[00:43:17] 06Analytics-Kanban, 06Discovery-Analysis, 07Browser-Support-Apple-Safari: Visits/searches from Safari 10 location bar search suggestions - https://phabricator.wikimedia.org/T157796#3016688 (10mpopov) We actually discussed this last year and I remember someone saying that Apple keeps a cached copy of Wikipedi... [00:52:56] 10Analytics, 10EventBus, 10Reading-Web-Trending-Service, 10Reading Epics (Trending Edits), and 3 others: Compute the trending articles over a period of 24h rather than 1h - https://phabricator.wikimedia.org/T156411#3028036 (10Jdlrobson) @mobrovac can we safely bump this up to 24 hours now? [01:37:39] 10Analytics, 10Analytics-EventLogging: Add ops-reportcard dashboard with analysis that shows the http to https slowdown on russian wikipedia - https://phabricator.wikimedia.org/T87604#3028129 (10Liuxinyu970226) [04:04:32] 10Analytics, 10MediaWiki-extensions-WikimediaEvents, 10The-Wikipedia-Library, 10Wikimedia-General-or-Unknown, 13Patch-For-Review: Implement Schema:ExternalLinksChange - https://phabricator.wikimedia.org/T115119#3028316 (10Beetstra) If there is a rc-feed of edits for testwiki, I could set up LiWa3 to feed... [05:55:29] 06Analytics-Kanban: Check abnormal pageviews for XHamster - https://phabricator.wikimedia.org/T158071#3025609 (10Tbayer) >>! In T158071#3025815, @MusikAnimal wrote: > A few things I found: > * It seems the number of unique IPs that visited `/wiki/XHamster` is significantly less than articles that received around... [09:00:05] Piwik is using httpd prefork and mod_php [09:00:16] my soul is devastated [09:00:19] I feel reall sad [09:00:21] *really [09:01:05] !log restarted Piwik with bulk_requests_use_transaction=0 to try to fix the SQL deadlock issue (https://github.com/piwik/piwik/issues/6398#issuecomment-91093146) [09:01:05] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [09:28:03] elukey: :( [09:30:25] :D [09:30:48] my idea is to fix the immediate issues and then spend some time in fixing it for good [09:32:03] elukey: That's be awesome :) [10:00:43] 10Analytics, 10DBA, 06Labs: Discuss labsdb visibility of rev_text_id and ar_comment - https://phabricator.wikimedia.org/T158166#3028656 (10JAllemandou) [10:12:12] (03CR) 10Joal: "Looks good for a single folder output." [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/337593 (https://phabricator.wikimedia.org/T156388) (owner: 10Mforns) [10:13:10] (03CR) 10Joal: "@nuria: Can we abandon that patch?" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/327845 (https://phabricator.wikimedia.org/T151832) (owner: 10Nuria) [10:16:03] (03CR) 10Joal: [C: 04-1] "Small nit for making this patch a minimal change" (032 comments) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/337632 (https://phabricator.wikimedia.org/T154090) (owner: 10Nuria) [10:19:37] joal_: qq - is there any reason why we'd want to connect to db1057? (s1 afaics) [10:19:54] there are some firewall rules about prelabsdb-mysql [10:20:02] probably super ancient [10:20:23] hmmm (invoking ottomata power) - I don't know :) [10:20:37] elukey: --^ [10:20:55] he didn't know either :D [10:21:41] To be sure I have full understanding: db1057 is in labsdb - It's one of the 'old' db machines [10:21:46] elukey: --^ [10:24:07] from what I can see it seems a s1 shard [10:24:13] so no labsdb [10:24:52] elukey: I have no idea - I have never used that connection [10:25:00] super :) [10:25:09] asking to the dbas for final confirmation, but I am going to drop it [10:26:25] (03CR) 10Joal: "@milimetric: See answer inline." (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/334042 (https://phabricator.wikimedia.org/T155658) (owner: 10Joal) [10:26:58] elukey: I think it might be interesting to triple check with Erik Zachte (maybe?) [10:29:26] (03Abandoned) 10Joal: [WIP] Modify MediawikiUserHistory [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/299729 (https://phabricator.wikimedia.org/T139745) (owner: 10Joal) [10:32:05] (03CR) 10Joal: "@ottomata: Should we abandon that patch? There now is an InputFormat that allows reading/transforming XML dumps into whatever format is ne" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/171056 (owner: 10Ottomata) [11:05:10] 06Analytics-Kanban, 06Operations, 10netops: Review ACLs for the Analytics VLAN - https://phabricator.wikimedia.org/T157435#3028812 (10elukey) Proposed fixes: ``` delete firewall family inet filter analytics-in4 term udplog delete firewall family inet filter analytics-in4 term prelabsdb-mysql delete firewall... [11:38:55] joal: I decided to go for the hammer, deleted what wasn't necessary from the ACLs :d [11:39:16] if you see any issue ping me, but I don't expect much.. (last famous words) [11:39:21] elukey: I 'like the approach: whoever gets hurt will shout ! [11:39:35] well it was carefully reviewed and planned :D [11:39:41] usually I don't like this approach :D [11:47:55] mmmmm I started to check ipv6 traffic [11:48:12] and it seems that the traffic to the puppet master goes via IPv6 [11:48:23] afaics we don't have the puppet master whitelisted [11:48:25] ahahahhaah [11:50:35] elukey: :S [12:04:16] * elukey lunch! [12:24:02] elukey: I don't know if it's the ACL change or ottomata java cert change, but build from archiva is back to normal ! [12:25:25] 06Analytics-Kanban, 13Patch-For-Review: Run a 1-off sqoop over the new labsdb servers - https://phabricator.wikimedia.org/T155658#3028963 (10chasemp) >>! In T155658#3028632, @gerritbot wrote: > Change 337793 had a related patch set uploaded (by Joal): > Add new fields to archive_p view in labsdb > > [[https:/... [12:33:21] 06Analytics-Kanban, 13Patch-For-Review: Run a 1-off sqoop over the new labsdb servers - https://phabricator.wikimedia.org/T155658#3028988 (10JAllemandou) > Per @Jcrespo > >> Those fields are not in use on production (I blocked that), and they will be done properly (deleted) later in the year: https://www.med... [12:37:13] (03CR) 10Milimetric: [V: 032 C: 032] "Missed that, thanks, merging" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/334042 (https://phabricator.wikimedia.org/T155658) (owner: 10Joal) [13:04:46] joal: weird :D [13:04:55] where do you execute the build usualy? [13:04:57] *usually? [13:05:07] I usually do it from stat1004 [13:05:07] (I don't remember :( ) [13:05:16] thanks milimetric :) [13:05:37] maybe the next time we could run tcpdump to see what is is used [13:06:58] sure elukey [13:07:14] I think andrew made changes yesterday to some certs stuff, might be related [13:08:05] probably.. but it would be interesting to see if now something like IPv6 might explain the change in behavior [13:09:12] elukey: sure [13:13:17] milimetric: o/ [13:14:01] I followed https://github.com/piwik/piwik/issues/6398#issuecomment-91093146 and set bulk_requests_use_transaction=0, the SQL lock problem seems gone.. but I can see a constant rate of 503s registered from Varnish [13:14:45] I didn't see traces of other problems on bohrium, so I am starting to wonder if Varnish returns 503s when it waits too much for the backend [13:14:52] like apache queueing too many connections etc.. [13:15:50] the apache scoreboard looks fine [13:15:51] mmm [13:16:49] taking a samll break a-team [13:21:29] yes definitely something weird between apache and varnish [13:41:21] o/ joal and milimetric [13:41:38] I don't have anything for the love systems agenda. Do you? [13:41:47] *live [13:41:50] Lol [13:55:55] !log disabled apache mod_deflate on bohrium (piwik test) [13:55:56] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [14:00:02] :D [14:00:05] Hey halfak|Mobile [14:00:14] Nothing really funky oin my side either [14:01:16] milimetric, halfak|Mobile: cancellation? [14:02:47] Oh, had not seen milimetric email - halfak|Mobile : Cancellation ! [14:03:19] OK. Enjoy the extra hour! [14:03:33] same to you halfak|Mobile :) [14:04:15] Arf halfak|Mobile, just remember there was a couple of things I wanted to discuss with you - from your mobile tag line, I assume there might be a better time for you [14:04:25] halfak|Mobile: later on today, tomorrow? [14:05:03] elukey: yoooo [14:05:55] ottomata: o/ o/ [14:06:03] Ooh. I can still meet. Just give me a couple minutes [14:06:26] milimetric: too early to say but I might have found the source of the 503s from piwik [14:07:06] elukey: shall we do the cdh dance in labs? [14:07:34] ottomata: what about in an hour? Would you be available? [14:07:59] suuuuure, i think we'd run into standup and retro then though [14:08:31] ah ok.. 30 mins? [14:08:39] ok [14:20:18] elukey: got a second @ batcave? [14:25:21] fdans: a couple of minutes yes, after that I have a meeting with Andrew [14:27:34] elukey: shouldnt need more than that [14:43:06] elukey: YOoOOooO :) [14:43:43] ottomata: I am in the batcave with Francisco but you can join, we have finished [14:43:50] k [14:46:29] nuria milimetric fdans mforns any feedback? have to start the detail page today and want to incorporate any feedback from the dashboard [14:48:32] ashgrigas leaving some notes now, thank you! :) [14:48:43] thanks fdans! [15:00:21] hi team :] [15:00:42] mforns o/ [15:00:49] hey fdans o/ [15:02:36] (03PS4) 10Mforns: Add spark job to aggregate historical projectviews [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/337593 (https://phabricator.wikimedia.org/T156388) [15:03:56] (03CR) 10Mforns: "Forgot to change the file name to projectcounts instead of projectviews." [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/337593 (https://phabricator.wikimedia.org/T156388) (owner: 10Mforns) [15:18:04] 06Analytics-Kanban, 13Patch-For-Review: Add "Damn Small XSS Scanner" (DSXS) to list of known bots - https://phabricator.wikimedia.org/T157528#3029512 (10JAllemandou) Awesome, thanks @Tbayer :) [15:37:10] 06Analytics-Kanban, 13Patch-For-Review: CDH 5.10 upgrade - https://phabricator.wikimedia.org/T152714#3029552 (10Ottomata) Post upgrade manually: remove cdh5.5* packages from apt thirdparty [15:48:17] sorry was at doctor's elukey, should've left a message [15:48:22] so mod_deflate was causing the 503s? [15:48:29] cool! [15:50:05] milimetric: they are still there, but I think it is Varnish not liking what piwik returns for some reason [15:50:10] will update the task after standup [15:50:23] still trying to figure out what that is [15:51:30] ok, thanks for all that work, glad the sql errors are gone, sorry I didn't expect this to be so nested and ugly [15:54:13] milimetric: I am having fun in debugging, the nasty part will be the work afterwards to fix puppet etc.. :D [15:54:25] but I really want to have a stable piwik [16:00:38] joal, mforns : standdduppp [16:02:34] ashgrigas: I will not be able to look at visuals until thu [16:04:05] ok [16:04:12] ill revise based on what i get today nuria [16:30:42] elukey: in case our piwik users don't show you the proper appreciation for making piwik stable, know that at least I appreciate you [16:31:49] joal: wow, expanded templates by itself should be a huge reason to use html if it makes sense to integrate in the pipeline [16:32:19] in my ideal conception of a possibly nonexistent reality, we can bypass wikitext loading/parsing altogether!! [16:32:45] * milimetric crushing people's 5 minute breaks since 2012 [16:32:54] milimetric: huhu :) [16:33:27] milimetric: <3 [16:33:35] milimetric: I'm not whether I'd prefer to parse HTML or wikitext ;) [16:33:53] oh, for me it's no contest, html by a mile [16:34:18] expanding templates in wikitext: they're recursive!! [16:34:31] * joal is afraid of HTML [16:34:53] milimetric: spark doesn't mind recursive :) FixPoint for the WIN ! [16:35:25] oh, dude, but you don't know, some of these templates recurse thousands of times [16:36:15] I know we can deal with it structurally, but performance wise... not confident [16:36:28] turtles all the way down in template land yep [17:03:41] joal, about the output format for projectcounts? [17:03:45] 06Analytics-Kanban: Check abnormal pageviews for XHamster - https://phabricator.wikimedia.org/T158071#3029918 (10Nuria) >What would be the SEO benefit of scraping the page? eh... traffic, of course, as a result of better positioning on search. [17:03:47] sure mforns ! [17:03:55] didn't we agree on having everything in one folder? [17:04:29] otherwise I'll partition the output in year=2007-like directories [17:07:18] mforns: sorry - got distracted - I'm backlogging IRC to check [17:13:25] joal, well reading it, it seems that we left it open to either all in a single folder or partitioned by year [17:13:48] joal, is it important that a single file contains 1 year of data and only one? [17:26:18] mforns: sorry - got hooked up with milimetric :) [17:27:00] mforns: The various angles I was thinking of were - GZ + TSV is interesting for external purposes [17:27:39] So, if we go for that (and not parquet+snappy), it would be cool to have files that kinda make sense for external consumption (yearly files) [17:28:28] mforns: makes sense? [17:35:42] joal, OK, and same folder vs year partition? does not matter? [17:36:27] mforns: I think if we go to the burden of partitioning by year in files, let;s put them in folders and tqake benefit of hadoop partitioning [17:37:08] mforns: But, all this is one possible direction - We can also do parquet (I don't mind) [17:37:31] joal, it's OK, will do year partitioning [17:37:37] gzip+TSV [17:37:46] mforns: + folder ;) [17:38:10] joal, you mean year=2007 ? [17:38:15] correct [17:38:18] sure [17:38:22] For have to take advantage [17:38:27] for HIVE sorry [17:38:41] Still makes sense mforns ? [17:42:12] elukey: FYI: https://wikitech.wikimedia.org/wiki/Analytics/Cluster/Hadoop/Administration#Updating_Cloudera_Packages [17:44:53] nice! [17:45:24] so each time that we want to upgrade we just sync [17:46:48] heyyy elukey! I know I +1ed you rrecent hadoop heapsize monitoring patch [17:46:51] but i think we are doping somethign wrong [17:47:01] we shouldnt' be looking up cdh::hadoop::hadoop_namenode_heapsize in hiera [17:47:06] we've already included the class [17:47:11] so we should look it up from the class variable [17:47:21] its failling in labs because those hiera keys don't exist there [17:47:33] the default param for thos in the hadoop class is undef [17:47:50] Oh, but some of those are wrapped in if production [17:47:50] hmmm [17:48:00] :( [17:48:03] going to make a patch... [17:48:57] its also confusing because the cdh::...::heapsize params are not real cdh class parameters [17:52:32] yeah I keep confusing these in puppet [17:57:03] elukey: also, you reference T153951 in the comment there [17:57:03] T153951: Yarn node manager JVM memory leaks - https://phabricator.wikimedia.org/T153951 [17:57:11] but these heap size monitors are only for namenode and RM [17:57:13] right? [17:58:02] yes yes [17:58:16] I also put one for the node manager [17:58:21] I should have [17:58:23] checking [17:58:44] oh ok [17:58:57] ya you did [17:58:57] cool [17:58:59] sorry [18:03:52] 06Analytics-Kanban, 06Operations: Periodic 500s from piwik.wikimedia.org - https://phabricator.wikimedia.org/T154558#3030097 (10elukey) Summary of today: * I followed https://github.com/piwik/piwik/issues/6398#issuecomment-91093146 and set `bulk_requests_use_transaction=0` manually to fix an error showing up... [18:04:25] 06Analytics-Kanban, 06Operations, 15User-Elukey: Periodic 500s from piwik.wikimedia.org - https://phabricator.wikimedia.org/T154558#3030103 (10elukey) p:05Triage>03Normal a:05Milimetric>03elukey [18:04:37] milimetric: ---^ [18:04:50] the mistery keep returning surprises :D [18:05:22] now I am waiting for a 503 on cp1058 to figure out what's happening, but it is definitely something weird between apache and varnish [18:11:51] ottomata: everything good if I go afk? [18:11:59] (I mean no pending things to discuss :) [18:12:55] ya [18:12:58] we continue tomorrow [18:13:02] oh elukey got a review for you in a sec [18:13:03] not working eyt [18:13:05] but just fyi [18:13:08] i can merge wo you [18:13:13] https://gerrit.wikimedia.org/r/#/c/337886/ [18:13:39] I saw it, looks good! Thanks! [18:14:04] cool [18:14:12] * elukey afk! [18:22:57] 06Analytics-Kanban: Check abnormal pageviews for XHamster - https://phabricator.wikimedia.org/T158071#3030227 (10MusikAnimal) Yeah you've got me why article promotion is so important to them, but it seems to be inline with why people spam Wikipedia all the time. Search engines have heuristics that look at traffi... [19:32:49] (03PS1) 10Joal: [WIP] Add job computing citations diffs over text [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/337900 [19:34:07] Done for today a-team, see you tomorrow ! [19:38:51] laters! [19:50:44] ottomata: hive kaput? [19:50:46] ottomata: [19:50:50] https://www.irccloud.com/pastebin/Fj22v27v/ [19:51:00] Guessing this is a question for you wonderful lot ^^ can I get access to graphite.wikimedia.org for eventlogging data as I'm all NDA'd up and it *may* be helpful for looking at the ExternalLinksChange data? Aaaaand https://grafana.wikimedia.org/dashboard/db/eventlogging-schema?var-schema=All doesn't seem to have `ExternalLinksChange` [19:51:18] hm [19:53:34] nuria i just ran a query and it worked for me [19:53:36] what's up? [20:09:47] 10Analytics: Split unique devices data for Asiacell and non-Asiacell traffic in Iraq - https://phabricator.wikimedia.org/T158237#3030739 (10Nuria) [20:09:57] ottomata: it worked! [20:10:02] ottomata: momentarily kaput [20:34:01] samtar: what are you looking for? [20:34:13] hmm [20:35:18] ottomata: on graphite? I'd like to be able to see the statistics for the ExternalLinksChange schema in eventlogging [20:35:35] hmm, i think you want this [20:35:36] https://grafana.wikimedia.org/dashboard/db/eventlogging?var-schema=ExternalLinksChange [20:35:48] hm but does that work? [20:35:54] or, how about logstash, ottomata? [20:35:55] perhaps this is better? [20:36:00] https://grafana.wikimedia.org/dashboard/db/kafka-by-topic?from=now-7d&to=now&var-cluster=analytics-eqiad&var-kafka_brokers=All&var-topic=eventlogging_ExternalLinksChange [20:36:04] there aren't many messages for that schema [20:36:33] oooh that's helpful! [20:36:41] The low count is expected :) [20:36:56] samtar: to dig deeper into errors and stuff, check logstash: where we log any validation or other errors on the EventLogging pipeline [20:36:56] https://logstash.wikimedia.org/app/kibana#/dashboard/eventlogging [20:37:02] you'll have to filter to find just your schema [20:37:13] but if there's an error that didn't crash the system (which is most of them) it would be there [20:37:17] "https://logstash.wikimedia.org is requesting your username and password. The site says: “WMF Labs (use wiki login name not shell) - nda/ops/wmf”" [20:37:20] ? [20:37:50] yeah. not public. you need to have signed an NDA explictly or be WMF staff [20:38:16] I've signed /a/ NDA [20:38:25] is there a separate one for that? [20:39:13] samtar: your wikitech account [20:39:15] that should work [20:39:23] I think 'nda' LDAP group membership is gated by the 'L2' NDA. [20:39:50] 'L2' is the one described here -- https://wikitech.wikimedia.org/wiki/Volunteer_NDA [20:39:56] I'm not sure about logstash, but samtar try logging into https://pivot.wikimedia.org [20:40:11] if you can get in there with your wikitech account you're in the right LDAP group [20:40:32] they should all use the same LDAP group membership control. Either wmf, ops, or nda group required [20:41:08] ah haven't signed L2 [20:41:16] k, makes sense [20:41:18] L3 and a researcher NDA [20:41:30] E_TOOMANYNDAS [20:41:52] heh, I was expecting that to return more than 0 google hits :) [20:42:28] milimetric: obviously a SEO challenge that we should accept :) [20:42:36] haha [20:45:44] samtar: i fixed https://grafana.wikimedia.org/dashboard/db/eventlogging?var-topic=eventlogging_ExternalLinksChange a little bit [20:45:49] the schema template var didn't work [20:45:50] not sure why [20:45:56] i had to chnage it to full (kafk) topic name to make it work [20:46:16] yay! Thank you ottomata :) [20:46:40] milimetric: are eventlogging events in logstash? [20:46:42] no, right? [20:46:44] just errors [20:47:08] oh sorry [20:47:09] ottomata: correct, just things from the error kafka queue [20:47:10] that's what you said :) [20:47:58] It'd be helpful to get at logstash.. would it be worth seeing about the NDA? [20:50:09] samtar: up to you, for now you have such low traffic that grafana is probably fine for you [20:51:07] Okay ^^ there's not going to be any additional debug-useful information being stored there? [20:51:16] it's not like it's super hard to get the NDA, but it comes with responsibilities to make sure you protect the data well [20:51:30] debug-useful, maybe, are you going to dig into the mediawiki code? [20:52:14] I will be yeah, we're currently getting a project instance set up so we can try to get to the bottom of the hook problem [20:52:16] samtar: actually, you should debug on beta, but there's no problem with the eventlogging pipeline [20:52:41] samtar: the problem is only on mediawiki's end right now, after you fix that and get events through, you can debug on beta where it's much easier and you don't need special access [20:52:57] logstash is really only if you're maintaining the schema in production and notice it's doing something wrong. [20:53:10] ah okay [20:53:43] This will be the first time I've come across logstash so you'll have to excuse the stupid questions ^^ [20:58:11] no problem samtar, easier questions are easier to answer and that's what we're here for :) [21:10:44] (03Abandoned) 10Nuria: POC of loading tile data into pivot [analytics/refinery] - 10https://gerrit.wikimedia.org/r/327845 (https://phabricator.wikimedia.org/T151832) (owner: 10Nuria) [22:07:40] bye team, good night! [22:27:45] 10Analytics, 10ContentTranslation, 07Spanish-Sites: Limn language dashboard: eswiki graph is wrong/stuck - https://phabricator.wikimedia.org/T99074#3031283 (10MarcoAurelio) [22:27:57] 10Analytics, 10Analytics-General-or-Unknown, 07Spanish-Sites: "data:" URLs accounting for 6 of the top 10 most viewed articles reported by stats.grok.se - https://phabricator.wikimedia.org/T68112#3031291 (10MarcoAurelio)