[04:35:59] 10Analytics, 10Analytics-Kanban, 10Analytics-Wikistats, 10Patch-For-Review: Beta: Y-axis units and rounding issues - https://phabricator.wikimedia.org/T187429#4223934 (10sahil505) [05:25:44] (03PS2) 10Sahil505: Corrected y-axis labels to one decimal place to avoid similar labels [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/434326 (https://phabricator.wikimedia.org/T187429) [05:26:27] (03CR) 10Sahil505: Corrected y-axis labels to one decimal place to avoid similar labels (031 comment) [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/434326 (https://phabricator.wikimedia.org/T187429) (owner: 10Sahil505) [06:14:08] !log re-run webrequest-load-wf-misc-2018-5-23-2 via Hue [06:14:13] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [06:45:23] I am restarting zookeeper on druid100[4-6] (wikistats backend) for openjdk security upgrades [06:55:14] 10Analytics, 10Analytics-Kanban, 10User-Elukey: Restart Analytics hosts for Java 8 Security upgrades - https://phabricator.wikimedia.org/T194268#4224061 (10elukey) Zookeeper restarted on druid100[4-6], the Druid daemons will be upgraded to a new version soon as part of T193712 [07:35:27] !log upgrading the Druid labs cluster to Debian Stretch [07:35:32] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [07:40:51] 10Analytics, 10Patch-For-Review, 10User-Elukey: Upgrade Druid nodes (1001->1006) to Debian Stretch - https://phabricator.wikimedia.org/T192636#4224091 (10elukey) I am upgrading the Druid labs cluster to Stretch as test, and two things came up: 1) the druid debs were missing from stretch-wikimedia, uploaded... [07:45:00] 10Analytics, 10Operations, 10Research, 10Traffic, and 6 others: Referrer policy for browsers which only support the old spec - https://phabricator.wikimedia.org/T180921#4224098 (10TheDJ) @gh87 yeah, so that is correct. We indicate as referer policy: "origin, origin-when-crossorigin, origin-when-cross-origi... [07:50:06] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Upgrade Druid nodes (1001->1006) to Debian Stretch - https://phabricator.wikimedia.org/T192636#4224114 (10elukey) [07:52:40] joal: d-* nodes in labs upgraded to Debian Stretch, everything seems fine, will do more tests but after the public upgrade to 0.11 we could start thinking about reimaging druid100[1-6] nodes to Stretch [07:53:07] afaik there is nothing to save on the hosts, the can be wiped and reimaged without any precaution (one at the time of course) [07:56:11] 10Analytics, 10Analytics-Kanban: Upgrade Analytics infrastructure to Debian Stretch - https://phabricator.wikimedia.org/T192642#4224131 (10elukey) [08:27:09] 10Analytics-Legal, 10WMF-Legal, 10Wikidata: Solve legal uncertainty of Wikidata - https://phabricator.wikimedia.org/T193728#4224188 (10TomT0m) > Well, you just said that that there might be cases "where copyright law permits the extraction of information from copyrighted texts", so I believe that an explicit... [09:02:11] 10Analytics-Legal, 10WMF-Legal, 10Wikidata: Solve legal uncertainty of Wikidata - https://phabricator.wikimedia.org/T193728#4224279 (10Psychoslave) Sorry, I've been busy on other activities lately, I'll catch up and give feed back as soon as I can. [10:05:22] joal,milimetric - whenever you guys are around what do you think about upgrading the druid public cluster to 0.11? All metrics seem stable and nothing weird in the druid analytics logs so far. All indexations went well, so I think we could proceed with Druid public if you agree [10:05:40] usually I prefer to wait more time but afaics 0.11 seems pretty stable [11:29:59] * elukey afk for a bit! [11:39:25] 10Analytics: Generate pagecounts-ez data back to 2008 - https://phabricator.wikimedia.org/T188041#4224853 (10CristianCantoro) I have run the script over 1 day worth of data (`2007-12-11`) , it took a little more than 8 hours (484 minuts) and around 34GB of RAM. I am testing on another day (`2007-12-12`). I was... [11:40:25] Hi elukey - here for a minute while kids are asleep - No problem for as for the upgrade [11:40:46] elukey: Next indexation is in ~2 weeks, so we have time [11:58:39] nice! [12:02:39] ok time in NYC is good now, if anything explodes :D [12:02:53] fdans: hola! Do you have 2 mins to help? [12:06:27] all right starting the Druid public upgrade [12:09:22] elukey: I can go in now! [12:09:54] fdans: o/ - since I am upgrading Druid, I'd ask you to check that wikistats doesn't fall over [12:10:39] elukey: ok! [12:11:13] elukey: watching endpoints [12:12:32] ack, starting with historicals [12:18:20] historicals done [12:18:48] waiting for https://grafana.wikimedia.org/dashboard/db/druid?refresh=1m&orgId=1&var-cluster=druid_public&var-druid_datasource=All&from=now-1h&to=now&var-datasource=eqiad%20prometheus%2Fanalytics&panelId=46&fullscreen [12:19:54] the coordinator's UI shows 100% segments loaded, so good [12:22:27] overlords done [12:22:42] fdans: from my perspective no issues so far, but lemme know otherwise [12:23:56] now doing the brokers [12:24:00] I'll depool/repool each time [12:24:25] elukey: looking good so far [12:28:47] ok all coordinators up [12:29:44] the UI shows me 100% segments loaded [12:30:41] all right from my side the upgrade is done [12:30:58] I also added to public the avro/parquet extensions joal (but not KIS) [12:34:05] fdans: all good? If so, Druid is good now :) [12:34:33] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Upgrade Druid clusters to 0.11 - https://phabricator.wikimedia.org/T193712#4225008 (10elukey) Both clusters upgraded to 0.11! [12:35:01] 10Analytics, 10Analytics-Kanban, 10User-Elukey: Restart Analytics hosts for Java 8 Security upgrades - https://phabricator.wikimedia.org/T194268#4225010 (10elukey) [12:36:06] (03CR) 10Elukey: "Adding Faidon to review the new fields!" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/433597 (https://phabricator.wikimedia.org/T194055) (owner: 10Elukey) [12:38:26] (03PS3) 10Elukey: [WIP] Index some fields from isp_data and geocoded_data in Druid's webrequest [analytics/refinery] - 10https://gerrit.wikimedia.org/r/433597 (https://phabricator.wikimedia.org/T194055) [12:47:08] elukey: sorry, irccloud is not notifying me of chats [12:47:41] elukey: yes, I've been curling endpoints every 10s and everything's been good! [13:01:08] fdans: \o/ [13:09:22] (03CR) 10Nuria: "Looks to me this is ready to go (testing) once faidon thinks the fields are sufficient. Would love to know if the new pivot adds the new d" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/433597 (https://phabricator.wikimedia.org/T194055) (owner: 10Elukey) [13:10:40] hello eu team, [13:18:04] hello eu nuria_! :) [13:35:17] nuria_: helloooo are you in mad or bcn? [13:47:29] * elukey fights with Namely [14:01:03] https://gerrit.wikimedia.org/r/#/admin/projects/analytics/geowiki-data [14:01:16] sorry, wrong place to paste [14:04:46] ottomata: o/ - druid 0.11 deployed everywhere :) [14:10:21] yeehawww [14:10:22] :D [14:11:22] elukey: NICE! [14:25:35] !log redirecting pivot -> turnilo.wikimedia.org - https://phabricator.wikimedia.org/T194427 [14:25:38] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [14:26:03] bye bye pivot [14:27:48] easy peasy! [14:35:03] heloooo [14:35:24] o/ [14:37:48] http://druid.io/docs/0.12.0-rc1/development/extensions-core/druid-basic-security.html [14:38:25] with TLS enabled (supported by 0.11+) this one is really good [14:45:06] elukey: I'm not getting a response from stat1005 when trying to ssh [14:45:37] stat1004 is responding, don't know if this is normal [14:45:59] it is not, pretty sure somebody is overloading it [14:46:02] yeah [14:47:56] scoundrels! [14:49:08] bearloga: hi!! Sorry I had to kill your process, stat1005 was completely overloaded and not responding to ssh :( [14:50:48] also the oom killer acted on some python process, maybe related? [14:51:28] fdans: all yours [14:52:34] thank you elukey , sorry bearloga [15:22:42] (03CR) 10Mforns: [V: 032 C: 032] "Amazing! :DD" [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/434326 (https://phabricator.wikimedia.org/T187429) (owner: 10Sahil505) [15:31:46] ottomata: ops sync? [15:34:42] OG BOY [15:50:00] joal yt? [15:53:15] hey ottomata [15:53:29] joining [15:54:01] joal AH wait i think i figured it out sorry ... [15:54:10] oki [16:00:45] ping fdans [16:00:49] stadduppp [16:07:30] 10Analytics, 10Analytics-Kanban: Archive old geowiki data (editors per country) and make it easily available at WMF - https://phabricator.wikimedia.org/T190856#4225956 (10fdans) a:03fdans [16:47:17] * elukey afk for a bit! [16:55:39] 10Analytics, 10Operations, 10Research, 10Traffic, and 6 others: Referrer policy for browsers which only support the old spec - https://phabricator.wikimedia.org/T180921#4226106 (10Nuria) Latest data about this show internal referrers for safari creeping up from Late march: {F18492888} [16:56:26] joal: woah i got something working [16:56:30] now i want to brain bounce some stuff [16:56:32] got a min? [16:56:32] \o/ ! [16:56:42] ottomata: in 1-1 in 5 minutes [16:56:45] k [16:56:46] in da cave ! [16:56:53] ya [16:58:02] actually, will leave for 1-1 - I'll ping after [16:58:05] ottomata: --^ [17:04:14] ottomata, can you batcave for a moment today? It's about that weird error in EL Sanitization, just to see if it rings a bell. No need to be now though [17:07:06] yesssirrrr [17:07:16] actually i should make some lunch wow [17:07:21] i'm in bc now but will be afk lunhing [17:07:30] oh wait [17:07:31] ottomata, ok! np [17:07:35] joa al is away [17:07:39] will ping when back fro lunch mforns [17:07:45] sure ottomata np [17:53:08] sorry mforns helping fr folks with kafka [17:53:10] how much longer you around? [17:53:19] ottomata, np, I'll be here late [17:53:34] also, I think I found a possible reason for the issue, lookin into it [17:55:36] gotta run an errand, be back in an hour or so [17:58:35] l [17:58:36] k [18:01:34] * elukey off! [18:10:56] ottomata, mforns - Hi lads [18:11:01] May I help: [18:11:02] ? [18:11:05] hey joal [18:11:40] the EL Sanitization job does not work, I'm getting a weird error, and I [18:12:04] I'm wondering if it has something to do with the PartitionedDataFrame refactor [18:12:11] mforns: Can be !! [18:12:17] although, I went through the changes, and everything seems perfect [18:12:24] mforns: do you want to discuss in da cave? [18:12:27] sure [18:12:29] La baticueva ! [18:12:29] omw [18:12:33] :D [18:33:14] Hey ottomata - Quick question for you [18:33:59] ottomata: an you tell us if the event-refine jobs running in prod make use of the latest deployed jar (v0.0.64) [18:35:35] i don't think they do [18:35:42] that should be configured in cron [18:35:43] or [18:35:45] puppet [18:35:47] the jar version [18:35:54] ottomata: interestingn [18:36:15] ottomata: We're wondering with mforns if the problem comes from whitelist or the PartitionedDataframe thing [18:36:31] ottomata: I'd be interested to test the new version for a classical event-refine-job [18:40:37] joal: shoudl be easy ya? run a command and use the jar? [18:41:11] ottomata: I'm with the command - I should use the cron as an example I bet [18:41:47] joal on an03 [18:41:47] cat /usr/local/bin/json_refine_eventlogging_analytics [18:42:19] ottomata: I think you told me this at least 3 times ... [18:42:20] phew ok, i think i'm done with fr folks [18:42:21] :/ [18:42:23] so! [18:42:27] ook [19:30:34] mforns: my connection is dying :( [19:30:43] joal, :( [19:30:49] mforns: trying to see if I can figure out anything [19:31:09] I'm executing with code prior to refactor [19:31:14] well, still compiling [19:37:48] joal, same 2 problems without PartitionedDataFrame code! [19:43:55] o.o running again [19:48:39] :( [19:48:47] mforns: this feels wrong :( [19:49:07] mforns: I have experienced the connection issue when running with hdfs user - very weird ! [19:49:14] hmmm [19:51:18] yea, failed again, seems not related to code... [19:52:10] mforns: same value issue? [19:52:15] yes [19:52:31] :S [19:52:49] joal, next, I'm trying to run the code as it was on 2018-04-17 [20:05:06] mforns: I got a working refine job using jar v0.0.64 using cluster, not client [20:05:16] joal, aha [20:05:31] mforns: I'm assuming the issue come from the special jars in driver? but I'm not sure [20:05:40] but ELSanitization needs deploy-mode = client [20:05:46] I see [20:05:51] mforns: why? [20:06:01] because it looks for the whitelist file in the driver [20:06:12] file is in analytics1003 [20:06:14] not in cluster [20:06:26] mforns: deploy-mode cluster only means logs are more difficult to get (no real time, only post-application, with yarn logs command) [20:06:36] Ahhhh [20:06:42] yea, hehe [20:07:00] we can change that, if it is a problem [20:07:00] mforns: I think we should push that file to the cluster [20:07:05] ok [20:07:22] tomorrow I'll ask elukey [20:07:36] Let's test with that and see if if changes anything [20:07:43] sounds good mforns [20:07:47] ok, will test [20:07:50] Thanks for the tests ! [20:08:02] thank *you* for the help! [20:08:08] mforns: I'm gonna stop for today - connection is too bad :) [20:08:17] right, seeya tomorrow! [20:08:18] I'll see you tomorrow! [20:24:06] byyeye [20:32:07] ottomata, what would be a good place for the EL whitelist in HDFS? [20:32:43] \/user/hdfs/eventlogging_purging_whitelist.yaml? [20:43:17] hmm [20:43:29] i forget, it is in puppet right now..or? [20:43:35] we could put it in refinery? [20:43:42] or is that annoying? [20:43:52] i guess annoything because puppet will schedule the thing hm [20:43:53] ? [20:46:30] ottomata, I don't care much, although intuitively I'd say it belongs to puppet [20:46:42] yes, right now it's in puppet [20:47:14] only reason i'd say refinery is then there'd be an obvious place... [20:47:33] aha, it's easier to find [20:47:46] refinery gets deployed to hdfs [20:48:26] mforns: [20:48:33] we could make a new dir in /wmf ? [20:48:37] /wmf/config [20:48:38] ? [20:48:46] or [20:48:47] /wmf/var [20:48:48] ? [20:49:00] hm [20:50:17] or just /var maybe [20:50:48] but if you think refinery is cool, let's go with that [20:51:23] the only (temporary) problem is ensuring presence for mysql [20:52:29] ensuring precense? [20:52:31] oh right. [20:52:36] we need the same thing for mysql [20:52:37] yaaaa [20:52:37] hm [20:52:45] so its nice it is in puppet [20:53:07] althought, we do have the pageview wiki whitelist, right? [20:53:09] this is kinda like that? [20:53:17] we coudl deploy refinery to mysql eventlog repl hosts [20:53:24] where that script runs [20:58:15] ok [20:59:59] mforns: i'm not against deploying it with puppet to somwhere else, i'm jsut not sure where... [21:00:10] refinery seems a little bit more natural... [21:00:12] what do you think? [21:00:38] on the cluster side, refinery seems right [21:01:15] I think we can live with cloning refinery in mysql hosts for now [21:04:45] ok [21:43:52] 10Analytics, 10Operations, 10SRE-Access-Requests, 10Patch-For-Review: Access to usergroups for Marshall Miller - https://phabricator.wikimedia.org/T194550#4226980 (10MMiller_WMF) Thank you! I'm in.