[00:22:10] (03CR) 10Ottomata: "Just ran this , lookinggood to meee! Gotta construct a test to make sure if fixes the problem we had before still.." [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/419217 (owner: 10Joal) [08:02:06] 10Analytics-Kanban, 10Google-Summer-of-Code (2018): Proposal : [Analytics] Improvements to Wikistats2 front-end - https://phabricator.wikimedia.org/T189964#4060236 (10sahil505) [08:24:25] Hi team - Unusual weather in Britanny: 20cm of snow!!! Obviously everything is disrupted, so I won't be working a lot to care Lino and Naé [08:24:44] o/ [08:24:45] I might post a picture of snow-man of an igloo :) [08:24:51] snowing in bologna too :) [08:24:57] not that heavily though :P [08:25:41] Hi elukey - It's not that bad, it's just that since it happens once every twenty years for us, we are not prepared not do we know how to react :D [08:25:57] Except running and shouting in joy, throwing snowballs [08:26:17] So I'll keep doing that with Lino, school being closedb [08:27:04] elukey: before I leave, anything you'd like me to help with (for synchro) [08:27:07] ? [08:28:38] joal: nope, just tried to deploy the new cassandra package on aqs1004 (to enable the jmx exporter) and failed, so rolledback everything [08:28:42] not a good start of the week :D [08:28:45] :( [08:28:58] * joal offers some coffee to elukey [08:29:07] :) [08:40:51] going to reboot thorium now for security upgrades folks [08:47:18] 10Analytics-Kanban, 10User-Elukey: Reboot all Analytics hosts for Kernel upgrade - https://phabricator.wikimedia.org/T188594#4060326 (10elukey) [08:48:07] aaand thorium rebooted [09:38:22] !log restart hadoop daemons on analytics1070 for openjdk upgrades (canary) [09:38:23] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [09:40:16] this datanode is now running with the new setting to tolerate 2 disk failures before shutting down [09:52:04] 10Analytics-Kanban, 10User-Elukey: Reboot all Analytics hosts for Kernel upgrade - https://phabricator.wikimedia.org/T188594#4060459 (10elukey) [10:02:26] 10Analytics, 10Analytics-EventLogging, 10User-Elukey: Run eventlogging purging script on beta labs to avoid disk getting full - https://phabricator.wikimedia.org/T171203#4060493 (10elukey) [10:05:06] 10Analytics, 10Operations, 10Patch-For-Review: rack/setup/install notebook100[34] - https://phabricator.wikimedia.org/T183935#4060495 (10elukey) [10:09:34] 10Analytics, 10User-Elukey: latest varnishkafka fails to build on Debian - https://phabricator.wikimedia.org/T186250#4060561 (10elukey) Very nice, just read http://man7.org/linux/man-pages/man3/daemon.3.html ``` daemon(): Since glibc 2.21: _DEFAULT_SOURCE In glibc... [10:15:01] 10Analytics, 10MediaWiki-extensions-CentralAuth: Determine impact of Apple's Intelligent Tracking Prevention 1.1 - https://phabricator.wikimedia.org/T190031#4060567 (10TheDJ) [11:33:38] * elukey lunch + errand! (be back in ~2h) [13:29:04] hey yall, I'm around, working on my patch [13:29:12] Hi milimetric :) [13:29:26] joal!! I thought you were building snow people [13:29:41] joal: any thoughts on why that druid loader fails or where I should look for logs? [13:29:44] milimetric: I did !!! Lino sleeps now, so I can spend a minute working :) [13:30:15] milimetric: I've had a quick look - It seems related to python not being set up the same way on an1067 (and maybe others???) [13:30:20] I couldn't see any other diff [13:30:46] oh! So the data nodes are not homogenous and it'll just fail whenever it runs on those machines? [13:30:55] milimetric: I suspect it couldf [13:31:10] milimetric: the error we got for your example was about command not found [13:31:39] ok, cool, I'll run a fresh one and dig through yarn logs more [13:31:50] milimetric: please ping me when started :) [13:32:27] joal: started 0014326-180308085316299-oozie-oozi-C [13:32:40] ack milimetric ! [13:38:07] milimetric: same error than last time: /usr/bin/env: python -- : No such file or directory [13:40:15] right, https://hue.wikimedia.org/jobbrowser/jobs/job_1520532368078_34310/single_logs [13:40:31] so... how does the same job work for mediawiki history? [13:41:23] milimetric: THat is so hugely weird !!! [13:43:59] yeah, didn't find anything in stdout that would cause a problem, looks like params are passed ok [13:44:07] milimetric: this sub-workflow is being used by any druid-loading-job we have [13:44:25] yeah, exactly [13:44:39] 8-( [13:44:40] and it looks like the first part of it works, so I was stuck about why the index step would fail [13:44:50] milimetric: indeed, only druid part [13:46:25] hm, I'll try running the command myself on I guess analytics1067: [13:46:26] https://www.irccloud.com/pastebin/LcjF5e3q/ [13:46:43] milimetric: as hdfs user [13:46:51] yep [13:46:52] I think I tried that ... hm [13:46:55] Can't recall [13:47:09] milimetric: this time i failed from an1039 [13:57:19] joal: wait, how/where is druid_loader.py on an1039? It's running it via java out of hadoop somehow? [13:57:38] milimetric: oozie? [13:57:43] milimetric: yessir [13:58:16] oozie executes its app-master, and a container that executes the workflow actions (in our case, a shell action) [13:58:44] oh... so... to do it manually you copy the python down to the machine out of oozie? [13:58:56] out of the oozie directory in hdfs I mean [13:59:06] ? [14:00:19] milimetric: I can't think of a better way [14:17:19] Oh milimetric - Wanted to discusse with you as well - I've managed to import with sqoop as parquet instead of avro :) Thought it could be interesting to discuss [14:25:58] oh cool, is the import similarly fast? [14:27:14] when we first looked we saw a big speed increase with Avro from plain text import [14:28:13] (03PS1) 10Fdans: Add hi.wikimedia and zh.wikidata to whitelist [analytics/refinery] - 10https://gerrit.wikimedia.org/r/420346 [14:28:42] milimetric: I have not checked performance at import, but from a reading perspective it;ll be really better [14:28:52] milimetric: there is a downside though [14:29:48] milimetric: The code currently doesn't aspect ouput-pth to contain anything else thant alphanumeric + '_' (meaning our partitioned folder can't be used as is, would be needed to be renamed :S) [14:30:51] ottomata: o/ - ok if I reboot kafka100[23]? [14:31:15] hiiii [14:31:19] yes go ahead elukey [14:31:20] 10Analytics, 10Analytics-Cluster, 10Analytics-Kanban: Spike: Consider alternatives to MirrorMaker: uReplicator, Confluent Replicator - https://phabricator.wikimedia.org/T190049#4061233 (10Ottomata) p:05Triage>03High [14:31:25] ack! [14:31:28] on friday afternoon i blacklisted the change-prop topics as well [14:31:36] mm has been a little more stable [14:31:38] will watch it [14:33:12] it was faster for mysql to dump with plain text, I think it used mysqldump directly, but I think Parquet will probably be similar to Avro in speed [14:33:47] milimetric: there are chances yes, could even be a little slawer: It'll use avro to convert to parquet I think [14:34:41] that’s ok, and the path name I’m not 100% sure what you mean but if it makes the jobs work better we can rename whatever we need to [14:37:54] milimetric: agreed - sqoop job would not be impacted, but reading job would [14:38:14] milimetric: for python - I don't understand :( [14:40:14] me neither, command runs fine on 1039 [14:40:37] yes [14:40:40] same on an1067 [14:41:04] I think we need help from the masters milimetric - elukey / ottomata - If by chance you'd have a minute [14:41:43] elukey/ottomata: We mere computer mortal are having a weird issue with python on analytics worker [14:41:59] looking at somethign for petr atm, but ya in a bit [14:42:09] in the meantime, --verbose :) [14:43:27] Dan has tested a oozie job that uses the druid-loading subworkflow - And we got an error 127 (command not found) [14:44:02] While this sub-workflow is currently working for many other jobs [14:44:25] any specific host in which it ended up? analytics107* have a different version of python [14:44:41] elukey: I thought about that - error experienced on an1039 and an1067 [14:44:49] but command not found seems a bit brutal [14:44:50] 10Analytics, 10EventBus, 10MW-1.31-release-notes (WMF-deploy-2017-11-14 (1.31.0-wmf.8)), 10Patch-For-Review, 10Services (next): Timeouts on event delivery to EventBus - https://phabricator.wikimedia.org/T180017#4061305 (10Ottomata) Most timeouts during normal operation seem to have been handled, but ther... [14:44:58] do we have some logs to check? [14:46:32] in the mean-time milimetric - I'd like to change the folder structure of your patch for ozzie [14:46:45] elukey: nothing more really - Will find you a url [14:47:26] because the "command not found" is super generic and can mean anything [14:47:37] elukey: https://yarn.wikimedia.org/jobhistory/logs/analytics1039.eqiad.wmnet:8041/container_e63_1520532368078_34310_01_000004/attempt_1520532368078_34310_m_000000_0/hdfs/stdout/?start=0 [14:48:52] what it is trying to execute? Can I see the subworkflow? [14:50:27] elukey: https://github.com/wikimedia/analytics-refinery/tree/master/oozie/util/druid/load [14:51:06] thanks :) [14:57:10] elukey: This means err127 could be because of wrong params for instancE? [14:57:36] joal: oh! is it that I'm not passing a template_file but the literal string "template_file"? [14:58:01] Could very well be milimetric [14:58:05] I missed that first time I looked at the params but I would've expected the error to be something in python not command not found [14:59:20] joal: as for folder structure, you can tell me and I'll change it [14:59:36] but either way I want to make it work first [15:09:40] (03CR) 10Mforns: Add hi.wikimedia and zh.wikidata to whitelist (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/420346 (owner: 10Fdans) [15:11:12] (03PS2) 10Fdans: Add hi.wikimedia and zh.wikidata to whitelist [analytics/refinery] - 10https://gerrit.wikimedia.org/r/420346 [15:11:44] (03CR) 10Fdans: Add hi.wikimedia and zh.wikidata to whitelist (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/420346 (owner: 10Fdans) [15:29:43] 10Analytics: Make 'metric' field not a partition in mediawiki_metrics - https://phabricator.wikimedia.org/T190058#4061502 (10mforns) [15:32:03] 10Analytics: Turn off old geowiki jobs - https://phabricator.wikimedia.org/T190059#4061513 (10Nuria) [15:33:10] 10Analytics, 10Analytics-Kanban: Write agreggation job for eventlogging page preview data - https://phabricator.wikimedia.org/T188310#4061532 (10Nuria) [15:35:45] ping joal [15:37:11] holaaa joal [15:39:13] 10Analytics, 10User-Elukey: latest varnishkafka fails to build on Debian - https://phabricator.wikimedia.org/T186250#4061550 (10Jrdnch) I can submit a patch. Thanks for the man page, that's helpful. My patch would probably just remove the `_BSD_SOURCE` macro, since `daemon()` has been in `_DEFAULT_SOURCE` sin... [15:47:18] 10Analytics-Kanban, 10User-Elukey: Reboot all Analytics hosts for Kernel upgrade - https://phabricator.wikimedia.org/T188594#4061583 (10elukey) [16:03:11] 10Analytics, 10Analytics-Kanban: Write agreggation job for eventlogging page preview data - https://phabricator.wikimedia.org/T188310#4061649 (10Nuria) * source data is on eventlogging * there is a refine success flag you can depend on * there are transformations needed similar to refine data transformations... [16:34:32] milimetric: reviewed the thing a bit more - Given how the refinement of cu_changes into geo_wiki works, no need for _PARTITIONED flag - Success should be enough (and is actually currently not set I think) [16:56:06] ottomata: can we come up with a role-name for stat1005? [16:56:35] statistics::explorer::private? [16:57:06] elukey: I wasd trying to find a joke, but didn't find it :( [16:57:18] ahahah [17:03:56] ooo just found https://github.com/so-fancy/diff-so-fancy, pretty nice! [17:04:09] elukey: in meeting now hm [17:04:28] elukey: we already have a role name, no? [17:04:28] statistics::private [17:04:29] ? [17:05:12] ottomata: and I can put in there all the other includes lying around in site.pp ? [17:05:28] ya i htink so, stat1005 is only node that uses that role [17:05:41] super, tomorrow you'll get a code review :) [17:40:04] fdans: will be working on datetime patch a bit more today, i think we should merge that one before we look at mobile chnages [17:40:29] fdans: as it will fix three/four bugs that affect production version cc milimetric mforns [17:40:37] let me know if you disagree [17:40:49] k [17:43:17] it’s fine either way nuria_, probably no merge conflicts [17:46:39] nuria_: I’ll prioritise reviewing your change to mobile changes, but as milimetric I don’t think there’ll be conflicts [18:05:55] going off! [18:05:58] byyeee [19:15:38] fdans: not so worried about conflicts , rather i d do not want to announce the mobile work to be done and when you load thE dashboard there are couple quite visible bugs about dates, i'd rather those are fixed by the time we call out the mobile work [19:17:11] that’s fair nuria_ [19:17:27] fdans: ok! [19:17:36] fdans: back to coding after my meetins [19:17:39] *meetings [19:21:24] (03CR) 10Mforns: "I played around in the canary from my Android phone and it looks great!! The menus work great, the detail pages are awesome!" (038 comments) [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/416999 (owner: 10Fdans) [19:23:01] (03CR) 10Nuria: "@fdans: please add bug to commit message. @mforns: can you add screeshots of issues? that way is easier to verify they are fixed" [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/416999 (owner: 10Fdans) [19:40:01] fdans, what is the corresponding task? [19:56:53] (03CR) 10Mforns: "As I'm not sure which task this is, I paste the screenshots for my previous indications here:" [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/416999 (owner: 10Fdans) [20:09:36] 10Analytics, 10Performance-Team (Radar): Possible statsv corruption? - https://phabricator.wikimedia.org/T189530#4062449 (10Krinkle) [20:23:48] 10Analytics-Kanban, 10MW-1.31-release-notes (WMF-deploy-2018-02-27 (1.31.0-wmf.23)), 10Patch-For-Review: Record and aggregate page previews - https://phabricator.wikimedia.org/T186728#4062484 (10Tbayer) >>! In T186728#4057522, @Ottomata wrote: > For reference, the EventLogging code that sets `userAgent.is_bo... [21:22:34] (03PS5) 10Nuria: Formats date objects always according to UTC timezone [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/417476 (https://phabricator.wikimedia.org/T189266) (owner: 10Fdans) [21:23:21] please milimetric mforns fdans test https://gerrit.wikimedia.org/r/417476 to make sure things are consistent timezone wise, code is ready for CR [21:23:36] will do [21:32:02] 10Analytics, 10Analytics-Wikistats: roadmap of migration to Wikistats 2 - https://phabricator.wikimedia.org/T183180#4062662 (10Nuria) [21:32:04] 10Analytics, 10Analytics-Wikistats: Remaining reports. - https://phabricator.wikimedia.org/T186121#4062661 (10Nuria) [21:34:09] 10Analytics, 10Analytics-Wikistats: Beta: Provide easier way of accessing metrics such as active editors as defined in Wikistats 1 - https://phabricator.wikimedia.org/T187806#4062665 (10Nuria) [21:38:59] (03CR) 10Mforns: "Whenever I select a wiki in the wiki selector, I get lots of errors like this one in the console:" [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/417476 (https://phabricator.wikimedia.org/T189266) (owner: 10Fdans) [21:58:58] mforns: thanks for testing, i think there must be a crazy race condition, will try to find it [21:59:24] nuria_, it's within a nextTick call [21:59:47] mforns: on meeting can talk in abit [22:33:54] bye teaaam :] [22:52:23] (03PS11) 10Milimetric: Compute geowiki statistics from cu_changes data [analytics/refinery] - 10https://gerrit.wikimedia.org/r/413265 (https://phabricator.wikimedia.org/T188113) [23:01:15] chelsyx: oozie jobs that import pageviews into druid: https://github.com/wikimedia/analytics-refinery/tree/master/oozie/pageview/druid/daily [23:01:52] chelsyx: mediawiki history imports into druid: https://github.com/wikimedia/analytics-refinery/tree/master/oozie/mediawiki/history/druid [23:03:08] chelsyx: example query from superset for mw history into druid [23:04:37] nuria_: Thank you!!!! [23:06:17] chelsyx: take a look at superset 1st , there are plenty docs : https://github.com/apache/incubator-superset [23:06:25] chelsyx: as always let us know [23:07:00] nuria_: Definitely! Thank you! [23:07:05] chelsyx: jon was mentioned a ratio of UDdaily/UDmonthly , that we woudl need to ingest into druid fresh [23:07:14] chelsyx: let me triple check that [23:10:23] chelsyx: ya, indeed, that will require a new datasource on druid that can be exposed via superset [23:10:51] chelsyx: or rather, i think we could add that to our daily indexing of unqiue devices [23:11:55] nuria_: Cool! I will ask Jon to create a list of metrics he want, and figure out what needs to be add to Druid. I will let you know [23:13:45] chelsyx: i actually would add that ratio to our unqiue devices dataset on druid https://github.com/wikimedia/analytics-refinery/blob/master/oozie/unique_devices/per_project_family/druid/daily/generate_druid_unique_devices_per_project_family_daily.hql [23:14:25] nuria_: okie dokie! Thanks! [23:17:14] nuria_: This ticket is related I think: https://phabricator.wikimedia.org/T186828 [23:18:19] 10Analytics, 10Research: geowiki data for Global Innovation Index - 2017 - https://phabricator.wikimedia.org/T178183#4063100 (10leila) @Ottomata I used to access staging using the following sample commands: > ssh stat1003.eqiad.wmnet > mysql --defaults-file=/etc/mysql/conf.d/research-client.cnf -hs1-a... [23:21:58] 10Analytics, 10Research: geowiki data for Global Innovation Index - 2017 - https://phabricator.wikimedia.org/T178183#3683651 (10Nuria) @leila: nuria@stat1006:~$ mysql --defaults-extra-file=/etc/mysql/conf.d/research-client.cnf -hanalytics-slave.eqiad.wmnet Welcome to the MariaDB monitor. Commands end with ;... [23:23:46] nuria_: sorry T186828 needs to count uuid, not the same as unique devices. It will require a new datasource [23:23:47] T186828: Productionize per-country daily & monthly active app user stats - https://phabricator.wikimedia.org/T186828 [23:29:47] 10Analytics, 10Research: geowiki data for Global Innovation Index - 2017 - https://phabricator.wikimedia.org/T178183#4063157 (10leila) @Nuria Thanks! ( I was using "-hs1-analytics-slave.eqiad.wmnet" which based on your query I see I shouldn't have.) @Ottomata resolved! :) [23:55:29] (03PS12) 10Milimetric: Compute geowiki statistics from cu_changes data [analytics/refinery] - 10https://gerrit.wikimedia.org/r/413265 (https://phabricator.wikimedia.org/T188113)