[00:44:13] PROBLEM - Check the last execution of refinery-import-page-history-dumps on stat1007 is CRITICAL: connect to address 10.64.21.118 port 5666: Connection refused https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [00:45:25] PROBLEM - Check the last execution of archive-maxmind-geoip-database on stat1007 is CRITICAL: connect to address 10.64.21.118 port 5666: Connection refused https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [00:45:49] PROBLEM - Check the last execution of reportupdater-browser on stat1007 is CRITICAL: connect to address 10.64.21.118 port 5666: Connection refused https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [00:54:47] RECOVERY - Check the last execution of refinery-import-page-history-dumps on stat1007 is OK: OK: Status of the systemd unit refinery-import-page-history-dumps https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [00:55:57] RECOVERY - Check the last execution of archive-maxmind-geoip-database on stat1007 is OK: OK: Status of the systemd unit archive-maxmind-geoip-database https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [00:56:19] RECOVERY - Check the last execution of reportupdater-browser on stat1007 is OK: OK: Status of the systemd unit reportupdater-browser https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [06:12:04] morning! [06:18:20] o/ [06:57:28] !log drop wmf_netflow from Analytics druid and restart the job with more dimensions [06:57:30] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [06:58:01] the backfill is probably not needed in my opinion since we had partial data up to a couple of days ago, since some routers were not configured correctly [06:58:45] one thing that I noticed is that the hourly job calculates data with the range of -6h -> -5h before the current date [06:59:22] but it could be different? see https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/531046/ [06:59:26] I'll wait for mforns :) [06:59:35] (I am probably saying something silly) [06:59:49] the current "issue" is that data in Turnilo lags a bit to show up [06:59:54] (recent data I mean) [07:00:20] the good solution is to add real time indexation :) [07:00:35] buuut for the time being maybe having also less lag is better? [07:03:12] brb [07:42:00] Good morning :) [07:44:48] o/ [07:45:17] elukey: how shall we start with Kerb? [07:46:02] joal: in theory from https://wikitech.wikimedia.org/wiki/User:Elukey/Analytics/Hadoop_testing_cluster [07:46:12] then basically coming up with things to test etc.. [07:46:22] I added notes about all the problems that we found [07:47:19] ok elukey - Will read and try to come up with whatever I think could be useful (possibly just a test plan :) [07:47:30] ah and possibly getting an account [07:47:44] I don't think I have sent you the email with the tmp password for the account [07:48:00] elukey: I don't know if accounts have been renewed but I had one before I left [07:48:16] ah good [07:48:25] just checked yes [07:48:34] we have now a script that sends an email with the tmp pass [07:48:39] that you have to change upon first login [07:48:43] so everything is automated [07:48:52] ok [07:49:16] ideally before enabling kerberos we'd need to roll out spark2.4 [07:49:30] so we could start from that? [07:50:03] roll out meaning making sure everything works with it? [07:50:18] no sorry backtrack, spark2.3 with buster compatibility first [07:50:19] https://phabricator.wikimedia.org/T229347 [07:50:43] stat1005 is basically ready to go, with buster [07:50:54] Andrew left a package to rollout [07:51:05] that we could do together, I feel a bit more comfortable :) [07:51:10] (we can rollback if anything happens) [07:51:22] no problem for me :) [07:52:22] there is also https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/AMD_GPU that is interesting [07:52:25] if you didn't see it [07:52:28] now everything works :) [07:53:18] joal: need to go out for an errand, 10/15 mins, will be back and roll out [07:53:24] elukey: please :) [07:53:30] should we drain spark jobs first? Not sure if needed [07:53:31] I'll probably have questions for you [07:54:41] (brb) [08:06:19] 10Analytics, 10Analytics-Wikistats, 10Performance-Team: Piwik JS isn't cached - https://phabricator.wikimedia.org/T230772 (10Gilles) [08:06:32] 10Analytics, 10Analytics-Wikistats, 10Performance-Team: Piwik JS isn't cached - https://phabricator.wikimedia.org/T230772 (10Gilles) [08:09:40] back :) [08:10:18] Hey :) [08:10:32] elukey: I have a bunch of questions - shall we talk in da cave? [08:10:36] sure! [08:55:58] joal: does it look reasonable? https://etherpad.wikimedia.org/p/elukey-netflow [08:59:05] 10Analytics, 10Analytics-Wikistats, 10Performance-Team: Piwik JS isn't cached - https://phabricator.wikimedia.org/T230772 (10elukey) Nice catch, even if I was convinced that the static contents were cached by Varnish 24h if not Cache-Control (or similar) headers were found.. Maybe we have a specific pass pol... [09:01:46] 10Analytics, 10Analytics-Wikistats, 10Performance-Team: Piwik JS isn't cached - https://phabricator.wikimedia.org/T230772 (10elukey) @ema hi :) Are response headers like Cache-Control used by Varnish in case `caching: 'pass'` is configured? [09:14:16] https://turnilo.wikimedia.org/#test_wmf_netflow [09:14:18] \o/ [09:15:51] wow it is amazing [09:17:57] so I guess that if it keeps working, I can just [09:18:02] 1) kill the realtime indexation [09:18:11] 2) restart it with the datasource 'wmf_netflow' [09:18:22] since hourly/daily indexations will override the realtime data right? [09:19:20] (brb) [09:21:28] seems very correct elukey :) [09:28:18] 10Analytics, 10Product-Analytics, 10Reading Depth, 10Readers-Web-Backlog (Needs Product Owner Decisions): Reading_depth remove eventlogging instrumentation? - https://phabricator.wikimedia.org/T229042 (10phuedx) [09:38:20] joal: \o/ proceeding! [09:38:28] elukey: [09:38:41] elukey: I'm thinking of task time in relation to segment size [09:39:08] elukey: can we leave the test job running for some time before proceeding, so that I get a better understanding? [09:39:19] joal: ah sure of course! [09:42:04] elukey: I wonder about task duration being PT10M and segment granularity being 1H [09:42:52] elukey: I have also found a new information on segment optimization: we should take row-number into account, not size [09:43:59] finally elukey, druid current doc shows an awesome UI - Do we have plans to upgrade one of these days? [09:44:04] * joal runs fast [09:48:13] joal: we can definitely schedule one for next quarter! [09:48:51] elukey: I think having task duration lasting longer than segment granularity is better - Having it lasting 10M makes no sense, and I think they actually last 1H [09:49:13] ah sure, we can kill/change it [09:49:51] elukey: Let's use PT6H for task duration - The only advantage of having small tasks is in case of error [09:49:58] ack [09:56:33] mmm so curl localhost:8090/druid/indexer/v1/supervisor/test_wmf_netflow/terminate -X POST -i seems not working [09:56:50] elukey: I think you need to wait for the task to finish [09:57:38] not sure since it tells me 404 [09:57:45] Ah - indeed [09:57:59] hm [09:58:18] https://druid.apache.org/docs/latest/development/extensions-core/kafka-ingestion.html maybe is for the latest upstream [10:02:25] elukey: Worked for me with shutdown instead of terminate [10:02:26] ah no now it is gone [10:02:29] ah! [10:02:40] where did you find shutdown? [10:02:41] elukey: probably not as clean though [10:02:49] later https://druid.apache.org/docs/latest/operations/api-reference.html#supervisors [10:03:09] ahahhah [10:03:10] okok [10:03:18] shall I restart it with PT6H? [10:03:18] elukey: shutdown is being deprecated, so it's probably our thing :) [10:03:47] elukey: please- you can actually restart with prod datasource if you wish [10:04:10] nice [10:04:34] of course I didn't do it since I am stupid [10:04:38] anyway, will kill again [10:04:42] huhu [10:05:28] ok done :) [10:11:24] interesting, https://turnilo.wikimedia.org/#wmf_netflow seems zero [10:13:38] but events are processed - https://grafana.wikimedia.org/d/000000538/druid?refresh=1m&orgId=1&var-datasource=eqiad%20prometheus%2Fanalytics&var-cluster=druid_analytics&var-druid_datasource=wmf_netflow&from=now-1h&to=now&panelId=41&fullscreen [10:14:43] even more interesting [10:14:44] ssh -L 9091:an-tool1007.eqiad.wmnet:80 an-tool1007.eqiad.wmnet [10:14:51] it is different with the new version of Turnilo :D [10:15:37] ahhh wait count is zero [10:15:53] okok now I can see the issue [10:16:01] with realtime indexation, only the count measure is there [10:16:04] for some reason [10:16:17] meanwhile in the new version of turnilo all of them are there [10:16:23] but count is still zero [10:16:57] and in fact I haven't added the count measure to realtime indexation [10:19:03] restarted [10:19:33] working! [10:19:41] but all the measures are only in the new turnilo [10:23:31] elukey: how come they are only in the new one? [10:27:02] elukey: I can see data in the new one [10:27:08] the old one I mean sorry [10:29:53] 10Analytics, 10Analytics-Kanban: Wikistats: month on dashboard changes on any redraw - https://phabricator.wikimedia.org/T230514 (10fdans) a:05Milimetric→03fdans [10:32:27] joal: there is data but I can only see the 'count' measure in the old one [10:32:39] elukey: ah yes - so can I [10:32:59] didn't we had to specify the measures in turnilo for realtime banner impressions? [10:33:03] (03PS1) 10Fdans: Transition data rows to using time ranges instead of timestamps [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/531148 (https://phabricator.wikimedia.org/T230514) [10:33:06] elukey: do we expect other measures? [10:33:26] yeah if you check in the new turnilo there is 'packets' and 'bytes' [10:33:35] hm [10:33:50] elukey: turnilo restart needed, or manual config updatre? [10:34:18] joal: I think that we need to upgrade turnilo :D [10:34:27] huhuhu :) [10:34:38] ok :) [10:34:50] gone errand, will be back in a bit [10:34:54] o/ [10:36:09] lunch for me! [12:08:35] hm - something is wrong with netflow supervisor elukey - let's check when you get back [12:11:52] I am here [12:12:56] joal: what is wrong? [12:14:01] empty segments for hours 11 and 12 elukey :( [12:14:57] more info? [12:15:00] :) [12:15:18] nope, didn't look more [12:15:39] I think the restart of the task (or the kill of the previous) didn't succeed [12:16:11] BUT - there is something I don't understand [12:16:18] turnilo shows data!! [12:17:32] It must be related to tasks not handing off segments as fast as previously - so segments don't show up, but data is here [12:17:39] Let's wait for more [12:17:42] ack [12:17:44] :) [12:17:44] sorry for the ping eluke [12:17:48] nono please! [12:17:54] I was trying to understand :) [12:18:06] I am trying to add jupyterhub to an-tool1006 [12:18:13] but if fails in weird ways [12:18:15] lovely [12:18:15] :D [12:18:24] :S [12:25:23] something must have changed since we first set it up on the notebooks [12:25:42] elukey: buster? [12:26:41] no no it is stretch, I think I found the issue [12:26:55] there is a symbolic link that should be in puppet (probably) [12:27:23] working! [12:27:24] :) [12:27:31] ssh -N an-tool1006.eqiad.wmnet -L 8000:127.0.0.1:8000 [12:28:04] sort of, cannot login : [12:28:05] :P [12:28:29] User Elukey not in allowed groups (analytics-admins) [12:28:31] looool [12:28:34] joal: --^ [12:29:57] but it should work for you [12:30:21] will try elukey [12:30:47] elukey: another question - no more webrequest after august 16th - is it expected? [12:31:27] nope, something broke [12:31:32] Arf [12:31:40] I think it was the last deployment [12:31:47] hm [12:32:10] notebook ok for me elukey (meaning, I'm in, will test later) [12:32:21] super [12:34:09] JA008: File does not exist: hdfs://analytics-test-hadoop/user/oozie/share/lib/lib_20190627073559/hive2/libfb303-0.9.3.jar [12:34:20] wow [12:34:55] when oozie complains about its own lib, I'm afraid [12:35:06] there is /user/oozie/share/lib/lib_20190809093929 [12:35:33] but it is strange, it seems a pruning of some sort [12:37:02] so I think that it is when Andrew installed spark2 [12:37:14] hm [12:37:29] you mean spark2 for buster? [12:37:42] no no we don't have buster in there [12:37:47] ah sorry [12:37:49] the new version that works with both [12:37:53] right [12:38:52] probably not, the new version is 2.3.1-bin-hadoop2.6-4 [12:38:55] and I don't see it [12:39:56] we have a thing called /usr/local/bin/spark2_oozie_sharelib_install [12:40:04] that puppet sometimes executes [12:40:52] but only if hdfs://analytics-test-hadoop/user/oozie/share/lib/lib_etc.. is not present [12:41:01] so what I am wondering is if during a test it was removed [12:41:04] puppet re-created it [12:42:12] but in this case it was [12:42:12] drwxr-xr-x - oozie hadoop 0 2019-08-09 09:40 /user/oozie/share/lib/lib_20190809093929 [12:42:39] weird [12:42:57] it feels as if the lib had not been registered by oozie [12:45:10] I have restarted oozie and re-run the last failed hour, it seems passing the add_partition step [12:46:51] joal: I think it was due to Andrew's testing, seems to be a one weird testing issue, never happened before [12:46:59] possible [12:47:12] I'll restart all the failed hours when we log-off so it will not use all the cluster's resources :) [12:47:25] (I meant now, so you can test :) [12:47:49] makes sense - thanks elukey :) [12:47:53] thank you! [12:50:46] elukey: something else I have noticed: the spark2 folder in our cluster don't contain examples anymore :( [12:51:05] While it's no big deal in itself, it's interesting to run tests [12:51:48] wasn't aware of it, can you give me the path? [12:51:54] I can check if it should be there or not [12:52:34] on analytics1031, there is /usr/lib/spark for old spark1.6 and /usr/lib/spark2 for new spark - The former contains an examples folder [12:53:21] elukey: --^ [12:53:22] helllooo team europeee [12:53:28] holaaaa [12:53:41] Hey! Good mornfternoon nuria :) [12:53:50] holaaa joal !!! [12:54:17] good to read nuria :) I hope you enjoyed holidays (I certainly did :) [12:55:46] 10Analytics-Kanban, 10Product-Analytics, 10Patch-For-Review: Make aggregate data on editors per country per wiki publicly available - https://phabricator.wikimedia.org/T131280 (10Nuria) >Unless I'm very much mistaken, the described system will make it possible to determine the country of specific editors, in... [12:58:47] 10Analytics-Kanban, 10Product-Analytics, 10Patch-For-Review: Make aggregate data on editors per country per wiki publicly available - https://phabricator.wikimedia.org/T131280 (10Nuria) @Milimetric I think for our first release we can remove all countries mentioned on the surveillance report, we can work on... [13:21:24] 10Analytics, 10Operations: Access to HUE for Mayakpwiki - https://phabricator.wikimedia.org/T229143 (10Nuria) a:03JAllemandou [13:30:02] 10Analytics: Unable to access SWAP notebooks using LDAP - https://phabricator.wikimedia.org/T230627 (10elukey) @cchen I'd suggest at this point to try to reset your password to something else, so we will see if it helps o not. From the logs point of view it seems that the wrong password is inserted, but then not... [13:33:46] a-team: if anybody of you has time please test superset/turnilo during these days :) [13:34:07] (brb) [13:34:24] hm - just noticed- we need to restart the webrequest-load bundle - the last restart has been done on defualt queue instead of production one [13:42:41] elukey: i can test! [13:42:49] elukey: on the staging host? [13:46:32] joal: ah snap didn't check, it was part of the hive2 actions move [13:46:43] nuria: there are info in the emails, but yes :) [13:46:59] an-tool1005 for superset, an-tool1007 for turnilo [13:53:54] 10Analytics, 10Reading Depth: Publish aggregated reading time dataset - https://phabricator.wikimedia.org/T230642 (10Nuria) >Hi Nuria. I'm proposing to start with a one-off release that I can handle easily. Sounds good, a one-off release is just a file with data on a public folder, no pipeline of any sort is... [13:58:30] 10Analytics, 10Reading Depth: Publish aggregated reading time dataset - https://phabricator.wikimedia.org/T230642 (10Nuria) Documenting any caveats with data re important, for example, in a multi tab browsing situation is this data of quality? If I open two tabs with wikipedia content , does the data take into... [13:59:53] joal: thanks again for helping with DataGrip yesterday! I've documented the setup process here: https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Access#DataGrip [14:07:12] 10Analytics, 10Product-Analytics, 10Reading Depth, 10Readers-Web-Backlog (Needs Product Owner Decisions): Reading_depth remove eventlogging instrumentation? - https://phabricator.wikimedia.org/T229042 (10Nuria) @phuedx I vote also for disabling the instrumentation, can we use this ticket for this purpose? [14:22:38] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Tune Wikistats 2 Varnish caching - https://phabricator.wikimedia.org/T230136 (10Nuria) 05Open→03Resolved [14:22:52] hey alllll :] [14:24:36] holaaa mforns [14:25:03] hey nuria! welcome back [14:25:15] grasias mforns [14:30:47] 10Analytics-Kanban: Upgrade superset to 0.34 - https://phabricator.wikimedia.org/T230416 (10Nuria) @nuria to test superset [14:35:13] 10Analytics, 10Product-Analytics: Streamline Superset signup and authentication - https://phabricator.wikimedia.org/T203132 (10Nuria) >caveat is that the email of the user created will be $uid@email.notfound I do not think that is a problem, I really cannot think of a superset feature that requires a true e-m... [14:46:08] welcome back nuria! [14:46:23] gracias bearloga [14:55:39] (03CR) 10Nuria: Add mediarequests hourly oozie job (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/529911 (https://phabricator.wikimedia.org/T229817) (owner: 10Fdans) [14:58:11] nuria: thanks for taking a look, that file is just for the record about how to backfill from the old mediacounts dataset, and I've changed the query substantially. I probably shouldn't have added it with this change, will post a change deleting it or updating it once i've backfilled from mediacounts [14:59:08] hola fdans , I see, having a record on how backfilling was done is helpful so updating sounds good [15:13:25] 10Analytics, 10Operations: Access to HUE for Mayakpwiki - https://phabricator.wikimedia.org/T229143 (10Nuria) Assigning to @joal who has ops duty this week https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Access#Admin_Instructions_to_sync_a_Hue_account [15:26:17] 10Analytics, 10Performance-Team, 10Research, 10Security-Team, 10WMF-Legal: A Large-scale Study of Wikipedia Users' Quality of Experience: data release - https://phabricator.wikimedia.org/T217318 (10Nuria) Let's see, this dataset has no page info, neither timestamps, is that correct? [15:38:50] 10Analytics, 10Performance-Team, 10Research, 10Security-Team, 10WMF-Legal: A Large-scale Study of Wikipedia Users' Quality of Experience: data release - https://phabricator.wikimedia.org/T217318 (10Gilles) My bad. We probably want timestamp as well, but it can be very coarse (rounded to the hour is fine)... [15:40:23] 10Analytics, 10Performance-Team, 10Research, 10Security-Team, 10WMF-Legal: A Large-scale Study of Wikipedia Users' Quality of Experience: data release - https://phabricator.wikimedia.org/T217318 (10Gilles) [15:49:45] 10Analytics, 10Performance-Team, 10Research, 10Security-Team, 10WMF-Legal: A Large-scale Study of Wikipedia Users' Quality of Experience: data release - https://phabricator.wikimedia.org/T217318 (10Nuria) @Gilles it would be good to shift timestamps so this data cannot be linked (or rather, obviously lin... [15:53:09] 10Analytics, 10Performance-Team, 10Research, 10Security-Team, 10WMF-Legal: A Large-scale Study of Wikipedia Users' Quality of Experience: data release - https://phabricator.wikimedia.org/T217318 (10Gilles) Sure, we can shift the timestamps by an arbitrary amount. It would still prove lack of temporal cor... [15:53:51] Thanks for the great doc bearloga ! [15:54:18] s mforns [15:54:30] oops- wrong paste mforns [15:54:36] heh [15:55:16] s mforns [15:55:21] xD [15:55:32] Mwarf - I need to get back in using computers I guess [15:58:11] hehehe [16:09:39] where can i query kafka api's from? I can't seem to find any hosts that have things like `kafka-consumer-groups` from the kafka release installed [16:09:57] only kafkacat, which while nice is missing things (like report offset of consumer group) [16:11:13] ebernhardson: usually those are on the kafka hosts itself, that we limit to ops [16:11:15] hi ebernhardson - we're in standup I'd say kafka-jumbo100X but elukey should confirm [16:11:24] yes :) [16:11:33] elukey: hmm, isn't that a bit of fake security? They just talk to the kafka api's that anyone else can [16:11:47] i mean, i could go download the appropriate deb, unpack it, copy the binaries anywhere, and they will talk to kafka [16:11:54] ebernhardson: nono I mean access to the host themselves, not talking about api security :) [16:11:55] (but i wont, because we agreed not to do that :P) [16:12:29] elukey: ahh, so we just don't install those anywhere else? [16:12:37] stuff like kafka-consumer-groups are contained in the confluent kafka package IIRC, so we don't install everywhere [16:12:41] yes exactly [16:12:54] but if you use stuff like kafka-python etc.. you can easily get those info [16:12:59] but you'll need to code a bit :( [16:13:07] if you need a one off I can grab data for you [16:14:23] elukey: i'm trying to see if jumbo-eqiad is correctly tracking cirrussearch_updates_eqiad consumer group. I added consumer, it says it joined the group, produced a message and .. the daemon doesn't report recieving anything [16:14:55] its for TopicPartition(topic='eqiad.swift.search_glent.upload-complete', partition=0) [16:15:16] but the grafana dashboard for consumer lag reports no consumers for that topic, which makes me suspicious... [16:15:39] ah I was about to check it [16:17:07] ebernhardson: just to triple check, if you use kafkacat you can see the message sent right? [16:17:14] elukey: yup [16:18:31] this daemon was recently re-written so maybe i mucked something up, but the logging from python kafka consumer looks like it thinks it's talking to the right cluster and listening to the right partitions so...its odd [16:19:27] ebernhardson: kafka consumer-groups --list from kafka-jumbo1001 shows cirrussearch_updates_eqiad [16:20:01] lemme see if I can find its status [16:20:04] elukey: and the commited offset? [16:20:08] should be either 0, 1 or 2 [16:22:07] so kafka consumer-groups --describe --group cirrussearch_updates_eqiad doesn't show me anything :D [16:22:13] empty [16:23:19] hmm, so indeed something wierd going on somewhere :) The daemon also thinks it hasn't seen any messages, but i produced a new message at 15:57, after starting up the daemon and seeing it connect in logs at about 15:55 [16:23:23] :S [16:25:08] it's gotta be on my daemon side somewhere...will look into what its doing and add more logging [16:27:02] ack, let me know if I can hel [16:27:04] *help [16:42:27] 10Analytics: Unable to access SWAP notebooks using LDAP - https://phabricator.wikimedia.org/T230627 (10cchen) @elukey i update the password and it's working! i am able to log into the SWAP now. Thanks again! [16:44:13] mayakpwiki: Hi! Would you mind trying to login to hue.wikimedia.org ? [16:46:20] 10Analytics: Unable to access SWAP notebooks using LDAP - https://phabricator.wikimedia.org/T230627 (10elukey) 05Open→03Resolved a:03elukey Good! [17:22:40] 10Analytics, 10Performance-Team, 10Research, 10Security-Team, 10WMF-Legal: A Large-scale Study of Wikipedia Users' Quality of Experience: data release - https://phabricator.wikimedia.org/T217318 (10Nuria) >Another possibility is to only keep the ruwiki data, which has by far the largest traffic during th... [17:23:31] 10Analytics, 10Performance-Team, 10Research, 10Security-Team, 10WMF-Legal: A Large-scale Study of Wikipedia Users' Quality of Experience: data release - https://phabricator.wikimedia.org/T217318 (10Nuria) FYI that released one-off datasets get documented in meta like, for example: https://meta.wikimedia.... [17:24:37] 10Analytics, 10Performance-Team, 10Research, 10Security-Team, 10WMF-Legal: A Large-scale Study of Wikipedia Users' Quality of Experience: data release - https://phabricator.wikimedia.org/T217318 (10Gilles) [17:46:24] 10Analytics, 10Product-Analytics, 10Reading Depth, 10Patch-For-Review, 10Readers-Web-Backlog (Needs Product Owner Decisions): Reading_depth: remove eventlogging instrumentation - https://phabricator.wikimedia.org/T229042 (10Jdlrobson) [17:49:36] 10Analytics, 10Product-Analytics, 10Reading Depth, 10Patch-For-Review, 10Readers-Web-Backlog (Needs Product Owner Decisions): Reading_depth: remove eventlogging instrumentation - https://phabricator.wikimedia.org/T229042 (10ovasileva) +1 on disabling for now and keeping the dataset. [19:01:24] 10Analytics, 10Operations: Access to HUE for Mayakpwiki - https://phabricator.wikimedia.org/T229143 (10JAllemandou) Action has been taken that should have granted access to shell username `Mayakpwiki`. @Mayakp.wiki can you test please? :) [19:28:57] 10Analytics, 10Operations: Access to HUE for Mayakpwiki - https://phabricator.wikimedia.org/T229143 (10Mayakp.wiki) Checked connection and ran queries against mediawiki history. Access is working as expected. Thanks @JAllemandou and @Nuria for your help ! [19:38:16] 10Analytics, 10Operations: Access to HUE for Mayakpwiki - https://phabricator.wikimedia.org/T229143 (10JAllemandou) 05Open→03Resolved [20:04:39] (03PS4) 10Mforns: [WIP] Add Oozie job for mediawiki history dumps [analytics/refinery] - 10https://gerrit.wikimedia.org/r/530002 (https://phabricator.wikimedia.org/T208612) [20:05:24] (03CR) 10Mforns: [C: 04-2] "Finally, this seems to work! But still need to write the README." [analytics/refinery] - 10https://gerrit.wikimedia.org/r/530002 (https://phabricator.wikimedia.org/T208612) (owner: 10Mforns) [20:15:44] (03PS5) 10Mforns: [WIP] Add Oozie job for mediawiki history dumps [analytics/refinery] - 10https://gerrit.wikimedia.org/r/530002 (https://phabricator.wikimedia.org/T208612) [20:20:46] (03PS6) 10Mforns: [WIP] Add Oozie job for mediawiki history dumps [analytics/refinery] - 10https://gerrit.wikimedia.org/r/530002 (https://phabricator.wikimedia.org/T208612) [20:30:13] (03PS3) 10Mforns: [WIP] Add spark job to create mediawiki history dumps [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/528504 (https://phabricator.wikimedia.org/T208612) [20:31:22] (03CR) 10Mforns: [C: 04-2] "This seems to work now! But we still have to agree on a final dumps format (splits) and add some detailed docs." [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/528504 (https://phabricator.wikimedia.org/T208612) (owner: 10Mforns) [20:44:43] 10Analytics, 10Fundraising-Backlog, 10Fundraising Sprint Q 2019: Identify source of discrepancy between HUE query in Count of event.impression and druid queries via turnilo/superset - https://phabricator.wikimedia.org/T204396 (10DStrine) [20:58:01] milimetric: re T224459, I just reviewed it and is ready to go. Thanks for your edits. 2 points: I'll send the email to wiki-research-l and we will do some more pushes via our personal contacts and twitter. I'll have to give them 2 weeks time, so changing the deadline to September 3, Ok? [20:58:01] T224459: Recommend the best format to release public data lake as a dump - https://phabricator.wikimedia.org/T224459 [21:07:31] 10Analytics, 10Research: Recommend the best format to release public data lake as a dump - https://phabricator.wikimedia.org/T224459 (10leila) @Milimetric thanks! I added two questions at the end: field of research and email address. I changed the deadline to 2019-09-03 and started advertising for it now. [22:47:43] 10Analytics, 10Product-Analytics, 10Reading Depth, 10Patch-For-Review, 10Readers-Web-Backlog (Needs Product Owner Decisions): Reading_depth: remove eventlogging instrumentation - https://phabricator.wikimedia.org/T229042 (10kzimmerman) @Groceryheist here's the proposal for SessionLength, which we want t... [23:22:50] 10Analytics, 10Product-Analytics, 10Reading Depth, 10Patch-For-Review, 10Readers-Web-Backlog (Readers-Web-Kanbanana-2019-20-Q1): Reading_depth: remove eventlogging instrumentation - https://phabricator.wikimedia.org/T229042 (10Jdlrobson) [23:34:14] 10Analytics, 10Product-Analytics, 10Reading Depth, 10Patch-For-Review, 10Readers-Web-Backlog (Readers-Web-Kanbanana-2019-20-Q1): Reading_depth: remove eventlogging instrumentation - https://phabricator.wikimedia.org/T229042 (10Jdlrobson) IT's off: https://grafana.wikimedia.org/d/000000566/overview?panel...