[07:07:18] <fdans>	 morning! I'm dealing with the failed cron, no worries -.-
[07:08:29] <elukey>	 fdans: o/
[07:08:45] <elukey>	 I have received an email for /user/fdans/etc.. on root@
[07:09:04] <elukey>	 with yesterday's failure
[07:09:11] <elukey>	 you may have it also in your crontab
[07:09:40] <elukey>	 also, do you want to have emails be sent to analytics-alerts@ or do you prefer your email?
[07:10:57] <fdans>	 elukey: analytics-alerts is prob better, I just need to test this once more and make sure it runs correctly
[07:12:31] <elukey>	 fdans: ack, I just commented the crontab for your user on stat1007
[08:10:33] <joal>	 Morning team :)
[08:13:44] <elukey>	 o/
[09:21:18] <wikibugs>	 (03CR) 10Joal: [C: 04-1] "This should work for backfilling but not for regular daily jobs: jobs are materialized at the beginning of the day (coord.actualTime), and" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/549876 (owner: 10Fdans)
[09:29:56] <wikibugs>	 (03CR) 10Joal: Add python oozie lib and oozie-dumper script (033 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/549861 (https://phabricator.wikimedia.org/T237271) (owner: 10Joal)
[09:30:20] <wikibugs>	 (03PS3) 10Joal: Add python oozie lib and oozie-dumper script [analytics/refinery] - 10https://gerrit.wikimedia.org/r/549861 (https://phabricator.wikimedia.org/T237271)
[09:31:08] <joal>	 Wow elukey - I can tell stat1004 is kerberized :)
[09:35:33] <wikibugs>	 (03PS4) 10Joal: Add python oozie lib and oozie-dumper script [analytics/refinery] - 10https://gerrit.wikimedia.org/r/549861 (https://phabricator.wikimedia.org/T237271)
[09:36:47] <elukey>	 joal: hahahaha
[09:36:55] <elukey>	 is is too much or understandable?
[09:37:01] <elukey>	 I created also a userguide
[09:38:05] <joal>	 This is great elukey - Very nticeable (that's the whole point), and understable
[09:38:31] <joal>	 elukey: possibly we wait for kerberos to be deployed before updating all our hosts with the message?
[09:38:36] <joal>	 maybe not though :0
[09:42:35] <moritzm>	 now is the right time, all the Kerberos tools are avaialble and working, so people can already familiarise
[09:44:29] <elukey>	 yep exactly this was my thought as well
[09:44:54] <elukey>	 hosts are all kerberized, and in the UserGuide I added a note about the fact that it is not enabled yet in prod
[09:45:22] <joal>	 ok :)
[09:54:54] * joal can feel Kerberos fetid breath on his neck
[09:59:52] <fdans>	 omg joal 
[10:01:09] <joal>	 yes fdans?
[10:01:31] <fdans>	 joal: nah the breath comment left an impression of me :D
[10:01:34] <fdans>	 D:
[10:01:47] <fdans>	 impression on me*
[10:47:22] <elukey>	 ah joal little problem, the oozie dump script doesn't work with kerberos
[10:47:34] <elukey>	 http needs to be instructed to use spnego probably
[10:47:50] <elukey>	 that is not a problem for the migration
[10:48:38] <elukey>	 the oozie http api will require kerberos (good since it will properly authenticate api requests)
[10:48:45] <joal>	 readin
[10:48:52] <joal>	 about spnego elukey :)
[10:49:03] <joal>	 Will update the script - thanks for pointing!
[10:49:56] <elukey>	 maybe we can add a flag like --use-kerberos or similar
[10:49:59] <elukey>	 np :)
[10:50:05] <elukey>	 the rest looks perfect
[10:50:16] <joal>	 thanks for the review :)
[11:27:42] <icinga-wm>	 PROBLEM - Check the last execution of eventlogging_db_sanitization on db1107 is CRITICAL: NRPE: Command check_check_eventlogging_db_sanitization_status not defined https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers
[11:28:11] <elukey>	 this is me --^
[11:34:18] <icinga-wm>	 PROBLEM - Check the last execution of eventlogging_db_sanitization on db1108 is CRITICAL: NRPE: Command check_check_eventlogging_db_sanitization_status not defined https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers
[11:35:07] <elukey>	 same thing :)
[11:42:10] <wikibugs>	 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Patch-For-Review: Rerun sanitization  before archiving eventlogging mysql data - https://phabricator.wikimedia.org/T236818 (10elukey)
[11:43:51] <wikibugs>	 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban: Archive data on eventlogging MySQL to analytics replica before decomisioning - https://phabricator.wikimedia.org/T231858 (10elukey) Sanitization done!  The remaining step seems to be to take a mysql dump of both log databases (on db1107 and db1108) a...
[11:47:48] * elukey lunch!
[13:51:44] <joal>	 elukey: I've been reading a bit on spnego - Would you have a minute to talk with me about that?
[13:55:31] <elukey>	 joal: sure
[13:55:40] <elukey>	 in the cave or here?
[13:55:50] <joal>	 elukey: cave seems easier if you have time
[13:56:04] <elukey>	 sure
[14:43:33] <joal>	 elukey: The refinery rsync function has worked great with kerb - I assume we need a wrapper to be able to use it as a script, right?
[14:43:45] <elukey>	 \o/
[14:44:24] <elukey>	 joal: does it need all the refinery to work? Because deploying it to labstore nodes might be problematic
[14:44:36] <joal>	 elukey: I tried it both ways: local-to-hdfs and hdfs-to-local - There are interesting details: hidden files not copied
[14:44:48] <joal>	 elukey: it needs refinery-python :(
[14:45:19] <joal>	 elukey: it feels like we will be willing to separate refinery-jars from refinery-python - But some of it is intertweened
[14:45:30] <elukey>	 yeah I can imagine
[14:45:45] <joal>	 For instance the camus wrapper uses a jar ...
[14:51:21] <joal>	 elukey: can scap deploy parts of repos? or could we have a hierarchical scap projects (subprojects depending on parent ones)?
[14:54:38] <joal>	 Also elukey - Shall I use that to implement spnego for the script? https://github.com/pythongssapi/requests-gssapi
[14:56:14] <elukey>	 yep seems good!
[14:56:32] <elukey>	 no idea about scap.. maybe we could have git submodules, and deploy them separately if needed
[14:58:32] <joal>	 dropping for kids - bac kfro standup
[15:26:01] <fdans>	 elukey: do you recall anything about deploying wikistats 1?
[15:26:12] <fdans>	 would just running the ol puppet on thorium do the trick?
[15:26:27] <fdans>	 I want to deploy a lil change in index.html
[15:27:24] <elukey>	 fdans: good question, I don't recall
[15:27:43] <elukey>	 but we can try with puppet
[15:28:04] <fdans>	 elukey: yesss let's do it
[15:35:28] <elukey>	 fdans: how should we proceed? do you have a change ready or still in progress?
[15:35:37] <elukey>	 (just to understand if you are waiting for me or not)
[15:35:43] <fdans>	 elukey: already merged :)
[15:35:49] <elukey>	 ah ok
[15:36:53] <fdans>	 elukey: I also had the devious idea of just replacing the file in thorium manually but I thought to try the nice way first
[15:36:56] <elukey>	 fdans: forced the puppet run but I didn't see any git pull
[15:37:13] <elukey>	 so there is probably another way
[15:37:36] <fdans>	 elukey: I feel like it's rsynced because the folder in thorium isn't a git repo
[15:38:28] <elukey>	 yep
[15:39:42] <elukey>	 ah yes it is 
[15:39:51] <elukey>	 lemme find the info in puppet
[15:40:22] <fdans>	 elukey: could it be that erik just uploaded the files?
[15:40:38] <elukey>	 fdans: it seems that Erik was able to rsync before, but now we disabled it
[15:41:04] <fdans>	 elukey: ok do I have permission to just overwrite the file?
[15:41:22] * fdans smiles nervously
[15:41:27] <elukey>	 fdans: +1, log it in here though
[15:41:38] <fdans>	 thank you elukey 
[15:42:10] <fdans>	 !log manually overwriting index.html in Wikistats 1 to apply patch https://gerrit.wikimedia.org/r/#/c/analytics/wikistats/+/550338/
[15:42:12] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[15:43:27] <elukey>	 be back in a bit :)
[15:47:07] <wikibugs>	 (03PS1) 10Mforns: Add data quality metric: traffic variations per country [analytics/refinery] - 10https://gerrit.wikimedia.org/r/550498 (https://phabricator.wikimedia.org/T234484)
[15:47:43] <wikibugs>	 (03CR) 10Mforns: [V: 03+2] Refactor data_quality oozie bundle to fix too many partitions [analytics/refinery] - 10https://gerrit.wikimedia.org/r/547320 (https://phabricator.wikimedia.org/T235486) (owner: 10Mforns)
[15:57:40] <elukey>	 mforns: o/
[15:57:56] <elukey>	 sanitization completed on both log databases, I have removed all our work :(
[16:06:53] <wikibugs>	 (03CR) 10Nuria: [C: 04-1] "Before merging these code and queries let's please study that the data we are pulling has a signal, for which we probably need to run the " [analytics/refinery] - 10https://gerrit.wikimedia.org/r/550498 (https://phabricator.wikimedia.org/T234484) (owner: 10Mforns)
[16:08:04] <mforns>	 elukey, \o/
[16:08:28] <mforns>	 eventlogging_cleaner will always exist in our hearts
[16:10:49] <wikibugs>	 (03CR) 10Nuria: [C: 04-1] Add query to track WDQS updater hitting Special:EntityData (031 comment) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/549859 (https://phabricator.wikimedia.org/T218998) (owner: 10Ladsgroup)
[16:15:53] <nuria>	 mforns: INDEED
[16:16:19] <mforns>	 heh
[16:16:48] <nuria>	 mforns:  for the quality alarms, we still do not have the code that would alarm on sudden changes of entrophy right?
[16:17:03] <nuria>	 mforns: for teh UA
[16:17:05] <nuria>	 *the
[16:17:18] <mforns>	 nuria, no no
[16:17:27] <mforns>	 that would be another job
[16:17:33] <mforns>	 on the same workflow
[16:19:03] <nuria>	 mforns: let's do that before jumping into pageviews per country , let me know if you disagree. i think we can try the idea of removing the top 20% of pages but still teh underlying data series will be very seasonal ( my bet) so i think we need to study  a bit what threshold of pages to remove if any and consult with jelel 
[16:19:34] <wikibugs>	 (03CR) 10Ladsgroup: Add query to track WDQS updater hitting Special:EntityData (031 comment) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/549859 (https://phabricator.wikimedia.org/T218998) (owner: 10Ladsgroup)
[16:22:38] <elukey>	 just added a pic to the Kerberos user guide https://wikitech.wikimedia.org/wiki/Analytics/Systems/Kerberos/UserGuide#High_level_overview
[16:22:47] <elukey>	 will add some words about it
[16:27:34] <nuria>	 elukey: ooohhh
[16:27:39] <mforns>	 nuria, I already finished the query with: 1) top 20% articles ignored and 2) pageviews per country normalized by their continents pageviews, which will reduce seasonality
[16:28:25] <nuria>	 mforns:  I think we shoudl determine emprirically whether 20% is a good threshold
[16:28:30] <mforns>	 nuria, you can take a look at the metric in Superset: https://superset.wikimedia.org/superset/dashboard/73/
[16:28:36] <mforns>	 nuria, sure!
[16:29:38] <mforns>	 the dashboard for traffic_per_country is being backfilled, will take a bit
[16:30:00] <nuria>	 mforns: and normalizing by continent  also requires proof that it works, i do not see that it woudl for  acountry like cuba
[16:31:08] <nuria>	 mforns: i think we should test teh asumptions we made in jupyter and use a known event (of censhorship or traffic decline)  to see how things look so at this time i feel the superset dashboard is a bit premature 
[16:31:31] <mforns>	 nuria, yes yes I saw your CR, makes sense
[16:32:08] <fdans>	 a-team so now we're officially retiring Wikistats 1 in January 2020: https://stats.wikimedia.org/
[16:32:21] <mforns>	 yes, normalizing cuba by continent would not be ideal, but in general I thought that it would be a better normalization than global
[16:32:36] <mforns>	 fdans, O.o!
[16:32:39] <nuria>	 mforns:  so let's do two things: 1) let's work with these events  and our timeseries : https://office.wikimedia.org/wiki/Analytics/MonitoringWikipediaAccessibilityAroundTheWorld
[16:33:22] <nuria>	 and see what we get in terms of metrics and 2) let's work on the alarming scripts for UA entrophy which we know has a signal we can alarm on
[16:34:25] <nuria>	 mforns: my hunch is that normalizing by continent is not effective but you can prove me wrong
[16:34:47] <mforns>	 nuria, yea I'm also not super happy with that, let's see
[16:35:07] <mforns>	 you think it would be better to normalize by global?
[16:35:16] <nuria>	 mforns: let's work in jupyter 1st in this case, we have all teh data we need so we can try out different things there faster 
[16:35:30] <nuria>	 mforns: for entrophy we needed to gather the data but in this case we have it 
[16:36:02] <mforns>	 sure
[16:42:57] <wikibugs>	 (03CR) 10Nuria: [C: 04-1] Add query to track WDQS updater hitting Special:EntityData (031 comment) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/549859 (https://phabricator.wikimedia.org/T218998) (owner: 10Ladsgroup)
[16:50:26] <elukey>	 fdans: I am on Chrome and I don't see any banner indicating that
[16:50:31] <elukey>	 might be that I have a page cached
[16:51:09] <elukey>	 ah there you go after some refreshes it popped out
[16:52:17] <elukey>	 !log forced a purge in Varnish for the stats.wikimedia.org front page to pick up the new deprecation banner
[16:52:18] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[16:52:20] <elukey>	 fdans: --^
[16:52:38] <fdans>	 oh thanks elukey :)
[17:01:26] <nuria>	 ping ottomata , mforns standdduppp
[17:20:57] <wikibugs>	 10Analytics: Request for a large request data set for caching research and tuning - https://phabricator.wikimedia.org/T225538 (10Danielsberger) Thanks @lexnasser , that's very helpful. Is there a way to check how frequently submit queries happen for hosts other than upload?
[17:43:16] <wikibugs>	 10Analytics: Request for a large request data set for caching research and tuning - https://phabricator.wikimedia.org/T225538 (10lexnasser) @Danielsberger   I'm not sure if there's a public-facing way to check the frequency of submit queries. Will have to defer to @Nuria about that.  That said, I believe that *....
[17:48:52] <wikibugs>	 10Analytics, 10Analytics-Wikistats: Wikistats data discrepancy for India page views from hive data pull - https://phabricator.wikimedia.org/T237579 (10Iflorez) This is helpful clarity as we work with various data systems.  I will read the AQS endpoints documentation.  Thank you!
[17:51:11] <wikibugs>	 10Analytics, 10Analytics-Wikistats: Wikistats data discrepancy for India page views from hive data pull - https://phabricator.wikimedia.org/T237579 (10Iflorez) @pdas This is the documentation for Wikistats 2: https://wikitech.wikimedia.org/wiki/Analytics/Systems/Wikistats_2 I will also be reading through this...
[18:05:41] <wikibugs>	 10Analytics, 10Desktop Improvements, 10Event-Platform, 10Readers-Web-Backlog (Kanbanana-2019-20-Q2): [SPIKE 8hrs] How will the changes to eventlogging affect desktop improvements - https://phabricator.wikimedia.org/T233824 (10Jdrewniak) a:03Jdrewniak
[18:36:12] <jclark-ctr>	 #wikimedia-analytics need to perform maintenance on analytics1062 would like some assistance depooling host
[18:37:11] <wikibugs>	 10Analytics, 10DBA: Repurpose db1107 as a generic database - https://phabricator.wikimedia.org/T238113 (10elukey)
[18:37:21] <elukey>	 nuria: --^ :)
[18:45:40] <wikibugs>	 10Analytics, 10Analytics-Wikistats: Wikistats data discrepancy for India page views from hive data pull - https://phabricator.wikimedia.org/T237579 (10Nuria) Wikistats is a limited UI over a more capable API (AQS the analytics query service) so what wikistats offers and AQS can do are different things. Data so...
[18:47:33] <nuria>	 elukey: ya? db1107?
[18:48:25] <nuria>	 elukey:  tears come to my eyes when thinking of getting rid of those boxes , not that we do not LOVE every one of our boxes
[18:49:35] <elukey>	 :)
[19:03:02] <nuria>	 msg ottomata MEP meeting?
[19:49:30] <wikibugs>	 (03PS1) 10Joal: Add hdfs-rsync script based on Hdfs python lib [analytics/refinery] - 10https://gerrit.wikimedia.org/r/550536 (https://phabricator.wikimedia.org/T234229)
[19:53:39] <joal>	 ok team - leaving for diner
[20:07:26] <wikibugs>	 10Analytics, 10Better Use Of Data, 10Event-Platform, 10Product-Infrastructure-Team-Backlog, 10Epic: Vertical: Virtualpageview datastream on MEP - https://phabricator.wikimedia.org/T238138 (10Nuria)
[20:07:40] <wikibugs>	 10Analytics, 10Better Use Of Data, 10Event-Platform, 10Product-Infrastructure-Team-Backlog, 10Epic: Vertical: Virtualpageview datastream on MEP - https://phabricator.wikimedia.org/T238138 (10Nuria) Helpful notes on etherpad: https://etherpad.wikimedia.org/p/event-platform
[20:09:58] <nuria>	 ottomata: please modify as needed: https://phabricator.wikimedia.org/T238138
[20:12:00] <wikibugs>	 10Analytics, 10Analytics-Cluster, 10Operations, 10ops-eqiad: analytics1062 lost one of its power supplies - https://phabricator.wikimedia.org/T237133 (10ops-monitoring-bot) Icinga downtime for 2:00:00 set by otto@cumin1001 on 1 host(s) and their services with reason: analytics1062 lost one of its power sup...
[20:59:59] <wikibugs>	 (03CR) 10Ladsgroup: Add query to track WDQS updater hitting Special:EntityData (031 comment) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/549859 (https://phabricator.wikimedia.org/T218998) (owner: 10Ladsgroup)
[21:46:37] <wikibugs>	 10Analytics-EventLogging, 10Analytics-Kanban, 10Event-Platform, 10CPT Initiatives (Modern Event Platform (TEC2)), and 2 others: Modern Event Platform (TEC2) - https://phabricator.wikimedia.org/T185233 (10Ottomata)
[22:20:09] <nuria>	 mforns: apple copying your ideas: https://www.apple.com/uk/privacy/
[22:20:23] <mforns>	 O.o
[23:08:21] <wikibugs>	 10Analytics, 10Growth-Team, 10Product-Analytics: Growth: implement wider data purge window - https://phabricator.wikimedia.org/T237124 (10Nuria)
[23:22:56] <nuria>	 fdans: wikistats2 getting closer to 2000 uniques a day!