[00:12:50] <wikibugs>	 (03CR) 10Ladsgroup: [C: 03+2] Add script to collect lexicographical data statistics (031 comment) [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/647284 (owner: 10Lucas Werkmeister (WMDE))
[00:16:51] <wikibugs>	 (03CR) 10Ladsgroup: [C: 03+2] Extract WikimediaSparql helper class [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/647283 (owner: 10Lucas Werkmeister (WMDE))
[00:18:39] <wikibugs>	 (03PS1) 10Ladsgroup: Extract WikimediaSparql helper class [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/649399
[00:18:42] <wikibugs>	 (03Merged) 10jenkins-bot: Extract WikimediaSparql helper class [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/647283 (owner: 10Lucas Werkmeister (WMDE))
[00:18:46] <wikibugs>	 (03CR) 10Ladsgroup: [C: 03+2] Extract WikimediaSparql helper class [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/649399 (owner: 10Ladsgroup)
[00:18:53] <wikibugs>	 (03Merged) 10jenkins-bot: Add script to collect lexicographical data statistics [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/647284 (owner: 10Lucas Werkmeister (WMDE))
[00:19:18] <wikibugs>	 (03PS1) 10Ladsgroup: Add script to collect lexicographical data statistics [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/649400
[00:19:53] <wikibugs>	 (03Merged) 10jenkins-bot: Extract WikimediaSparql helper class [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/649399 (owner: 10Ladsgroup)
[00:21:45] <wikibugs>	 (03CR) 10Ladsgroup: [C: 03+2] Add script to collect lexicographical data statistics [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/649400 (owner: 10Ladsgroup)
[00:22:29] <wikibugs>	 (03Merged) 10jenkins-bot: Add script to collect lexicographical data statistics [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/649400 (owner: 10Ladsgroup)
[01:31:16] <wikibugs>	 10Analytics-Radar, 10Product-Analytics, 10Growth-Team (Current Sprint): HomepageVisit schema validation errors - https://phabricator.wikimedia.org/T269966 (10Tgr) We removed the tutorial module (along with the entire start module, which has been replaced by startemail) in {T258008} so `start_tutorial_state`...
[02:12:53] <wikibugs>	 10Analytics-Radar, 10Product-Analytics, 10Growth-Team (Current Sprint), 10Patch-For-Review: HomepageVisit schema validation errors - https://phabricator.wikimedia.org/T269966 (10Tgr) a:03Tgr
[06:29:12] <wikibugs>	 10Analytics-Radar, 10Product-Analytics, 10Growth-Team (Current Sprint), 10Patch-For-Review: HomepageVisit schema validation errors - https://phabricator.wikimedia.org/T269966 (10Tgr) >>! In T269966#6690735, @Tgr wrote: > I don't know how backporting works with the new git-hosted schemas and this is probabl...
[06:33:08] <wikibugs>	 10Analytics-Radar, 10Product-Analytics, 10Growth-Team (Current Sprint), 10Patch-For-Review: HomepageVisit schema validation errors - https://phabricator.wikimedia.org/T269966 (10Tgr) >>! In T269966#6690735, @Tgr wrote: > ...in which case we should rename it to `start_startemail_state`.  Oops, I mean `start...
[07:21:58] <elukey>	 With three binding +1s, three non-binding +1s, and no -1 or 0 votes,
[07:22:01] <elukey>	 the vote for 1.5.0 release PASSED.
[07:22:02] <elukey>	 \o/
[07:22:05] <elukey>	 good morning
[07:41:36] <joal>	 Hey! good morning!
[07:42:55] <elukey>	 bonjour!
[07:44:25] <wikibugs>	 10Analytics-Clusters, 10Analytics-Kanban, 10Patch-For-Review: Kerberos credential cache location - https://phabricator.wikimedia.org/T255262 (10elukey) I just fixed a couple of bugs: - `kerberos-run-command` wasn't working properly. - `presto` (client) was not picking up the new location of the credential ca...
[08:38:44] <wikibugs>	 10Analytics, 10Product-Analytics, 10Epic: Readership Retention: New vs. Returning Unique devices - https://phabricator.wikimedia.org/T269815 (10JAllemandou) More thoughts. The `unique-devices` metrics (`daily` and `monthly`, `per-domain` and `per-project-family`) are each the sum of 2 sub-fields (see https:/...
[08:40:11] <wikibugs>	 (03CR) 10Joal: [C: 03+1] "LGTM! Thanks Razzi" [analytics/aqs] - 10https://gerrit.wikimedia.org/r/647756 (https://phabricator.wikimedia.org/T268809) (owner: 10Razzi)
[08:54:42] <wikibugs>	 (03CR) 10Joal: "Asking for LRUCache for isWMFHostname as well, then good for me! thanks" (031 comment) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/646808 (https://phabricator.wikimedia.org/T256674) (owner: 10Ottomata)
[09:03:46] <icinga-wm>	 PROBLEM - aqs endpoints health on aqs1004 is CRITICAL: /analytics.wikimedia.org/v1/edits/per-page/{project}/{page-title}/{editor-type}/{granularity}/{start}/{end} (Get daily edits for english wikipedia page 0) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs
[09:03:46] <icinga-wm>	 PROBLEM - aqs endpoints health on aqs1005 is CRITICAL: /analytics.wikimedia.org/v1/edits/per-page/{project}/{page-title}/{editor-type}/{granularity}/{start}/{end} (Get daily edits for english wikipedia page 0) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs
[09:04:09] <joal>	 MEH - Druid datasource drop related :( --^
[09:04:28] <joal>	 elukey: Should we try to drop datasources in a different and nicer way?
[09:04:32] <icinga-wm>	 PROBLEM - aqs endpoints health on aqs1009 is CRITICAL: /analytics.wikimedia.org/v1/edits/per-page/{project}/{page-title}/{editor-type}/{granularity}/{start}/{end} (Get daily edits for english wikipedia page 0) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs
[09:05:42] <icinga-wm>	 PROBLEM - aqs endpoints health on aqs1007 is CRITICAL: /analytics.wikimedia.org/v1/edits/per-page/{project}/{page-title}/{editor-type}/{granularity}/{start}/{end} (Get daily edits for english wikipedia page 0) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs
[09:06:14] <icinga-wm>	 RECOVERY - aqs endpoints health on aqs1005 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs
[09:06:30] <elukey>	 joal: I am starting to think that this could be related to the aqs timeout to the druid-broker vip
[09:07:44] <icinga-wm>	 PROBLEM - aqs endpoints health on aqs1006 is CRITICAL: /analytics.wikimedia.org/v1/edits/per-page/{project}/{page-title}/{editor-type}/{granularity}/{start}/{end} (Get daily edits for english wikipedia page 0) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs
[09:07:50] <joal>	 elukey: making AQS accepting higher latency from druid would probably solve that I assume - But I don't know how much it should be (neither do I know how much it currently is actually)
[09:08:27] <elukey>	 joal: my idea is different - aqs should timeout aggressively, like max 5s/10s
[09:08:45] <elukey>	 what I suspect is that aqs taking a long time to timeout is also a reason for the AQS alerts to fire
[09:09:30] <elukey>	 since connections pile up due to the brief historical slowdown, but they are not timeout out and they also pile up on the aqs front
[09:10:14] <icinga-wm>	 PROBLEM - aqs endpoints health on aqs1005 is CRITICAL: /analytics.wikimedia.org/v1/edits/per-page/{project}/{page-title}/{editor-type}/{granularity}/{start}/{end} (Get daily edits for english wikipedia page 0) timed out before a response was received https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs
[09:11:00] <joal>	 Ah - I get it elukey - So actually the problem is the code doing the check timing out, while we're rather have AQS time out
[09:12:19] <elukey>	 joal: I think that the issue is on the Druid front for sure, dropping datasources like we do causes historicals to become slower, but we should also be more reactive on the aqs front, it should help
[09:12:34] <joal>	 right
[09:12:38] <elukey>	 does it make sense?
[09:12:52] <joal>	 As much as my brain can pick up, it does!
[09:14:32] <elukey>	 it should have recovered by now though
[09:14:34] <elukey>	 mmmmm
[09:14:36] <elukey>	 https://grafana.wikimedia.org/d/000000538/druid?viewPanel=60&orgId=1&refresh=1m is not great
[09:16:33] <joal>	 elukey: looking at https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&refresh=5m&var-server=druid1004&var-datasource=thanos&var-cluster=druid_public
[09:16:36] <icinga-wm>	 RECOVERY - aqs endpoints health on aqs1007 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs
[09:16:37] <joal>	 This is weird
[09:17:13] <elukey>	 what is weird?
[09:17:21] <joal>	 Spikes in TCP errors while not even a big load
[09:18:01] <elukey>	 I think those are the conns piling up on the broker front
[09:18:10] <icinga-wm>	 RECOVERY - aqs endpoints health on aqs1009 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs
[09:18:52] <icinga-wm>	 RECOVERY - aqs endpoints health on aqs1005 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs
[09:19:06] <icinga-wm>	 RECOVERY - aqs endpoints health on aqs1006 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs
[09:20:22] <icinga-wm>	 RECOVERY - aqs endpoints health on aqs1004 is OK: All endpoints are healthy https://wikitech.wikimedia.org/wiki/Services/Monitoring/aqs
[09:24:42] <joal>	 elukey: may I have your blessing for https://gerrit.wikimedia.org/r/c/analytics/refinery/+/647681 please?
[09:26:06] <wikibugs>	 (03CR) 10Elukey: [C: 03+1] Correct mediawiki_image table sqoop and creation [analytics/refinery] - 10https://gerrit.wikimedia.org/r/647681 (https://phabricator.wikimedia.org/T266077) (owner: 10Joal)
[09:26:29] <joal>	 Thanks :)
[09:26:55] <elukey>	 :)
[09:28:02] <elukey>	 ah very interesting, the historicals didn't send any metrics during the timeframe of the problem
[09:28:14] <joal>	 hm
[09:28:42] <joal>	 elukey: it feels like an internal deadlock no? The hosts were not even gently busy
[09:29:46] <elukey>	 yeah I was thinking the same, as if the historicals locked everything when deleting
[09:31:09] <joal>	 elukey: actual deletion seems to happen fast - I wonder why it locks for so long :(
[09:36:29] <elukey>	 joal: how many datasource do we drop at once?
[09:36:55] <joal>	 elukey: 1 datasource (in regular case) - But many segments
[09:39:58] <elukey>	 joal: ok so something definitely weird, I see loading/announcing activity for a lot of datasources, including mediawiki_history_reduced_2020_11
[09:40:04] <elukey>	 that I didn't expect
[09:40:43] <joal>	 Weird!
[09:40:49] <elukey>	 ah snap also loading from hdfs
[09:40:53] <wikibugs>	 (03CR) 10Andrew-WMDE: [C: 03+1] "+1 for Adam's changes" (031 comment) [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/647742 (https://phabricator.wikimedia.org/T262209) (owner: 10Andrew-WMDE)
[09:40:55] <elukey>	 what the hell
[09:41:02] <joal>	 WUT?
[09:41:26] <elukey>	 ok so this would make more sense, the historicals are locked because they are loading also segments that aqs is asking
[09:41:40] <elukey>	 2020-12-15T09:22:08,812 INFO org.apache.druid.storage.hdfs.HdfsDataSegmentPuller: Unzipped 140106271 bytes from [hdfs://analytics-hadoop/user/druid/deep-storage-public-eqiad/mediawiki_history_reduced_2020_11/...
[09:42:31] <joal>	 elukey: Could it be possible that the historical caches (I thought they weren't used?) are shared among datasources? That would be aweful but eh
[09:43:05] <elukey>	 in theory no, we have for example /srv/druid/segment-cache/mediawiki_history_reduced_2020_11
[09:43:56] <joal>	 elukey: right - this is segment-cache
[09:43:56] <elukey>	 2020-12-15T09:10:37,433 INFO org.apache.druid.storage.hdfs.HdfsDataSegmentPuller: Unzipped 406035520 bytes from [hdfs://analytics-hadoop/user/druid/deep-storage-public-eqiad/mediawiki_history_reduced_2020_09
[09:44:12] <elukey>	 the above are 400MB from HDFS
[09:44:45] <joal>	 elukey: another idea - When dropping a datasource, the coordinator decides to reshuffle data, trying to make it more evenly spread out
[09:45:10] <elukey>	 it is a good idea indeed
[09:45:16] <elukey>	 it would explain this mess
[09:48:43] <elukey>	 2020-12-15T09:07:02,162 INFO org.apache.druid.server.coordinator.rules.LoadRule: Dropping segment [mediawiki_history_reduced_2020_11_2015-12-01T00:00:00.000Z_2016-01-01T00:00:00.000Z_2020-12-03T11:32:52.025Z] on server [druid1007.eqiad.wmnet:8083] in tier [_default_tier]
[09:48:48] <elukey>	 joal: --^
[09:49:12] <joal>	 meh
[09:51:13] <elukey>	 joal: ok mistery "solved", it is the coordinator
[09:51:26] <elukey>	 more specifically, its balancer
[09:52:02] <joal>	 elukey: Ahhhh! I'm sorry it didin't come to mind earlier :(
[09:52:28] <elukey>	 joal: it is totally crazy though, I hope there is a way to tell the coordinator to take it very slowly
[09:52:38] <joal>	 so do I elukey !!!
[09:55:49] <elukey>	 joal: is the replication 3 for the datasources?
[09:56:12] <joal>	 I think it's 2 in default-tier
[09:58:15] <elukey>	 in https://druid.apache.org/docs/latest/configuration/index.html there are some interesting things
[09:58:31] <elukey>	 for example, the coordinator can be stopped dynamically from doing things
[11:15:19] * elukey lunch!
[12:12:59] <wikibugs>	 10Analytics: Druid datasource drop triggers segment reshuffling by the coordinator - https://phabricator.wikimedia.org/T270173 (10elukey) p:05Triage→03High
[14:03:19] <joal>	 Hi a-team - today is train day (in addition to fun day :) - I've put up stuff in here https://etherpad.wikimedia.org/p/analytics-weekly-train - Please let me know if there is anything more you'd like to see dpeloyed
[14:07:02] <fdans>	 joal: just added a single item, but since you're already deploying source, there's no additional action needed
[14:07:12] <fdans>	 thank you for doing the train joseph!
[14:13:21] <elukey>	 joal: it is also fine if you want to deploy tomorrow and enjoy the day ;)
[14:14:10] <joal>	 elukey: tomorrow is kid's day for me - not easier :)
[14:16:58] <elukey>	 ack!
[14:24:01] <joal>	 Hi ottomata - Sorry to jump on you like this - Do you wish we do a fast turn around for merge and deploy of the isWMFHostname patch?
[14:53:31] <wikibugs>	 10Analytics-Kanban, 10Patch-For-Review: Test the Bigtop 1.5 RC release on the Hadoop test cluster - https://phabricator.wikimedia.org/T269919 (10elukey) Worked more on the cookbook: I added an explicit `systemctl mask` step for HDFS namenodes and datanodes, to prevent them from starting during the install pack...
[14:58:56] <elukey>	 joal: https://lists.fosdem.org/pipermail/fosdem/2020q4/003112.html
[14:59:31] <joal>	 Nice elukey!
[14:59:44] <joal>	 Thanks for the link :)
[15:00:26] <elukey>	 the Bigtop upstream devs asked if we either want to join a talk about bigtop or present bigtop@wikimedia
[15:01:22] <joal>	 elukey: It's too late to make a talk I think :)
[15:02:32] <joal>	 But it could be fun to attend!
[15:02:40] <joal>	 And maybe plan fro a talk next year?
[15:02:47] <joal>	 Need to drop for kids
[15:03:01] <joal>	 Back after, and train!
[15:05:06] <elukey>	 joal: the deadline is in few days, they only need an abstract now
[15:05:28] <elukey>	 they ask for a recording by mid jan
[15:16:12] <wikibugs>	 (03CR) 10Mforns: [C: 03+1] "I see 2 +1. Please, feel free to merge this! The next step to make this work, would be to add a reportupdater job snippet to puppet. This " [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/647742 (https://phabricator.wikimedia.org/T262209) (owner: 10Andrew-WMDE)
[15:48:08] <wikibugs>	 (03CR) 10Awight: "Cool!  (I don't have CR+2 but will happily wait.)" [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/647742 (https://phabricator.wikimedia.org/T262209) (owner: 10Andrew-WMDE)
[15:50:11] * elukey afk for a bit!
[15:51:05] <wikibugs>	 (03CR) 10Mforns: [V: 03+2 C: 03+2] "Oh, sorry! Will merge" [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/647742 (https://phabricator.wikimedia.org/T262209) (owner: 10Andrew-WMDE)
[16:04:54] <wikibugs>	 (03CR) 10Mforns: [V: 03+2 C: 03+2] "LGTM!" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/648139 (https://phabricator.wikimedia.org/T269918) (owner: 10Itamar Givon)
[16:08:43] <wikibugs>	 (03CR) 10Mforns: [C: 03+1] "@Awight, please let me know if you prefer to merge as is, or comment out graphite lines." [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/645345 (https://phabricator.wikimedia.org/T260138) (owner: 10Andrew-WMDE)
[16:17:12] <wikibugs>	 (03CR) 10Mforns: [WIP] Aggregate TemplateWizard metrics (032 comments) [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/649351 (https://phabricator.wikimedia.org/T262209) (owner: 10Awight)
[16:20:43] <mforns>	 heya a-team, are we skipping meetings today?
[16:21:13] <fdans>	 a-team yes, skipping meetings because of FUN DAY
[16:24:29] <gmodena>	 elukey joal do you have experience with moving data from a stat host into a mysql prod host (dbproxies)? 
[16:25:04] <wikibugs>	 (03PS1) 10Lucas Werkmeister (WMDE): Limit number of lexeme metrics sent to Graphite [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/649669
[16:25:37] <mforns>	 fdans: 👍
[16:26:45] <mforns>	 wait... but, meetings are fun! :D
[16:29:53] <gmodena>	 elukey joal looks like we can't access dbproxy from stat because firewall. I was wondering if we could request fw changes, or if we'd better look for other routes. 
[16:29:58] <elukey>	 gmodena: it depends what dbproxies, we have a firewall in between analytics => production, so traffic going out a stat100x needs to follow some rules otherwise it gets dropped
[16:30:07] <elukey>	 yes :)
[16:30:19] <elukey>	 can you open a task with the specifics?
[16:31:14] <gmodena>	 elukey sure thing! 
[16:32:21] <gmodena>	 elukey joal alos, yay fosedm!
[16:33:25] <gmodena>	 bummer it will be online in 2021, otherwise we could have improvised a get together 
[16:34:50] <elukey>	 ah yes for sure! For working reasons (the all hands coincided with the fosdem every time) I never went to fosdem :(
[16:34:55] <wikibugs>	 (03CR) 10Ladsgroup: "there is a bit of duplication of logic here, can it be DRY'ed up a bit?" [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/649669 (owner: 10Lucas Werkmeister (WMDE))
[16:36:30] <elukey>	 razzi: o/ when you have a moment can you check my last comment in https://gerrit.wikimedia.org/r/c/operations/puppet/+/645120 ?
[16:36:47] <elukey>	 we can do it tomorrow of course, just wanted to know if it was in your radar
[16:36:58] <elukey>	 (so you'll be able to ack stuff in icinga etc..)
[16:39:53] <ottomata>	 gmodena: it is not something we have great support for
[16:39:58] <ottomata>	 some ideas and context in the discussion in https://phabricator.wikimedia.org/T266826
[16:40:24] <ottomata>	 https://phabricator.wikimedia.org/T266826#6600009
[16:45:18] <gmodena>	 ottomata ack. thanks, I'll review
[16:47:26] <wikibugs>	 (03PS1) 10Lucas Werkmeister (WMDE): Send accurate timestamp with lexeme statistics [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/649676
[16:48:58] <wikibugs>	 (03CR) 10Lucas Werkmeister (WMDE): "I thought the duplication was acceptable in the original script, since there still seemed to be a number of differences in the functions. " [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/649669 (owner: 10Lucas Werkmeister (WMDE))
[16:58:13] <wikibugs>	 (03CR) 10Ladsgroup: "> Patch Set 1:" [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/649669 (owner: 10Lucas Werkmeister (WMDE))
[16:58:18] <wikibugs>	 (03CR) 10Ladsgroup: [C: 03+2] Limit number of lexeme metrics sent to Graphite [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/649669 (owner: 10Lucas Werkmeister (WMDE))
[16:59:12] <wikibugs>	 (03Merged) 10jenkins-bot: Limit number of lexeme metrics sent to Graphite [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/649669 (owner: 10Lucas Werkmeister (WMDE))
[17:00:15] <wikibugs>	 (03PS1) 10Ladsgroup: Limit number of lexeme metrics sent to Graphite [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/649687
[17:00:18] <wikibugs>	 (03CR) 10Lucas Werkmeister (WMDE): "found an issue, fortunately only in a comment" (031 comment) [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/649669 (owner: 10Lucas Werkmeister (WMDE))
[17:00:21] <wikibugs>	 (03CR) 10Ladsgroup: [C: 03+2] Limit number of lexeme metrics sent to Graphite [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/649687 (owner: 10Ladsgroup)
[17:01:33] <wikibugs>	 (03Merged) 10jenkins-bot: Limit number of lexeme metrics sent to Graphite [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/649687 (owner: 10Ladsgroup)
[17:04:00] <joal>	 ottomata: heya - quick fixes before Ideploy, or not now?
[17:06:27] <joal>	 also ottomata - As per https://phabricator.wikimedia.org/T266826#6600009 - Wouldn't it be easier to have a hole open with dedicated restrictions to write to dome DBs?
[17:07:20] <joal>	 mforns: ping as well (you were not here where I send the original message) - Anything you'd like to add to the train before I start?
[17:10:47] <wikibugs>	 (03CR) 10Awight: "> @Awight, please let me know if you prefer to merge as is, or comment out graphite lines." [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/645345 (https://phabricator.wikimedia.org/T260138) (owner: 10Andrew-WMDE)
[17:11:30] <ottomata>	 joal: maybe, but that might be something to talk to dbas about
[17:11:39] <ottomata>	 i don't love the idea of using stat1008 as part of a production pipeline
[17:11:47] <ottomata>	 but we do need to solve that problem more generally
[17:12:14] <joal>	 ottomata: Right, that was my point (trying to find a solution facilitating broad use-cases)
[17:13:06] <joal>	 ottomata: We could have a dedicated host on our side with restricted access to write to prod DB (single host for no overwhelming, restricted access for prod stuff only)
[17:13:14] <joal>	 anyway, just an idea 
[17:13:15] <wikibugs>	 (03CR) 10Awight: [WIP] Aggregate TemplateWizard metrics (032 comments) [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/649351 (https://phabricator.wikimedia.org/T262209) (owner: 10Awight)
[17:13:20] <joal>	 ottomata: last ping on java code?
[17:13:30] <joal>	 (while I have your attention :)
[17:15:51] <joal>	 looks like my demand for attention has been rejected
[17:21:53] <wikibugs>	 10Quarry: quarry-web-01 / is full - https://phabricator.wikimedia.org/T270198 (10Reedy)
[17:22:44] * elukey sends some attention to joal
[17:22:49] <joal>	 <3
[17:28:03] <wikibugs>	 10Quarry: quarry-web-01 / is full - https://phabricator.wikimedia.org/T270198 (10Reedy) ` reedy@quarry-web-01:/srv/quarry$ df -h Filesystem                                                         Size  Used Avail Use% Mounted on udev                                                               2.0G     0  2.0G...
[17:28:08] <joal>	 Last call ottomata and mforns - Starting to deploy in 2 minutes if no news :)
[17:30:06] <mforns>	 oh sorry joal 
[17:30:19] <mforns>	 let me check quickly
[17:30:36] <joal>	 Actually elukey - Do you wish me to deploy AQS with the patch for cache-header?
[17:30:48] <wikibugs>	 (03PS1) 10Lucas Werkmeister (WMDE): Reduce duplicate code in lexeme statistics script [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/649710
[17:31:02] <wikibugs>	 (03CR) 10Lucas Werkmeister (WMDE): "> Patch Set 1:" [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/649669 (owner: 10Lucas Werkmeister (WMDE))
[17:31:02] <mforns>	 joal: I merged several changes to the eventlogging sanitization whitelist
[17:31:25] <mforns>	 on my side, you can go ahead, I will add them in the deployment train
[17:31:25] <elukey>	 joal: if you folks are ok with the change it seems good, but afaik it was never tested no?
[17:31:48] <joal>	 mforns: ack! it should be picked up automagically upon deploy right?
[17:31:58] <mforns>	 yes joal 
[17:32:37] <joal>	 mforns: ok! any patch you'd wish me to wait for or all good for you?
[17:32:48] <joal>	 elukey: it has not been tested I don't think so
[17:33:06] <joal>	 elukey: you tell me - I'm happy to deploy, or we can wait and test
[17:33:32] <mforns>	 joal: I'm good!
[17:33:38] <joal>	 Ack
[17:33:43] <joal>	 deploy time it is then
[17:35:27] <elukey>	 joal: let's wait for a test
[17:35:47] <joal>	 works for me elukey
[17:36:05] <joal>	 Today's train starts now with refinery-source and refinery on-board
[17:37:40] <wikibugs>	 (03PS1) 10Reedy: SECURITY: Set correct Mime Type on /api/preferences [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/649711 (https://phabricator.wikimedia.org/T270195)
[17:38:01] <wikibugs>	 (03CR) 10Reedy: [C: 03+2] SECURITY: Set correct Mime Type on /api/preferences [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/649711 (https://phabricator.wikimedia.org/T270195) (owner: 10Reedy)
[17:38:32] <wikibugs>	 (03Merged) 10jenkins-bot: SECURITY: Set correct Mime Type on /api/preferences [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/649711 (https://phabricator.wikimedia.org/T270195) (owner: 10Reedy)
[17:38:51] <ottomata>	 joal:  sorry am amking lunch and then prepping ofr interview
[17:38:56] <ottomata>	 not going to make this deploy! :) thank you thogh
[17:39:18] <joal>	 ottomata: no prob - Thanks for answer :)
[17:40:00] <wikibugs>	 (03CR) 10Joal: [C: 03+2] "Merging for deloy" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/643255 (owner: 10Joal)
[17:45:01] <wikibugs>	 10Analytics, 10Quarry, 10Security-Team, 10Patch-For-Review, and 3 others: Reflected Cross-Site scripting (XSS) vulnerability in analytics-quarry-web - https://phabricator.wikimedia.org/T270195 (10Reedy) 05Open→03Resolved a:03Reedy Pulled onto hosts, workers restarted
[17:46:03] <wikibugs>	 10Analytics, 10Quarry, 10Security-Team, 10Patch-For-Review, and 3 others: Reflected Cross-Site scripting (XSS) vulnerability in analytics-quarry-web - https://phabricator.wikimedia.org/T270195 (10Reedy)
[17:46:28] <wikibugs>	 (03Merged) 10jenkins-bot: Update pageview title extraction for trailing EOL [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/643255 (owner: 10Joal)
[17:48:34] <wikibugs>	 10Quarry: quarry-web-01 / is full - https://phabricator.wikimedia.org/T270198 (10Reedy) 05Open→03Resolved a:03Reedy
[18:03:49] <wikibugs>	 10Analytics-Radar, 10Product-Analytics, 10Growth-Team (Current Sprint), 10Patch-For-Review: HomepageVisit schema validation errors - https://phabricator.wikimedia.org/T269966 (10nettrom_WMF) >>! In T269966#6690735, @Tgr wrote: > `start_email_state` should just be fixed to use the startemail module, unless...
[18:15:08] <wikibugs>	 10Analytics, 10Better Use Of Data, 10Product-Analytics, 10Product-Infrastructure-Data: Schema repository structure, naming - https://phabricator.wikimedia.org/T269936 (10kzimmerman)
[18:15:30] <wikibugs>	 10Analytics, 10Better Use Of Data, 10Product-Analytics, 10Product-Infrastructure-Data: Schema repository structure, naming - https://phabricator.wikimedia.org/T269936 (10mpopov) >>! In T269936#6685002, @Mholloway wrote: > So they're linked here, here are the recommendations from Product Analytics that I th...
[18:17:14] * elukey afk! (will check later)
[18:18:26] <wikibugs>	 (03PS1) 10Joal: Bump changelog.md to v0.0.141 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/649717
[18:18:50] <wikibugs>	 10Analytics, 10Better Use Of Data, 10Event-Platform, 10Product-Analytics, 10Product-Infrastructure-Data: MEP: Should stream configurations be written in YAML? - https://phabricator.wikimedia.org/T269774 (10kzimmerman)
[18:19:38] <wikibugs>	 (03CR) 10Joal: [V: 03+2 C: 03+2] "Merging for deploy" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/649717 (owner: 10Joal)
[18:22:00] <wikibugs>	 10Analytics, 10Product-Analytics, 10Inuka-Team (Kanban): Set up preview counting for KaiOS app - https://phabricator.wikimedia.org/T244548 (10Jpita) Neil was able to see the events
[18:26:19] <joal>	 !log Release refinery-source v0.0.141
[18:26:21] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[18:32:56] <wikibugs>	 (03CR) 10Ladsgroup: "I don't understand the reasoning behind this change. The stat is being called once a day and its metrics basically has precision of a day." [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/649676 (owner: 10Lucas Werkmeister (WMDE))
[19:02:21] <wikibugs>	 (03PS1) 10Maven-release-user: Add refinery-source jars for v0.0.141 to artifacts [analytics/refinery] - 10https://gerrit.wikimedia.org/r/649720
[19:02:44] <wikibugs>	 (03CR) 10Joal: [V: 03+2 C: 03+2] "Merging for deploy" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/649720 (owner: 10Maven-release-user)
[19:04:05] <wikibugs>	 (03PS2) 10Joal: Correct mediawiki_image table sqoop and creation [analytics/refinery] - 10https://gerrit.wikimedia.org/r/647681 (https://phabricator.wikimedia.org/T266077)
[19:04:21] <wikibugs>	 (03CR) 10Joal: [V: 03+2 C: 03+2] "Merging for dpeloy" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/647681 (https://phabricator.wikimedia.org/T266077) (owner: 10Joal)
[19:04:51] <wikibugs>	 (03PS3) 10Joal: oozie: Replace all references of an-coord1001 with analytics-hive [analytics/refinery] - 10https://gerrit.wikimedia.org/r/647612 (https://phabricator.wikimedia.org/T268028) (owner: 10Elukey)
[19:05:10] <wikibugs>	 (03CR) 10Joal: [V: 03+2 C: 03+2] "Merging for deploy" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/647612 (https://phabricator.wikimedia.org/T268028) (owner: 10Elukey)
[19:05:40] <ottomata>	 heya a-team, giving a quick tour of analytics event world to jason and others now
[19:05:43] <ottomata>	 feel free to joini!
[19:05:54] <mforns>	 where?
[19:05:56] <ottomata>	 https://meet.google.com/wij-cnxf-qor
[19:09:52] <wikibugs>	 (03PS1) 10Joal: Update webrequest hive jar to v0.0.141 [analytics/refinery] - 10https://gerrit.wikimedia.org/r/649722 (https://phabricator.wikimedia.org/T268630)
[19:10:39] <wikibugs>	 (03CR) 10Joal: [V: 03+2 C: 03+2] "Merging for deploy" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/649722 (https://phabricator.wikimedia.org/T268630) (owner: 10Joal)
[19:11:03] <ottomata>	 actually, that will be in 20 minutes!
[19:11:19] <joal>	 ottomata: I'll come say hello but won't stay :)
[19:11:35] <ottomata>	 ok :)
[19:12:00] <joal>	 ottomata: or might stay will doing the train, depending on the deployment stage
[19:14:37] <joal>	 !log Scap deploy refinery 
[19:14:38] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[19:28:25] <wikibugs>	 10Analytics, 10Product-Analytics, 10Inuka-Team (Kanban): Set up preview counting for KaiOS app - https://phabricator.wikimedia.org/T244548 (10AMuigai) 05Open→03Resolved
[19:32:34] <ottomata>	 mforns:  yoohoo
[19:43:42] <joal>	 !log Deploy refinery onto HDFS
[19:43:44] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[19:55:06] <wikibugs>	 10Analytics: Druid datasource drop triggers segment reshuffling by the coordinator - https://phabricator.wikimedia.org/T270173 (10Milimetric) Thanks for raising this.  It seems to be something other folks are dealing with, if not exactly due to shuffling segments at least in a related way with a large performanc...
[20:24:40] <joal>	 !log Kill restart webrequest_load oozie job after deploy
[20:24:42] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[20:25:37] <wikibugs>	 (03CR) 10Mforns: [V: 03+2 C: 03+2] Process EventLogging events and tally preferences for CodeMirror [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/645345 (https://phabricator.wikimedia.org/T260138) (owner: 10Andrew-WMDE)
[20:27:23] <elukey>	 joal: all good? Do you need anything for the oozie stuff?
[20:27:43] <joal>	 Hi elukey - Shouldn't be in bed at that time :-P
[20:27:54] <elukey>	 (as I mentioned earlier on, the restarts can be done anytime, please take it easy, I'll do it too during the next days)
[20:28:12] <joal>	 elukey: all good - Just restarting webrequest, big bunch of other restarts will happen tomorrow
[20:28:19] <joal>	 Or the day after :)
[20:28:26] <elukey>	 it is 21:30, I know I am not that young anymore but it seems a little earlier for bed time :D
[20:28:38] <elukey>	 perfect :)
[20:28:41] <joal>	 Ah - It's me being old I guess ;)
[20:28:44] <elukey>	 hahahahah
[20:29:00] <joal>	 have a good evening :)
[20:29:06] <elukey>	 you too :)
[20:30:49] <wikibugs>	 (03CR) 10Mforns: "Oh, hmm.. we still have to figure out the split between hive queries and mysql queries..." [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/645345 (https://phabricator.wikimedia.org/T260138) (owner: 10Andrew-WMDE)
[21:02:27] <ottomata>	 razzi:  assuming you are doing fun day stuff, ping me in case you want to meet!  
[21:02:28] <ottomata>	 :)