[02:15:17] 10Analytics-Radar, 10Product-Analytics, 10Growth-Team (Current Sprint), 10MW-1.36-notes (1.36.0-wmf.25; 2021-01-05): HomepageVisit schema validation errors - https://phabricator.wikimedia.org/T269966 (10Etonkovidova) Checked Schema:HomepageVisit in ` betalabs` - the events are being recorded; [[ https://me... [03:04:55] 10Analytics-Radar, 10Product-Analytics, 10Growth-Team (Current Sprint), 10MW-1.36-notes (1.36.0-wmf.25; 2021-01-05): HomepageVisit schema validation errors - https://phabricator.wikimedia.org/T269966 (10Tgr) No idea how that would happen. `start_email_state` is only omitted when the startemail module is no... [03:14:44] 10Analytics-Radar, 10Product-Analytics, 10Growth-Team (Current Sprint), 10MW-1.36-notes (1.36.0-wmf.25; 2021-01-05): HomepageVisit schema validation errors - https://phabricator.wikimedia.org/T269966 (10Tgr) ...unless you are looking at entries from before merging the patch, in which case this is the bug t... [06:15:35] 10Analytics-Clusters: Convert labsdb1012 from multi-source to multi-instance - https://phabricator.wikimedia.org/T269211 (10Marostegui) >>! In T269211#6719873, @Ottomata wrote: > We should use this also as an opportunity to reinstall as Debian Buster Yes, that is the whole point of this migration. MariaDB 10.1... [07:09:05] goood morning [07:24:57] 10Analytics-Clusters: Convert labsdb1012 from multi-source to multi-instance - https://phabricator.wikimedia.org/T269211 (10elukey) I took a look to the puppetization that Brooke did and it seems really straightforward, I am going to write down a high level plan in here to get people's opinion and confirmations:... [08:03:48] Good morning [08:05:31] elukey: let me know when we should talk about distcp ) [08:06:25] joal: bonjour! What do you have in mind? [08:06:49] elukey: I was thinking about starting the data copy for backup cluster [08:07:24] if it is small then ok! [08:07:38] (we don't have a lot of free space in there) [08:08:00] elukey: can we start the work now, or do you prefer us to wait? [08:08:18] nono you can go ahead [08:08:33] at some point during the week I'll try to test the cookbook to upgrade the cluster [08:08:42] but we can sync about when so distcp is not running [08:10:06] ok [08:12:18] elukey: I'm planning on doing small-ish datasets first (in /wmf/data/wmf, everything being less that 1T useful) [08:12:32] sure [08:12:48] this would be a first test for not so big data, and would also allow me to check for incremental copy [08:14:09] 10Analytics, 10Analytics-Kanban, 10Growth-Team, 10Product-Analytics, 10Patch-For-Review: Migrate Growth EventLogging schemas to Event Platform - https://phabricator.wikimedia.org/T267333 (10Aklapper) [08:14:14] 10Analytics, 10Better Use Of Data, 10Event-Platform, 10Product-Infrastructure-Data, and 2 others: MEP Client MediaWiki PHP - https://phabricator.wikimedia.org/T253121 (10Aklapper) [08:24:18] 10Analytics-Clusters: Convert labsdb1012 from multi-source to multi-instance - https://phabricator.wikimedia.org/T269211 (10Marostegui) >>! In T269211#6721801, @elukey wrote: > > At this point I guess Data persistence will help in getting the data in on the node. How much time would it take? (I am trying to fig... [09:21:19] (03CR) 10Joal: [C: 03+1] "LGTM!" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/654256 (owner: 10Mforns) [09:26:01] 10Analytics, 10Analytics-Data-Quality: Unique devices numbers for all wikipedias missing for Agust and SEptember - https://phabricator.wikimedia.org/T271170 (10JAllemandou) Thanks for pointing this @nuria :) I have checked data on the cluster and on dumps and all data seems present. This is related to the data... [09:26:25] 10Analytics, 10Analytics-Data-Quality, 10Analytics-Kanban: Unique devices numbers for all wikipedias missing for Agust and SEptember - https://phabricator.wikimedia.org/T271170 (10JAllemandou) a:03JAllemandou [09:29:22] !log Manually reload unique-devices monthly in cassandra to fix T271170 [09:29:25] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [09:29:25] T271170: Unique devices numbers for all wikipedias missing for Agust and SEptember - https://phabricator.wikimedia.org/T271170 [09:32:28] ok just tested the config that I was talking about related to PXE boot into debian-rescue mode, works :) [09:32:36] very handy in case it is needed [09:33:20] elukey: I'm interested to understand just a bit more detail if you can explain again please (sorry, slow brain me :/) [09:33:23] now the question is - do we have all root partition on raid setups correctly configured? (namely boot sector present on all disks) [09:33:49] joal: ah yes sure! So the patch is https://gerrit.wikimedia.org/r/654192, a little cryptic [09:34:11] but the gist of it is that, by default, when you force PXE boot it goes straight into debian install [09:34:59] there is an option called "rescue mode", that allows you to do more fancy things like recovering the current status of disks/partitions/etc.. and drop into a complete shell [09:35:22] for example, say that we have /dev/sda and /dev/sdb in raid 1, with the root partition on top [09:35:37] and the boot sector only on /dev/sda configured with mbr etc.. [09:35:49] if /dev/sda dies, we cannot boot anymore [09:36:07] * joal follows with attention [09:36:11] so using debian install in rescue mode allows you to get into a shell and execute "grub-install /dev/sdb" [09:36:14] reboot [09:36:25] and enjoy the host again [09:36:31] Ok I follow [09:37:08] so your patch makes hosts offer the possiblity to boot in rescue-mode instead of direxctly going to install - right? [09:37:21] exactly yes, leaving the default to install [09:37:33] the long term goal for SRE is to provide a fancy menu for PXE [09:37:42] and waiting for say, 5 seconds before going to install [09:37:44] so people can install, rescue, memtest, etc.. [09:37:48] yep [09:37:51] \o/ [09:37:57] * joal has understood! [09:39:16] thanks elukey :) [09:39:59] 10Analytics, 10Analytics-Data-Quality, 10Analytics-Kanban: Unique devices numbers for all wikipedias missing for Agust and SEptember - https://phabricator.wikimedia.org/T271170 (10JAllemandou) Problem solved. {F33985369} [09:51:50] 10Analytics, 10Event-Platform: ExternalLinksChange Event Platform Migration - https://phabricator.wikimedia.org/T271162 (10Samwalton9) Huh - I assumed these two things were linked. In that case I'm not sure this has ever been used by us and can be removed. [10:23:58] https://issues.apache.org/jira/browse/BIGTOP-3471 [10:24:14] they are planning to remove alluxio, weird [10:24:25] I'll ask questions :) [10:25:46] :( [10:31:37] also, we use sqoop shipped by CDH right? [10:31:42] with a wrapper around it [10:32:00] correct elukey [10:32:10] also planned to be removed :D [10:32:11] I have seen that sqoop is removed [10:32:27] Maybe we'll move to Gobblin, or Spark ?) [10:32:33] I am adding a comment, if we offer time to fix issues they'll probably keep it [10:34:27] added a comment, also about debian 11 [11:37:54] 10Quarry, 10Gerrit-Privilege-Requests, 10User-DannyS712: Quarry repo access should be cleaned up - https://phabricator.wikimedia.org/T201435 (10DannyS712) a:03DannyS712 [12:05:36] * elukey lunch! [13:22:16] 10Analytics-Clusters, 10DC-Ops, 10Operations, 10ops-eqiad: (Need By: TBD) rack/setup/install an-worker11[18-41] - https://phabricator.wikimedia.org/T260445 (10elukey) @Cmjohnson happy 2021 :) When you have a moment could you please unrack the one host added to B4 and the two to C2? Then I think we could... [13:46:44] (03PS4) 10DCausse: Add rdf-streaming-updater schemas for side outputs [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/647723 (https://phabricator.wikimedia.org/T269619) [13:48:25] (03CR) 10DCausse: "> Patch Set 3:" (038 comments) [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/647723 (https://phabricator.wikimedia.org/T269619) (owner: 10DCausse) [13:49:57] 10Analytics, 10Event-Platform: ExternalLinksChange Event Platform Migration - https://phabricator.wikimedia.org/T271162 (10Ottomata) Great! Thank you. [13:50:27] 10Analytics, 10Event-Platform: ExternalLinksChange Event Platform Migration - https://phabricator.wikimedia.org/T271162 (10Ottomata) 05Open→03Declined [13:50:30] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Event-Platform, and 4 others: Migrate legacy metawiki schemas to Event Platform - https://phabricator.wikimedia.org/T259163 (10Ottomata) [13:50:46] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Event-Platform, and 4 others: Migrate legacy metawiki schemas to Event Platform - https://phabricator.wikimedia.org/T259163 (10Ottomata) [13:54:48] 10Analytics, 10Event-Platform: TranslationRecommendation* Schemas Event Platform Migration - https://phabricator.wikimedia.org/T271163 (10Ottomata) > If it's not hard, I'd ask to retain the geocoded data. It isn't hard, we can do! > eventlogging also always shows up in webrequests so we can always extract tha... [13:58:20] 10Analytics, 10Event-Platform, 10Fundraising-Backlog: CentralNoticeBannerHistory Event Platform Migration - https://phabricator.wikimedia.org/T271168 (10Ottomata) Will do ok! [14:06:45] (03CR) 10Ottomata: Add rdf-streaming-updater schemas for side outputs (032 comments) [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/647723 (https://phabricator.wikimedia.org/T269619) (owner: 10DCausse) [14:16:20] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Event-Platform, and 4 others: Migrate legacy metawiki schemas to Event Platform - https://phabricator.wikimedia.org/T259163 (10Ottomata) [14:17:15] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Event-Platform, and 4 others: Migrate legacy metawiki schemas to Event Platform - https://phabricator.wikimedia.org/T259163 (10Ottomata) [14:18:35] 10Analytics, 10Event-Platform, 10Fundraising-Backlog: CentralNoticeBannerHistory and CentralNoticeImpression Event Platform Migration - https://phabricator.wikimedia.org/T271168 (10Ottomata) [14:19:58] 10Analytics, 10Event-Platform, 10Fundraising-Backlog: CentralNoticeBannerHistory and CentralNoticeImpression Event Platform Migration - https://phabricator.wikimedia.org/T271168 (10Ottomata) Oh, I just noticed you also have [[ https://meta.wikimedia.org/wiki/Schema:CentralNoticeImpression | CentralNoticeImpr... [14:27:50] 10Analytics, 10Event-Platform, 10Structured-Data-Backlog: SuggestedTagsAction Event Platform Migration - https://phabricator.wikimedia.org/T267351 (10mforns) @Ramsey-WMF Hi! Just letting you know that next week we'll proceed to migrate this schema to Event Platform. Please, confirm, so I know you're aware :]... [14:31:39] (03CR) 10Mforns: [V: 03+2 C: 03+2] "Merging, given the 2 +1s." [analytics/refinery] - 10https://gerrit.wikimedia.org/r/654256 (owner: 10Mforns) [14:41:30] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Event-Platform, and 4 others: Migrate legacy metawiki schemas to Event Platform - https://phabricator.wikimedia.org/T259163 (10Ottomata) [15:01:00] 10Analytics, 10Analytics-Kanban, 10Event-Platform, 10EventStreams, and 5 others: Set up internal eventstreams instance exposing all streams declared in stream config (and in kafka jumbo) - https://phabricator.wikimedia.org/T269160 (10Ottomata) [15:12:39] (03CR) 10Milimetric: [C: 03+2] Clarify requirements for building [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/654319 (owner: 10Milimetric) [15:13:16] chrisalbon: o/ happy 2021 - I recall that we had a chat about prometheus libraries etc.., we have finally reached a more stable version of pywmflib - https://doc.wikimedia.org/wmflib/v0.0.6/api/index.html [15:13:20] (available also on pypi) [15:13:39] 10Analytics, 10Event-Platform, 10Fundraising-Backlog: CentralNoticeBannerHistory and CentralNoticeImpression Event Platform Migration - https://phabricator.wikimedia.org/T271168 (10Ottomata) a:03Ottomata [15:13:52] it contains useful modules that SRE used to have only in spicerack (so requiring admin privileges) [15:14:04] 10Analytics, 10Event-Platform: TranslationRecommendation* Schemas Event Platform Migration - https://phabricator.wikimedia.org/T271163 (10Isaac) > If it's not hard, I'd ask to retain the geocoded data. >> It isn't hard, we can do! Thanks! > All webrequests are logged, so the POST of the event will be availabl... [15:14:09] 10Analytics, 10Event-Platform, 10Structured-Data-Backlog: SuggestedTagsAction Event Platform Migration - https://phabricator.wikimedia.org/T267351 (10Ottomata) a:03mforns [15:14:17] oh awesome, thanks elukey [15:21:05] (03CR) 10Mforns: [C: 04-1] "I get a console error when testing with: npm install, npm run dev, npm run server. See inline comment." (033 comments) [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/647792 (https://phabricator.wikimedia.org/T188859) (owner: 10Fdans) [15:23:10] 10Analytics, 10Event-Platform, 10Research: TranslationRecommendation* Schemas Event Platform Migration - https://phabricator.wikimedia.org/T271163 (10Ottomata) [15:34:02] 10Analytics, 10Editing-Team-Request, 10Event-Platform: EditAttemptStep Event Platform Migration - https://phabricator.wikimedia.org/T271207 (10Ottomata) [15:34:29] 10Analytics, 10Editing-Team-Request, 10Event-Platform: EditAttemptStep Event Platform Migration - https://phabricator.wikimedia.org/T271207 (10Ottomata) @MNeisler Let us know if this schema needs client IP and/or geocoded data. If not, it will be removed as part of this migration. TY! [15:35:48] (03CR) 10Fdans: [C: 03+2] AQS: add configuration for timeout to Druid requests [analytics/aqs] - 10https://gerrit.wikimedia.org/r/649884 (https://phabricator.wikimedia.org/T268809) (owner: 10Fdans) [15:36:20] (03CR) 10Fdans: [V: 03+2 C: 03+2] AQS: add configuration for timeout to Druid requests [analytics/aqs] - 10https://gerrit.wikimedia.org/r/649884 (https://phabricator.wikimedia.org/T268809) (owner: 10Fdans) [15:42:54] 10Analytics, 10Analytics-EventLogging, 10Event-Platform, 10Performance-Team: NavigationTiming Extension schemas Event Platform Migration - https://phabricator.wikimedia.org/T271208 (10Ottomata) [15:44:42] 10Analytics, 10Analytics-EventLogging, 10Event-Platform, 10Performance-Team: NavigationTiming Extension schemas Event Platform Migration - https://phabricator.wikimedia.org/T271208 (10Ottomata) @Gilles @Krinkle I'm not sure if this list is correct. I've taken it from our [[ https://docs.google.com/spreads... [15:45:19] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Event-Platform, and 4 others: Migrate legacy metawiki schemas to Event Platform - https://phabricator.wikimedia.org/T259163 (10Ottomata) [15:46:33] (03PS6) 10Fdans: Wikistats testing framework: Replace Karma with Jest [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/648376 [15:47:22] (03CR) 10Fdans: Wikistats testing framework: Replace Karma with Jest (031 comment) [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/648376 (owner: 10Fdans) [15:48:24] elukey: I'm reading icinga docs, but having trouble to find what I'm looking for, is it possible that whenever a job ends with non-0 exit code, icinga adds the error message to the alert email? [15:49:11] mforns: I am not sure, probably it is not really possible :( [15:49:17] what is the use case? [15:49:34] thinking about the old task of moving data quality alerts to icinga [15:50:25] the email contents are templated in puppet I guess? [15:50:27] one thing that we can add is a reference to a wikipage with a runbook explaining how to retrieve the error/info [15:50:41] no I think it is something icinga specific [15:51:12] I'd personally like more a link to a runbook that explains how to check what's wrong [15:51:17] with links etc.. [15:51:35] more general [15:51:44] what do you think? [15:52:04] aha makes sense! [15:57:47] elukey: maybe the anomaly detection spark job can put the anomaly files in the public folder that gets synced to https://analytics.wikimedia.org/published/ [15:58:16] so that we can add the link to them in the email [15:58:32] and whoever looks at that can just see the affected metrics with one click [15:58:43] mforns: I am not sure if we can customize the email, there is an option to add a link to a wikipage only [15:59:03] elukey: the link needs to be to a wiki page? [15:59:11] can not be to https://analytics.wikimedia.org/published/ ? [15:59:38] mforns: it is a static link that cannot vary across alarm occurrences :( [15:59:52] this is why I was saying to add a runbook [16:00:02] in the runbook it can be mentioned to check into published etc.. [16:00:13] and if nothing is there, how to retrieve info [16:00:27] elukey: we could point to the base directory for data quality alerts, but yea, the runbook seems a better idea in the end, the link could be there [16:00:35] yes, I like [16:00:36] :] [16:00:56] thanks!! [16:06:00] <3 [16:12:30] 10Analytics, 10Event-Platform, 10WMF-Architecture-Team, 10Services (later): Reliable (atomic) MediaWiki event production - https://phabricator.wikimedia.org/T120242 (10Ottomata) [16:23:25] heya a-team, razzi and I are going to get startted on the train [16:23:36] are there any outstanding refinery-source patches yall want to get in before we do? [16:24:23] ottomata: not on my end [16:24:34] (03PS5) 10Fdans: Add Active Editors per Country metric to Wikistats [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/647792 (https://phabricator.wikimedia.org/T188859) [16:25:41] (03CR) 10Ottomata: [C: 03+2] Update changelog.md for 0.0.143 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/654305 (owner: 10Ottomata) [16:25:49] (03CR) 10Fdans: "Thanks @mforns for spotting the console error. Indeed the state doesn't set dimensions on the first init of the app, so I added a check fo" (033 comments) [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/647792 (https://phabricator.wikimedia.org/T188859) (owner: 10Fdans) [16:25:52] razzi: most of the changes in refinery-source this week are from me [16:25:53] 10Analytics-Clusters: Convert labsdb1012 from multi-source to multi-instance - https://phabricator.wikimedia.org/T269211 (10Bstorm) >>! In T269211#6721801, @elukey wrote: > 3) Add basic puppetization on site.pp, that I guess it is role `wmcs::db::wikireplicas::analytics_multiinstance` plus hiera settings to add... [16:25:55] i've already updated changelog [16:25:56] https://gerrit.wikimedia.org/r/c/analytics/refinery/source/+/654305 [16:25:59] nice [16:26:20] razzi: want to start the release? [16:26:20] https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Deploy/Refinery-source [16:26:20] cool fdans will review after meetings! [16:26:28] (we have a few wikistats changes we're working on but we'll deploy them separately later, they're not ready yet) [16:26:34] ok [16:26:45] ottomata: yeah let's go for it [16:27:06] you want to do it? or shall I? [16:27:48] ottomata: Let me try [16:27:51] coo [16:29:07] ottomata: next step is the Jenkins build I take it? [16:29:14] yup! [16:29:26] i've done step1. changelog [16:29:33] Starting build #66 for job analytics-refinery-maven-release-docker [16:39:29] Project analytics-refinery-maven-release-docker build #66: 09SUCCESS in 9 min 56 sec: https://integration.wikimedia.org/ci/job/analytics-refinery-maven-release-docker/66/ [16:41:05] mforns: is the data quality hourly bundle oozie restart yours? [16:41:22] ottomata: yes! [16:41:38] it just updates the threshold of the traffic alarms [16:42:20] ok, and we need to kill the current one first, right? [16:42:22] Starting build #33 for job analytics-refinery-update-jars-docker [16:42:26] via hue? [16:42:37] what is the Dstart_time='YYYY-MM-DDTHH:00Z' bit about? [16:42:52] (03PS1) 10Maven-release-user: Add refinery-source jars for v0.0.143 to artifacts [analytics/refinery] - 10https://gerrit.wikimedia.org/r/654466 [16:42:52] Project analytics-refinery-update-jars-docker build #33: 09SUCCESS in 30 sec: https://integration.wikimedia.org/ci/job/analytics-refinery-update-jars-docker/33/ [16:42:54] whatever the next time is waiting to run when we kill it? [16:43:20] ottomata: yes, you need to kill the hourly one first, in hue-next [16:43:54] the -Dstart_time='...' should be as you say, yes! [16:44:05] ok cool [16:51:16] 10Analytics: Separate RSVD anomaly detection into a systemd timer for better alarming with Icinga - https://phabricator.wikimedia.org/T263030 (10mforns) @ssingh @elukey I've been looking into this a bit and have had some second thoughts. Current approach: - Oozie sends emails whenever the script runs and finds... [16:52:16] (03PS6) 10Milimetric: Add Active Editors per Country metric to Wikistats [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/647792 (https://phabricator.wikimedia.org/T188859) (owner: 10Fdans) [16:52:18] (03PS7) 10Milimetric: Wikistats testing framework: Replace Karma with Jest [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/648376 (owner: 10Fdans) [16:52:20] (03PS5) 10Milimetric: Upgrade Webpack from 2 to 5 [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/649311 (https://phabricator.wikimedia.org/T188759) (owner: 10Fdans) [16:54:06] (03CR) 10Ottomata: [C: 03+1] Add refinery-source jars for v0.0.143 to artifacts [analytics/refinery] - 10https://gerrit.wikimedia.org/r/654466 (owner: 10Maven-release-user) [16:54:52] (03CR) 10Razzi: [V: 03+2 C: 03+2] Add refinery-source jars for v0.0.143 to artifacts [analytics/refinery] - 10https://gerrit.wikimedia.org/r/654466 (owner: 10Maven-release-user) [16:58:29] razzi, ottomata: There is a patch for you about restarting AQS for druid datasource upgrade :) [17:00:14] yup thank you! its in etherpad [17:10:51] (03PS7) 10Milimetric: Add Active Editors per Country metric to Wikistats [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/647792 (https://phabricator.wikimedia.org/T188859) (owner: 10Fdans) [17:10:53] (03PS8) 10Milimetric: Wikistats testing framework: Replace Karma with Jest [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/648376 (owner: 10Fdans) [17:10:55] (03PS6) 10Milimetric: Upgrade Webpack from 2 to 5 [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/649311 (https://phabricator.wikimedia.org/T188759) (owner: 10Fdans) [17:14:42] 10Analytics, 10Event-Platform: Local - https://phabricator.wikimedia.org/T271219 (10Pavone9919) [17:34:41] (03PS8) 10Milimetric: Add Active Editors per Country metric to Wikistats [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/647792 (https://phabricator.wikimedia.org/T188859) (owner: 10Fdans) [17:34:43] (03PS9) 10Milimetric: Wikistats testing framework: Replace Karma with Jest [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/648376 (owner: 10Fdans) [17:34:45] (03PS7) 10Milimetric: Upgrade Webpack from 2 to 5 [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/649311 (https://phabricator.wikimedia.org/T188759) (owner: 10Fdans) [17:37:32] ottomata: I've done a quick count of revisions from mwh and events on simplewiki for December by day: ~0.5% diff, with some days higher (1 day ~4%) [17:37:52] ottomata: We should continue to investigate :) [18:13:33] joal: is that the same as before? [18:13:40] 10Analytics, 10Analytics-Kanban: Replace Camus by Gobblin - https://phabricator.wikimedia.org/T271232 (10JAllemandou) [18:14:10] ottomata: I think so but will double check [18:15:52] joal: lemme know when you want to sync up on gobblin plans for this quarter [18:16:08] milimetric: when you wish :) [18:16:37] milimetric: first round now or tomorrow after standup? [18:17:14] now works for me, cave? [18:18:04] 10Analytics, 10Event-Platform: Local - https://phabricator.wikimedia.org/T271219 (10Reedy) 05Open→03Invalid [18:19:45] sure milimetric - joining [18:38:19] razzi: shall we proceed with weekly train stuff? [18:44:16] we have syslog in logstash.wikimedia.org now :) [18:44:20] (For all our hosts) [18:44:53] nice [18:45:08] ottomata: yep, I'm good to proceed [18:45:13] ok, so where we at? [18:45:49] 10Analytics, 10Analytics-Kanban: Replace Camus by Gobblin - https://phabricator.wikimedia.org/T271232 (10JAllemandou) [18:48:10] ottomata: time to deploy refinery? [18:48:44] great go for it [18:48:57] after that we can jump in bc and do the rest together? [18:50:28] 10Analytics, 10Analytics-Data-Quality, 10Analytics-Kanban: Unique devices numbers for all wikipedias missing for Agust and SEptember - https://phabricator.wikimedia.org/T271170 (10Nuria) 05Open→03Resolved [18:50:29] ottomata,razzi - one thing for AQS - in theory the two changes could go out in one go (druid datasource config change + timeout) but I'd keep them separate (with some delay between them) to avoid mixing up [18:50:39] k [18:51:08] 10Analytics, 10Analytics-Data-Quality, 10Analytics-Kanban: Unique devices numbers for all wikipedias missing for Agust and SEptember - https://phabricator.wikimedia.org/T271170 (10Nuria) Thanks for the fast response! [18:51:21] razzi: is your caching change for AQS also in? [18:52:01] elukey: yes, I suppose I should have updated the train etherpad [18:52:07] will do so now [18:52:47] super [18:53:07] let's remember also to confirm that the new cache settings works [18:53:25] and also https://grafana.wikimedia.org/dashboard/db/aqs [18:53:33] in theory traffic patterns shouldn't change [18:56:40] (03PS1) 10Razzi: Bump up refinery-source version to 0.0.143 [analytics/refinery] - 10https://gerrit.wikimedia.org/r/654476 [18:58:19] * elukey dinner! [19:02:17] hmm, razzi i don't know if there is a need for ^^, is there? [19:02:32] is there a relevant change for webrequest load we need to make sure is applied? [19:04:25] ottomata: not that I know of, so I suppose not [19:06:26] ya we only need to bump jars and restart jobs for things that have relevant changes [19:07:17] perhaps the "Refine - use PERMISSIVE mode and log more info about corrupt records" change should be deployed though? [19:08:02] yes def [19:08:08] that is in the etherpad [19:08:31] was thikningg we could do everything after the refinery deploy togegher [19:08:35] starting on line 35 [19:08:37] ? [19:09:35] perhaps I'm missing something, but wouldn't that permissive mode change not apply unless we upgrade the refinery-source version in refinery? [19:11:00] the jars are deployed along with refinery [19:11:08] but not everything that uses them is only in refinery [19:11:09] in this case [19:11:30] puppet is configuring a systemd timer to launch a spark job using refinery-job.jar [19:11:38] so [19:11:40] https://gerrit.wikimedia.org/r/c/operations/puppet/+/654308 [19:11:49] will bump the version that is used for that job [19:16:55] ottomata: ok, I'll do the refinery scap deploy [19:16:59] gr8 [19:17:22] !log deploying refinery for weekly train [19:17:24] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [19:26:29] mforns: you looking at that metric patch? https://gerrit.wikimedia.org/r/c/analytics/wikistats2/+/647792 [19:26:43] I wanted to try and deploy today, so I'm testing as well [19:33:52] milimetric: yes I intended to review in short [19:34:02] I will do now [19:36:11] ottomata: I think refinery is all set; meet in batcave for next steps? [19:36:55] otw [19:39:46] milimetric: I think you left a debugger in the code, or was it on purpose? [19:40:01] yeah, mforns, a later patch gets rid of it [19:40:36] milimetric: you mean a later patch set? [19:40:47] the last patch set I see is 8, and it does contain the debugger [19:40:49] yeah, look at the webpack 2->5 one, it removes it [19:40:56] (later change, not patch, sorry) [19:41:01] oh ok ok [19:49:07] milimetric: I got another error on the metric change, commenting [19:49:43] elukey: do we always deploy refinery to thin e.g. labstore, and hadoo-test [19:49:44] ? [19:58:48] ottomata: it depends, if needed yes, for hadoop-test I usually do it once in a while [19:59:05] (03CR) 10Mforns: [C: 04-1] "Now I can see the metric, which is Awesomeee :D" [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/647792 (https://phabricator.wikimedia.org/T188859) (owner: 10Fdans) [20:00:12] it doesn't hurt to deploy on those systems :) [20:06:58] Ending my day team - see you tomorrow [20:07:36] ok elukey ya we did thin, but won't bother with hadoop-test unless needed [20:40:36] (03PS1) 10Razzi: Update aqs to cf9b064 [analytics/aqs/deploy] - 10https://gerrit.wikimedia.org/r/654491 [20:40:49] (03CR) 10Milimetric: "> bundle.js:3574 TypeError: Cannot read property 'key' of undefined" [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/647792 (https://phabricator.wikimedia.org/T188859) (owner: 10Fdans) [20:43:26] (03CR) 10Razzi: [V: 03+2 C: 03+2] Update aqs to cf9b064 [analytics/aqs/deploy] - 10https://gerrit.wikimedia.org/r/654491 (owner: 10Razzi) [20:43:55] !log deploy aqs as part of train [20:43:57] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [20:45:46] !log Refine changes: event tables now have is_wmf_domain, canary events are removed, and corrupt records will result in a better monitoring email [20:45:47] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [20:57:01] milimetric: razzi and I are about to bump mw hist snapshot in aqs [20:57:06] you avail to test? [20:58:48] 10Analytics, 10Better Use Of Data, 10Event-Platform, 10Product-Infrastructure-Data, and 2 others: MEP Client MediaWiki PHP - https://phabricator.wikimedia.org/T253121 (10Mholloway) 05Open→03Resolved [20:58:50] 10Analytics, 10Analytics-Kanban, 10Growth-Team, 10Product-Analytics, 10Patch-For-Review: Migrate Growth EventLogging schemas to Event Platform - https://phabricator.wikimedia.org/T267333 (10Mholloway) [21:22:10] (03PS1) 10Gerrit maintenance bot: Add tr.wikivoyage to pageview whitelist [analytics/refinery] - 10https://gerrit.wikimedia.org/r/654502 (https://phabricator.wikimedia.org/T271260) [21:27:45] ottomata: I can test in like 10 min? [21:32:20] milimetric: we deployed it! [21:32:30] !log bumped mediawiki history snapshot version in AQS [21:32:32] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [21:32:34] we tested i thikn [21:39:20] i suppose a warning, i need to download and split a 200GB file on stat1007 to upload it to hadoop, stat1007 has 2+TB of disk space in /srv so i don't expect to cause anyone problems, but who knows... [21:39:59] and decompress..i wouldn't be surprised if its needs ~1 TB in intermediate satates [21:45:28] k thanks for the heads up [21:46:26] ottomata: tested, all good [21:54:31] gr8 [22:21:17] l8rs all! [23:28:50] 10Analytics, 10Event-Platform, 10Structured-Data-Backlog: SuggestedTagsAction Event Platform Migration - https://phabricator.wikimedia.org/T267351 (10Ramsey-WMF) Thanks for the notice! By all means, proceed 😄 >>! In T267351#6722466, @mforns wrote: > @Ramsey-WMF Hi! Just letting you know that next week we'l... [23:33:48] 10Analytics-Radar, 10Product-Analytics, 10Growth-Team (Current Sprint), 10MW-1.36-notes (1.36.0-wmf.25; 2021-01-05): HomepageVisit schema validation errors - https://phabricator.wikimedia.org/T269966 (10Etonkovidova) 05Open→03Resolved >>! In T269966#6721671, @Tgr wrote: > No idea how that would happen....