[02:17:20] 10Analytics, 10Pageviews-API, 10Chinese-Sites, 10Pageviews-Anomaly: Unusual high page view on Chinese Wikipedia - https://phabricator.wikimedia.org/T269065 (10Shizhao) [02:27:29] 10Analytics, 10Pageviews-API, 10Chinese-Sites, 10Pageviews-Anomaly: Unusual high page view on Chinese Wikipedia - https://phabricator.wikimedia.org/T269065 (10Shizhao) >>! 在T269065#6659583中,@MilkyDefer写道: > I think there are two possibilities: > > 1. Pageviews tool is broken; or > 2. Someone is sending fr... [02:31:11] 10Analytics, 10Chinese-Sites, 10Pageviews-Anomaly: Unusual high page view on Chinese Wikipedia - https://phabricator.wikimedia.org/T269065 (10Shizhao) [07:11:29] good morning [07:49:23] !log restart oozie to pick up new settings for T264358 [07:49:26] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [07:49:26] T264358: Investigate oozie banner monthly job timeouts - https://phabricator.wikimedia.org/T264358 [08:03:02] addshore: o/ [08:03:04] hello hello [08:03:11] hope everything is well on your side :) [08:03:54] if you have time, can you tell me nowadays who in WMDE requires access to analytics-wmde-users? I'd like to add it to https://wikitech.wikimedia.org/wiki/Analytics/Data_access#Access_Groups so SRE knows when people request access [08:04:31] (sometimes they reach out asking "This person is in WMDE, should I add the username to analytics-wmde-users too?") [08:16:46] team: I modified https://wikitech.wikimedia.org/wiki/Analytics/Data_access according to what we discussed during standup [08:17:48] 10Analytics, 10Analytics-Kanban: Analytics Ops Technical Debt - https://phabricator.wikimedia.org/T240437 (10elukey) [08:17:53] 10Analytics-Clusters: Apply proper permissions to stat100x home directories - https://phabricator.wikimedia.org/T262183 (10elukey) 05Open→03Declined >>! In T262183#6451157, @Milimetric wrote: > potential easier way: require belonging to analytics-privatedata to log into stat1xxx. This is what we decided to... [08:18:26] 10Analytics, 10Analytics-Kanban: Analytics Ops Technical Debt - https://phabricator.wikimedia.org/T240437 (10elukey) [08:18:28] 10Analytics: Deprecate the anaytics-users POSIX group - https://phabricator.wikimedia.org/T269150 (10elukey) [08:18:42] going to send an email to analytics-announce@ [08:37:15] hi team; I'm helping out with spinning up service on prod (https://phabricator.wikimedia.org/T265722), that requires some data generated on Hadoop. We are looking at ingesting the dataset in mysql. However, I was wondering if serving off one of the analytics datastores could be an option. IIRC joal showed wikistats dashboards that were using druid as a backend. [08:39:11] do you think this could be a feasible path? [08:40:09] Hi gmodena! [08:40:16] we don't have an SLO in place yet, though my understanding is that we (=PET) are looking at providing support only during biz hours. There's an ongoing thread with SREs about this. [08:40:19] joal 'morning :) [08:41:46] gmodena: along with SLO you already mentioned, information that we'd need to assess are: data size and format (granularity, etc) and data access patterns [08:42:18] gmodena: Those two aspects are key to choose which backend would best fit the need for serving the data [08:43:40] gmodena: The 2 backends we use in analytics-land are druid (dimensional data to slice and dice) and cassandra (random-read access) -- simplified view obviouly [08:45:09] gmodena: While analytics doesn't use mariadb as backend, we use it as source, and we could definitely load data into it (we use sqoop to get data FROM mariadb, we could use it to load INTO) [08:46:01] gmodena: Wednesday is kids day for me, meaning I'll be mostly off up to siesta time (1pm CEST), work ~2h, then off again until ~6pm CEST, then back for a long evening [08:46:05] joal roger that. The inputdata is ~16GB, refreshed and ingested monthly. The services is expected a QPS (and db reads) in the order of hundred / queries hour. We *might* need to write data back, and I appreciate this would complicate things - this could be sidestepped. I'll share more details in pvt. [08:46:54] gmodena: hi! one thing from my side - I'd avoid to load data to mysql (analytics I mean) unless it is a one-off or for some tests [08:47:07] elukey: hello :) [08:47:31] joal ack, no rush! It's mostly speculation at this point. Many thanks for the feedback & enjoy daddy day :) [08:47:44] <3 gmodena - Talk later [08:47:53] plus I want to make clear that we should be really really mindful in avoiding to link a production service (with possibly some tight and high availability requirements) with Analytics datastores [08:48:00] joal: o/ [08:48:09] elukey ack [08:48:59] elukey the feasibility of linking analytics to prod is what I'm trying to find out [08:49:18] and no is a perfectly fine answer, but at least we know :) [08:49:19] gmodena: I don't want to be the grumpy ops in the picture, just to make clear that we support our systems with the idea that if they break production is not affected (as much as possible) [08:49:36] elukey understood [08:49:41] to give you some context [08:49:51] for example, we have some interesting use case for discovery [08:50:33] they train regularly models for ranking on hadoop, and push the result to Swift (our object store basically, like S3) and from it they fetch it and upload it to elastic search [08:51:40] we explored the mysql option, and pushing flat files to swift was considered but rejected. I just wanted to investigate if druid (or similar) could also have been a candidate. If that's not the case, that is good to know [08:52:32] elukey that use case is very similar to what we want to achieve [08:52:52] we don't have a training step (in ML terms), but the data prep one should be equivalent [08:53:39] gmodena: so we can definitely think about druid, but it depends on the requirements for this service [08:54:25] we can hae a quick chat on meet if you want [09:31:41] elukey: so it is / should basically only be for interacting with the wmde-analytics user (checking logs and manually running the scripts that run there) [09:50:51] addshore: ack ok! So definitely not needed for all WMDE users [09:51:00] nope [09:51:39] I mean, at points it would be useful to just have everyone have access to it, and if people already have analytics access then they may as well also be in this group [09:51:53] otherwise we have to do the classic 3 day wait for people to get added to the group? [09:54:43] addshore: yes exactlty, but let's keep the usage of the group to a limited set of users if possible [09:54:50] yup :) [09:54:55] :) [09:55:01] ok updated docs, thanks! [09:55:11] np! [10:01:22] 10Analytics: Convert labsdb1012 from multi-source to multi-instance - https://phabricator.wikimedia.org/T269211 (10Marostegui) [10:06:31] 10Analytics: Convert labsdb1012 from multi-source to multi-instance - https://phabricator.wikimedia.org/T269211 (10elukey) I am going to have a chat with my team about it, we'll add this as goal for next quarter if it is ok (seems so timing wise, but let me know). What we (as Analytics) we'll have to do is: 1... [11:32:59] * elukey afk! Lunch! [12:14:57] 10Analytics-Radar, 10Editing-team, 10MediaWiki-Page-editing, 10Platform Engineering, and 2 others: EditPage save hooks pass an entire `EditPage` object - https://phabricator.wikimedia.org/T251588 (10daniel) p:05Medium→03Low [13:03:13] 10Analytics, 10Release-Engineering-Team (Development services): Unable to clone git repo from stat1008 - https://phabricator.wikimedia.org/T268290 (10kostajh) 05Open→03Resolved a:03kostajh Thanks for the clarification @Dzahn and @brennen. [14:12:32] (03CR) 10Joal: "Comments on coordinator.xml and workflow.xml not takin into account :)" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/640146 (https://phabricator.wikimedia.org/T251777) (owner: 10Fdans) [14:30:42] bearloga [14:31:37] oh weird, IRCCloud doesn't have a search/find text feature [14:40:24] ottomata: o/ [14:40:42] so for the kerberos enabled refactoring, I found something that may also work fine in labs/cloud [14:40:45] https://gerrit.wikimedia.org/r/c/operations/puppet/+/641958/15/modules/kerberos/manifests/exec.pp [14:41:04] the idea is to have a flag that removes the the wrapper from execs [14:41:15] tunable per cluster, or even global [14:41:51] and the flag is set for https://gerrit.wikimedia.org/r/c/operations/puppet/+/641958/15/modules/profile/manifests/kerberos/client.pp [14:41:58] that we include everywhere now [14:42:56] lemme know if it is something that could work, in case I am ready to merge the patch (then I have another one for systemd timers) [14:43:05] (but same thing, it uses the same flag) [14:49:30] 10Analytics, 10Product-Analytics, 10Inuka-Team (Kanban): Set up preview counting for KaiOS app - https://phabricator.wikimedia.org/T244548 (10nshahquinn-wmf) [14:51:42] +1 elukey looks great! [14:52:19] ottomata: thank yoooo [14:52:25] *youuuu [14:52:53] I am going through the pcc diff another time, and will rollout with puppet disabled [14:53:46] elukey: will you have time today for migrationnnnn? [14:54:11] mforns: yep sure! [14:54:16] I think jo-seph won't be there... but I guess we can do it [14:54:44] :D [14:54:53] when is it good for you? [14:56:51] mforns: In an hour it would be great, but we can do it know if it is better for you [14:57:36] elukey: no no, let's do it at 17h [14:57:43] great :] [14:58:09] ack :) [15:34:16] (03CR) 10Fdans: Add historical_raw job to load data from pagecounts_raw (039 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/640146 (https://phabricator.wikimedia.org/T251777) (owner: 10Fdans) [15:34:31] joal: my bad, I didn't hit reply with my comments [15:35:33] mforns: https://gerrit.wikimedia.org/r/c/mediawiki/services/eventstreams/+/644854 [15:47:20] 10Analytics-Clusters, 10Operations, 10ops-eqiad: an-presto1004 shows only the NIC in the boot list - https://phabricator.wikimedia.org/T268951 (10Cmjohnson) I will have to take a look at the server and get back to you. I am assuming this is okay to take down since the disks are not being seen. Sounds lik... [15:48:11] 10Analytics-Clusters, 10Operations, 10ops-eqiad: an-presto1004 shows only the NIC in the boot list - https://phabricator.wikimedia.org/T268951 (10elukey) @Cmjohnson yes please take it down anytime :) [15:55:31] mforns: an snap I am merging a puppet change for systemd timers, will try to do an-launcher1002 first [15:56:09] oh elukey no problemo, we can kill ingestion if necessary [15:56:11] I am forcing all timers to use kerberos by default, to remove settings that we don't need [15:56:20] so some timers on launcher are changed [15:56:20] aha [15:56:44] elukey: we can postpone the migration 30 mins? [15:57:05] I think we only need 30 mins to complete it, so we'd have time before standup [15:57:49] sure! [15:58:06] ok, this way you can finish [15:58:12] it helps me since puppet is failing on an-launcher of course -.- [16:02:31] PROBLEM - Check the last execution of refinery-sqoop-whole-mediawiki on an-launcher1002 is CRITICAL: CRITICAL: Status of the systemd unit refinery-sqoop-whole-mediawiki https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [16:04:35] this is not me! commonswiki:image :D [16:04:45] I think it was mentioned by Joseph yesterday [16:23:34] mforns: I am still trying to fix the weird problem, sorry :( [16:24:46] 10Analytics-Radar, 10MediaWiki-extension-requests: "Reverted edits" view for Contributions page - https://phabricator.wikimedia.org/T186536 (10DannyS712) 05Open→03Invalid >>! In T186536#6336639, @Ostrzyciel wrote: > If I understand how Special:Contributions' //Tag filter// field works, once T254074 is comp... [16:25:09] hiya a-team, should I look into Check the last execution of refinery-sqoop-whole-mediawiki is CRITICAL ? [16:26:31] ottomata: I think that it was expected, one table failed and Joseph mentioned yesterday IIRC, maybe we could wait standup to brainbounce with him [16:27:44] k [16:27:46] elukey: no problem, maybe I can do that later today with ottomata, or we can try tomorrow [16:29:32] in general, though, yeah, that sqoop job is important for the mw history pipeline. I would look to see which table failed and if the rest of the pipeline can be allowed to continue (fake success flags, etc.) [16:29:40] or maybe... ottomata, do you have time now until standuppp? [16:30:21] milimetric: the commonswiki.image table is the only one that failed, wasn't it mentioned by Joseph during standup? [16:30:38] it was but I thought he was going to fix it [16:30:53] this is why I suggested to wait for standup, to ask to him :) [16:31:17] I guess he couldn't find the right way to split the min/max across reducers, that's probably what failed judging by commonswiki [16:31:20] mforns: an-launcher1002 is in a weird state so you'll have to wait for me :( [16:31:35] mforns: we can do it after I fix this, even after standup, if you'll have time [16:31:35] elukey: oh, of course [16:31:39] yeah, but I don't like to depend on Jo, we can figure it out :) [16:31:46] ottomata: let's look together? [16:31:52] ack :) [16:31:52] I'll get in the cave (cc razzi ) [16:31:59] elukey: ok, let's see how it goes, don't want to make you work late [16:32:04] ok milimetric [16:32:19] 1 miin [16:40:25] (03CR) 10Mforns: Add SpecialInvestigate schema to EventLogging whitelist (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/628237 (https://phabricator.wikimedia.org/T262496) (owner: 10Jenniferwang) [16:47:14] !log faked _SUCCESS flag for image table to allow daisy-chained mediawiki history load dependent coordinators to keep running [16:47:16] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [16:50:34] !log restarted turnilo to clear deleted datasource [16:50:36] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [16:59:24] 10Analytics, 10Data-release, 10Privacy Engineering, 10Research, 10Privacy: Evaluate a differentially private solution to release wikipedia's project-title-country data - https://phabricator.wikimedia.org/T267283 (10Nuria) a:03Nuria [16:59:44] 10Analytics, 10Data-release, 10Privacy Engineering, 10Research, 10Privacy: Evaluate a differentially private solution to release wikipedia's project-title-country data - https://phabricator.wikimedia.org/T267283 (10Nuria) @Aklapper I assigned to myself again after my account was re-activated [17:27:11] (03CR) 10Jenniferwang: Add SpecialInvestigate schema to EventLogging whitelist (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/628237 (https://phabricator.wikimedia.org/T262496) (owner: 10Jenniferwang) [17:30:12] fdans: sorry got other meeting we can talk later this afternoon [17:42:57] elukey: we found the reason here: https://phabricator.wikimedia.org/T222378#5155235 and we think it's totally fine to uncouple the timers, just have them all run separately [17:44:27] milimetric: ack nice! [17:50:09] !log Manually start refinery-sqoop-production on an-launcher1002 to cover for couped runs failure [17:50:10] !log starting netflow migration wmf->event [17:50:10] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [17:50:12] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [18:20:46] !log finished netflow migration wmf->event [18:20:49] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [18:21:23] 10Analytics-Radar, 10Performance-Team (Radar), 10Readers-Web-Backlog (Kanbanana-FY-2020-21), 10Vue.js (Vue.js-Search): Revise schema and performance dashboards for Vue.js search - https://phabricator.wikimedia.org/T250336 (10Jdlrobson) [18:28:40] 10Analytics, 10Data-release, 10Privacy Engineering, 10Research, 10Privacy: Evaluate a differentially private solution to release wikipedia's project-title-country data - https://phabricator.wikimedia.org/T267283 (10Aklapper) Thanks (and again sorry!) [18:41:23] ottomata: related funny piece of software: https://issues.apache.org/jira/browse/CALCITE-4034 [18:41:36] ottomata: related to our question about dumps [18:41:41] ottomata: I'd love to try that [18:43:21] MEH - I don't manage to get that image table from commons out :( [18:48:23] * elukey afk! [19:01:26] ottomata: materialize just raised 32M dollars, check out their examples, they are all done with wikipedia streams: https://materialize.com/quickstart/ [19:28:38] (03CR) 10Mforns: [C: 04-1] "I see, the tool field is not sensitive then. Thanks for the explanation." (033 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/628237 (https://phabricator.wikimedia.org/T262496) (owner: 10Jenniferwang) [19:35:54] oh sorry razzi, wrong channel, now ok [19:42:25] oh razzi, don't worry, cdanis just beat me to asking you and merged that [19:43:41] sure thing [20:00:49] OH nuria cool! [20:01:01] cc ottomata I KNOW [20:01:47] nuria: did you marcel's eventstreams gui? [20:01:53] stream.wikimedia.org/v2/ui [20:02:01] see* [20:02:16] joal: yeah that would be pretty cool [20:04:12] mforns: So niceeeeee [20:04:52] (03PS4) 10Jenniferwang: Add SpecialInvestigate schema to EventLogging whitelist [analytics/refinery] - 10https://gerrit.wikimedia.org/r/628237 (https://phabricator.wikimedia.org/T262496) [20:05:21] nuria: its in beta too [20:05:21] https://stream-beta.wmflabs.org/v2/ui/#/ [20:05:23] witih all streams [20:05:35] ottomata: ooohh, ne BABY! [20:05:38] *new [20:07:00] (03CR) 10Jenniferwang: Add SpecialInvestigate schema to EventLogging whitelist (032 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/628237 (https://phabricator.wikimedia.org/T262496) (owner: 10Jenniferwang) [20:07:51] mforns: review meeeee https://gerrit.wikimedia.org/r/c/mediawiki/services/eventstreams/+/644854:) [20:11:51] luuukin [20:13:12] ottomata: should we add a link to ?doc? [20:13:17] in the UI [20:14:36] oh, you already did! xD sorry [20:15:01] :D [20:35:38] (03CR) 10Mforns: [C: 04-1] "I think your last changes didn't make it to Gerrit! :]" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/628237 (https://phabricator.wikimedia.org/T262496) (owner: 10Jenniferwang) [20:36:58] ottomata: code looks good! just thought link names could be 'Spec' and 'Wiki'? [20:36:58] ottomata: code looks good! just thought link names could be 'Spec' and 'Wiki'? [20:37:02] two times? [20:37:05] xD [20:37:09] mforns: just pushed patch to do 'API Docs' and 'Wiki' [20:37:19] ok, LGTM [20:39:25] ottomata: +2, I let you merge it [20:40:25] oh, jenkins will... [20:41:07] great :) [20:42:27] are you deploying prod? [20:44:56] nuria: thanks :] [20:49:00] mforns: i will soon [20:49:06] :D [20:49:49] !log deployed eventgate-analytics-external with refactored stream config, hopefully this will work around the canary events alarm bug - T266573 [20:49:52] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [20:49:53] T266573: eventgate-analytics-external occasionally seems to fail lookups of dynamic stream config from MW EventStreamConfig API - https://phabricator.wikimedia.org/T266573 [21:03:19] RECOVERY - Check the last execution of produce_canary_events on an-launcher1002 is OK: OK: Status of the systemd unit produce_canary_events https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [21:16:32] !log Rerun timed out jobs after oozie config got updated (mediawiki-geoeditors-yearly-coord and banner_activity-druid-monthly-coord) [21:16:34] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [21:20:37] ottomata: I am on clinic duty this week and no APPROVED so far :( [21:21:04] (03PS5) 10Jenniferwang: Add SpecialInvestigate schema to EventLogging whitelist [analytics/refinery] - 10https://gerrit.wikimedia.org/r/628237 (https://phabricator.wikimedia.org/T262496) [21:21:06] they are all lower case :P [21:21:35] hahah [21:21:41] i'll try to do better [21:21:47] i was just so excited to give my first ever approval i guess [21:21:50] the thrill is gone [21:22:18] lol [21:22:45] (03CR) 10Mforns: [V: 03+2 C: 03+2] "LGTM!" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/628237 (https://phabricator.wikimedia.org/T262496) (owner: 10Jenniferwang) [21:23:56] (03CR) 10Mforns: "This will take effect after our next deployment train, which happens every Tuesday." [analytics/refinery] - 10https://gerrit.wikimedia.org/r/628237 (https://phabricator.wikimedia.org/T262496) (owner: 10Jenniferwang) [21:26:21] mforns_brb: deployed, you are the front page :) [21:37:42] !log Manually create _SUCCESS flags for banner history monthly jobs to kick off (they'll be deleted by the purge tomorrow morning) [21:37:43] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [21:50:13] 10Analytics, 10Analytics-Kanban: Alter table for navigation timing errors out in Hadoop test - https://phabricator.wikimedia.org/T268733 (10Ottomata) a:05Ottomata→03None [22:02:04] ottomata: O.O [22:17:36] 10Analytics: Investigate oozie banner monthly job timeouts - https://phabricator.wikimedia.org/T264358 (10JAllemandou) I relaunched timedout tasks, manually recreated missing _SUCCESS flags, and jobs succeeded! This looks like a solved problem to me :) [22:18:07] 10Analytics, 10Analytics-Kanban: Investigate oozie banner monthly job timeouts - https://phabricator.wikimedia.org/T264358 (10JAllemandou) a:03Ottomata [22:28:03] 10Analytics, 10Analytics-Kanban: Investigate oozie banner monthly job timeouts - https://phabricator.wikimedia.org/T264358 (10Ottomata) Well, those ones would succeed because now the _SUCCESS flags exist. The problem is when the job times out before the _SUCCESS flags exist. If the October job succeeds witho... [23:26:48] 10Analytics, 10Analytics-Wikistats, 10Inuka-Team, 10Language-strategy, and 2 others: Have a way to show the most popular pages per country - https://phabricator.wikimedia.org/T207171 (10lexnasser) Hey everyone, @JFishback_WMF has completed his risk analysis of the working API design, and, from a privacy pe...