[01:35:24] 10Analytics, 10Pageviews-API, 10Pageviews-Anomaly, 10Patch-For-Review: "Venuše (planeta)" on cs.wp has surprisingly high numbers in Pageviews Analysis (and also Topviews Analysis) - https://phabricator.wikimedia.org/T239532 (10Nuria) The bot detection running on shadow mode now should ba able to detect thi... [01:35:26] (03CR) 10Nuria: [C: 03+2] Add MeetingRoomApp to the bot regex [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/586057 (https://phabricator.wikimedia.org/T239532) (owner: 10Urbanecm) [01:36:38] 10Analytics, 10Pageviews-API, 10Pageviews-Anomaly, 10Patch-For-Review: "Venuše (planeta)" on cs.wp has surprisingly high numbers in Pageviews Analysis (and also Topviews Analysis) - https://phabricator.wikimedia.org/T239532 (10Nuria) [01:36:40] 10Analytics: Label high volume bot spikes in pageview data as automated traffic - https://phabricator.wikimedia.org/T238357 (10Nuria) [02:02:57] 10Analytics-Kanban, 10Better Use Of Data, 10Product-Analytics: Superset Updates - https://phabricator.wikimedia.org/T211706 (10Nuria) That indeed looks very useful, pinging @Gehel and company (@EBernhardson and @dcausse) so thet know that if we have a verison of elastic that supports SQL access it can be mad... [06:29:00] !log allow all analytics-privatedata-users to use the GPUs on stat1005/8 [06:29:01] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [06:49:28] elukey: Hey, we want to rename a really large table in labsdb1012, is it okay? [06:49:46] discussion at #wikimedia-databases [06:54:39] It's already removed from sqoop but I'm not sure if it's deployed https://gerrit.wikimedia.org/r/c/analytics/refinery/+/585720 [06:57:32] you're moving to airflow instead of oozie? NICE [06:58:12] Amir1: hi! I have no opposition but I am faily ignorant about the sqoop tables, would you mind to wait for joal or milimetric ? [06:58:33] sure thing [06:58:47] ah wait you are asking if https://gerrit.wikimedia.org/r/#/c/analytics/refinery/+/585720/ is deployed [06:59:03] I think so but I can triple check, one second [06:59:19] I mean, if it's not it's still fine unless it gets deployed by the next run [06:59:23] I heard it's monthly [06:59:41] s/unless/as long as/ [07:00:38] so it is not deployed, and we just completed the monthly mw history sqoop [07:00:53] it will be surely deployed before next month, so +1 from my [07:00:54] *me [07:01:19] cool, is there anything else I should be worried about? for the dbstore replicas? [07:01:25] Amir1: about airflow - the search team is already testing it on an-airflow1001, we will do the same soon [07:02:08] Amir1: not that I can think of, dbstore replicas would probably need the same maintenance (I guess) but you are free to go in there too [07:02:09] nice [07:02:20] thanks [07:32:04] fdans: https://datavizcatalogue.com/index.html :) [07:32:24] I was looking for some example of https://datavizcatalogue.com/methods/treemap.html for superset [07:33:09] and then I wondered - superset seems to be something that you'd enjoy a lot working on, interested in being more involved with upstream? [07:41:22] 10Analytics-Kanban, 10Better Use Of Data, 10Product-Analytics: Superset Updates - https://phabricator.wikimedia.org/T211706 (10Gehel) >>! In T211706#6030682, @Nuria wrote: > That indeed looks very useful, pinging @Gehel and company (@EBernhardson and @dcausse) so thet know that if we have a verison of elasti... [08:43:34] Amir1 / elukey: yep, we just need it deployed before May 1st, but we also need to check the puppet code that call ms the sqoop to remove it from the table list [09:13:04] 10Analytics, 10Analytics-EventLogging, 10Patch-For-Review: Validate JSON-schema before allowing saves in the Schema namespace - https://phabricator.wikimedia.org/T249333 (10awight) [09:17:30] 10Analytics, 10Analytics-EventLogging, 10Patch-For-Review: Validate JSON-schema before allowing saves in the Schema namespace - https://phabricator.wikimedia.org/T249333 (10awight) The next step is to translate https://wikitech.wikimedia.org/wiki/Event_Platform/Schemas/Guidelines into a formal schema and "al... [09:22:53] * elukey going out to get some groceries [10:32:44] 10Analytics-Kanban, 10Better Use Of Data, 10Product-Analytics: Upgrade to Superset 0.36.0 - https://phabricator.wikimedia.org/T249495 (10elukey) [10:37:30] 10Analytics-Kanban, 10Better Use Of Data, 10Product-Analytics, 10Patch-For-Review: Upgrade to Superset 0.36.0 - https://phabricator.wikimedia.org/T249495 (10elukey) Opened https://github.com/apache/incubator-superset/issues/9468 to upstream to fix a bug found in 0.36.0rc3 [10:37:55] 10Analytics: Review error in table visualization with Superset 0.36.0 - https://phabricator.wikimedia.org/T249405 (10elukey) [10:38:00] 10Analytics-Kanban, 10Better Use Of Data, 10Product-Analytics, 10Patch-For-Review: Upgrade to Superset 0.36.0 - https://phabricator.wikimedia.org/T249495 (10elukey) [10:52:31] 10Analytics, 10Wikidata, 10Wikidata-Campsite (Wikidata-Campsite-Iteration-∞): Remove wb_terms from sqoop - https://phabricator.wikimedia.org/T249319 (10Addshore) [10:52:48] 10Analytics, 10Wikidata, 10Wikidata-Campsite (Wikidata-Campsite-Iteration-∞): Remove wb_terms from sqoop - https://phabricator.wikimedia.org/T249319 (10Addshore) a:03JAllemandou @joal I'll leave this one here until you remove the tables! [11:32:54] elukey: fast grocery today :) [11:33:24] yeah not the big one, I went to a small shop [11:33:53] ok :) [11:34:07] elukey: datasource has been tested, seems ready to dpeloy [11:41:43] joal: aqs1004 depooled and ready for testing [11:41:50] ack elukey - testing [11:42:43] All good for me elukey [11:48:23] super [11:50:23] !log deploy new druid datasource in Druid public [11:50:24] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [11:55:43] joal: done! [11:55:50] \o/ Checking UI elukey [11:55:53] so I just discovered that apt doesn't work in the vlan now [11:56:03] meh [11:56:06] since there is a new host not whitelisted in the firewall [11:57:28] elukey: UI works for me :) [12:18:42] elukey: the new TwoColConflictExit schema is finally deployed and ready to refine, at your convenience. [12:20:16] awight: nice! [12:21:15] elukey: oooh the 26-hour thing, huh... [12:33:44] !log Bump AQS druid backend to 2020-03 [12:33:45] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [12:48:26] 10Analytics, 10Operations, 10netops: Move netflow data to Eventgate Analytics - https://phabricator.wikimedia.org/T248865 (10MoritzMuehlenhoff) p:05Triage→03Medium [13:14:25] Just ran an example with Tensorflow 2.0 and ROCm 3.3! \o/ [13:23:10] !log upgraded stat1008 to AMD ROCm 3.3 (enables tensorflow 2.x) [13:23:11] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [13:36:24] joal: tf 2.x available on stat1008 :O [13:40:04] 10Analytics-Kanban, 10Better Use Of Data, 10Product-Analytics: Superset Updates - https://phabricator.wikimedia.org/T211706 (10elukey) >>! In T211706#5840804, @ayounsi wrote: > Playing around Superset I came across those already reported bugs: > https://github.com/apache/incubator-superset/issues/7649 > http... [13:45:54] \o/ elukey [13:45:57] This is great! [13:51:17] rerunning the failed refine job [13:52:24] ottomata: morninggg! I am wondering if the new web-proxies are throttling us sometimes, leading to these failures in refine [13:52:42] isaacj: Hi - let me know when you're ready, I'd like to try to help with the request [13:52:50] hmmm, maybe, there shouldn't be that many requests....although maybe there a bunch happening all at once? [13:53:00] at the beginning of of the job? it does paralleize them [13:53:12] I was about to mention that ottomata --^ :) [13:53:15] joal: about to receive an email :) [13:53:27] https://phabricator.wikimedia.org/T247510 would help [13:54:05] isaacj: hm, I think I don't get it - send? [13:58:36] ottomata: but api.svc would still go through the webproxy no? [13:59:09] no [13:59:24] webproxy is only needed I think because of public url [13:59:40] api.svc resolves to an internal ip [14:00:22] Ah - I get it isaacj :) [14:01:36] :thumbs up: [14:01:48] ottomata: sure but there is no explicit rule for that in the analytics vlan, IIRC Arzhel at the time standardized the access to HTTP via webproxy [14:02:09] so not only public ips [14:02:21] ? [14:02:34] right we'd have to add the rule to the vlan, ya? [14:02:47] yes but all HTTP traffic now goes through the webproxy [14:02:47] am free after 1 [14:02:51] oops wrong chat :) [14:03:03] 10Analytics, 10Analytics-Kanban, 10Continuous-Integration-Infrastructure (phase-out-jessie), 10Patch-For-Review, and 2 others: Migrate analytics/refinery/source release jobs to Docker - https://phabricator.wikimedia.org/T210271 (10hashar) [14:03:19] ? [14:03:37] elukey: IIRC sre doesn't lke it when internal services use webproxy to access internal services [14:04:03] ottomata: no idea, my point was to verify this before adding rules [14:04:26] ah k [14:04:31] soujnds good, in eithe rcase [14:04:51] i looked at the code that would be needed to pass a custom http header down to http lib [14:04:58] it isn't that easy, mostly because of Java abstractions :p[ [14:05:27] 10Analytics, 10Pageviews-API, 10Pageviews-Anomaly, 10Patch-For-Review: "Venuše (planeta)" on cs.wp has surprisingly high numbers in Pageviews Analysis (and also Topviews Analysis) - https://phabricator.wikimedia.org/T239532 (10JAllemandou) I confirm that our automated-traffic detection heuristic (shadow mo... [14:06:07] joal: is the change I made to EventSChemaLoader last week deployed? [14:06:22] let me check ottomata [14:06:34] yessir [14:06:37] ottomata: --^ [14:06:38] great [14:06:43] gonna try to use it :) [14:06:45] :) [14:20:28] 10Analytics-EventLogging, 10Analytics-Kanban, 10Event-Platform, 10CPT Initiatives (Modern Event Platform (TEC2)), and 3 others: Modern Event Platform (TEC2) - https://phabricator.wikimedia.org/T185233 (10Ottomata) [14:34:08] isaacj: Added some comments to the query [14:36:09] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Upgrade AMD ROCm to latest upstream - https://phabricator.wikimedia.org/T247082 (10elukey) [14:37:15] joal: thanks, i'll go through them! [14:48:16] 10Analytics-EventLogging, 10Analytics-Kanban, 10Event-Platform, 10CPT Initiatives (Modern Event Platform (TEC2)), and 3 others: Modern Event Platform (TEC2) - https://phabricator.wikimedia.org/T185233 (10Ottomata) [14:48:32] joal: all looked good, thanks! just left a clarifying question for you about the time windowing in the anonymized_sessions query [14:51:13] Just answered isaacj [14:52:53] joal: oh nice! by the time this is done, i'll be able to add Spark to my CV ;) [14:53:00] ;) [15:22:37] 10Analytics, 10Analytics-EventLogging, 10Patch-For-Review: Validate JSON-schema before allowing saves in the Schema namespace - https://phabricator.wikimedia.org/T249333 (10Ottomata) @awight we will be deprecating on wiki schemas over the next couple of quarters. The EventLogging extension won't be used (by... [15:23:57] 10Analytics, 10Analytics-EventLogging, 10Patch-For-Review: Validate JSON-schema before allowing saves in the Schema namespace - https://phabricator.wikimedia.org/T249333 (10Ottomata) There is already extensive CI to ensure that users using new schema repos will abide by these guidelines. BTW, if you are wil... [16:01:27] 10Analytics, 10Analytics-EventLogging, 10Patch-For-Review: Validate JSON-schema before allowing saves in the Schema namespace - https://phabricator.wikimedia.org/T249333 (10Milimetric) 05Open→03Declined we don't plan on working on this, so feel free to reopen if you want to continue work on it (but maybe... [16:02:49] 10Analytics, 10Wikidata, 10Wikidata-Campsite (Wikidata-Campsite-Iteration-∞): Remove wb_terms from sqoop - https://phabricator.wikimedia.org/T249319 (10Milimetric) FYI: checked monthly sqoop and it's not referenced there so puppet is good to go (https://github.com/wikimedia/puppet/blob/ef13ee28202b2ff9653bfa... [16:04:12] 10Analytics, 10VPS-project-codesearch: Add analytics/* gerrit repos to code search - https://phabricator.wikimedia.org/T249318 (10Milimetric) a:03Milimetric I don't know how to add them. But would love to. [16:05:01] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Make sqoop run in production queue - https://phabricator.wikimedia.org/T249155 (10Milimetric) p:05Triage→03High [16:06:16] 10Analytics, 10Analytics-Kanban, 10Product-Analytics, 10Patch-For-Review: Import regularly via sqoop mediawiki_imagelinks table - https://phabricator.wikimedia.org/T249113 (10Milimetric) p:05Triage→03High a:05JAllemandou→03Nuria [16:06:52] 10Analytics, 10Analytics-Kanban, 10Product-Analytics, 10Patch-For-Review: Add new dimensions to druid's pageview_hourly datasource - https://phabricator.wikimedia.org/T243090 (10JAllemandou) I merged the patch without realizing that monthly job has not been updated (hourly and daily, but not monthly). Movi... [16:07:03] 10Analytics, 10Better Use Of Data, 10Product-Analytics, 10Epic: Session Length Metric. Web implementation - https://phabricator.wikimedia.org/T248987 (10Milimetric) a:03jlinehan [16:08:21] a-team: coming to our hangtime? :) [16:08:41] 10Analytics, 10Product-Analytics: Can't publish my draft dashboard on superset - https://phabricator.wikimedia.org/T248904 (10Milimetric) will research upstream for a fix and file if not found [16:08:42] nshahquinn: nuria sent an email cancelling it IIRC [16:09:01] joal: oh! I don't remember seeing it. [16:09:13] me neither [16:09:17] meh - checking [16:09:33] joal: at this point, should it stay cancelled? :) [16:09:39] or be recancelled? [16:09:41] or something [16:10:07] nshahquinn: sorry, i send a cancelation [16:10:13] nshahquinn: and an e-mail about it [16:10:45] bearloga, nshahquinn: trying to send cancelation again [16:11:11] 10Analytics, 10Better Use Of Data, 10Product-Infrastructure-Team-Backlog, 10Wikimedia-Logstash, and 3 others: Documentation of client side error logging capabilities on mediawiki - https://phabricator.wikimedia.org/T248884 (10Milimetric) a:03jlinehan assigned for documentation, ping @phuedx as well [16:11:15] nuria: well, no worries, we've got the message now hahaha [16:11:28] nshahquinn, bearloga: I think you're not in the original invite! [16:12:05] joal: our team group is...in any case, this meeting was definitely on our calendars [16:12:25] nshahquinn: not cancelled indeed, but an email had been sent to all attendees [16:12:34] ah I see [16:12:48] 10Analytics, 10Analytics-Kanban, 10Product-Analytics, 10Patch-For-Review: Add new dimensions to druid's pageview_hourly datasource - https://phabricator.wikimedia.org/T243090 (10mforns) The reason the monthly job isn't modified is I thought we didn't want to add the dimensions for pageviews_daily datasourc... [16:12:50] nshahquinn: I just forwarded it to ou and bearloga - sorry about the miscommunication folks :( [16:13:20] joal: hahaha I see. it's okay! we're going to take this time to play some team pictionary 😁 [16:14:12] 10Analytics, 10Event-Platform, 10Inuka-Team, 10KaiOS-Wikipedia-app: Capture and send back client-side errors - https://phabricator.wikimedia.org/T248615 (10Milimetric) Documentation task here: T248884, we'll track this as well to help with docs. [16:15:01] 10Analytics, 10Pageviews-API, 10Pageviews-Anomaly, 10Patch-For-Review: "Venuše (planeta)" on cs.wp has surprisingly high numbers in Pageviews Analysis (and also Topviews Analysis) - https://phabricator.wikimedia.org/T239532 (10Milimetric) \o/ for our new detection :) [16:15:07] 10Analytics, 10Pageviews-API, 10Pageviews-Anomaly, 10Patch-For-Review: "Venuše (planeta)" on cs.wp has surprisingly high numbers in Pageviews Analysis (and also Topviews Analysis) - https://phabricator.wikimedia.org/T239532 (10Milimetric) p:05Triage→03Medium [16:16:27] 10Analytics, 10Analytics-Kanban, 10Product-Analytics, 10Patch-For-Review: Add new dimensions to druid's pageview_hourly datasource - https://phabricator.wikimedia.org/T243090 (10mforns) Sorry @JAllemandou, posted comment too quickly, do you think we should add the dimensions to the pageviews_daily datasour... [16:27:04] 10Analytics, 10Commons, 10Epic: Provide download statistics of files on Wikimedia Commons - https://phabricator.wikimedia.org/T218076 (10Milimetric) @Ramsey-WMF any more thoughts on the definition of "download"? Once we have that, the technical work is relatively easy. [16:33:50] 10Analytics, 10Analytics-Kanban, 10Product-Analytics, 10Patch-For-Review: Add new dimensions to druid's pageview_hourly datasource - https://phabricator.wikimedia.org/T243090 (10JAllemandou) @mforns: I had not thought about it this way :) I think there is value in adding the same fields to the pageview_dai... [16:33:56] elukey: heya [16:34:31] elukey: can we meet tomorrow before standup to look at airflow in a stats machine? [16:36:11] mforns: sure! [16:36:29] elukey: cool :] do you want me to create a meeting? [16:36:52] mforns: seems ok, otherwise when you join is fine [16:37:38] ok then I'll just ping you [17:09:55] 10Analytics, 10Operations, 10ops-eqiad: (Need by: TBD) rack/setup/install kafka-jumbo100[789].eqiad.wmnet - https://phabricator.wikimedia.org/T244506 (10elukey) >>! In T244506#6022851, @Cmjohnson wrote: > These are failing during install. @elukey can you verify the raid configuration please > > Failed to... [17:11:26] (03PS1) 10Mforns: Add gr.wikimedia to pageviews white-list [analytics/refinery] - 10https://gerrit.wikimedia.org/r/586400 [17:16:25] * elukey afk for ~30 mins [17:27:34] (03CR) 10Nuria: [C: 03+2] Add gr.wikimedia to pageviews white-list [analytics/refinery] - 10https://gerrit.wikimedia.org/r/586400 (owner: 10Mforns) [17:27:42] (03CR) 10Nuria: [V: 03+2 C: 03+2] Add gr.wikimedia to pageviews white-list [analytics/refinery] - 10https://gerrit.wikimedia.org/r/586400 (owner: 10Mforns) [17:28:40] (03CR) 10Nuria: [V: 03+2 C: 03+2] Add imagelinks to mediawiki-history-load oozie job [analytics/refinery] - 10https://gerrit.wikimedia.org/r/585294 (https://phabricator.wikimedia.org/T249113) (owner: 10Joal) [17:29:47] ottomata: please take a look at this CR when you can https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/585292/ [17:46:57] nuria: seems easy enough, I can merge if you want [17:47:37] ty! [17:47:54] ok done! [17:51:28] * elukey off! [19:38:47] joal: you stil laround? [19:38:51] up [19:39:00] got a sec for spark scala brain bounce in cave? [19:39:05] sure, joinig [20:07:30] 10Analytics, 10Analytics-EventLogging, 10Patch-For-Review: Validate JSON-schema before allowing saves in the Schema namespace - https://phabricator.wikimedia.org/T249333 (10awight) Thanks for talking me down from the ledge :-) I'd be curious to see how the new system produces events from MediaWiki extension... [20:13:40] (03PS1) 10Mforns: Add new dimensions to druid pageviews_daily [analytics/refinery] - 10https://gerrit.wikimedia.org/r/586432 (https://phabricator.wikimedia.org/T243090) [20:16:34] (03CR) 10Mforns: [V: 03+2] "I tested this by loading a test datasource to Druid and vetting the loaded data. All seems fine! See: https://tinyurl.com/s6ydxmo" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/586432 (https://phabricator.wikimedia.org/T243090) (owner: 10Mforns) [21:20:19] (03PS1) 10Ottomata: Unify Refine transform functions to work with both legacy and new event data [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/586447 (https://phabricator.wikimedia.org/T238230) [21:24:15] (03CR) 10jerkins-bot: [V: 04-1] Unify Refine transform functions to work with both legacy and new event data [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/586447 (https://phabricator.wikimedia.org/T238230) (owner: 10Ottomata) [21:45:49] 10Analytics, 10Analytics-EventLogging, 10Timeless: EventLogging revision popup gets hidden behind content in Timeless - https://phabricator.wikimedia.org/T249557 (10Legoktm) [21:48:26] 10Analytics, 10Analytics-Kanban, 10Continuous-Integration-Infrastructure (phase-out-jessie), 10Patch-For-Review, and 2 others: Migrate analytics/refinery/source release jobs to Docker - https://phabricator.wikimedia.org/T210271 (10Jdforrester-WMF) [22:52:41] 10Analytics, 10Event-Platform, 10Growth-Team, 10MediaWiki-Revision-backend, and 7 others: Replace LinksUpdate Revision methods with RevisionRecord - https://phabricator.wikimedia.org/T249397 (10DannyS712) [23:03:59] 10Analytics, 10VPS-project-codesearch: Add analytics/* gerrit repos to code search - https://phabricator.wikimedia.org/T249318 (10Ladsgroup) >>! In T249318#6032746, @Milimetric wrote: > I don't know how to add them. But would love to. Awesome. It's in this file: https://gerrit.wikimedia.org/r/plugins/gitiles...