[08:33:42] !log stat1005 upgraded with ROCm 2.7.1 [08:33:44] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [08:35:42] updated https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/AMD_GPU [08:38:50] 10Analytics, 10Operations, 10Research-management, 10Patch-For-Review, 10User-Elukey: Remove computational bottlenecks in stats machine via adding a GPU that can be used to train ML models - https://phabricator.wikimedia.org/T148843 (10elukey) Upgraded stat1005 with ROCm 2.7.1, from my tests everything lo... [08:51:23] neilpquinn: https://github.com/wikimedia-research/Audiences-External_automatic_translation/pull/1 ping :) [08:59:00] so archiva should work now, have you guys tested a build? [09:11:58] (03PS6) 10Fdans: Add per file mediarequests endpoint to AQS [analytics/aqs] - 10https://gerrit.wikimedia.org/r/534824 (https://phabricator.wikimedia.org/T231589) [09:14:41] (03CR) 10jerkins-bot: [V: 04-1] Add per file mediarequests endpoint to AQS [analytics/aqs] - 10https://gerrit.wikimedia.org/r/534824 (https://phabricator.wikimedia.org/T231589) (owner: 10Fdans) [09:17:36] what the hell [09:17:57] why does it say "jerkins-bot" in IRC? [09:18:08] someone is quite the prankster [09:18:40] gerrit and in email notifications it still says jenkins [09:18:41] lol [09:20:41] aahhaah [09:46:26] 10Analytics, 10Analytics-Kanban, 10User-Elukey: Decouple analytics zookeeper cluster from kafka zookeeper cluster [2019-2020] - https://phabricator.wikimedia.org/T217057 (10elukey) 05Open→03Stalled The testing is currently blocked by two things: * in T231067 we are wondering if Java 8 should be used ins... [10:05:38] 10Analytics, 10Analytics-Kanban: Add more dimensions to netflow's druid ingestion specs - https://phabricator.wikimedia.org/T229682 (10elukey) Remaining things to do: 1) Add the netflow kafka supervisor job somewhere in refinery 2) Add alert on realtime data for netflow (we used to do it for banner impression... [10:45:22] * elukey lunch! [12:16:23] (03PS7) 10Fdans: Add per file mediarequests endpoint to AQS [analytics/aqs] - 10https://gerrit.wikimedia.org/r/534824 (https://phabricator.wikimedia.org/T231589) [12:20:31] (03CR) 10Fdans: Add per file mediarequests endpoint to AQS (034 comments) [analytics/aqs] - 10https://gerrit.wikimedia.org/r/534824 (https://phabricator.wikimedia.org/T231589) (owner: 10Fdans) [12:23:28] (03CR) 10jerkins-bot: [V: 04-1] Add per file mediarequests endpoint to AQS [analytics/aqs] - 10https://gerrit.wikimedia.org/r/534824 (https://phabricator.wikimedia.org/T231589) (owner: 10Fdans) [12:23:38] ffs [12:24:28] (03PS8) 10Fdans: Add per file mediarequests endpoint to AQS [analytics/aqs] - 10https://gerrit.wikimedia.org/r/534824 (https://phabricator.wikimedia.org/T231589) [12:28:16] (03CR) 10jerkins-bot: [V: 04-1] Add per file mediarequests endpoint to AQS [analytics/aqs] - 10https://gerrit.wikimedia.org/r/534824 (https://phabricator.wikimedia.org/T231589) (owner: 10Fdans) [12:28:44] (03PS9) 10Fdans: Add per file mediarequests endpoint to AQS [analytics/aqs] - 10https://gerrit.wikimedia.org/r/534824 (https://phabricator.wikimedia.org/T231589) [13:45:13] 10Analytics, 10New-Readers: Add KaiOS to the list of OS query options for pageviews in Turnilo - https://phabricator.wikimedia.org/T231998 (10SBisson) From @KartikMistry Jio phone: `Mozilla/5.0 (Mobile; LYF/F300B/LYF-F300B-001-01-34-070519; Android; rv:48.0) Gecko/48.0 Firefox/48.0 MAIOS/2.5` [14:27:25] ottomata: didn't know about https://phabricator.wikimedia.org/T232122 \o/ [14:27:30] congrats! [14:37:05] 10Analytics, 10New-Readers: Add KaiOS to the list of OS query options for pageviews in Turnilo - https://phabricator.wikimedia.org/T231998 (10SBisson) From @AMuigai Nokia 8110: `Mozilla/5.0 (Mobile; Nokia_8110_4G; rv:48.0) Gecko/48.0 Firefgox/48.0 KAIOS/2.5.1` [15:04:40] 10Analytics, 10Operations, 10hardware-requests, 10User-Elukey: codfw: 1 misc node for the Kerberos KDC service - https://phabricator.wikimedia.org/T227425 (10RobH) a:05RobH→03faidon @Faidon, Please note that T227425 & T227288 are for spare pool allocations for kerbos in both codfw and eqiad. as such,... [15:05:28] 10Analytics, 10Operations, 10hardware-requests, 10User-Elukey: eqiad: 1 misc node for the Kerberos KDC service - https://phabricator.wikimedia.org/T227288 (10RobH) a:05RobH→03faidon @Faidon, Please note that T227425 & T227288 are for spare pool allocations for kerbos in both codfw and eqiad. as such,... [15:33:14] (03CR) 10Nuria: "Almost there, I think last thing we need to do is sort out possible values for refferrer." (031 comment) [analytics/aqs] - 10https://gerrit.wikimedia.org/r/534824 (https://phabricator.wikimedia.org/T231589) (owner: 10Fdans) [15:43:30] 10Analytics, 10Analytics-Kanban: Add more dimensions to netflow's druid ingestion specs - https://phabricator.wikimedia.org/T229682 (10Nuria) >Add the netflow kafka supervisor job somewhere in refinery Can you explain a bit what do we need this for? >Add dimensions for country IP src/dst and amend puppet/refi... [16:00:38] ping fdans , ottomata [16:01:09] ping ottomata standdduppp [16:10:16] (03CR) 10Joal: "Not that I know, it's good to go for train IMO." [analytics/refinery] - 10https://gerrit.wikimedia.org/r/534611 (https://phabricator.wikimedia.org/T231856) (owner: 10Joal) [16:26:15] ottomata: Your spark-streaming-SQL is very cool :) [16:27:04] :) [16:27:20] wouldn't it be amazing if we had th jsonschema reg auto integration!!!! [16:31:03] oh yes :) [16:32:29] ottomata: something fun to think about in that direction (whether we use spark or flink) is that, since all data needs to transit on the network, running in K8s should be the way [16:32:41] indeed [16:32:44] for prod job stuff [16:32:54] even for fun-jobs stuff [16:42:41] * elukey off o/ [17:13:11] (03CR) 10Nuria: [C: 03+2] Cleanup artifacts folder [analytics/refinery] - 10https://gerrit.wikimedia.org/r/534611 (https://phabricator.wikimedia.org/T231856) (owner: 10Joal) [17:13:53] nuria: I'm confident about that --^, but there always is a risk doing those things :S [17:14:10] So I'm both confident and afraid [17:15:02] joal: ya, the risk is that we are deleting ajar that is used no? [17:15:15] correct, that something fails becasue jar is not here anymore [17:15:35] fdans, mforns , joal : anything else you are thinking of for deployment train? https://etherpad.wikimedia.org/p/analytics-weekly-train [17:17:45] 10Analytics: Superset + Turnilo access for Verena Lindner + Raja Gumienny (WMDE) - https://phabricator.wikimedia.org/T231677 (10Nuria) Has @Verena also signed an NDA? [17:21:19] (03CR) 10Nuria: [V: 03+2 C: 03+2] Cleanup artifacts folder [analytics/refinery] - 10https://gerrit.wikimedia.org/r/534611 (https://phabricator.wikimedia.org/T231856) (owner: 10Joal) [17:24:25] 10Analytics, 10New-Readers: Add KaiOS to the list of OS query options for pageviews in Turnilo - https://phabricator.wikimedia.org/T231998 (10SBisson) PR is here: https://github.com/ua-parser/uap-core/pull/428 Waiting for feedback from the project maintainers. [17:26:09] (03PS1) 10Nuria: Changelog v0.0.99 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/535897 [17:28:03] (03CR) 10Nuria: [C: 03+2] "Self merging to deploy." [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/535897 (owner: 10Nuria) [17:38:04] nuria joal I was thinking that we could save a little space in cassandra by computing all-agents inside aqs [17:38:19] since for monthly we are already doing some aggregation [17:38:35] and not load that number in cassandra [17:39:08] fdans: if we want to do so, we need to change the cassandra-keys for not looking multiple rows for multiple agents [17:40:02] I'm not sure of the tradeoff here (space gain vs computation needed/queries) [17:40:15] (03CR) 10Nuria: [V: 03+2 C: 03+2] Changelog v0.0.99 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/535897 (owner: 10Nuria) [17:40:17] joal: wait why would we need more queries? [17:40:34] we are already getting the three columns on each cassandra request right? [17:40:46] for every all-agents query, you need 2 queries (one for each agent) [17:41:09] joal: hmmmm no each agent has a column in cassandra [17:41:13] agents are pivoted [17:41:43] Oh I see ! Pivoted, therefore computing all means only taking 2 values and summing [17:41:45] we would only have to sum the user + spider [17:41:50] yes [17:42:31] The complexity might come with column explosion [17:44:34] joal: should I do it then? [17:44:55] fdans: in a meeting now so I can't compute :) [17:45:02] oh sorry [17:45:12] np - my bad shouldn't have answered [17:45:17] will ping later [17:58:38] building jar 0.0.99 [18:14:54] !log deployment of v0.0.99 of refinery that includes quite a bit of cleanup [18:14:56] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [18:17:11] ottomata: no capacity (looks like) to deploy to stats1007 [18:17:51] bwa [18:17:54] ottomata: this keeps on happening but looks like we need to cleanup the dir [18:18:08] hmm s7 is fine nuria [18:19:17] you sure it is stat1007 nuria ? [18:19:39] ottomata: ya [18:19:44] https://www.irccloud.com/pastebin/J5OBG25E/ [18:20:06] nuria: is there more output? [18:20:13] ottomata: unless error is reporting something else i do not see, one sec [18:40:09] Hey fdans - sorry went away for diner after meeting [18:40:17] fdans: do ou want to talk now or tomorroe? [18:42:58] !log deployment of v0.0.99 to cluster succeeded, letting it bake for a bit [18:43:00] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [18:49:52] joal: I can do batcave a couple mins if you're available [18:50:02] OMW ! [19:05:11] nuria: would you have aminute? [19:05:25] joal: yes, bc? [19:05:30] Yes please :) [19:21:12] 10Analytics, 10EventBus, 10Scoring-platform-team: Change event.mediawiki_revision_score schema to use map types - https://phabricator.wikimedia.org/T225211 (10Ottomata) [19:22:53] 10Analytics, 10EventBus, 10Scoring-platform-team: Change event.mediawiki_revision_score schema to use map types - https://phabricator.wikimedia.org/T225211 (10Ottomata) @JAllemandou @Pchelolo Q: do we want to have `scores` be a map type keyed by model name, or still an array of objects? (We'll certainly cha... [19:23:48] 10Analytics, 10EventBus, 10Scoring-platform-team: Change event.mediawiki_revision_score schema to use map types - https://phabricator.wikimedia.org/T225211 (10Ottomata) [19:26:57] 10Analytics, 10EventBus, 10Scoring-platform-team: Change event.mediawiki_revision_score schema to use map types - https://phabricator.wikimedia.org/T225211 (10Ottomata) [19:30:00] nuria: sorry was writing a task. can I merge that puppet change tomorrow morning? [19:30:04] so we can watch just in case? [19:30:09] for refinery jar [19:30:10] ? [19:30:11] ottomata: sure [19:30:30] ottomata: it is going to be GOLDEN [19:31:20] :) [19:36:46] 10Analytics, 10EventBus, 10Scoring-platform-team: Change event.mediawiki_revision_score schema to use map types - https://phabricator.wikimedia.org/T225211 (10Pchelolo) >>! In T225211#5485276, @Ottomata wrote: > @JAllemandou @Pchelolo Q: do we want to have `scores` be a map type keyed by model name, or still... [19:41:10] 10Analytics, 10EventBus, 10Scoring-platform-team: Change event.mediawiki_revision_score schema to use map types - https://phabricator.wikimedia.org/T225211 (10JAllemandou) I like the idea of having the model-name as map-key. Only limitation I can think of is that only one model version can be reported on a r... [19:42:19] 10Analytics, 10EventBus, 10Scoring-platform-team: Change event.mediawiki_revision_score schema to use map types - https://phabricator.wikimedia.org/T225211 (10Ottomata) I think we were told by ORES folks that that would never happen (at least not for a given revision-score event). If a revision is again scor... [20:08:38] 10Analytics, 10EventBus, 10Scoring-platform-team: Change event.mediawiki_revision_score schema to use map types - https://phabricator.wikimedia.org/T225211 (10JAllemandou) I recall that as well @Ottomata - My note was purely theoretical :) I'm fully in favor of having a map keyed by model names. [20:09:00] Gone for tonight team - see you tomorrowe [20:09:08] 10Analytics, 10EventBus, 10Scoring-platform-team: Change event.mediawiki_revision_score schema to use map types - https://phabricator.wikimedia.org/T225211 (10Ottomata) :) [20:10:54] 10Analytics-EventLogging, 10Analytics-Kanban, 10EventBus, 10MediaWiki-JobQueue, and 3 others: Migrate JobQueue to eventgate - https://phabricator.wikimedia.org/T228705 (10Pchelolo) [20:10:57] 10Analytics, 10Analytics-EventLogging, 10EventBus, 10MediaWiki-JobQueue, and 3 others: Delayed jobs fail validation in eventgate - https://phabricator.wikimedia.org/T230049 (10Pchelolo) 05Open→03Resolved [20:14:45] 10Analytics, 10Analytics-Kanban, 10Product-Analytics: event_user_id is always NULL for anonymous edits in Mediawiki History table - https://phabricator.wikimedia.org/T232171 (10kzimmerman) p:05Triage→03Normal [20:17:06] 10Analytics, 10Analytics-Kanban, 10Product-Analytics: Discrepancies in Superset Pageview Data - https://phabricator.wikimedia.org/T232382 (10kzimmerman) p:05Triage→03High Thanks so much for investigating & fixing this, @JAllemandou! And I appreciate the updated usage notes. [20:20:57] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Set up automatic deletion for netflow datasource in Druid - https://phabricator.wikimedia.org/T229674 (10mforns) That patch should do the trick, but we should wait about 2 months before merging. Netflow data from 90 days ago still has the old schema and wo... [20:31:45] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10EventBus, and 4 others: Decomission eventlogging-service-eventbus and clean up related configs and code - https://phabricator.wikimedia.org/T232122 (10Ottomata) Woo hoo, eventlogging-service-eventbus is no more! Still have some various cleanup t... [21:46:49] 10Analytics, 10Analytics-Kanban, 10StructuredDataOnCommons, 10Tool-Pageviews, 10Patch-For-Review: Add literal transcoding to media file properties UDF - https://phabricator.wikimedia.org/T230312 (10Nuria) [21:47:02] 10Analytics, 10Analytics-Kanban, 10StructuredDataOnCommons, 10Tool-Pageviews, 10Patch-For-Review: Add literal transcoding to media file properties UDF - https://phabricator.wikimedia.org/T230312 (10Nuria) 05Open→03Resolved [21:47:06] 10Analytics, 10Analytics-Kanban, 10Tool-Pageviews: Create new mediarequests table - https://phabricator.wikimedia.org/T229817 (10Nuria) [21:48:43] 10Analytics, 10Analytics-Kanban: Refactor quenename into HQL hive2 action oozie jobs - https://phabricator.wikimedia.org/T231002 (10Nuria) 05Open→03Resolved [21:49:04] 10Analytics, 10Analytics-Kanban, 10Tool-Pageviews: Create new mediarequests table - https://phabricator.wikimedia.org/T229817 (10Nuria) [21:49:17] 10Analytics, 10Analytics-Kanban, 10Tool-Pageviews: Create new mediarequests table - https://phabricator.wikimedia.org/T229817 (10Nuria) 05Open→03Resolved [21:49:25] 10Analytics, 10Tool-Pageviews, 10Patch-For-Review: Load media requests data into cassandra - https://phabricator.wikimedia.org/T228149 (10Nuria) [21:49:36] 10Analytics, 10Analytics-Kanban: Add --skip-trash arg to refinery-drop-older-than calls in data_purge.pp - https://phabricator.wikimedia.org/T229436 (10Nuria) 05Open→03Resolved [21:49:51] 10Analytics, 10Analytics-EventLogging, 10EventBus, 10Core Platform Team Legacy (Watching / External), 10Services (watching): Modern Event Platform: Stream Intake Service - https://phabricator.wikimedia.org/T201068 (10Nuria) [21:49:58] 10Analytics-EventLogging, 10Analytics-Kanban, 10EventBus, 10CPT Initiatives (Modern Event Platform (TEC2)), and 4 others: Modern Event Platform: Stream Intake Service: Migrate eventlogging-service-eventbus events to eventgate-main - https://phabricator.wikimedia.org/T211248 (10Nuria) 05Open→03Resolved [21:50:19] 10Analytics-EventLogging, 10Analytics-Kanban, 10EventBus, 10CPT Initiatives (Modern Event Platform (TEC2)), and 4 others: Modern Event Platform: Stream Intake Service: Migrate eventlogging-service-eventbus events to eventgate-main - https://phabricator.wikimedia.org/T211248 (10Nuria) Let's do the wave!!!!! [22:04:05] 10Analytics, 10Operations, 10Traffic: Images served with text/html content type - https://phabricator.wikimedia.org/T232679 (10Nuria) [22:06:24] 10Analytics, 10Operations, 10Traffic: Images served with text/html content type - https://phabricator.wikimedia.org/T232679 (10Nuria) This has the effect that these images are being considered content pageviews when they are just asset requests [22:11:51] 10Analytics, 10Analytics-EventLogging, 10EventBus, 10CPT Initiatives (Modern Event Platform (TEC2)), 10Services (watching): Migrate all event-schemas schemas to current.yaml and materialize with jsonschema-tools and remove old schemas - https://phabricator.wikimedia.org/T232144 (10Pchelolo) 05Open→03R... [22:11:55] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10EventBus, and 4 others: Decomission eventlogging-service-eventbus and clean up related configs and code - https://phabricator.wikimedia.org/T232122 (10Pchelolo) [22:12:00] 10Analytics, 10Analytics-EventLogging, 10EventBus, 10CPT Initiatives (Modern Event Platform (TEC2)), and 2 others: CI Support for Schema Registry - https://phabricator.wikimedia.org/T206814 (10Pchelolo) [22:22:46] 10Analytics, 10Analytics-Kanban: Add more dimensions to netflow's druid ingestion specs - https://phabricator.wikimedia.org/T229682 (10ayounsi) >>! In T229682#5481530, @Nuria wrote: > I just though i can easily setup turnilo to decode tcp_flags so they are not ints, let me give it a try If it works fine and i... [22:34:21] 10Analytics, 10Operations, 10Traffic: Images served with text/html content type - https://phabricator.wikimedia.org/T232679 (10Nuria) I think we need to add proxy=googleweblight to x-analytics