[02:42:29] 10Analytics, 10Analytics-EventLogging, 10Contributors-Team, 10MobileFrontend: Schema:MobileWebEditing: What are commons sorts of errors? - https://phabricator.wikimedia.org/T118366 (10Jdlrobson) [07:12:50] 10Analytics, 10Operations: Move Superset to a Buster VM - https://phabricator.wikimedia.org/T258768 (10MoritzMuehlenhoff) [07:13:37] 10Analytics-Clusters, 10Patch-For-Review, 10User-Elukey: Upgrade Druid to its latest upstream version (currently 0.18.1) - https://phabricator.wikimedia.org/T244482 (10elukey) Tested some indexing jobs; * navtiming hive2druid * webrequest json * webrequest parquet The last two thanks to Joseph's help :). Ev... [07:17:52] 10Analytics, 10Operations, 10Patch-For-Review: Move Hue to a Buster VM - https://phabricator.wikimedia.org/T258768 (10elukey) [08:31:31] 10Analytics-Clusters, 10Patch-For-Review, 10User-Elukey: Upgrade Druid to its latest upstream version (currently 0.18.1) - https://phabricator.wikimedia.org/T244482 (10elukey) Tried a kafka supervisor for netflow, plus some queries to the broker, everything looks good. I'll file a gh issue with the above des... [09:16:40] 10Analytics-Clusters, 10Discovery, 10Discovery-Search: Move mjolnir kafka daemon from ES to search-loader VMs - https://phabricator.wikimedia.org/T258245 (10Gehel) [09:21:56] o/ if I want webrequests for query.wikidata.org what partition should I pick? (misc, text or something else)? [09:24:38] 10Analytics-Clusters, 10Analytics-Radar, 10User-Elukey: Monitoring GPU Usage on stat Machines - https://phabricator.wikimedia.org/T251938 (10Aroraakhil) @elukey just curious if there are any updates on this? [09:25:55] 10Analytics-Clusters, 10Analytics-Radar, 10User-Elukey: Monitoring GPU Usage on stat Machines - https://phabricator.wikimedia.org/T251938 (10elukey) @Aroraakhil sorry still no progress, I hope to get something done this Quarter :( [09:31:21] looks like it's "text" [09:40:58] dcausse: yes sorry didn't see your msg [09:41:08] np :) [09:45:14] 10Analytics-Clusters, 10Patch-For-Review, 10User-Elukey: Upgrade Druid to its latest upstream version (currently 0.18.1) - https://phabricator.wikimedia.org/T244482 (10elukey) Created https://github.com/apache/druid/issues/10209 [09:46:39] 10Analytics-Clusters, 10Patch-For-Review, 10User-Elukey: Upgrade Druid to its latest upstream version (currently 0.18.1) - https://phabricator.wikimedia.org/T244482 (10elukey) So current status: I'd love to get somebody else to test the new Druid version, but from my point of view we should be ready to upgra... [10:26:21] 10Analytics-Clusters, 10Analytics-Radar, 10User-Elukey: Monitoring GPU Usage on stat Machines - https://phabricator.wikimedia.org/T251938 (10Aroraakhil) @elukey thanks! [10:31:20] * elukey lunch! [12:31:13] 10Analytics: Check home/HDFS leftovers of drossi/fsalutari - https://phabricator.wikimedia.org/T258788 (10MoritzMuehlenhoff) [14:03:09] 10Analytics-Clusters, 10Discovery, 10Discovery-Search, 10Patch-For-Review: Move mjolnir kafka daemon from ES to search-loader VMs - https://phabricator.wikimedia.org/T258245 (10elukey) While working on https://gerrit.wikimedia.org/r/c/operations/puppet/+/616101/, I realized that mjolnir is also used on rel... [14:26:38] 10Analytics: Deprecate Hue (if possible) - https://phabricator.wikimedia.org/T258799 (10elukey) [14:26:52] 10Analytics-Clusters: Deprecate Hue (if possible) - https://phabricator.wikimedia.org/T258799 (10elukey) [14:27:17] 10Analytics-Clusters: Deprecate Hue (if possible) - https://phabricator.wikimedia.org/T258799 (10elukey) [14:27:19] 10Analytics, 10Analytics-Kanban: Analytics Ops Technical Debt - https://phabricator.wikimedia.org/T240437 (10elukey) [14:28:19] elukey: still around? I was looking at https://sonarcloud.io/organizations/wmftest/projects and it looks like refinery isn't being analyzed. [14:28:46] maryum has done some work on our side, she might be able to help get it analyzed on your side as well. [14:29:07] (note that I haven't talked about that with her yet) [14:35:50] gehel: that would be great yes! [14:36:10] I'll open a task [14:37:32] gehel: I'll also talk with Nuria about the possibility of assigning one of us to follow all the things to do for refinery, we should probably track all in a task and make a plan about how to fix, when etc.. [14:38:01] elukey: I did create a few tasks under T258680 [14:38:02] T258680: Update Code Conventions for Java and Scala - https://phabricator.wikimedia.org/T258680 [14:38:13] gehel: ah ok nice I didn't know it, checking [14:38:30] well, I'm creating the second one now, so "a few" might be misleading [14:39:10] I am sure we'll open more :) [14:39:49] there is probably another one to create to improve the maven dependency settings that we have right now, IIRC you already started working on it right? [14:40:06] 10Analytics, 10Code-Health: Integrate SonarCloud analysis as part of the analytics refinery builds - https://phabricator.wikimedia.org/T258800 (10Gehel) [14:42:42] elukey: I've pushed my few cleanups to T258699 [14:42:42] T258699: Introduce various static analysis tools to analytics/refinery - https://phabricator.wikimedia.org/T258699 [14:43:09] not sure if more details are needed in Phab [14:43:10] 10Analytics, 10Analytics-Kanban, 10Performance-Team: Validation rules on eventgate should take max int values into account in order to validate data for an schema - https://phabricator.wikimedia.org/T258659 (10Nuria) a:05Gilles→03fdans [14:43:15] 10Analytics, 10Analytics-Kanban, 10Performance-Team: Validation rules on eventgate should take max int values into account in order to validate data for an schema - https://phabricator.wikimedia.org/T258659 (10Nuria) B. Make EventGate always add max and min long validation to integer types, even if the jsons... [14:43:35] I'm not going to push anything else until those are merged [14:44:26] ack thanks a lot [14:44:44] The maven dependencies are a complete mess (I'm surprised that it isn't causing more issues). I'm not going to try to address those, too much of a mess given my limited understanding of the code base. But you (as a team) should really do something about it. [14:45:08] I can spend an hour with you (joal, ottomata, someone else?) to go over the principles of what is broken there. [14:46:26] gehel: yep definitely, I realized the mess when working on archiva the last time :( If you have time it would be great to have your thoughts about why it is a mess (maybe very high level points) in a subtask so we can keep it as reference [14:46:28] nuria, joal, ottomata: ping me if you want an introduction to the wonderful world of broken Maven dependencies and a few tools to address the issues. [14:47:12] the in my opinion we'd need to have somebody assigned to it, and plan when to resolve the problems during the next quarters [14:47:13] task coming up! [14:47:16] <3 [14:47:49] gehel: joal is on vacation and ottomata does not work fridays. Filing a task sounds great , i think elukey explained why we cannot possibly do chnages in source without slowly retsrating jobs affected [14:47:57] *restarting [14:48:25] gehel: but we can most definitely schedule a project to do it once the team has the full staff again [14:48:44] I'll push a few more thoughts in phab and after that I'll let you come back to me if you think any of those are a good idea and / or if you need my help [14:49:12] ofc, your call if it make sense to do those cleanups or not [14:49:53] gehel: it totally makes sense [14:49:58] gehel: really [14:50:11] it always comes down to priorities :) [14:50:37] introducing sonarcloud is easy and very low risk, that's a good first one [14:50:56] gehel: but we cannot have source be very different from runtime, and, until end of august we are in life support [14:51:06] yep [14:51:24] (03PS3) 10Gehel: Introduce Takari Maven Wrapper. [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/615481 (https://phabricator.wikimedia.org/T258699) [14:52:05] gehel: let's schedule a meeting between you and i (and ottomata if he can come) to go over your thoughts with mvn [14:52:24] (03PS4) 10Gehel: Use properties to configure compiler source and target versions. [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/615485 (https://phabricator.wikimedia.org/T258699) [14:53:07] (03PS4) 10Gehel: Introduce ForbiddenAPI as a static analysis tool. [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/615722 (https://phabricator.wikimedia.org/T258699) [14:58:01] 10Analytics, 10Code-Health: Cleanup Maven dependencies in analytics/refinery - https://phabricator.wikimedia.org/T258802 (10Gehel) [15:15:05] 10Analytics, 10MobileFrontend, 10Readers-Web-Backlog, 10Performance-Team (Radar), 10Technical-Debt: Figure out XAnalytics stuff - https://phabricator.wikimedia.org/T190381 (10Jdlrobson) @mforns Per T190381#5054344 when can we plan to remove this code. I'm not sure what to do with this task :) [15:44:52] 10Analytics, 10Event-Platform, 10Technical-blog-posts: Story idea for Blog: Wikimedia's Event Platform - https://phabricator.wikimedia.org/T253649 (10srodlund) @Ottomata I left a note on the doc, but this looks like a good start. Do you want to draft the first post and let me know when your draft is done? I... [16:22:11] 10Analytics, 10MobileFrontend, 10Readers-Web-Backlog, 10Performance-Team (Radar), 10Technical-Debt: Figure out XAnalytics stuff - https://phabricator.wikimedia.org/T190381 (10Nuria) @Jdlrobson there are bits of data in X-Analytics that are used in the pageview pipeline like for example the "special" page... [16:43:18] * elukey off! [16:43:24] have a good weekend folks! [16:43:26] :) [16:46:52] hey a-team! There's a table called wikidata_entity in the Data Lake, can we get something like that for Commons too? Maybe there's already a phab task for that? If not, I can file one. [16:59:46] 10Analytics-Clusters: Move the stat1004-6-7 hosts to Debian Buster - https://phabricator.wikimedia.org/T255028 (10Isaac) @elukey thanks for the heads up -- any expectation that any Python packages will be problematic to reinstall? The one that generally gives me the most headache btw is [[https://pypi.org/projec... [17:32:02] Nettrom: which table specifically from commons? [17:33:05] 10Analytics: Check home/HDFS leftovers of nathante - https://phabricator.wikimedia.org/T256356 (10Groceryheist) Hi @leila, I don't have reviews back yet. I expect them to come next week. If it's a hassle we can wait and see if the reviewers want anything. There were a few things I wanted to double-check in any... [17:50:11] 10Analytics: Check home/HDFS leftovers of nathante - https://phabricator.wikimedia.org/T256356 (10leila) @Groceryheist ok, let's wait then. If you need it, we will figure it out. for the stat1006 copy: any copy out of the servers will need a review and thumbs up on our end. Can you list in this task which of th... [17:58:58] 10Analytics-Clusters, 10Discovery, 10Discovery-Search, 10Patch-For-Review: Move mjolnir kafka daemon from ES to search-loader VMs - https://phabricator.wikimedia.org/T258245 (10RKemper) @elukey In response to your earlier ping in this thread - yup, let's work to get this moving next week [17:59:31] milimetric: the wikidata_entity table contains a conversion of the Wikidata entities JSON dumps (https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Edits/Wikidata_entity) Now that we have structured data on Commons, having a similar table for that would be helpful. [18:11:45] Nettrom: you mean wikibase in commons right? [18:12:06] nuria: yes [18:12:17] Nettrom: you mean mediawiki_wbc_entity_usage [18:12:18] ? [18:12:23] no [18:12:53] Nettrom: ah, i see, the table on wmf [18:13:23] it's a parsing of the Wikidata dump, so it's be a parsing of the JSON dump of Structured Data on Commons [18:13:37] sounds like I should make a phab task for this? [18:15:43] Nettrom: yes a task would be handy but if i remember correctly such a dump of commons does not exist , maybe that chnaged? [18:16:07] *changed [18:22:55] Nettrom: are you looking for the dump in question? [18:24:22] nuria: there should be dumps of Commons now, but I'm unsure where they are. I'll dig into this and file a phab task as well with information in it. [18:24:31] Nettrom: https://phabricator.wikimedia.org/T221917 [18:25:49] Nettrom: I think teh rdfs are used to feed in the sparql query endpoint [18:26:00] Nettrom: thus they exists for bootstrapping [18:26:20] Nettrom: I am not sure any other dumps in json format where created for commons, let us know what you find [18:27:03] nuria: ah, I see! I'll dig into this, thanks for the help so far! [18:57:57] 10Analytics, 10Analytics-Kanban, 10Performance-Team: Validation rules on eventgate should take max int values into account in order to validate data for an schema - https://phabricator.wikimedia.org/T258659 (10Nuria) [20:07:57] 10Analytics: Check home/HDFS leftovers of nathante - https://phabricator.wikimedia.org/T256356 (10Groceryheist) Okay, For the project with @halfak any risks would arise from the internal histories of historical ores scores of revisions. The rest of the data used in the project is the publicly available wikimedi... [20:35:53] 10Analytics, 10Product-Analytics, 10Structured-Data-Backlog: Create a Commons equivalent of the wikidata_entity table in the Data Lake - https://phabricator.wikimedia.org/T258834 (10nettrom_WMF) [20:37:54] 10Analytics, 10Product-Analytics, 10Structured-Data-Backlog: Create a Commons equivalent of the wikidata_entity table in the Data Lake - https://phabricator.wikimedia.org/T258834 (10nettrom_WMF) From my understanding of the documentation of `wmf.wikidata_entity`, it's based on the JSON dump of the data. For... [20:38:54] 10Analytics, 10Product-Analytics, 10Structured-Data-Backlog: Create a Commons equivalent of the wikidata_entity table in the Data Lake - https://phabricator.wikimedia.org/T258834 (10nettrom_WMF) @Miriam : If I remember correctly, this kind of table would be useful for your work. Could you add some use cases... [20:49:16] 10Analytics, 10Analytics-EventLogging: Events being lost in Chrome when navigating to an external URL - https://phabricator.wikimedia.org/T258513 (10Mayakp.wiki) On July 23 @Edtadros helped fire many events to external links on [[ https://test.wikipedia.org/wiki/Main_Page | Testwiki ]] to check on this issue a... [20:50:52] 10Analytics-Clusters, 10DC-Ops, 10Operations, 10ops-eqiad: analytics1050 host + mgmt down - https://phabricator.wikimedia.org/T258370 (10wiki_willy) a:03Cmjohnson [22:01:49] 10Analytics, 10Product-Analytics, 10Structured-Data-Backlog: Create a Commons equivalent of the wikidata_entity table in the Data Lake - https://phabricator.wikimedia.org/T258834 (10Nuria) Putting this in radar until the json dump is created for commons. [22:02:01] 10Analytics-Radar, 10Product-Analytics, 10Structured-Data-Backlog: Create a Commons equivalent of the wikidata_entity table in the Data Lake - https://phabricator.wikimedia.org/T258834 (10Nuria) [22:36:21] 10Analytics, 10Analytics-Kanban, 10Product-Analytics, 10Patch-For-Review: Add editors per country data to AQS API (geoeditors) - https://phabricator.wikimedia.org/T238365 (10kzimmerman)