[07:00:36] 10Analytics, 10Graphite: Grafana shows zero EventLogging events for around 44 hours around January 15 - https://phabricator.wikimedia.org/T215744 (10elukey) If I follow the links I can see the hole in ReadingDepth only sometimes, so the first thought that comes into mind is that since this data is backed up by... [07:04:43] morningggg [07:06:11] 10Analytics, 10Analytics-Kanban, 10DBA, 10Patch-For-Review, 10User-Banyek: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5] - https://phabricator.wikimedia.org/T210478 (10Marostegui) Now that we know that the biggest and and most painful table (as it was Aria and it was huge - around 180G... [07:38:49] updated https://wikitech.wikimedia.org/wiki/Analytics/Data_access#MariaDB_replicas [07:38:54] addshore, Amir1 ---^ [07:38:59] SRV records ready [08:24:46] \o/! [08:24:51] Morning elukey :) [08:27:07] 10Analytics, 10Analytics-Kanban, 10DBA, 10Patch-For-Review, 10User-Banyek: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5] - https://phabricator.wikimedia.org/T210478 (10Marostegui) After having al the sections ready and compressed on all hosts, there is one thought I had, where to leave... [08:38:25] elukey: woo [08:38:35] !log restart superset to pick up new settings in config.py [08:38:37] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [08:42:42] addshore: I am going to send an email about this but there is a caveat - at some point we'll need to set the stagingdb on dbstore1002 in read-only, dump its last version and import to the new host [08:42:53] otherwise the staging db will have different data [08:42:56] ack [08:43:01] how does this affect your scripts? [08:43:19] yes, but also none of the scripts that write to the db are that critical [08:43:30] to be honest, if they missed 1 weeks worth of data nooone would care :) [08:43:31] :P [08:44:08] so IMO feel free to switch to readyonly mode whever you want! [08:45:22] addshore: will try to set a date so everybody can coordinate :) [08:48:23] amazing [08:57:44] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Generate edit totals by country by month - https://phabricator.wikimedia.org/T215655 (10JAllemandou) Hey @Milimetric - Could we add "sum_edit_counts" to the existing dataset instead of creating a new one ? [09:00:29] 10Analytics, 10Analytics-Kanban, 10EventBus: Spike: Can Refine handle map types if Hive Schema already exists with map fields? - https://phabricator.wikimedia.org/T215442 (10JAllemandou) Arf - Will try to be clearer: Instead of getting schema data from reading json and double read in case the schema you get... [09:02:36] 10Analytics, 10Product-Analytics, 10Research, 10WMDE-Analytics-Engineering, and 3 others: Provide tools for querying MediaWiki replica databases without having to specify the shard - https://phabricator.wikimedia.org/T212386 (10elukey) SRV records deployed and documentation updated! If anybody could give i... [09:24:34] 10Analytics, 10Analytics-Cluster, 10Analytics-Kanban, 10DBA, and 2 others: Cleanup or remove mysql puppet module; repurpose mariadb module to cover misc use cases - https://phabricator.wikimedia.org/T162070 (10jcrespo) > Second, mariadb::packages_wmf and mariadb::packages should probably be merged into one... [09:42:35] (03PS3) 10Elukey: Introduce analytics-mysql [analytics/refinery] - 10https://gerrit.wikimedia.org/r/488473 (https://phabricator.wikimedia.org/T212386) [09:43:48] 10Analytics, 10Analytics-Kanban, 10DBA, 10Patch-For-Review, 10User-Banyek: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5] - https://phabricator.wikimedia.org/T210478 (10Marostegui) [09:44:19] (03CR) 10Elukey: "Dan if you want to start commenting please do, the script seems working on stat1007 but I'd like to know more of what you'd like to see in" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/488473 (https://phabricator.wikimedia.org/T212386) (owner: 10Elukey) [10:01:33] !log restart superset to pick up new config.py changes [10:01:35] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [10:03:10] elukey: I wrote something similar https://gerrit.wikimedia.org/r/c/analytics/wmde/scripts/+/489097/1/lib/WikimediaDbSectionMapper.php [10:04:56] Amir1: ah wow I didn't know about https://noc.wikimedia.org/conf/db-eqiad.php.txt [10:04:59] looks awesome [10:05:24] not sure though how to parse it in python, it would help a lot [10:10:01] yeah, even what I did in php is horrible hack IMO [10:11:01] ahahhaah [10:11:18] it is handy since you don't need mediawiki-config deployed [10:12:59] 10Analytics, 10Analytics-Kanban, 10DBA, 10Patch-For-Review, 10User-Banyek: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5] - https://phabricator.wikimedia.org/T210478 (10Marostegui) [10:13:46] 10Analytics, 10Analytics-Kanban, 10DBA, 10Patch-For-Review, 10User-Banyek: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5] - https://phabricator.wikimedia.org/T210478 (10Marostegui) [10:14:15] 10Analytics, 10Analytics-Kanban, 10DBA, 10Patch-For-Review, 10User-Banyek: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5] - https://phabricator.wikimedia.org/T210478 (10Marostegui) [10:17:07] 10Analytics, 10Analytics-Kanban, 10DBA, 10Patch-For-Review, 10User-Banyek: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5] - https://phabricator.wikimedia.org/T210478 (10Marostegui) >>! In T210478#4942496, @Marostegui wrote: > After having al the sections ready and compressed on all host... [10:28:15] 10Analytics: Check home leftovers of ISI researchers - https://phabricator.wikimedia.org/T215775 (10MoritzMuehlenhoff) [10:28:54] 10Analytics: Check home leftovers of ISI researchers - https://phabricator.wikimedia.org/T215775 (10elukey) [10:36:36] 10Analytics: Check home leftovers of ISI researchers - https://phabricator.wikimedia.org/T215775 (10elukey) List of leftovers: * mtizzoni ` ====== stat1007 ====== total 1004 [..cut..] ====== notebook1004 ====== total 4 drwxr-xr-x 7 mtizzoni wikidev 4096 Jul 5 2018 venv ` * panisson ` ====== stat1007 =====... [10:36:57] 10Analytics: Check home leftovers of ISI researchers - https://phabricator.wikimedia.org/T215775 (10elukey) p:05Triage→03Normal [11:09:42] elukey: Is mediawiki-config deployed on stat machines? [11:09:57] not yet :) [11:10:01] k [11:10:25] elukey: I would help to test the shard-awareness :) [11:12:44] makes sense, lemme add it to puppet [11:40:43] joal: created https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/489660/, going to wait for Andrew to see if he wants to change name etc.. [11:40:55] after that, the repo will be deployed on each stat/notebook [11:45:19] 10Analytics, 10Research: Add (scoop) wikidatadawiki.wb_items_per_site MariaDB table to wmf_raw - https://phabricator.wikimedia.org/T215616 (10JAllemandou) I wonder if the indformation present in the table mentioned is the same as the one we could extract from site-links in the wikidata items. @diego : Could yo... [12:06:33] dsaez: Heya - another pain about joining by title: the namespace ... [12:06:39] Just thought about that [12:06:54] yes [12:07:07] worst idea ever [12:07:13] hehehe [12:07:41] dsaez: I'll work a join for you using some tricks ;) [12:07:59] <3 [12:15:55] elukey: on stat1007 I got the following error when running puppet after the removal of the ISI researchers: [12:16:07] Notice: /Stage[main]/Admin/Admin::Groupmembers[statistics-privatedata-users]/Exec[statistics-privatedata-users_ensure_members]/returns: executed successfully [12:16:09] Error: Command exceeded timeout [12:16:10] Error: /Stage[main]/Admin/Exec[enforce-users-groups-cleanup]/returns: change from notrun to 0 failed: Command exceeded timeout [12:16:32] maybe this is also related to timeouts caused by huge homes, haven't invstigated further [12:17:33] ah lovely [12:17:44] there is a couple with huge homes [12:40:39] 10Analytics: Clean up home dirs for users jamesur and nithum - https://phabricator.wikimedia.org/T212127 (10elukey) Zoom in into tlwiki.db: ` checking table: tlwiki.logging location:hdfs://analytics-hadoop/user/hive/warehouse/tlwiki.db/logging, checking table: tlwiki.page location:hdfs://analytics-hadoop/user/... [12:40:42] joal: --^ if you have time [12:47:57] 10Analytics: Clean up home dirs for users jamesur and nithum - https://phabricator.wikimedia.org/T212127 (10JAllemandou) More info on this db: It contains sqooped data from `tlwiki` as naming suggests (number of revision coherent with recent snapshot). Data format is not optimal (hive-oriented, not even avro) an... [12:48:03] elukey: --^ :) [12:48:21] <3 [12:49:31] 10Analytics: Clean up home dirs for users jamesur and nithum - https://phabricator.wikimedia.org/T212127 (10elukey) Everything cleaned up! [12:49:50] 10Analytics, 10Analytics-Kanban: Clean up home dirs for users jamesur and nithum - https://phabricator.wikimedia.org/T212127 (10elukey) a:03elukey [12:50:27] going away for lunch + gym, bbl! [12:58:14] 10Analytics, 10DBA, 10Research: Improve interlingual links across wikis through Wikidata IDs - https://phabricator.wikimedia.org/T215616 (10diego) [13:02:30] 10Analytics, 10DBA, 10Research, 10Wikidata: Improve interlingual links across wikis through Wikidata IDs - https://phabricator.wikimedia.org/T215616 (10diego) [13:11:44] (03PS1) 10GoranSMilovanovic: update engine re-factor [analytics/wmde/WiktionaryCognateDashboard] - 10https://gerrit.wikimedia.org/r/489674 [13:11:47] (03PS1) 10GoranSMilovanovic: re-factor again [analytics/wmde/WiktionaryCognateDashboard] - 10https://gerrit.wikimedia.org/r/489675 [13:12:00] (03CR) 10GoranSMilovanovic: [V: 03+2 C: 03+2] update engine re-factor [analytics/wmde/WiktionaryCognateDashboard] - 10https://gerrit.wikimedia.org/r/489674 (owner: 10GoranSMilovanovic) [13:12:18] (03CR) 10GoranSMilovanovic: [V: 03+2 C: 03+2] re-factor again [analytics/wmde/WiktionaryCognateDashboard] - 10https://gerrit.wikimedia.org/r/489675 (owner: 10GoranSMilovanovic) [13:55:52] (03PS3) 10Fdans: Change email send workflow to notify of completed jobs [analytics/refinery] - 10https://gerrit.wikimedia.org/r/484657 (https://phabricator.wikimedia.org/T206894) [14:04:03] 10Analytics, 10Product-Analytics, 10Research, 10WMDE-Analytics-Engineering, and 3 others: Provide tools for querying MediaWiki replica databases without having to specify the shard - https://phabricator.wikimedia.org/T212386 (10Neil_P._Quinn_WMF) >>! In T212386#4942636, @elukey wrote: > SRV records deploye... [14:11:11] 10Analytics, 10DBA, 10MediaWiki-Database, 10Research, 10Wikidata: Improve interlingual links across wikis through Wikidata IDs - https://phabricator.wikimedia.org/T215616 (10Marostegui) [14:18:01] 10Analytics, 10Product-Analytics, 10Research, 10WMDE-Analytics-Engineering, and 3 others: Provide tools for querying MediaWiki replica databases without having to specify the shard - https://phabricator.wikimedia.org/T212386 (10elukey) @Neil_P._Quinn_WMF Thanks a lot! Mind if I add your snipped to https://... [15:06:13] fdans: is all ok with that coordinator? [15:06:26] (was reviewing the alerts) [15:06:52] elukey: no but don't worry it isn't the production one, it's just some tests I'm doing [15:07:12] ack :) Can you change the email to yours? [15:07:12] * fdans knows he should have changed the email subworkflow [15:09:48] (03CR) 10Fdans: [V: 03+1 C: 03+1] "Tested with oozie today, looking good in my user hive database" (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/489313 (https://phabricator.wikimedia.org/T215655) (owner: 10Milimetric) [15:10:00] fdans: despite all the errors? [15:10:06] or those were just oozie setup stuff [15:10:42] milimetric: nah the errors are my doing [15:11:21] joal: I thought of adding the edit count to the existing dataset, but thought it would complicate things a bit. Adding more bits of information there, and it would be a bit harder to aggregate. But yeah, that would work fine, what do you think? [15:11:34] fdans: you too, it's definitely a lot simpler to do it that way [15:19:26] milimetric: morning! The SRV records for the dbstores are in place, they seem to work [15:19:34] Neil already tested them in a notebook [15:20:39] 10Analytics: Check home leftovers of ISI researchers - https://phabricator.wikimedia.org/T215775 (10elukey) [15:24:10] elukey: ok, great, will get started on the patches for RU and sqoop [15:24:23] well, maybe after standup [15:26:37] milimetric: sure! I have sent a patch for the draft of the analytics-mysql wrapper [15:26:42] it contains also two functions for utils.py [15:26:58] and I am about to deploy mediawiki-config on stat/notebooks [15:31:55] joal: mediawiki-config is rolling out on all stat/notebooks under /srv/deployment/mediawiki-config [15:32:02] * elukey updates the docs as well [15:34:30] 10Analytics, 10Product-Analytics, 10Research, 10WMDE-Analytics-Engineering, and 3 others: Provide tools for querying MediaWiki replica databases without having to specify the shard - https://phabricator.wikimedia.org/T212386 (10elukey) As FYI to everybody, puppet is currently deploying a copy of mediawiki-... [15:39:08] brb [15:50:37] 10Analytics, 10Discovery, 10Operations, 10Research: Workflow to be able to move data files computed in jobs from analytics cluster to production - https://phabricator.wikimedia.org/T213976 (10Ottomata) > How big is the dataset and how fast is it going to grow? In the hundreds of megabytes I believe. @half... [15:51:51] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Add new wikis to analytics - https://phabricator.wikimedia.org/T209822 (10fdans) Verified that shnwiki, yuewiktionary and liwikinews were all included on the latest sqoop. This task can be closed now. [15:52:39] (03PS1) 10GoranSMilovanovic: Re-factor CloudVPS + xml config [analytics/wmde/TW/AdvancedSearchExtension-Dashboard] - 10https://gerrit.wikimedia.org/r/489721 [15:52:56] (03CR) 10GoranSMilovanovic: [V: 03+2 C: 03+2] Re-factor CloudVPS + xml config [analytics/wmde/TW/AdvancedSearchExtension-Dashboard] - 10https://gerrit.wikimedia.org/r/489721 (owner: 10GoranSMilovanovic) [15:55:45] (03PS1) 10GoranSMilovanovic: minor [analytics/wmde/TW/AdvancedSearchExtension-Dashboard] - 10https://gerrit.wikimedia.org/r/489722 [15:55:58] (03CR) 10GoranSMilovanovic: [V: 03+2 C: 03+2] minor [analytics/wmde/TW/AdvancedSearchExtension-Dashboard] - 10https://gerrit.wikimedia.org/r/489722 (owner: 10GoranSMilovanovic) [15:57:05] 10Analytics, 10Discovery, 10Operations, 10Research: Workflow to be able to move data files computed in jobs from analytics cluster to production - https://phabricator.wikimedia.org/T213976 (10EBernhardson) >>! In T213976#4943916, @Ottomata wrote: >> How big is the dataset and how fast is it going to grow?... [15:57:41] ebernhardson: but you said you also had plans for datasets ~10GB, right? [15:57:44] in the future [15:58:02] you might want to mention that too, as this is more of a long term proposal than an immediate fix [15:59:35] milimetric: maybe some day, but not yet. Those things are mostly fancy ideas in my head still [15:59:41] milimetric: while model shipping is a concrete problem we have today [16:00:04] i'll at least mention it [16:01:08] ideas in your head matter :) [16:01:16] 10Analytics, 10monitoring: Grafana shows zero EventLogging events for around 44 hours around January 15 - https://phabricator.wikimedia.org/T215744 (10fgiunchedi) I can confirm what @elukey was seeing / saying, namely that the data seems missing only from prometheus instance (hitting `d` and then `r` in grafan... [16:01:36] ping ottomata [16:01:42] ping joal [16:02:11] milimetric: also what do i say, "i have this plan to turn all wikis into about 10TB of floating point numbers. After that not sure yet :P" [16:02:42] nuria: using new app trying to join.,.. [16:03:20] ping joal hola, standup [16:03:37] ebernhardson: haha, it's ok to say that but we may suspect you of being The Matrix [16:06:45] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Add new cluster to superset db config - https://phabricator.wikimedia.org/T215680 (10Nuria) [16:06:57] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Add new cluster to superset db config - https://phabricator.wikimedia.org/T215680 (10Nuria) a:03Nuria [16:07:44] 10Analytics, 10Discovery, 10Operations, 10Research: Workflow to be able to move data files computed in jobs from analytics cluster to production - https://phabricator.wikimedia.org/T213976 (10EBernhardson) Longer term search will potentially want to generate some significantly larger datasets to ship to pr... [16:12:16] joal: correction - mediawiki-config is under /srv/mediawiki-config [16:33:07] 10Analytics, 10Product-Analytics: Superset's rolling average feature results in error message - https://phabricator.wikimedia.org/T213488 (10Nuria) We would like to test upstream in a staging environment before making more changes to fork, once the db cluster migration is done we can spend some cycles on test... [16:33:20] 10Analytics, 10Analytics-Kanban, 10Product-Analytics: Superset's rolling average feature results in error message - https://phabricator.wikimedia.org/T213488 (10Nuria) a:03elukey [16:33:42] 10Analytics, 10Research: Check home leftovers of ISI researchers - https://phabricator.wikimedia.org/T215775 (10leila) [16:34:15] 10Analytics, 10Analytics-Kanban: Test sqooping from the new dedicated labsdb host - https://phabricator.wikimedia.org/T215550 (10Nuria) p:05Triage→03High [16:35:41] 10Analytics, 10Analytics-Kanban, 10Product-Analytics: Superset's rolling average feature results in error message - https://phabricator.wikimedia.org/T213488 (10Milimetric) p:05Triage→03High [16:43:40] 10Analytics, 10Analytics-Kanban, 10DBA, 10Patch-For-Review, 10User-Banyek: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5] - https://phabricator.wikimedia.org/T210478 (10Marostegui) a:05elukey→03Marostegui [16:43:54] 10Analytics: Move FR banner-impression jobs to events (lambda) - https://phabricator.wikimedia.org/T215636 (10Nuria) Pinging @DStrine Waiting for FR tech input to productionize this. [16:44:38] 10Analytics, 10Analytics-Kanban, 10DBA, 10Patch-For-Review, 10User-Banyek: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5] - https://phabricator.wikimedia.org/T210478 (10Marostegui) [16:44:42] 10Analytics: Move FR banner-impression jobs to events (lambda) - https://phabricator.wikimedia.org/T215636 (10Nuria) p:05Triage→03Normal [16:44:43] ottomata: otherwise I'll forget - shall we deprecate officially spark 1.x ? [16:45:52] 10Analytics, 10Analytics-Kanban, 10DBA, 10Patch-For-Review, 10User-Banyek: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5] - https://phabricator.wikimedia.org/T210478 (10Marostegui) [16:46:59] 10Analytics, 10Analytics-Kanban, 10DBA, 10Patch-For-Review, 10User-Banyek: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5] - https://phabricator.wikimedia.org/T210478 (10Marostegui) [16:47:43] 10Analytics, 10DBA, 10MediaWiki-Database, 10Research, 10Wikidata: Improve interlingual links across wikis through Wikidata IDs - https://phabricator.wikimedia.org/T215616 (10Nuria) p:05Triage→03High [16:47:50] 10Analytics, 10Pageviews-API: Yearly endpoint for the /pageviews/top API - https://phabricator.wikimedia.org/T154381 (10Nuria) p:05High→03Low [16:48:35] elukey, ottomata - let's do that !!!! [16:48:42] Spark1 deprecation ! [16:49:51] 10Analytics, 10MediaWiki-Vagrant, 10cloud-services-team (Kanban): Kafka in mw vagrant: kafka_broker.keystore.jks has expired - https://phabricator.wikimedia.org/T214593 (10Nuria) [16:50:06] 10Analytics, 10Analytics-Kanban, 10MediaWiki-Vagrant: Kafka in mw vagrant: kafka_broker.keystore.jks has expired - https://phabricator.wikimedia.org/T214593 (10Nuria) [16:50:19] 10Analytics, 10Analytics-Kanban, 10MediaWiki-Vagrant: Kafka in mw vagrant: kafka_broker.keystore.jks has expired - https://phabricator.wikimedia.org/T214593 (10Nuria) p:05Triage→03High [16:51:20] joal: ah nice ottomata said in his last email to people that we'd have done it this week [16:51:23] perfect [16:51:27] 10Analytics, 10Analytics-Kanban, 10Wikimedia-Stream: EventStreams returns 502 errors from outside the WMF network - https://phabricator.wikimedia.org/T215013 (10Nuria) a:03Ottomata [16:52:22] 10Analytics: Refine Monitor should be a systemd timer such if process cannot start we get notified - https://phabricator.wikimedia.org/T210759 (10Nuria) p:05High→03Normal [16:56:39] 10Analytics, 10Analytics-Kanban, 10DBA, 10Patch-For-Review, 10User-Banyek: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5] - https://phabricator.wikimedia.org/T210478 (10Marostegui) [17:03:05] 10Analytics, 10DBA, 10MediaWiki-Database, 10Research, 10Wikidata: Improve interlingual links across wikis through Wikidata IDs - https://phabricator.wikimedia.org/T215616 (10JAllemandou) @diego : This has worked for me (takes some time to compute and needs a bunch of resources). I hope it's close enough... [17:03:54] 10Analytics, 10ExternalGuidance, 10Product-Analytics, 10Patch-For-Review: Measure the impact of externally-originated contributions - https://phabricator.wikimedia.org/T212414 (10Nuria) Seems like there are several issues here, from the requests we are not clear that you actually have that data right now t... [17:04:56] 10Analytics: Move FR banner-impression jobs to events (lambda) - https://phabricator.wikimedia.org/T215636 (10DStrine) I'm not able to totally understand the impact and choices needed here If this is a cary-over task for @Seddon 's request on T203669 then we need him to respond here. If this is related to the... [17:10:39] 10Analytics, 10ExternalGuidance, 10Product-Analytics, 10Patch-For-Review: Measure the impact of externally-originated contributions - https://phabricator.wikimedia.org/T212414 (10Nuria) >Again, an alternative proposal would be to register (and aggregate) them as a virtual pageview instead, using the existi... [17:17:55] 10Analytics, 10ExternalGuidance, 10Product-Analytics, 10Patch-For-Review: Measure the impact of externally-originated contributions - https://phabricator.wikimedia.org/T212414 (10Nuria) [17:52:15] meta-erroring - https://xkcd.com/2110/ [17:54:50] 10Analytics, 10Discovery, 10Operations, 10Research: Workflow to be able to move data files computed in jobs from analytics cluster to production - https://phabricator.wikimedia.org/T213976 (10Halfak) I think our biggest models are around 100MB. I don't expect to have a model larger than 1GB any time soon.... [18:11:03] 10Analytics, 10monitoring: Grafana shows zero EventLogging events for around 44 hours around January 15 - https://phabricator.wikimedia.org/T215744 (10Volans) Isn't this due to the PDU issue we had that affected `prometheus1003`? See https://wikitech.wikimedia.org/wiki/Incident_documentation/20190115-PDU-fuses... [18:11:43] * elukey ofF! [18:12:56] Gone for diner - will be back after [18:21:39] 10Analytics, 10Discovery, 10Operations, 10Research: Workflow to be able to move data files computed in jobs from analytics cluster to production - https://phabricator.wikimedia.org/T213976 (10diego) > > We do have one very large asset file at 1.9GB (word2vec embedding). I don't need that to be much bigge... [18:23:45] 10Analytics, 10DBA, 10MediaWiki-Database, 10Research, 10Wikidata: Improve interlingual links across wikis through Wikidata IDs - https://phabricator.wikimedia.org/T215616 (10diego) Looks good @JAllemandou, thanks. This is a good workaround, but imho, we should have an structure or schema that makes this... [18:28:07] 10Analytics, 10DBA, 10MediaWiki-Database, 10Research, 10Wikidata: Improve interlingual links across wikis through Wikidata IDs - https://phabricator.wikimedia.org/T215616 (10EBernhardson) I don't know if this meets your needs, but the cirrussearch dumps have the wikidata id's broken out. This is the `wi... [18:41:34] 10Quarry: Show query run date above outputs section - https://phabricator.wikimedia.org/T215831 (10Framawiki) [18:41:41] 10Quarry: Show query run date above outputs section - https://phabricator.wikimedia.org/T215831 (10Framawiki) p:05Triage→03High [19:07:48] 10Analytics, 10Operations, 10SRE-Access-Requests, 10Patch-For-Review: Allow Erik Bernhardson to have root access on stat1005 for GPU testing - https://phabricator.wikimedia.org/T215384 (10Dzahn) Approved in SRE meeting (SRE-2019-02-11#Access_Requests) [19:08:38] 10Analytics, 10Operations, 10SRE-Access-Requests, 10Patch-For-Review: Allow Erik Bernhardson to have root access on stat1005 for GPU testing - https://phabricator.wikimedia.org/T215384 (10Dzahn) Normally would have merged Gerrit change but see comments from Moritz there, he said we should wait until buster... [19:11:24] 10Analytics, 10Analytics-Kanban, 10EventBus, 10Services (watching): EventBusRCFeedEngine should use FormattedRCFeed instead of RCFeedEngine to use updated configuration - https://phabricator.wikimedia.org/T215834 (10Ottomata) p:05Triage→03Normal [19:15:29] 10Analytics, 10Analytics-Kanban, 10EventBus, 10Patch-For-Review, 10Services (watching): EventBus mediawiki extension should support multiple 'event service' endpoints - https://phabricator.wikimedia.org/T214446 (10Ottomata) Order of operations: - {T215834} -- Deploy backwards compatible code change -- D... [19:21:03] milimetric: https://phabricator.wikimedia.org/T215830 is this sufficient? [19:40:02] 10Analytics, 10Product-Analytics, 10Reading-analysis: [EventLogging Sanitization] Update EL sanitization white-list for field renames in EL schemas - https://phabricator.wikimedia.org/T209087 (10mforns) @Neil_P._Quinn_WMF > Can you make sure my VisualEditorFeatureUse whitelist patch (T212588) is merged and... [19:50:31] edsanders: yeah, looks good, cc your manager [19:51:26] 10Analytics, 10DBA, 10MediaWiki-Database, 10Research, 10Wikidata: Improve interlingual links across wikis through Wikidata IDs - https://phabricator.wikimedia.org/T215616 (10diego) @EBernhardson , this looks exactly what I was looking for, initially. Thank you very much for that. However, I wont close... [19:51:46] edsanders: I think the process requires manager approval [19:52:09] milimetric: as much as i woudl like to stay I think i need to remove myself from analytics ops rotation [19:52:42] nuria: no problem I’ll shift the rotation [19:52:51] milimetric: super thanks [19:53:02] Thanks, done [19:53:14] 10Analytics, 10DBA, 10MediaWiki-Database, 10Research, 10Wikidata: Improve interlingual links across wikis through Wikidata IDs - https://phabricator.wikimedia.org/T215616 (10EBernhardson) >>! In T215616#4944986, @diego wrote: > @EBernhardson , this looks exactly what I was looking for, initially. Thank... [20:07:30] 10Analytics, 10DBA, 10MediaWiki-Database, 10Research, 10Wikidata: Improve interlingual links across wikis through Wikidata IDs - https://phabricator.wikimedia.org/T215616 (10jcrespo) > diego added a project: DBA. I don't understand what is the actionable here for us. Without context, I would say that:... [20:08:32] 10Analytics, 10EventBus, 10Growth-Team, 10MediaWiki-Watchlist, and 6 others: Clear watchlist on enwiki only removes 50 items at a time - https://phabricator.wikimedia.org/T207329 (10Pchelolo) Sorry, got out of my radar somehow. After the latest deploy ^^ this has to work properly. [21:23:16] 10Analytics-Kanban, 10EventBus, 10Patch-For-Review, 10Services (watching): EventBusRCFeedEngine should use FormattedRCFeed instead of RCFeedEngine to use updated configuration - https://phabricator.wikimedia.org/T215834 (10Ottomata) [21:23:21] (03PS1) 10Mforns: Lowercase capsule fields in EL sanitization whitelist [analytics/refinery] - 10https://gerrit.wikimedia.org/r/489817 (https://phabricator.wikimedia.org/T209503) [21:31:08] mforns: for the lowercase of fields on sanitization. [21:31:14] nuria, yes? [21:31:32] mforns: don't all fields need to be lowercased? like all columns all schemas? [21:31:49] nuria, no, the struct subfields keep their casing [21:32:04] mforns: ok, there are also fields on edit schema with "." [21:33:18] nuria, that schema is not populated in the event dtabase [21:33:44] mforns: ok [21:33:57] it does not even exist in the event_sanitized database [21:34:00] (03CR) 10Nuria: [V: 03+2 C: 03+2] Lowercase capsule fields in EL sanitization whitelist [analytics/refinery] - 10https://gerrit.wikimedia.org/r/489817 (https://phabricator.wikimedia.org/T209503) (owner: 10Mforns) [21:34:02] although... it is in the whitelist [21:36:53] mforns: ya, it is in the refine blacklist [21:37:16] I wonder what would happen if it was sanitized with those field names... [21:47:37] (03CR) 10Nuria: [C: 04-1] Change email send workflow to notify of completed jobs (032 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/484657 (https://phabricator.wikimedia.org/T206894) (owner: 10Fdans) [22:34:51] 10Analytics, 10DBA, 10MediaWiki-Database, 10Research, 10Wikidata: Improve interlingual links across wikis through Wikidata IDs - https://phabricator.wikimedia.org/T215616 (10diego) @jcrespo, the API works good for query specific pages/entities, not for example to know which pages that existing in X_wiki... [22:49:27] 10Analytics, 10Analytics-EventLogging, 10Patch-For-Review: eventlogging fails flake8 due to new upstream version, breaking CI - https://phabricator.wikimedia.org/T212396 (10mforns) I fixed the flake8 and flake8-bin issues in the patch above. However, Jenkins still fails, because "ImportError: No module named... [22:49:42] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Patch-For-Review: eventlogging fails flake8 due to new upstream version, breaking CI - https://phabricator.wikimedia.org/T212396 (10mforns) a:03mforns [23:34:11] 10Analytics, 10Operations, 10WMF-Legal, 10Privacy: Honor DNT header for access logs & varnish logs - https://phabricator.wikimedia.org/T98831 (10leila) @Gilles in light of https://www.w3.org/2011/tracking-protection/ shall we decline this task? (Apple already announced that they will remove DNT from Safari). [23:43:35] 10Analytics: Percentage of users with DNT on - https://phabricator.wikimedia.org/T127571 (10leila) 05Open→03Resolved