[02:45:26] 10Analytics, 10ORES, 10Patch-For-Review, 10Scoring-platform-team (Current): Wire ORES recent_score events into Hadoop - https://phabricator.wikimedia.org/T209732 (10awight) [04:23:18] (03PS11) 10Awight: Oozie jobs to produce ORES data [analytics/refinery] - 10https://gerrit.wikimedia.org/r/482753 (https://phabricator.wikimedia.org/T209732) [04:24:32] (03CR) 10Awight: "I ran into an interesting twist: we need to watch both datacenters' mediawiki_revision_score streams in case of a service switchover, but " (0314 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/482753 (https://phabricator.wikimedia.org/T209732) (owner: 10Awight) [04:30:25] 10Analytics, 10ORES, 10Patch-For-Review, 10Scoring-platform-team (Current): Wire ORES recent_score events into Hadoop - https://phabricator.wikimedia.org/T209732 (10awight) Something tricky I ran into: success files aren't written for hours where there are zero changeprop events through codfw. Maybe we ha... [07:44:29] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Set up a Analytics Hadoop test cluster in production that runs a configuration as close as possible to the current one. - https://phabricator.wikimedia.org/T212256 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by elukey on cumin1001.eqiad.wmn... [08:20:46] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Set up a Analytics Hadoop test cluster in production that runs a configuration as close as possible to the current one. - https://phabricator.wikimedia.org/T212256 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['analytics1039.eqiad.wmnet'] ` a... [08:21:28] Hi team - still not 100% today - I think it's a flu-ish stuff - Will read emails and connect every now and then but will not try to produce [08:22:51] rest joal!! [08:42:47] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Set up a Analytics Hadoop test cluster in production that runs a configuration as close as possible to the current one. - https://phabricator.wikimedia.org/T212256 (10elukey) [08:47:39] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Set up a Analytics Hadoop test cluster in production that runs a configuration as close as possible to the current one. - https://phabricator.wikimedia.org/T212256 (10elukey) [09:21:55] 10Analytics, 10DBA, 10Operations, 10ops-eqiad: swap a2-eqiad PDU with on-site spare - https://phabricator.wikimedia.org/T213748 (10fgiunchedi) [09:24:17] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Set up a Analytics Hadoop test cluster in production that runs a configuration as close as possible to the current one. - https://phabricator.wikimedia.org/T212256 (10elukey) [09:26:11] (03CR) 10Mforns: Make saltrotate store salts with timestamps as file name. (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/484250 (https://phabricator.wikimedia.org/T212014) (owner: 10Mforns) [09:32:35] (03PS3) 10Mforns: Make saltrotate store salts with timestamps as file name. [analytics/refinery] - 10https://gerrit.wikimedia.org/r/484250 (https://phabricator.wikimedia.org/T212014) [09:43:50] 10Analytics, 10Analytics-Kanban: Clean up staging db - https://phabricator.wikimedia.org/T212493 (10elukey) So the staging db is kinda problematic to clean up, since it is difficult to figure out owners and reach out to people. I have already started to ask to people to review/drop the old tables, but as preca... [09:46:10] 10Analytics, 10Analytics-Kanban: Clean up staging db - https://phabricator.wikimedia.org/T212493 (10elukey) @Neil_P._Quinn_WMF @Milimetric @mforns @leila @nettrom_WMF @DarTar @Tbayer would you mind to review the tables in the description and see if anything is definitely not needed and can be dropped? [09:48:28] 10Analytics, 10Analytics-Wikistats: [Wikistats v2] Default selection for (active) editors is confusing for inexperienced users - https://phabricator.wikimedia.org/T213800 (10Nemo_bis) [10:04:54] anyone around to help me try and locate the data for https://meta.wikimedia.org/wiki/Schema:WikibaseTermboxInteraction ? [10:04:59] not sure where it is getting lost :/ [10:06:42] I'm seeing the JS code hit https://www.wikidata.org/beacon/event, and that looks correct [10:07:40] oh wait, i see it in hive now, wow, maybe this event just hasn't been triggered by real users [10:07:41] hah [10:09:14] good :) [10:22:56] 10Analytics, 10User-Elukey: Convert Aria tables to InnoDB on dbstore1002 - https://phabricator.wikimedia.org/T213706 (10elukey) a:03elukey [10:24:42] 10Analytics, 10User-Elukey: Convert Aria tables to InnoDB on dbstore1002 - https://phabricator.wikimedia.org/T213706 (10elukey) So as far as I can understand I'd need to grab the list of tables and produce a list of: ` ALTER TABLE $table-name ENGINE=InnoDB; ` @Marostegui does replication need to be stopped w... [10:28:57] 10Analytics, 10User-Elukey: Convert Aria tables to InnoDB on dbstore1002 - https://phabricator.wikimedia.org/T213706 (10Marostegui) No, I don't think you have to stop it. Keep in mind that you can also do: `alter table $SCHEMA.$TABLE engine=InnoDB` [10:34:56] 10Analytics, 10User-Elukey: Convert Aria tables to InnoDB on dbstore1002 - https://phabricator.wikimedia.org/T213706 (10elukey) Thanks! I have created `/home/elukey/aria_tables_alter.sql` on dbstore1002, if you can review them quickly as sanity check it would be great. Then I'd just execute mysql --skip-ssl <... [10:36:45] 10Analytics, 10User-Elukey: Convert Aria tables to InnoDB on dbstore1002 - https://phabricator.wikimedia.org/T213706 (10elukey) Another question - should we back up the staging database just in case something goes wrong? [10:39:04] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Set up a Analytics Hadoop test cluster in production that runs a configuration as close as possible to the current one. - https://phabricator.wikimedia.org/T212256 (10elukey) [10:42:27] 10Analytics, 10User-Elukey: Convert Aria tables to InnoDB on dbstore1002 - https://phabricator.wikimedia.org/T213706 (10Marostegui) >>! In T213706#4880452, @elukey wrote: > Thanks! > > I have created `/home/elukey/aria_tables_alter.sql` on dbstore1002, if you could review them quickly as sanity check it would... [12:22:12] * elukey lunch! [12:36:25] joal: feeling any better? [12:36:42] oh no, I see your message now from earlier! [12:36:45] get well soon! [12:50:41] PROBLEM - eventbus grafana alert on icinga1001 is CRITICAL: CRITICAL: EventBus ( https://grafana.wikimedia.org/d/000000201/eventbus ) is alerting: EventBus POST Response Status alert. [12:52:52] mmmm [12:52:56] grafana alert? [12:53:07] RECOVERY - eventbus grafana alert on icinga1001 is OK: OK: EventBus ( https://grafana.wikimedia.org/d/000000201/eventbus ) is not alerting. [12:54:48] EventBus POST Response Status alert [12:54:48] NO DATA for 2 minutes [12:54:52] ah there you go [12:55:57] will ask to Andrew [13:08:06] 10Analytics: virtualpageview_hourly lacks data from December 17 on - https://phabricator.wikimedia.org/T213602 (10Tbayer) Great, thank you @Ottomata and everyone else for solving this so quickly! >>! In T213602#4878976, @Nuria wrote: > Data is present now up to the 22nd. >>! In T213602#4878990, @Nuria wrote: >... [13:10:47] 10Analytics, 10Analytics-Kanban, 10DBA, 10Patch-For-Review, 10User-Banyek: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5] - https://phabricator.wikimedia.org/T210478 (10Marostegui) Let's wait for {T213706} to be done before we migrate the existing copy of `stagingdb` to any of the hosts. [13:10:58] 10Analytics, 10User-Elukey: Convert Aria tables to InnoDB on dbstore1002 - https://phabricator.wikimedia.org/T213706 (10Marostegui) [13:11:02] 10Analytics, 10Analytics-Kanban, 10DBA, 10Patch-For-Review, 10User-Banyek: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5] - https://phabricator.wikimedia.org/T210478 (10Marostegui) [13:42:35] 10Analytics: virtualpageview_hourly lacks data from December 17 on - https://phabricator.wikimedia.org/T213602 (10Nuria) It finished by midnite the December data, which is the one you needed for the report. [13:48:55] 10Analytics, 10User-Elukey: Convert Aria tables to InnoDB on dbstore1002 - https://phabricator.wikimedia.org/T213706 (10elukey) p:05Normal→03High [13:51:05] 10Analytics: Alarms for virtualpageview should exist (probably in oozie) for jobs that have been idle too long - https://phabricator.wikimedia.org/T213716 (10elukey) As FYI I can see the following for Dec 17th in my inbox for analytics-alerts@: ` OOZIE - SLA END_MISS (AppName=virtualpageview-hourly-coord, JobID... [14:02:04] fdans: moved systemd timers and couple other snipets to the new doc and "emptied" the old one [14:04:00] nuria: those bits should be turned into something actionable, I don't really know what to do with the systemd timers [14:05:01] fdans: we can rework that but it is very useful, if a job fails, you want to see logs the way to do it since we use systemd has chnaged a lot [14:12:29] nuria: in the docs the only bit missing is, as far as I can see, how to restart jobs [14:12:32] I can add it now [14:12:53] but it is basically issuing a start to the service unit [14:14:11] ah no it is mentioned, trying to highlight it [14:14:32] anyway, I'd suggest to play with them and see what are the doubts etc.. [14:16:04] ah no just realized that it was fdans to discuss about timers [14:16:15] :) [14:17:29] 10Analytics: Reportupdater should alert if it fails over and over - https://phabricator.wikimedia.org/T213309 (10elukey) We decided to try a simple systemd timer for the moment, that will alarm if report updater will run and return a non zero code. This is currently tracked in T172532 [14:28:36] 10Analytics, 10Analytics-Kanban, 10User-Elukey: Convert Aria tables to InnoDB on dbstore1002 - https://phabricator.wikimedia.org/T213706 (10elukey) [14:29:03] fdans: you there? [14:29:12] hellooo [14:29:14] o/ [14:29:25] if you have time we can deploy superset [14:29:38] elukey: I'm ready if you are [14:29:50] let's do it [14:30:01] so I am going to merge https://gerrit.wikimedia.org/r/#/c/analytics/superset/deploy/+/481056/ (please check that it is the good one) [14:30:19] stop superset, take a dump of the database, and finally deploy [14:30:23] how does it sound? [14:35:21] fdans: ? [14:35:46] haha I read it as "take a dump on the database" [14:36:02] ahahhahaha [14:36:16] the patch looks good to me [14:36:24] (03CR) 10Fdans: [C: 03+1] Bump to superset version 0.26.3-wikimedia1 [analytics/superset/deploy] - 10https://gerrit.wikimedia.org/r/481056 (owner: 10Ottomata) [14:36:44] fdans: can you +2 merge it and update the superset repo on deploy1001 while I take the mysql dump ? [14:36:59] !log stop superset to allow a clean mysqldump [14:37:01] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [14:37:10] elukey: on it [14:38:23] (03CR) 10Fdans: [V: 03+2 C: 03+2] Bump to superset version 0.26.3-wikimedia1 [analytics/superset/deploy] - 10https://gerrit.wikimedia.org/r/481056 (owner: 10Ottomata) [14:38:46] fdans: feel free to deploy whenever you want [14:39:50] elukey: merged and pulled last version on repo [14:40:06] you should have perms to deploy right? [14:40:15] yeah, using scap? [14:40:18] yep [14:40:21] cool [14:40:58] !log deploying superset 0.26.3-wikimedia1 [14:40:59] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [14:41:29] elukey: done [14:42:02] fdans: goood! so in theory no db upgrade is needed [14:42:29] let's check the dashboards [14:42:36] and then if the issue has been fixed [14:44:08] 10Analytics, 10ORES, 10Patch-For-Review, 10Scoring-platform-team (Current): Wire ORES recent_score events into Hadoop - https://phabricator.wikimedia.org/T209732 (10Ottomata) Yeah, this isn't the first time we've had this problem. It isn't actually that easy to solve, because the Kafka consumer doesn't ad... [14:48:38] 10Analytics, 10EventBus, 10Operations, 10Services (watching): Discovery for Kafka cluster brokers - https://phabricator.wikimedia.org/T213561 (10fgiunchedi) Discovery records for kafka would come handy in the logging pipeline case too, namely during datacenter failover to move producers off a given datacen... [14:49:49] something is weird, it seems like we deployed 0.28 again [14:49:50] elukey: lol the isssue is fixed by f-strings in the superset repo [14:50:07] yeah but that was the thing that we noticed after the deploy to 0.28 [14:50:24] we are seeing the same errors? [14:50:28] how the hell is possible? [14:51:26] elukey: hmmm, if that's the case the filter box should be all broken, lemme check [14:51:47] elukey@analytics-tool1003:/srv/deployment/analytics/superset/deploy$ ls artifacts/stretch/superset* [14:51:50] artifacts/stretch/superset-0.26.3_wikimedia1-py3-none-any.whl [14:51:56] this looks good [14:52:05] elukey: yep, filterbox is broken [14:52:05] https://superset.wikimedia.org/superset/dashboard/geowikiarchive/?preselect_filters=%7B%0A%20%20%2248%22%3A%20%7B%0A%20%20%20%20%22__from%22%3A%20%222018-03-01T00%3A00%3A00%22%2C%0A%20%20%20%20%22__to%22%3A%20%222018-04-01T00%3A00%3A00%22%0A%20%20%7D%0A%7D [14:52:21] it seems like we are indeed in 0.28 [14:53:07] sigh [14:53:14] let's rollback fdans [14:53:41] we need the staging environment [14:53:46] before any more deployment [14:53:55] elukey: can we rollback using scap? [14:54:14] or revert + scap deploy? [14:54:34] fdans: revert + scap deploy afaik [14:55:05] ok doing it elukey [14:55:19] (03PS1) 10Fdans: Revert "Bump to superset version 0.26.3-wikimedia1" [analytics/superset/deploy] - 10https://gerrit.wikimedia.org/r/484444 [14:55:44] (03CR) 10Fdans: [V: 03+2 C: 03+2] Revert "Bump to superset version 0.26.3-wikimedia1" [analytics/superset/deploy] - 10https://gerrit.wikimedia.org/r/484444 (owner: 10Fdans) [14:56:30] !log "rolling back to stable superset" [14:56:31] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [14:58:26] fdans: is the deploy ongoing or already finished? [14:58:51] elukey: still on promote and restart_service stage(s) [14:59:14] ah okok [14:59:18] so that explains the 502s [15:00:03] elukey: it does seem hard stuck there [15:00:14] it didn't take that long at all before [15:00:34] 10Analytics, 10ORES, 10Patch-For-Review, 10Scoring-platform-team (Current): Wire ORES recent_score events into Hadoop - https://phabricator.wikimedia.org/T209732 (10Nuria) So i understand, the expectation here will be that files are written for all hours but empty for those of which there was no data? [15:01:27] elukey: failed [15:02:02] https://www.irccloud.com/pastebin/z2D2TojK/ [15:02:38] that doesn't make any sense though [15:03:17] because that git repo is only present in the new version [15:03:23] ah no right [15:03:30] nono this is the revert [15:03:30] there are other two patches merged by andrew [15:03:36] before that [15:04:27] fdans: we can do something like that - create a branch on deploy1001 from the last commit from upstream [15:04:39] should be the one before andrew's merges [15:04:43] then deploy from that one [15:04:57] in the meantime we'll try to figure out how to proceed [15:05:01] how does it sound? [15:05:47] fdans: ? [15:05:48] elukey: not sure what you mean with the last commit from upstream [15:06:29] you are right, I meant the last "stable" commit from us [15:06:46] that should be fcc7058e90a8fc83eeaa012bd751af2a0f7f3fb0 [15:07:06] after that there are 3 commits from andrew + your revert [15:08:02] otherwise I can do it [15:08:04] let me know [15:08:19] ok, let me see [15:08:19] or even better, bc? [15:08:28] probably more productive [15:11:35] 10Analytics, 10EventBus, 10Operations, 10Services (watching): Discovery for Kafka cluster brokers - https://phabricator.wikimedia.org/T213561 (10Joe) Sorry, I need some more specifics: you want to make a dns query, and get as a response the "nearest" kafka cluster in the form of a list of hostnames/ports?... [15:11:43] o/ lemm eknow if yall need help! [15:13:35] 10Analytics, 10EventBus, 10Operations, 10Services (watching): Discovery for Kafka cluster brokers - https://phabricator.wikimedia.org/T213561 (10Ottomata) No no for me, all I want is an alias for the list of Kafka brokers in a given Kafka cluster. I don't need any DC failover stuff. Perhaps discovery is... [15:13:45] ottomata: if you have time we are in bc! [15:23:05] 10Analytics, 10Analytics-Kanban: Clean up staging db - https://phabricator.wikimedia.org/T212493 (10mforns) @elukey Definitely the tables prefixed with mforns_ can be deleted. [15:26:14] 10Analytics, 10EventBus, 10Operations, 10Services (watching): Discovery for Kafka cluster brokers - https://phabricator.wikimedia.org/T213561 (10Joe) Might I suggest that you use a SRV dns record instead? It's more appropriate for enumerating members in a cluster. We use those for etcd discovery. [15:29:35] 10Analytics, 10EventBus, 10WMF-JobQueue, 10Core Platform Team (Security, stability, performance and scalability (TEC1)), and 2 others: EventBus error "Unable to deliver all events: (curl error: 28) Timeout was reached" - https://phabricator.wikimedia.org/T204183 (10kchapman) [15:33:15] 10Analytics, 10EventBus, 10WMF-JobQueue, 10Core Platform Team (Security, stability, performance and scalability (TEC1)), and 3 others: EventBus error "Unable to deliver all events: (curl error: 28) Timeout was reached" - https://phabricator.wikimedia.org/T204183 (10CCicalese_WMF) a:03Pchelolo [16:15:41] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10MW-1.32-notes (WMF-deploy-2018-10-16 (1.32.0-wmf.26)), and 3 others: Spin out a tiny EventLogging RL module for lightweight logging - https://phabricator.wikimedia.org/T187207 (10Milimetric) I just checked and I think we've exorcised any async or... [16:21:16] 10Analytics: Alarms for virtualpageview should exist (probably in oozie) for jobs that have been idle too long - https://phabricator.wikimedia.org/T213716 (10Nuria) a:03Nuria [16:27:33] 10Analytics, 10ORES, 10Patch-For-Review, 10Scoring-platform-team (Current): Wire ORES recent_score events into Hadoop - https://phabricator.wikimedia.org/T209732 (10awight) >>! In T209732#4881126, @Ottomata wrote: > We could emit a single test event per hour into the topic in each dc... :) That works for... [16:33:24] 10Analytics, 10Analytics-Kanban, 10User-Elukey: Convert Aria tables to InnoDB on dbstore1002 - https://phabricator.wikimedia.org/T213706 (10elukey) Took a mysqldump of the staging database and moved it in two places: * on dbstore1002's /srv/elukey_backup * on stat1007's /home/elukey home dir (chmod root:root... [17:06:23] 10Analytics, 10DBA, 10Operations, 10ops-eqiad: swap a2-eqiad PDU with on-site spare - https://phabricator.wikimedia.org/T213748 (10RobH) @fgiunchedi: Thanks for updating about the ms-be systems! I see you added they can be gracefully powered down, can we just power them back up and ensure puppet runs post... [17:07:31] 10Analytics, 10DBA, 10Operations, 10ops-eqiad: swap a2-eqiad PDU with on-site spare - https://phabricator.wikimedia.org/T213748 (10RobH) >>! In T213748#4881709, @RobH wrote: > @fgiunchedi: Thanks for updating about the ms-be systems! I see you added they can be gracefully powered down, can we just power t... [17:50:33] 10Analytics, 10Analytics-Kanban: Clean up staging db - https://phabricator.wikimedia.org/T212493 (10leila) @elukey Please delete tables that include leila and leizi in their name. And my apologies that I didn't clean up after myself. I will do better in the future. [17:52:42] 10Analytics, 10Analytics-Kanban: Clean up staging db - https://phabricator.wikimedia.org/T212493 (10DarTar) @elukey same for all dartar_* tables, they can safely be removed. [17:54:14] 10Analytics, 10Analytics-Kanban: Clean up staging db - https://phabricator.wikimedia.org/T212493 (10elukey) Thanks all! [17:55:16] milimetric: whenever you have time, can you check the milimetrics_ prefixed tables --^ [17:55:19] ? [17:56:00] * elukey off! [17:56:02] o/ [17:58:10] 10Analytics, 10Analytics-Kanban: Clean up staging db - https://phabricator.wikimedia.org/T212493 (10Neil_P._Quinn_WMF) It looks like I only have a few staging tables because I've been cleaning up as I go, but I checked and dropped `ve_experiment_expanded` and `neilpquinn_VE_experiment_revs` [18:01:08] 10Analytics, 10Analytics-Kanban: Clean up staging db - https://phabricator.wikimedia.org/T212493 (10diego) Hi! I'm not sure what is this, but for sure you can delete diego_tmp. Thanks [18:09:27] 10Analytics, 10Analytics-Kanban: Clean up staging db - https://phabricator.wikimedia.org/T212493 (10nettrom_WMF) I went ahead and deleted all tables starting with "nettrom_" except the four tables referenced in T190434#4085830. [18:54:10] It's not clear to me whether it's safe to run concurrent inserts into the same table from Oozie... [18:57:55] 10Analytics, 10Analytics-Kanban, 10Operations, 10netops: Figure out networking details for new cloud-analytics-eqiad Hadoop/Presto cluster - https://phabricator.wikimedia.org/T207321 (10Ottomata) Tomorrow (Jan 15) we have a meeting with some SRE folks to revisit this. We've got the cloud-analytics Hadoop... [18:59:25] 10Analytics, 10Analytics-Kanban, 10Operations, 10netops: Figure out networking details for new cloud-analytics-eqiad Hadoop/Presto cluster - https://phabricator.wikimedia.org/T207321 (10Ottomata) Some links: - https://prestodb.io/docs/current/security/ldap.html - https://prestodb.io/docs/current/connector... [19:01:19] hey awight, i see some oozie failure emails [19:01:21] i think you don't get those [19:01:30] you should add your email address to the list of emails that get alerts for that job [19:02:04] OH [19:02:08] its hardcoded! [19:02:22] in send_error_email workflow (assuming you are using that) [19:02:39] oh hmm maybe its not [19:03:35] hm its not hardcoded but we never override it anywhere [19:03:37] you probably should for this [19:03:50] awight: to answer your previous question [19:03:56] as long as the inserts are into separate hive partitions [19:04:00] it is fine to do it concurrently [19:04:43] ottomata: thanks for the heads-up, I was fumbling the send_error_email overwrites, would be nicer if I could override or something... [19:04:52] Hopefully it stops, the job is killed. [19:05:08] hrm, definitely going to be the same partition, so I'll just set concurrency to 1 [19:05:22] 10Analytics, 10Analytics-Kanban: Clean up staging db - https://phabricator.wikimedia.org/T212493 (10Neil_P._Quinn_WMF) I also checked with @JKatzWMF and dropped `jkatz_foo`, `jkatz_foosss26`, and `editor_stats_JK_test`. [19:09:27] if same partition it might be ok awight depends on how the insert is [19:09:32] actually, i think it will be fine if you are not doing insert overwrite [19:09:37] it'll just write new files [19:09:39] in the same partition [19:10:19] awight: for email overriding, i think if you set something like error_alert_contact=analytics-alerts@wikimdia.org,awight@wikimedia.org [19:10:26] you can then pass it to send_error_email workflow as the [19:10:47] ${error_alert_contact} email [19:10:50] param* [19:10:59] oh [19:11:00] i guess [19:11:14] [19:11:14] to [19:11:14] ${error_alert_contact} [19:11:14] [19:11:19] something like that [19:11:25] ottomata: yes! great, thanks [19:11:34] I was missing that parameter [19:11:44] and apologies for the team spam... [19:58:31] 10Analytics, 10EventBus, 10Operations, 10Patch-For-Review, 10Services (watching): Discovery for Kafka cluster brokers - https://phabricator.wikimedia.org/T213561 (10Pchelolo) Once we get it we would need to update Change-Prop, JQ-Change-Prop, EventBus-service, event streams to use the new DNS record. [20:20:51] 10Analytics, 10EventBus, 10Operations, 10Patch-For-Review, 10Services (watching): Discovery for Kafka cluster brokers - https://phabricator.wikimedia.org/T213561 (10akosiaris) >>! In T213561#4881255, @Joe wrote: > Might I suggest that you use a SRV dns record instead? It's more appropriate for enumeratin... [20:21:56] 10Analytics, 10Analytics-Kanban: Clean up staging db - https://phabricator.wikimedia.org/T212493 (10Milimetric) dropped all `milimetric_` tables, and gave up on my dreams of figuring out what exactly is going on with mediawiki's revision table. [20:23:45] 10Analytics, 10EventBus, 10Operations, 10Patch-For-Review, 10Services (watching): Discovery for Kafka cluster brokers - https://phabricator.wikimedia.org/T213561 (10Ottomata) Kafka doesn't support SRV. Hence my Round Robin DNS patch. After more discussion with @bblack, I think I've decided to abandon t... [20:24:41] 10Analytics: Find out what happens to the old rows in the revision table - https://phabricator.wikimedia.org/T142535 (10Milimetric) I just dropped the data I mentioned in this task. Since we've been sqooping from mediawiki, we have a version of this kind of data in the `wmf_raw` database, in the `mediawiki_revi... [21:17:25] (03PS12) 10Awight: Oozie jobs to produce ORES data [analytics/refinery] - 10https://gerrit.wikimedia.org/r/482753 (https://phabricator.wikimedia.org/T209732) [23:18:20] 10Analytics, 10Research, 10Article-Recommendation: Transferring data from Hadoop to production MySQL database - https://phabricator.wikimedia.org/T213566 (10Nuria) The first thing we need to do is to oozie-fy the data creation workflow that produces the files you would be loading into mysql (likely tsv), let... [23:33:00] 10Analytics, 10Analytics-Kanban, 10Operations, 10netops: Figure out networking details for new cloud-analytics-eqiad Hadoop/Presto cluster - https://phabricator.wikimedia.org/T207321 (10ayounsi) I think there is a distinction to make here when saying "prod", as it's made of several vlans/networks, especial...