[00:10:42] 10DBA, 10DC-Ops, 10Operations, 10ops-codfw: (Need By: 2020-11-29) rack/setup/install db214[234] - https://phabricator.wikimedia.org/T267041 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['db2142.codfw.wmnet'] ` Of which those **FAILED**: ` ['db2142.codfw.wmnet'] ` [00:22:01] 10DBA, 10Community-Tech, 10Expiring-Watchlist-Items: Watchlist Expiry: Release plan [rough schedule] - https://phabricator.wikimedia.org/T261005 (10ifried) @Marostegui The feature has now been enabled on all wikis. [02:55:34] 10DBA, 10MediaWiki-extensions-FlaggedRevs, 10User-DannyS712: flaggedrevs_statistics is never purged - https://phabricator.wikimedia.org/T269196 (10DannyS712) [03:09:39] 10DBA, 10MediaWiki-extensions-FlaggedRevs, 10User-DannyS712: flaggedrevs_statistics is never purged - https://phabricator.wikimedia.org/T269196 (10Zache) Please don't purge them. If one wants to do timelines or graphs (Like i do) of how the revieweing have been developed or wants to compare different languag... [03:21:32] 10DBA, 10MediaWiki-extensions-FlaggedRevs, 10Technical-Debt, 10User-DannyS712: flaggedrevs_statistics is never purged - https://phabricator.wikimedia.org/T269196 (10Reedy) I guess this is from some never finished feature, maybe (as it was never really exposed much in FR itself?)... @aaron ? Or is it used b... [03:42:10] 10DBA, 10MediaWiki-extensions-FlaggedRevs, 10Technical-Debt, 10User-DannyS712: flaggedrevs_statistics is never purged - https://phabricator.wikimedia.org/T269196 (10Zache) I think the most important use case is statistics analysis. Also afaik i think it is needed also WMF/Community figures out what to do w... [03:59:53] 10DBA, 10MediaWiki-extensions-FlaggedRevs, 10Technical-Debt, 10User-DannyS712: flaggedrevs_statistics is never purged - https://phabricator.wikimedia.org/T269196 (10Zache) And for current uses, least I am using them for the graphs. Though, this is not realtime as it used to aggregate data from other tables... [05:55:45] 10DBA, 10Data-Services, 10Patch-For-Review, 10cloud-services-team (Kanban): Productionize clouddb10[13-20] - https://phabricator.wikimedia.org/T267090 (10Marostegui) s1 situation: Transfer from db1106 (sanitarium master) to clouddb1013:3311 and clouddb1017:3311 completed successfully. Sanitization on clou... [05:56:22] 10DBA, 10Data-Services, 10Patch-For-Review, 10cloud-services-team (Kanban): Productionize clouddb10[13-20] - https://phabricator.wikimedia.org/T267090 (10Marostegui) [06:06:15] 10DBA, 10Data-Services, 10Patch-For-Review, 10cloud-services-team (Kanban): Deploy labsdbuser and views to new clouddb hosts - https://phabricator.wikimedia.org/T268312 (10Marostegui) @Bstorm clouddb1013;3311 and clouddb1017:3311 are back and with all the grants and the `_p` database. Can you try those?... [06:06:50] 10DBA, 10Data-Services, 10Patch-For-Review, 10cloud-services-team (Kanban): Deploy labsdbuser and views to new clouddb hosts - https://phabricator.wikimedia.org/T268312 (10Marostegui) [06:12:46] 10DBA, 10Data-Services, 10Patch-For-Review, 10cloud-services-team (Kanban): Deploy labsdbuser and views to new clouddb hosts - https://phabricator.wikimedia.org/T268312 (10Marostegui) [06:23:20] 10DBA, 10decommission-hardware, 10Patch-For-Review: decommission es1017.eqiad.wmnet - https://phabricator.wikimedia.org/T268825 (10Marostegui) @Volans this host has the mgmt interface down (and most likely broken) so as expected, the boot loaders cannot be wiped, how should we proceed with those issues? [06:28:11] 10DBA, 10decommission-hardware, 10Patch-For-Review: decommission es1017.eqiad.wmnet - https://phabricator.wikimedia.org/T268825 (10Marostegui) The host is off by the way, it never came back from a reboot a few days ago. [06:31:55] 10DBA, 10Community-Tech, 10Expiring-Watchlist-Items: Watchlist Expiry: Release plan [rough schedule] - https://phabricator.wikimedia.org/T261005 (10Marostegui) Thanks @ifried For the record and for future reference when checking graphs/logs this is when it was enabled: ` Mentioned in SAL (#wikimedia-operatio... [06:32:45] 10DBA, 10decommission-hardware, 10Patch-For-Review: decommission es1017.eqiad.wmnet - https://phabricator.wikimedia.org/T268825 (10ops-monitoring-bot) cookbooks.sre.hosts.decommission executed by marostegui@cumin1001 for hosts: `es1017.eqiad.wmnet` - es1017.eqiad.wmnet (**FAIL**) - **Failed downtime host o... [06:59:18] 10DBA, 10GrowthExperiments, 10Growth-Team (Current Sprint), 10Patch-For-Review, and 2 others: Slow load times for Special:Homepage on cswiki - https://phabricator.wikimedia.org/T267216 (10Tgr) Breaking up the query doesn't seem to affect the planner: ` MariaDB [cswiki]> explain SELECT /* GrowthExperiments\... [07:03:03] 10DBA, 10GrowthExperiments, 10Growth-Team (Current Sprint), 10Patch-For-Review, and 2 others: Slow load times for Special:Homepage on cswiki - https://phabricator.wikimedia.org/T267216 (10Tgr) There is a less arbitrary way to improve it though. The above query can actually be written as ` SELECT /* GrowthE... [07:03:38] 10DBA, 10GrowthExperiments, 10Growth-Team (Current Sprint), 10Patch-For-Review, and 2 others: Slow load times for Special:Homepage on cswiki - https://phabricator.wikimedia.org/T267216 (10Marostegui) @Tgr it does seem it is picking the right index (the PK) with the first query: ` root@db1129.eqiad.wmnet[cs... [07:23:28] 10DBA, 10decommission-hardware: decommission es1017.eqiad.wmnet - https://phabricator.wikimedia.org/T268825 (10Marostegui) [08:06:57] 10DBA: New database request: sockpuppet - https://phabricator.wikimedia.org/T268505 (10Marostegui) >>! In T268505#6660388, @hnowlan wrote: >>>! In T268505#6659031, @Marostegui wrote: >> @hnowlan let's throttle the writes if we can and it is not too much of a hassle. This database will live with many more, so let... [08:08:28] 10DBA: New database request: sockpuppet - https://phabricator.wikimedia.org/T268505 (10Marostegui) [08:12:54] 10DBA: New database request: sockpuppet - https://phabricator.wikimedia.org/T268505 (10Marostegui) @hnowlan how will you create the tables initially? Do you need CREATE? [08:19:51] marostegui: we have 600 more drifts since more tables got abstracted and now we can check their size or field types. wikitech:s10 leading with 30 unique drifts [08:20:11] I'll make tickets [08:20:17] Amir1: thank you - appreciated it [08:21:09] I'll fix s10 later, currently swamped in work [08:22:10] thanks Amir1 [09:01:54] 10DBA, 10decommission-hardware: decommission es1017.eqiad.wmnet - https://phabricator.wikimedia.org/T268825 (10Volans) @Marostegui currently the wipe of bootloaders is done from the OS, not the mgmt, so if the host is already down/broken it can't be done, but is not affected by the mgmt console not working. T... [09:02:40] 10DBA, 10decommission-hardware: decommission es1017.eqiad.wmnet - https://phabricator.wikimedia.org/T268825 (10Marostegui) Ah, excellent @volans - thanks for clarifying that. [09:03:33] 10DBA, 10decommission-hardware: decommission es1017.eqiad.wmnet - https://phabricator.wikimedia.org/T268825 (10Marostegui) [09:06:34] 10DBA, 10DC-Ops, 10decommission-hardware: decommission es1017.eqiad.wmnet - https://phabricator.wikimedia.org/T268825 (10Marostegui) a:05Marostegui→03wiki_willy [09:39:16] 10DBA, 10Data-Services, 10Patch-For-Review, 10cloud-services-team (Kanban): Productionize clouddb10[13-20] - https://phabricator.wikimedia.org/T267090 (10Marostegui) [09:46:41] 10DBA, 10GrowthExperiments, 10Growth-Team (Current Sprint), 10Patch-For-Review, and 2 others: Slow load times for Special:Homepage on cswiki - https://phabricator.wikimedia.org/T267216 (10Tgr) Yeah, I meant to say, as soon as there's any disjunction in the query it will pick tl_namespace, even if it's just... [09:54:41] 10DBA, 10Data-Services, 10Patch-For-Review, 10cloud-services-team (Kanban): Productionize clouddb10[13-20] - https://phabricator.wikimedia.org/T267090 (10Marostegui) [09:55:55] 10DBA, 10Data-Services, 10Patch-For-Review, 10cloud-services-team (Kanban): Productionize clouddb10[13-20] - https://phabricator.wikimedia.org/T267090 (10Marostegui) [10:27:41] 10DBA, 10Data-Services, 10Patch-For-Review, 10cloud-services-team (Kanban): Productionize clouddb10[13-20] - https://phabricator.wikimedia.org/T267090 (10Marostegui) @Bstorm just for my own organization, any ETA on when clouddb1020 will be released from your side? Thanks [12:07:28] 10DBA, 10Operations, 10Performance-Team, 10Platform Engineering, 10User-Kormat: Remove groups from db configs - https://phabricator.wikimedia.org/T263127 (10daniel) Back to the PET inbox per @WDoranWMF. We need to figure out where this fits in our process/roadmap. [12:54:26] 10DBA: Add a link engineering: Database for link recommendation service - https://phabricator.wikimedia.org/T267214 (10Marostegui) 05Open→03Resolved I have tested the connection from kubernetes1017 which is on 10.64.0, and it works fine, it can reach `m2-master.eqiad.wmnet`  thru port 3306 just fine. Going t... [13:01:07] 10DBA: Add a link engineering: Database for link recommendation service - https://phabricator.wikimedia.org/T267214 (10kostajh) Thank you @Marostegui! [13:18:53] 10DBA: Add a link engineering: Database for link recommendation service - https://phabricator.wikimedia.org/T267214 (10akosiaris) >>! In T267214#6662762, @Marostegui wrote: > I have tested the connection from kubernetes1017 which is on 10.64.0, and it works fine, it can reach `m2-master.eqiad.wmnet`  thru port 3... [13:25:33] 10DBA: Add a link engineering: Database for link recommendation service - https://phabricator.wikimedia.org/T267214 (10Marostegui) >>! In T267214#6662790, @akosiaris wrote: >>>! In T267214#6662762, @Marostegui wrote: >> I have tested the connection from kubernetes1017 which is on 10.64.0, and it works fine, it c... [13:30:22] 10DBA: Add a link engineering: Database for link recommendation service - https://phabricator.wikimedia.org/T267214 (10akosiaris) >>! In T267214#6662807, @Marostegui wrote: >>>! In T267214#6662790, @akosiaris wrote: >>>>! In T267214#6662762, @Marostegui wrote: >>> I have tested the connection from kubernetes1017... [13:31:52] 10DBA: Add a link engineering: Database for link recommendation service - https://phabricator.wikimedia.org/T267214 (10Marostegui) Thanks - that makes sense! [13:37:53] 10DBA, 10Patch-For-Review: New database request: sockpuppet - https://phabricator.wikimedia.org/T268505 (10hnowlan) >>! In T268505#6662043, @Marostegui wrote: > @hnowlan how will you create the tables initially? Do you need CREATE? Yes please! Just for `sockpuppet_import`. [13:39:06] 10DBA, 10Patch-For-Review: New database request: sockpuppet - https://phabricator.wikimedia.org/T268505 (10Marostegui) [14:39:58] 10DBA, 10Data-Services, 10Patch-For-Review, 10cloud-services-team (Kanban): Productionize clouddb10[13-20] - https://phabricator.wikimedia.org/T267090 (10Marostegui) [15:19:35] 10DBA, 10Data-Services, 10Patch-For-Review, 10cloud-services-team (Kanban): Deploy labsdbuser and views to new clouddb hosts - https://phabricator.wikimedia.org/T268312 (10Bstorm) >>! In T268312#6661950, @Marostegui wrote: > > I am not fully sure how the script works, but given that each instance has its... [15:19:45] 10DBA, 10Patch-For-Review: New database request: sockpuppet - https://phabricator.wikimedia.org/T268505 (10Marostegui) @hnowlan one more clarifying question. On the initial task you wrote: `Probably not needed - data for the database is generated from PySpark models and can be regenerated ` but at T268505#6641... [15:21:42] 10DBA, 10Data-Services, 10Patch-For-Review, 10cloud-services-team (Kanban): Deploy labsdbuser and views to new clouddb hosts - https://phabricator.wikimedia.org/T268312 (10Marostegui) That'd be strange if they were missing databases, as the copies are being done from the sanitarium hosts or even their mast... [15:29:48] 10DBA, 10Community-Tech, 10Expiring-Watchlist-Items: Watchlist Expiry: Release plan [rough schedule] - https://phabricator.wikimedia.org/T261005 (10ifried) Sure, sounds good. Thanks, @Marostegui! [15:29:53] 10DBA, 10Data-Services, 10Patch-For-Review, 10cloud-services-team (Kanban): Deploy labsdbuser and views to new clouddb hosts - https://phabricator.wikimedia.org/T268312 (10Bstorm) I'll get more info today by trying again on servers that have possible issues with debug logging (s7, s6, s5...I think all the... [15:35:42] 10DBA, 10Data-Services, 10Patch-For-Review, 10cloud-services-team (Kanban): Deploy labsdbuser and views to new clouddb hosts - https://phabricator.wikimedia.org/T268312 (10Marostegui) >>! In T268312#6663238, @Bstorm wrote: > I'll get more info today by trying again on servers that have possible issues with... [15:38:02] 10DBA, 10Operations, 10ops-eqiad: Degraded RAID on es1023 - https://phabricator.wikimedia.org/T268796 (10Cmjohnson) Ticket created with Dell, Sending a new disk to equinix. [15:43:14] 10DBA, 10Patch-For-Review: New database request: sockpuppet - https://phabricator.wikimedia.org/T268505 (10hnowlan) >>! In T268505#6663207, @Marostegui wrote: > @hnowlan one more clarifying question. > On the initial task you wrote: `Probably not needed - data for the database is generated from PySpark models... [17:36:45] 10DBA, 10DC-Ops, 10Operations, 10ops-codfw: (Need By: 2020-11-29) rack/setup/install db214[234] - https://phabricator.wikimedia.org/T267041 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by pt1979 on cumin2001.codfw.wmnet for hosts: ` db2143.codfw.wmnet ` The log can be found in `/var/log/wmf-... [17:39:29] 10DBA, 10Operations, 10ops-eqiad: db1139 memory errors on boot (issue continues after board change) 2020-08-27 - https://phabricator.wikimedia.org/T261405 (10jcrespo) @Jclark-ctr were you able to contact HP again? Host is again down so it can be managed at any time. [18:00:43] 10DBA, 10DC-Ops, 10Operations, 10ops-codfw: (Need By: 2020-11-29) rack/setup/install db214[234] - https://phabricator.wikimedia.org/T267041 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['db2143.codfw.wmnet'] ` Of which those **FAILED**: ` ['db2143.codfw.wmnet'] ` [18:02:41] 10DBA, 10Data-Services, 10Patch-For-Review, 10cloud-services-team (Kanban): Deploy labsdbuser and views to new clouddb hosts - https://phabricator.wikimedia.org/T268312 (10Bstorm) Yeah, I think I'll update the task over there today to take clouddb1020 off it. It just makes it more confusing anyway. [18:04:06] 10DBA, 10DC-Ops, 10Operations, 10ops-codfw: (Need By: 2020-11-29) rack/setup/install db214[234] - https://phabricator.wikimedia.org/T267041 (10Papaul) [19:52:02] 10DBA, 10DC-Ops, 10Operations, 10ops-codfw: (Need By: 2020-11-29) rack/setup/install db214[234] - https://phabricator.wikimedia.org/T267041 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by pt1979 on cumin2001.codfw.wmnet for hosts: ` db2144.codfw.wmnet ` The log can be found in `/var/log/wmf-... [20:10:26] 10DBA, 10DC-Ops, 10Operations, 10ops-codfw: (Need By: 2020-11-29) rack/setup/install db214[234] - https://phabricator.wikimedia.org/T267041 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['db2144.codfw.wmnet'] ` Of which those **FAILED**: ` ['db2144.codfw.wmnet'] ` [20:33:53] 10DBA, 10DC-Ops, 10Operations, 10ops-codfw: (Need By: 2020-11-29) rack/setup/install db214[234] - https://phabricator.wikimedia.org/T267041 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by pt1979 on cumin2001.codfw.wmnet for hosts: ` db2144.codfw.wmnet ` The log can be found in `/var/log/wmf-... [20:52:31] 10DBA, 10DC-Ops, 10Operations, 10ops-codfw: (Need By: 2020-11-29) rack/setup/install db214[234] - https://phabricator.wikimedia.org/T267041 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['db2144.codfw.wmnet'] ` Of which those **FAILED**: ` ['db2144.codfw.wmnet'] ` [21:44:51] 10DBA, 10DC-Ops, 10Operations, 10ops-codfw: (Need By: 2020-11-29) rack/setup/install db214[234] - https://phabricator.wikimedia.org/T267041 (10Papaul) [21:45:25] 10DBA, 10DC-Ops, 10Operations, 10ops-codfw: (Need By: 2020-11-29) rack/setup/install db214[234] - https://phabricator.wikimedia.org/T267041 (10Papaul) 05Open→03Resolved @Marostegui all yours [22:39:29] 10DBA, 10Data-Services, 10Patch-For-Review, 10cloud-services-team (Kanban): Deploy labsdbuser and views to new clouddb hosts - https://phabricator.wikimedia.org/T268312 (10Bstorm) So I'm glad I noticed that warning. Two things have come out of it: 1. I've fix the script to be much better around the multi-i...