[04:54:12] 10DBA, 10MediaWiki-General, 10MW-1.35-notes (1.35.0-wmf.34; 2020-05-26), 10Patch-For-Review, and 2 others: Normalise MW Core database language fields length - https://phabricator.wikimedia.org/T253276 (10Marostegui) s4 eqiad progress [] labsdb1012 [] labsdb1011 [] labsdb1010 [] labsdb1009 [x] dbstore1004... [07:10:36] 10DBA, 10Operations, 10Patch-For-Review: db1097 (m1 master) crashed due to memory issues. - https://phabricator.wikimedia.org/T256717 (10Marostegui) pre failover steps done [07:20:00] 10DBA, 10Patch-For-Review: Make checksum parallel to the data transfer in transferpy package - https://phabricator.wikimedia.org/T254979 (10Privacybatm) I have created a code for parallel data transfer using `multiprocessing`. I have benchmarked the code in our test machines and the results are given below: *... [08:16:17] 10DBA, 10Epic, 10Patch-For-Review, 10User-Kormat: Upgrade es4 to debian buster + mariadb 10.4 - https://phabricator.wikimedia.org/T257284 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by kormat on cumin2001.codfw.wmnet for hosts: ` ['es2020.codfw.wmnet'] ` The log can be found in `/var/log/wm... [08:18:47] 10DBA, 10Operations: decommission db1097.eqiad.wmnet - https://phabricator.wikimedia.org/T257406 (10Marostegui) [08:19:44] 10DBA, 10Operations, 10Patch-For-Review: db1097 (m1 master) crashed due to memory issues. - https://phabricator.wikimedia.org/T256717 (10Marostegui) 05Open→03Resolved a:03Marostegui All done - the decommissioning on db1097 will be tracked at T257406 Thanks Jaime and Alex for supporting this maintenance! [08:40:22] 10DBA, 10Patch-For-Review: Remove grants for the old dbproxy hosts from the misc databases - https://phabricator.wikimedia.org/T231280 (10Marostegui) Grants for dbproxy1003's IP removed: `for i in db1132 db1128 db1133 db1080; do echo $i; mysql.py -h$i -e "select user,host from mysql.user where host like '10.64... [08:40:32] 10DBA, 10Patch-For-Review: Remove grants for the old dbproxy hosts from the misc databases - https://phabricator.wikimedia.org/T231280 (10Marostegui) 05Open→03Resolved [08:40:43] hmm. repartitioning failed on es2020. investigating [08:41:47] oh ffs [08:41:58] netboot.cfg is missing that specific node too [08:42:11] `es20[12][1-9]` [08:42:56] Oh, can you check for es1020 too? [08:43:04] yeah, same issue [08:43:09] i'm going to send a CR [08:45:30] 10DBA, 10Patch-For-Review: Remove grants for the old dbproxy hosts from the misc databases - https://phabricator.wikimedia.org/T231280 (10Marostegui) [08:46:30] 10DBA, 10Operations, 10decommission-hardware, 10ops-eqiad, 10Patch-For-Review: Decommission dbproxy1003.eqiad.wmnet - https://phabricator.wikimedia.org/T256216 (10Marostegui) [08:48:15] marostegui, jynus: https://gerrit.wikimedia.org/r/c/operations/puppet/+/610240 [08:49:49] we can probably do dbstore* pc* and labsdb* [08:50:03] uf, I would gamble with that [08:50:13] when pcomponents1001 host is setup :-D [08:50:30] we already have issues with es and elastic confussion [08:50:35] jynus: yeah, that's the sort of thing i was thinking of [08:50:52] so this is one of the things that were like that for a reason [08:50:59] yeah, that's why I didn't mention es :) [08:51:02] (it was easier to handle the patches) [08:51:05] but dbstore and labsdb... [08:51:13] but now that we should not be doing that anymore [08:51:26] it is cleaner like kormat proposes IMHO [08:51:57] I don't have any strong opinions on that [08:52:00] talking about the patch itself [08:52:09] before vs patch [08:52:27] if in theory we will not edit the line anymore [08:52:54] plus the risk is minimal [08:53:15] I would like, however, to have testing at least via cumin to check [08:53:39] i'm currently working on that [08:53:41] does cumin use * or actual regular expressions? [08:53:45] cannot remember [08:53:57] it uses plain globbing, or clustershell nodesets [08:54:21] the eqiuivelent cumin pattern is 'db[1-2]* or dbstore[1-2]* or es[1-2]* or pc[1-2]* or labsdb1*' [08:54:46] so let's do that [08:54:48] get a list of hosts [08:55:14] and if it doesn't take much effort compare it with zarcillo to see if we have some instance missing [08:55:51] 10DBA, 10Operations, 10decommission-hardware, 10ops-eqiad, 10Patch-For-Review: Decommission dbproxy1003.eqiad.wmnet - https://phabricator.wikimedia.org/T256216 (10ops-monitoring-bot) cookbooks.sre.hosts.decommission executed by marostegui@cumin1001 for hosts: `dbproxy1003.eqiad.wmnet` - dbproxy1003.eqiad... [08:56:09] for example, there is dbmonitor* hosts [08:57:03] and dbproxy* and dbprov* we don't want to match [08:57:18] 10DBA, 10Operations, 10decommission-hardware, 10ops-eqiad, 10Patch-For-Review: Decommission dbproxy1003.eqiad.wmnet - https://phabricator.wikimedia.org/T256216 (10Marostegui) [08:58:20] marostegui: one unrelated thing, I modified weights of s1 dbs while you were away [08:58:37] because with one host out there was an inbalance with one host being very busy [08:59:10] should be checked that things are ok after db1089 got repooled [08:59:37] in case I broke something [08:59:48] jynus: will do, I repooled db1089 and depooled another host, so it should be fine [08:59:52] but will check, thanks for the heads up [09:00:03] ok, then check when everthing is finished [09:00:21] just wanted to give a heads up as I forgot to comment on the meeting [09:11:31] jynus: both comparisons done (vs cumin, and zarcillo) [09:12:22] and no differences? [09:12:29] no _unexpected_ differences [09:12:39] ok, fair, but still surprised [09:13:02] so my +1 stands, but you know who decides :-D [09:16:36] I am updating misc db documentation [09:16:46] there ware some dbs missing that were recently added [09:25:26] 10DBA, 10Epic, 10Patch-For-Review, 10User-Kormat: Upgrade es4 to debian buster + mariadb 10.4 - https://phabricator.wikimedia.org/T257284 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['es2020.codfw.wmnet'] ` Of which those **FAILED**: ` ['es2020.codfw.wmnet'] ` [09:25:42] I have updated https://wikitech.wikimedia.org/wiki/MariaDB/misc#Current_schemas to the lastest info [09:26:22] I will leave a clean up / visual update (e.g. a table with name / description / service owner / client / needs backups?) for a later time [09:27:38] marostegui: any objections to https://gerrit.wikimedia.org/r/c/operations/puppet/+/610240 ? otherwise i'd like to merge it so i can continue with reimaging es2020 [09:28:06] kormat: I am ok with it [09:28:46] that's the most positive thing i could have wished for! [09:37:56] 10DBA, 10Epic, 10Patch-For-Review, 10User-Kormat: Upgrade es4 to debian buster + mariadb 10.4 - https://phabricator.wikimedia.org/T257284 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by kormat on cumin2001.codfw.wmnet for hosts: ` ['es2020.codfw.wmnet'] ` The log can be found in `/var/log/wm... [09:44:31] 10DBA, 10Operations, 10CAS-SSO, 10Patch-For-Review, 10User-jbond: Request new database for idp-test.wikimedia.org - https://phabricator.wikimedia.org/T256120 (10Marostegui) @jbond the emtpy `cas_staging` database has been created on m1. You need to point your application to: `m1-master.eqiad.wmnet` and... [10:05:02] 10DBA, 10Epic, 10Patch-For-Review, 10User-Kormat: Upgrade es4 to debian buster + mariadb 10.4 - https://phabricator.wikimedia.org/T257284 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['es2020.codfw.wmnet'] ` and were **ALL** successful. [11:06:20] 10DBA, 10MediaWiki-General, 10MW-1.35-notes (1.35.0-wmf.34; 2020-05-26), 10Patch-For-Review, and 2 others: Normalise MW Core database language fields length - https://phabricator.wikimedia.org/T253276 (10Marostegui) [11:11:49] 10DBA, 10MediaWiki-General, 10MW-1.35-notes (1.35.0-wmf.34; 2020-05-26), 10Patch-For-Review, and 2 others: Normalise MW Core database language fields length - https://phabricator.wikimedia.org/T253276 (10Marostegui) [11:13:40] 10DBA, 10MediaWiki-General, 10MW-1.35-notes (1.35.0-wmf.34; 2020-05-26), 10Patch-For-Review, and 2 others: Normalise MW Core database language fields length - https://phabricator.wikimedia.org/T253276 (10Marostegui) s8 eqiad progress [] labsdb1012 [] labsdb1011 [] labsdb1010 [] labsdb1009 [x] dbstore1005... [11:59:58] 10DBA, 10MediaWiki-General, 10MW-1.35-notes (1.35.0-wmf.34; 2020-05-26), 10Patch-For-Review, and 2 others: Normalise MW Core database language fields length - https://phabricator.wikimedia.org/T253276 (10Marostegui) [12:13:57] marostegui: could you give https://gerrit.wikimedia.org/r/c/operations/puppet/+/610275 a review if you get a sec [12:26:17] jbond42: checking [12:26:34] thanks <3 [12:57:41] marostegui: thanks for the +1 on the TLS change. i see you use the puppet certs and wonder if you have tried using the alt_dns_names options to add in m1-master (or other names that may be usefull)? [13:52:45] 10DBA, 10MediaWiki-General, 10MW-1.35-notes (1.35.0-wmf.34; 2020-05-26), 10Patch-For-Review, and 2 others: Normalise MW Core database language fields length - https://phabricator.wikimedia.org/T253276 (10Marostegui) [14:33:08] 10DBA, 10MediaWiki-General, 10MW-1.35-notes (1.35.0-wmf.34; 2020-05-26), 10Patch-For-Review, and 2 others: Normalise MW Core database language fields length - https://phabricator.wikimedia.org/T253276 (10Marostegui) s7 eqiad progress [] labsdb1012 [] labsdb1011 [] labsdb1010 [] labsdb1009 [x] dbstore1003... [17:00:48] 10DBA, 10Cloud-Services, 10CPT Initiatives (MCR Schema Migration), 10Core Platform Team Workboards (Clinic Duty Team), and 2 others: Apply updates for MCR, actor migration, and content migration, to production wikis. - https://phabricator.wikimedia.org/T238966 (10Marostegui) Just for the record, altering s... [17:00:56] 10DBA, 10Cloud-Services, 10CPT Initiatives (MCR Schema Migration), 10Core Platform Team Workboards (Clinic Duty Team), and 2 others: Apply updates for MCR, actor migration, and content migration, to production wikis. - https://phabricator.wikimedia.org/T238966 (10Marostegui) [18:41:16] 10DBA, 10Data-Services, 10User-Ladsgroup, 10cloud-services-team (Kanban): Prepare and check storage layer for shnwiktionary - https://phabricator.wikimedia.org/T256010 (10Ladsgroup) a:05Ladsgroup→03None This is definitely can't be done by yours truly :D [18:44:21] 10DBA, 10Data-Services, 10User-Ladsgroup, 10cloud-services-team (Kanban): Prepare and check storage layer for shnwiktionary - https://phabricator.wikimedia.org/T256010 (10Nintendofan885) >>! In T256010#6290908, @Ladsgroup wrote: > This is definitely can't be done by yours truly :D :) [18:54:21] 10DBA, 10CheckUser, 10Trust-and-Safety, 10WMF-Legal, and 2 others: Configure WMF wikis to log login attempts in CheckUser - https://phabricator.wikimedia.org/T253802 (10Huji) [19:16:33] 10DBA, 10Cloud-Services, 10CPT Initiatives (MCR Schema Migration), 10Core Platform Team Workboards (Clinic Duty Team), and 2 others: Apply updates for MCR, actor migration, and content migration, to production wikis. - https://phabricator.wikimedia.org/T238966 (10WDoranWMF) @Marostegui That's really useful...