[05:56:47] 10Blocked-on-schema-change, 10DBA, 10Fundraising-Backlog: CentralNotice: Update DB schema on Meta for campaign types feature - https://phabricator.wikimedia.org/T272953 (10Marostegui) [05:57:16] 10Blocked-on-schema-change, 10DBA, 10Fundraising-Backlog: CentralNotice: Update DB schema on Meta for campaign types feature - https://phabricator.wikimedia.org/T272953 (10Marostegui) 05Open→03Resolved Change applied to `metawiki` [06:31:14] 10Blocked-on-schema-change, 10DBA, 10Fundraising-Backlog: CentralNotice: Update DB schema on Meta for campaign types feature - https://phabricator.wikimedia.org/T272953 (10AndyRussG) >>! In T272953#6790271, @Marostegui wrote: > Change applied to `metawiki` Cool beans, thanks so much!! :D [07:33:03] 10DBA, 10mariadb-optimizer-bug: Investigate possible optimizer regression on 10.4.17 with DELETE statements - https://phabricator.wikimedia.org/T268457 (10Marostegui) 05Open→03Resolved So the summary is: Bug confirmed on 10.4.17 and fixed on 10.4.18. The workaround: * If upgrading from 10.4.15 (or older)... [08:02:23] 10DBA, 10SRE, 10Patch-For-Review: Productionize db1155-db1175 and refresh and decommission db1074-db1095 (22 servers) - https://phabricator.wikimedia.org/T258361 (10Marostegui) [08:28:20] 10DBA, 10SRE, 10Patch-For-Review: Productionize db1155-db1175 and refresh and decommission db1074-db1095 (22 servers) - https://phabricator.wikimedia.org/T258361 (10Marostegui) [08:29:06] 10DBA, 10SRE, 10Patch-For-Review: Productionize db1155-db1175 and refresh and decommission db1074-db1095 (22 servers) - https://phabricator.wikimedia.org/T258361 (10Marostegui) [08:35:15] 10DBA, 10decommission-hardware: decommission db1089.eqiad.wmnet - https://phabricator.wikimedia.org/T273417 (10Marostegui) [08:46:01] 10DBA, 10decommission-hardware, 10Patch-For-Review: decommission db1089.eqiad.wmnet - https://phabricator.wikimedia.org/T273417 (10Marostegui) [08:46:03] 10DBA, 10SRE, 10Patch-For-Review: Productionize db1155-db1175 and refresh and decommission db1074-db1095 (22 servers) - https://phabricator.wikimedia.org/T258361 (10Marostegui) [08:46:30] 10DBA, 10SRE, 10Patch-For-Review: Productionize db1155-db1175 and refresh and decommission db1074-db1095 (22 servers) - https://phabricator.wikimedia.org/T258361 (10Marostegui) [08:56:30] 10DBA, 10decommission-hardware, 10Patch-For-Review: decommission db1089.eqiad.wmnet - https://phabricator.wikimedia.org/T273417 (10Marostegui) [09:05:24] 10DBA, 10Orchestrator: Add m* and es4/es5 sections to Orchestrator - https://phabricator.wikimedia.org/T272568 (10Marostegui) a:03Marostegui [09:06:06] 10Blocked-on-schema-change, 10DBA: Alter objectcache.exptime - https://phabricator.wikimedia.org/T272512 (10Marostegui) a:03Marostegui [09:07:36] marostegui: happy Monday :D [09:07:39] * Amir1 hides [09:11:22] 10Blocked-on-schema-change, 10DBA: Alter objectcache.exptime - https://phabricator.wikimedia.org/T272512 (10Marostegui) @Ladsgroup this table isn't empty and it is in fact written quite often. For instance these are values for s6: ` frwiki +----------+ | count(*) | +----------+ | 263130 | +----------+ jawi... [09:11:26] Amir1: hahaha I just pinged you there ^ [09:11:59] 10Blocked-on-schema-change: Schema change for renaming two indexes of site_identifiers - https://phabricator.wikimedia.org/T273361 (10Marostegui) a:03Marostegui [09:12:18] 10Blocked-on-schema-change: Schema change for renaming two indexes of site_identifiers - https://phabricator.wikimedia.org/T273361 (10Marostegui) p:05Triage→03Medium [09:12:38] oh it got more fun [09:12:41] let me check [09:12:53] sure, no rush, we have plenty of tasks....................... [09:12:55] :) [09:13:47] 10Blocked-on-schema-change: Schema change for renaming two indexes of site_identifiers - https://phabricator.wikimedia.org/T273361 (10Marostegui) [09:15:14] Will add one soon, a couple more later including PK for image table [09:15:20] :D [09:15:40] Is there a way for me to help (beside creating less tasks?) [09:16:17] For the index renaming/removals the only thing that would be is if you could give a quick look at the code to check if they might be hardcoded somewhere [09:18:21] 10Blocked-on-schema-change, 10DBA: Alter objectcache.exptime - https://phabricator.wikimedia.org/T272512 (10Ladsgroup) ugh, it's heavily used by Echo: ` wikiadmin@10.64.16.103(frwiki)> select * from objectcache limit 5; +-------------------------------------+--------------------------+---------------------+ |... [09:20:41] 10Blocked-on-schema-change: Schema change for renaming two indexes of site_identifiers - https://phabricator.wikimedia.org/T273361 (10Marostegui) [09:20:52] marostegui: oh after one incident we caused, I double check them with codesearch all the time [09:20:55] 10Blocked-on-schema-change: Schema change for renaming two indexes of site_identifiers - https://phabricator.wikimedia.org/T273361 (10Marostegui) Altered s6 codfw - leaving it for a day before going for some slaves in eqiad [09:21:13] Amir1: ah good! I always do it too, but another pair of eyes always help! [09:21:44] (the incident was with third parties) [09:26:36] 10Blocked-on-schema-change, 10DBA: Alter objectcache.exptime - https://phabricator.wikimedia.org/T272512 (10Ladsgroup) Why the expiry of the echo is for eighteen years in the future? Are we in a war or something? [12:14:19] I noticed on orchestrator, all hosts are reporting 1s lag, is there a placeholder now until blockers are solved? [12:46:30] it's defined by this query: [12:46:31] `SELECT ROUND(TIME_TO_SEC(TIMEDIFF(UTC_TIMESTAMP(6),ts))) FROM heartbeat.heartbeat WHERE ts>0 ORDER BY ts ASC LIMIT 1` [12:59:02] 10DBA, 10SRE, 10Patch-For-Review: Productionize db1155-db1175 and refresh and decommission db1074-db1095 (22 servers) - https://phabricator.wikimedia.org/T258361 (10Marostegui) @jcrespo be aware that you can proceed replacing db1095 with db1171 anytime. [12:59:18] given that heartbeat only gets updated once a second, it'll show a 1s lag at least 50% of the time [13:01:36] 10DBA, 10SRE, 10Patch-For-Review: Productionize db1155-db1175 and refresh and decommission db1074-db1095 (22 servers) - https://phabricator.wikimedia.org/T258361 (10jcrespo) Thanks for the notice! [13:02:51] 10Blocked-on-schema-change, 10DBA: Alter objectcache.exptime - https://phabricator.wikimedia.org/T272512 (10Marostegui) 05Open→03Stalled @Ladsgroup going to stall this for now until you've had time to investigate :) - no rush, we have plenty of other schema changes to do! [13:06:14] 10Blocked-on-schema-change: Schema change for renaming two indexes of site_identifiers - https://phabricator.wikimedia.org/T273361 (10Marostegui) s6 eqiad [] labsdb1012 [] labsdb1011 [] labsdb1010 [] labsdb1009 [x] dbstore1005 [] db1155 [x] db1140 [x] db1139 [] db1131 [] db1125 [x] db1113 [x] db1098 [x] db1096... [13:33:40] 10Blocked-on-schema-change: Schema change for renaming two indexes of site_identifiers - https://phabricator.wikimedia.org/T273361 (10Marostegui) [13:46:03] jynus: can i get a +1 on https://gerrit.wikimedia.org/r/c/operations/software/wmfmariadbpy/+/658580, please? [13:49:15] checking [13:50:45] jynus: ty :) [13:58:40] hmm, https://dbdiagram.io/home [14:03:22] Amir1: is that on the left CSS? :-P [14:03:34] Amir1, re: link/email, I would like, at some point, to rethink this as json properties: https://gerrit.wikimedia.org/r/c/mediawiki/core/+/337390/4/maintenance/tables.sql [14:04:26] volans: looks like it but sine I have been on a fully frontend project for half a year now (felt like five though) I can assure you it's not css [14:04:33] not even scss or less [14:04:37] lol [14:05:25] jynus: yeah that sounds good. Just one thing is that we ditched all explicit FKs in PG as well [14:05:33] that is ok [14:05:50] jynus: the lag on orchestrator should now normally show 0s. thanks for your question earlier :) [14:05:51] but we really need to capture the sematics, hopefully in a machinea-readable way [14:06:20] yeah, we can at least record that [14:06:37] then we can write tests to make sure for example they have the same datatype, etc. [14:06:43] thanks, kormat, I hope I didn't create many inconveniences, as an outsider, it looked weird to have 1s lag on primary dbs at least [14:07:08] and the 50-50 ended up giving me 1s on 3-4 pages I checked in a row [14:07:13] jynus: not at all. when looking at it, i _finally_ figured out why i'd seen code that subtracted 0.5s from the lag :) [14:08:00] kormat, yeah, that is not really justified, but because the error margin is so large, it was easier to read [14:08:14] (talking when I made it so for icinga) [14:08:41] I am guessing orch doesn't allow factional seconds? [14:09:55] jynus: is there a ticket for discussion around it? [14:10:24] Amir1, I can check, probably there is something, but there wasn't at the time of the patch [14:10:48] 10Data-Persistence-Backup, 10SRE: Revert OpenSSL min version configuration introduced for bacula compatibility - https://phabricator.wikimedia.org/T273182 (10jcrespo) I didn't get any answer here or on the other ticket, so this is my plan now: * Add a conditional so the above code only affects jessie host (co... [14:12:03] Let me know, I'll add it to list of things we are doing [14:12:19] (currently, cleaning 500 files from mediawiki core) [14:12:40] Amir1, the only thing related that I found is: https://phabricator.wikimedia.org/T91859 [14:13:43] Amir1: 500? [14:13:43] nah, that's not [14:14:05] Majavah: yeah the archive .sql files add up to more than 500 [14:14:19] ah [14:14:31] gosh knows how many of them are orphan, broken, etc. [14:14:58] install mw 1.2 and test [14:16:54] jynus: correct, orchestrator requests the replcation lag query to return an integer number of seconds [14:16:58] *requires [14:17:25] Amir1, apparently there is a lua module that parses sql: https://www.mediawiki.org/wiki/Module:SchemaDiagram [14:17:51] yeah, we are planning to replace it, it's less than optimal [14:18:09] Krinkle wrote it a while ago [14:19:50] anyway, my intention with the original patch was not as much rendering it, as additional CI preventing breaking stuff [14:20:24] this amount of issues were caught when I sent it: https://phabricator.wikimedia.org/T157227 [14:21:26] Amir1, is there any GSOC-sized task that could help the whole process? [14:23:15] 10DBA: Fix db-switchover update zarcillo part - https://phabricator.wikimedia.org/T272954 (10Kormat) The fix is merged, but not yet released. [14:25:50] can't say for sure [14:25:58] there are migrating the extensions [14:26:32] yeah, but I don't think we can mentor that ourselves [14:27:09] I was mostly thinking something self-contained, like a web frontend to some of your tools or something like that, less core [14:28:30] don't worry, we have lots of small tasks ourself too [14:30:52] 10DBA: Fix db-switchover update zarcillo part - https://phabricator.wikimedia.org/T272954 (10Marostegui) Thank you [14:34:53] yeah, I will think [14:38:48] 10DBA, 10Orchestrator, 10Patch-For-Review, 10User-Kormat: Enable report_host for mariadb - https://phabricator.wikimedia.org/T266483 (10Marostegui) [14:43:36] 10DBA, 10Orchestrator, 10Patch-For-Review, 10User-Kormat: Enable report_host for mariadb - https://phabricator.wikimedia.org/T266483 (10Marostegui) [15:00:31] 10Data-Persistence-Backup, 10SRE: Revert OpenSSL min version configuration introduced for bacula compatibility - https://phabricator.wikimedia.org/T273182 (10jcrespo) I'm silly, I was totally convinced that the revert applied to clients. It does not, only to storage hosts, which is easier to revert. That also... [15:00:39] 10Data-Persistence-Backup, 10SRE: Revert OpenSSL min version configuration introduced for bacula compatibility - https://phabricator.wikimedia.org/T273182 (10jcrespo) [17:10:02] 10DBA: Fix db-switchover update zarcillo part - https://phabricator.wikimedia.org/T272954 (10LSobanski) a:03Kormat [17:41:56] 10Data-Persistence-Backup, 10SRE, 10decommission-hardware, 10ops-codfw: decommission heze and heze-array1 - https://phabricator.wikimedia.org/T273051 (10Papaul) [18:02:59] 10DBA, 10Data-Services: Prepare and check storage layer for mniwiki - https://phabricator.wikimedia.org/T273465 (10LSobanski) p:05Triage→03Medium Thanks, let us know when the database is created, so we can sanitize it. [18:03:22] 10DBA, 10Data-Services: Prepare and check storage layer for mniwiktionary - https://phabricator.wikimedia.org/T273459 (10LSobanski) p:05Triage→03Medium Thanks, let us know when the database is created, so we can sanitize it. [18:23:12] 10Data-Persistence-Backup, 10SRE, 10decommission-hardware, 10ops-codfw: decommission heze and heze-array1 - https://phabricator.wikimedia.org/T273051 (10Papaul) [18:24:11] 10Data-Persistence-Backup, 10SRE, 10decommission-hardware, 10ops-codfw: decommission heze and heze-array1 - https://phabricator.wikimedia.org/T273051 (10Papaul) 05Open→03Resolved complete . @jcrespo thanks for getting this done. [20:33:51] 10DBA, 10Patch-For-Review: Productionize x2 databases - https://phabricator.wikimedia.org/T269324 (10Krinkle) 05Resolved→03Open Writing locally to mainstash is a hard requirement. Data is expected to generally persist and be eventually consistent but loss is tolerable. E.g. under maintenance a db can simp... [21:08:08] 10DBA, 10Patch-For-Review: Productionize x2 databases - https://phabricator.wikimedia.org/T269324 (10Marostegui) >>! In T269324#6793793, @Krinkle wrote: > Writing locally to mainstash is a hard requirement. > That's ok - but we need to make changes to our puppet and verify how this would work with `dbctl` (I... [21:17:22] 10DBA, 10Patch-For-Review: Productionize x2 databases - https://phabricator.wikimedia.org/T269324 (10Krinkle) >>! In T269324#6793944, @Marostegui wrote: > What's the impact if one of the masters goes down unexpectedly? > Also, what if we need to do maintenance on of them? Ie: reboot for a kernel upgrade?. Can... [21:29:24] 10DBA, 10Patch-For-Review: Productionize x2 databases - https://phabricator.wikimedia.org/T269324 (10Marostegui) >>! In T269324#6793987, @Krinkle wrote: >>>! In T269324#6793944, @Marostegui wrote: >> What's the impact if one of the masters goes down unexpectedly? >> Also, what if we need to do maintenance on... [21:42:49] 10DBA, 10Orchestrator: Cleanup heartbeat.heartbeat on all production instances - https://phabricator.wikimedia.org/T268336 (10Marostegui) m2 cleaned [21:43:01] 10DBA, 10Orchestrator: Cleanup heartbeat.heartbeat on all production instances - https://phabricator.wikimedia.org/T268336 (10Marostegui)