[06:13:33] 10DBA, 10Operations, 10Puppet, 10User-jbond: DB: perform rolling restart of mariadb deamons to pick up CA changes - https://phabricator.wikimedia.org/T239791 (10Marostegui) [06:14:00] 10Blocked-on-schema-change, 10DBA, 10Core Platform Team, 10Patch-For-Review: Schema change for refactored actor and comment storage - https://phabricator.wikimedia.org/T233135 (10Marostegui) [06:14:15] 10Blocked-on-schema-change, 10DBA, 10Core Platform Team, 10Patch-For-Review: Schema change for refactored actor and comment storage - https://phabricator.wikimedia.org/T233135 (10Marostegui) 05Open→03Resolved All done [06:26:38] 10DBA, 10Operations: Upgrade BIOS and firmware on db2084 - https://phabricator.wikimedia.org/T241103 (10Marostegui) [06:27:10] 10DBA, 10Operations: Upgrade BIOS and firmware on db2084 - https://phabricator.wikimedia.org/T241103 (10Marostegui) p:05Triage→03Normal [06:40:29] 10DBA, 10Cloud-Services: Prepare and check storage layer for ngwikimedia - https://phabricator.wikimedia.org/T240772 (10Ammarpad) [06:47:00] 10DBA, 10Operations, 10Puppet, 10User-jbond: DB: perform rolling restart of mariadb deamons to pick up CA changes - https://phabricator.wikimedia.org/T239791 (10Marostegui) [06:57:21] 10DBA: db1130 BBU possible issues - https://phabricator.wikimedia.org/T240823 (10Marostegui) Looks like the BBU is discharging 1% per day, but normally BBUs tend to swing between 90 and 100% all the time, so so far it is normal - lets keep monitoring it for a few more days: ` root@db1130:~# megacli -AdpBbuCmd -a... [06:59:54] 10DBA, 10GlobalUsage, 10StructuredDataOnCommons: Normalize globalimagelinks table - https://phabricator.wikimedia.org/T241053 (10Marostegui) p:05Triage→03Normal +1 to normalize it, specially given how big it is and the fact that it doesn't have a PK, which makes operations even harder. Do you have an est... [07:13:00] 10DBA, 10Operations, 10Puppet, 10User-jbond: DB: perform rolling restart of mariadb deamons to pick up CA changes - https://phabricator.wikimedia.org/T239791 (10Marostegui) [07:22:45] 10DBA: Investigate possible memory leak on db1115 - https://phabricator.wikimedia.org/T231769 (10Marostegui) I have restarted MySQL today again as part of {T239791} [07:22:55] 10DBA, 10Operations, 10Puppet, 10User-jbond: DB: perform rolling restart of mariadb deamons to pick up CA changes - https://phabricator.wikimedia.org/T239791 (10Marostegui) [07:32:09] 10DBA, 10Operations, 10Puppet, 10User-jbond: DB: perform rolling restart of mariadb deamons to pick up CA changes - https://phabricator.wikimedia.org/T239791 (10Marostegui) [07:50:21] 10DBA, 10Operations, 10Puppet, 10User-jbond: DB: perform rolling restart of mariadb deamons to pick up CA changes - https://phabricator.wikimedia.org/T239791 (10Marostegui) [08:21:34] dbproxy1018 haproxy failover [08:21:43] yep [08:21:44] it is me [08:21:48] dbproxy1019 haproxy failover [08:21:52] ok [08:21:52] yep [08:22:10] and also labsdb1010 ? [08:22:36] yes [08:22:45] sorry, I should have checked sal, my fault [08:22:53] no worries! [08:23:21] I didn't immediately know the proxies by name [08:23:34] yeah, it is confusing now [08:24:32] 10DBA, 10Operations, 10Puppet, 10User-jbond: DB: perform rolling restart of mariadb deamons to pick up CA changes - https://phabricator.wikimedia.org/T239791 (10Marostegui) [08:36:33] 10DBA, 10Operations, 10ops-codfw: Upgrade BIOS and firmware on db2084 - https://phabricator.wikimedia.org/T241103 (10Marostegui) [09:06:58] 10DBA, 10Operations, 10Puppet, 10User-jbond: DB: perform rolling restart of mariadb deamons to pick up CA changes - https://phabricator.wikimedia.org/T239791 (10Marostegui) [09:19:14] o/ Is it possible to get the selects that happened on the master for wikidatawiki around the same time as that transaction that you dumped me? [09:21:23] addshore: nope, that's not possible [09:21:30] unless they were slow and some of them got logged [09:21:33] okay! :P worth a shot [09:21:37] I can still try [09:21:59] but if they were not slow queries we won't see anything [09:22:02] nah they wont have been [09:22:08] thanks though! [09:22:12] :( [09:42:12] 10DBA, 10Operations, 10ops-codfw: Upgrade BIOS and firmware on db2084 - https://phabricator.wikimedia.org/T241103 (10Marostegui) I have depooled this host. So before acting on it we just need to stop downtime + stop MySQL [09:54:40] 10DBA, 10Operations, 10Puppet, 10User-jbond: DB: perform rolling restart of mariadb deamons to pick up CA changes - https://phabricator.wikimedia.org/T239791 (10Marostegui) [10:14:13] Found the bug! YAY! [10:16:31] 10DBA, 10Cloud-VPS, 10cloud-services-team (Kanban): CloudVPS: m5-master databases for openstack may require re-encoding - https://phabricator.wikimedia.org/T234830 (10aborrero) p:05High→03Lowest Lowering priority of this task. We did the Newton->Ocata upgrade ({T237749}) without seeing any issue related... [10:17:23] addshore: what is it! [10:17:49] its complicated https://phabricator.wikimedia.org/T237984#5753567 ;) [10:24:37] 10DBA, 10Cloud-VPS, 10cloud-services-team (Kanban): CloudVPS: m5-master databases for openstack may require re-encoding - https://phabricator.wikimedia.org/T234830 (10Marostegui) Should we close it for now instead? [11:17:46] marostegui: how easy would it be to grep for a transaction that happened today/ [11:18:18] I think I found 1 bug (with the previous transaction) but there might be another one hiding in there with a different transaction too [11:20:52] should be easy, specially if you have a time frame. I'm heading out for lunch, but we can talk later or just send the transaction details and I can look for it once I'm back [11:20:58] addshore: is this an urgent request? [11:21:29] ack, we can talk later, I'm going to spend another few hours looking through the code, just wanted to make sure asking later wouldnt make it harder [11:21:58] how often are the logs rotated? daily? or weekly? [11:22:00] my suggestion would be to ask on phabricator, so it is not forgotten [11:22:04] ack! [11:22:06] ty again [11:22:41] also there you could give all the details we asked last time (timestamp, query, table, server, etc.) [11:22:57] thank you! [11:43:00] hey folks, do we have public metrics/graphs for wiki replica servers? asking for some community members [11:44:12] you mean like load stats, etc? [11:45:24] we have https://grafana.wikimedia.org/d/000000278/mysql-aggregated?orgId=1&var-dc=eqiad%20prometheus%2Fops&var-group=labs&var-shard=multi&var-role=slave&from=1576734313860&to=1576755913860 [11:46:07] and people can check individual servers at: https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-dc=eqiad%20prometheus%2Fops&var-server=labsdb1009&var-port=9104&from=1576745148939&to=1576755948939 [11:47:19] and of course those too, on the host summary ones for os/hw stats [11:50:08] yes thanks [11:50:33] e.g.: https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&refresh=5m&var-server=labsdb1009&var-datasource=eqiad%20prometheus%2Fops&var-cluster=mysql&from=1576745428257&to=1576756228257 [12:53:11] addshore: from your last comment it is not clear to me what transaction you'd like to us to check :) [12:53:26] I think I managed to find it without the transaction this time :) [12:53:33] ah cool :) [12:57:02] 10DBA, 10Operations, 10Puppet, 10User-jbond: DB: perform rolling restart of mariadb deamons to pick up CA changes - https://phabricator.wikimedia.org/T239791 (10Marostegui) [12:57:21] 10DBA, 10Operations, 10Puppet, 10User-jbond: DB: perform rolling restart of mariadb deamons to pick up CA changes - https://phabricator.wikimedia.org/T239791 (10Marostegui) All candidate masters in eqiad and codfw have been restarted (and upgraded) [13:18:47] 10DBA, 10GlobalUsage, 10StructuredDataOnCommons: Normalize globalimagelinks table - https://phabricator.wikimedia.org/T241053 (10Ladsgroup) dev time it should not be much as it's contained in one extension. [13:30:39] 10DBA, 10GlobalUsage, 10StructuredDataOnCommons: Normalize globalimagelinks table - https://phabricator.wikimedia.org/T241053 (10jcrespo) I wonder (and this is just an idea), if we could create a `titles` table as part of this extension (or a different name, if people don't like that first `gu_titles`), and... [23:09:09] 10DBA: Remove grants for the old dbproxy hosts from the misc databases - https://phabricator.wikimedia.org/T231280 (10RobH) [23:09:13] 10DBA: Productionize dbproxy101[2-7].eqiad.wmnet and dbproxy200[1-4] - https://phabricator.wikimedia.org/T202367 (10RobH)