[00:10:45] 10DBA, 10Wikimedia-Site-requests, 10User-MarcoAurelio: Global rename of The_Photographer → Wilfredor: supervision needed - https://phabricator.wikimedia.org/T215107 (10Wilfredor) I cant login with any user or any password. I tried to reset my password but its told me that the username not exist [00:13:06] 10DBA, 10Wikimedia-Site-requests, 10User-MarcoAurelio: Global rename of The_Photographer → Wilfredor: supervision needed - https://phabricator.wikimedia.org/T215107 (10Zoranzoki21) >>! In T215107#4962965, @Wilfredor wrote: > I cant login with any user or any password. I tried to reset my password but its tol... [00:18:17] 10DBA, 10Wikimedia-Site-requests, 10User-MarcoAurelio: Global rename of The_Photographer → Wilfredor: supervision needed - https://phabricator.wikimedia.org/T215107 (10Wilfredor) How much it could take? [01:39:30] 10DBA, 10Wikimedia-Site-requests, 10User-MarcoAurelio: Global rename of The_Photographer → Wilfredor: supervision needed - https://phabricator.wikimedia.org/T215107 (10Tgr) So that's an impressive cascade of failures: * Updates for the `image` table are not batched so the query times out. (Why doesn't the DB... [04:10:29] 10DBA, 10Analytics, 10MediaWiki-Database, 10Research, 10Wikidata: Improve interlingual links across wikis through Wikidata IDs - https://phabricator.wikimedia.org/T215616 (10Tbayer) [06:10:43] 10DBA, 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Banyek: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5] - https://phabricator.wikimedia.org/T210478 (10Marostegui) The migration finished. These are the times in UTC from 18th Feb 2019: - Read only on dbstore1002: 05:53... [06:10:54] 10DBA, 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Banyek: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5] - https://phabricator.wikimedia.org/T210478 (10Marostegui) [07:15:03] 10DBA, 10Wikimedia-Site-requests: Global rename of Дагиров Умар → Takhirgeran Umar: supervision needed - https://phabricator.wikimedia.org/T216444 (10Marostegui) 05Open→03Stalled [07:49:18] 10DBA, 10Data-Services, 10Datasets-General-or-Unknown, 10User-notice: Archive and drop education program (ep_*) tables on all wikis - https://phabricator.wikimedia.org/T174802 (10Marostegui) [07:56:36] 10DBA, 10Operations, 10ops-codfw: Reboot, upgrade firmware and kernel of db1096-db1106, db2071-db2092 - https://phabricator.wikimedia.org/T216240 (10Marostegui) [07:56:48] 10DBA, 10Operations, 10ops-codfw: Reboot, upgrade firmware and kernel of db1096-db1106, db2071-db2092 - https://phabricator.wikimedia.org/T216240 (10Marostegui) db1106 has been rebooted (and kernel was upgraded) [08:06:17] 10DBA, 10Data-Services, 10Datasets-General-or-Unknown, 10User-notice: Archive and drop education program (ep_*) tables on all wikis - https://phabricator.wikimedia.org/T174802 (10Marostegui) [08:24:17] 10DBA, 10Data-Services, 10Datasets-General-or-Unknown, 10User-notice: Archive and drop education program (ep_*) tables on all wikis - https://phabricator.wikimedia.org/T174802 (10Marostegui) [08:34:18] jynus: I would appreciate another pair of eyes on: https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/491413/ [08:35:00] 10DBA, 10Data-Services, 10Datasets-General-or-Unknown, 10Patch-For-Review, 10User-notice: Archive and drop education program (ep_*) tables on all wikis - https://phabricator.wikimedia.org/T174802 (10Marostegui) [08:41:30] 10DBA, 10Data-Services, 10Datasets-General-or-Unknown, 10Patch-For-Review, 10User-notice: Archive and drop education program (ep_*) tables on all wikis - https://phabricator.wikimedia.org/T174802 (10Marostegui) [09:22:46] 10DBA, 10Data-Services, 10Datasets-General-or-Unknown, 10Patch-For-Review, 10User-notice: Archive and drop education program (ep_*) tables on all wikis - https://phabricator.wikimedia.org/T174802 (10Marostegui) [09:23:25] 10DBA, 10Epic, 10Tracking: Database tables to be dropped on Wikimedia wikis and other WMF databases (tracking) - https://phabricator.wikimedia.org/T54921 (10Marostegui) [09:23:31] 10DBA, 10Data-Services, 10Datasets-General-or-Unknown, 10Patch-For-Review, 10User-notice: Archive and drop education program (ep_*) tables on all wikis - https://phabricator.wikimedia.org/T174802 (10Marostegui) 05Open→03Resolved This is all done. The only pending follow up is to remove the views whic... [09:37:30] so es backups no longer fit on 10 TB, so I will have to do es2 on es2002, es3 somewhere else [09:37:53] :/ [09:38:03] how big is it? [09:38:17] 5TB each aprox [09:39:18] which means I will have to delete es1 on either es2003 or es2004 [09:50:58] so I left es2004 with es1 in raw format, es2002 with and es2 dump and es2003 with an es3 dump (ongoing) [09:51:09] are you using xtrabackup? [09:51:20] no [09:51:23] mydumper [09:51:32] with xtrabackup it wouldn't fit [09:52:09] dump_section.py --host=es2019.codfw.wmnet --user=root --pass=$pass --backup-dir=/srv/backups/dumps/ongoing es3 [09:52:56] so to recover, recover_section.py --host [09:53:06] Ah right I see [09:53:07] or [09:53:23] so to recover, recover_section.py es2 | es3 --host [10:44:47] 10DBA, 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Banyek: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5] - https://phabricator.wikimedia.org/T210478 (10Marostegui) [11:32:15] marostegui / jynus - so afaics the commons rename broke due to lack of batched updates, right? [11:33:03] that is what I understand from Tgr's comment (and gerrit patch) [11:33:30] I hope it can be reviewed/fixed, cherry-picked too soon :) [11:33:44] not only for big global renames but for all renames in general [11:33:51] it may make the dbs happy [12:37:28] hi there [12:37:40] what do you think about this memory graph for clouddb1001? [12:37:41] https://grafana-labs.wikimedia.org/dashboard/db/labs-project-board?orgId=1&var-project=clouddb-services&var-server=clouddb1001&from=now-24h&to=now [12:38:23] server memory should grow to 40-50GB [12:38:48] ok [12:38:50] thanks [12:38:59] that seems normal to me and expected: https://grafana-labs.wikimedia.org/dashboard/db/labs-project-board?orgId=1&var-project=clouddb-services&var-server=clouddb1001&from=now-24h&to=now&panelId=17&fullscreen [12:39:08] myisam and other things use cached memory [12:39:27] if we want to control those a bit better, we should force the use of Innodb [12:39:49] and increase the buffer pool to 75% of total available memory [12:39:55] let's see how this evolves in the next week [12:40:17] may I ask about your concerns, the growing trend? [12:40:36] yes [12:40:50] yeah, that is normal, a db consumes 100% of available memory [12:41:28] arturo: example https://grafana.wikimedia.org/d/000000274/prometheus-machine-stats?panelId=4&fullscreen&orgId=1&var-server=db1089&var-datasource=eqiad%20prometheus%2Fops [12:42:49] I see, thanks jynus [12:43:04] you can see the same pattern on labsdb1004: https://grafana.wikimedia.org/d/000000274/prometheus-machine-stats?panelId=4&fullscreen&orgId=1&var-server=labsdb1004&var-datasource=eqiad%20prometheus%2Fops&from=1547988166054&to=1550580106054 [13:47:46] 10DBA, 10Core Platform Team, 10MediaWiki-API, 10Patch-For-Review, 10Wikimedia-production-error: Certain ApiQueryRecentChanges::run api query is too slow, slowing down dewiki - https://phabricator.wikimedia.org/T149077 (10Marostegui) Could this be another case of MariaDB getting the optimizer fixed with a... [14:47:37] 10DBA, 10Operations, 10ops-codfw: Reboot, upgrade firmware and kernel of db1096-db1106, db2071-db2092 - https://phabricator.wikimedia.org/T216240 (10Papaul) Can db2089 be depool please if it is not yet? Thanks [14:49:21] 10DBA, 10Operations, 10ops-codfw: Reboot, upgrade firmware and kernel of db1096-db1106, db2071-db2092 - https://phabricator.wikimedia.org/T216240 (10jcrespo) Rebooting db2090: ` PowerEdge R630 BIOS Version: 2.4.3 ` ` 1st reboot: OK 2nd reboot: FAIL 3rd reboot: OK 4th reboot: OK 5th reboot: OK 6th reboot:... [14:49:45] 10DBA, 10Operations, 10ops-codfw: Reboot, upgrade firmware and kernel of db1096-db1106, db2071-db2092 - https://phabricator.wikimedia.org/T216240 (10jcrespo) Preparing db2089 for you, @Papaul give me 5 minutes. [15:04:43] 10DBA, 10Operations, 10ops-codfw: Reboot, upgrade firmware and kernel of db1096-db1106, db2071-db2092 - https://phabricator.wikimedia.org/T216240 (10jcrespo) Rebooting db2090: ` PowerEdge R630 BIOS Version: 2.4.3 ` ` 1st reboot: OK 2nd reboot: FAIL 3rd reboot: OK 4th reboot: OK 5th reboot: OK 6th reboot:... [15:15:49] 10DBA, 10MediaWiki-API, 10Core Platform Team Backlog (Watching / External), 10Patch-For-Review, 10Wikimedia-production-error: Certain ApiQueryRecentChanges::run api query is too slow, slowing down dewiki - https://phabricator.wikimedia.org/T149077 (10EvanProdromou) We're watching this problem, but it sou... [15:28:03] 10DBA, 10MediaWiki-API, 10Core Platform Team Backlog (Watching / External), 10Core Platform Team Kanban (Waiting for Review), and 2 others: Certain ApiQueryRecentChanges::run api query is too slow, slowing down dewiki - https://phabricator.wikimedia.org/T149077 (10Anomie) [15:33:38] Not sure what to make of https://phabricator.wikimedia.org/T216240#4964736 [15:37:42] 10DBA, 10Operations, 10ops-codfw: Reboot, upgrade firmware and kernel of db1096-db1106, db2071-db2092 - https://phabricator.wikimedia.org/T216240 (10Papaul) db2089 upgrade complete Upgrade BIOS from 2.4.3 to 2.9.1 IDRAC from 2.40. to 2.61 [15:39:59] 10DBA, 10Operations, 10ops-codfw: Reboot, upgrade firmware and kernel of db1096-db1106, db2071-db2092 - https://phabricator.wikimedia.org/T216240 (10jcrespo) a:05Papaul→03jcrespo Thanks, will ping you when/if tested more issues on that and other servers. [15:40:29] 10DBA, 10Operations, 10ops-codfw: Reboot, upgrade firmware and kernel of db1096-db1106, db2071-db2092 - https://phabricator.wikimedia.org/T216240 (10jcrespo) [16:12:53] 10DBA, 10Operations, 10ops-codfw: Reboot, upgrade firmware and kernel of db1096-db1106, db2071-db2092 - https://phabricator.wikimedia.org/T216240 (10jcrespo) Rebooting db2089: ` PowerEdge R630 BIOS Version: 2.9.1 ` ` 1st reboot: OK 2nd reboot: OK 3rd reboot: OK 4th reboot: OK 5th reboot: OK 6th reboot: O... [16:30:05] he :-) [16:31:32] lol [16:31:43] three people updating the dbstore task [16:32:09] haha [16:57:09] wmgWikibaseRepoIdGeneratorSeparateDbConnection = true [17:13:31] is it Java? :-P [17:18:07] it is wikibase configuration [17:18:46] if it was java, it would start with org.wikimedia.... [17:43:02] ;) [17:47:50] 10DBA, 10Operations, 10ops-codfw: Reboot, upgrade firmware and kernel of db1096-db1106, db2071-db2092 - https://phabricator.wikimedia.org/T216240 (10jcrespo) 05Open→03Stalled p:05Triage→03Low So I believe this is still an ongoing issue, but the remaining hosts may have a lower probability of failing... [17:49:35] backups are working as expected with the patches [17:50:14] I will wait until they are done and then finish the raw backups [18:59:38] 10DBA, 10Analytics, 10MediaWiki-Database, 10Research, 10Wikidata: Improve interlingual links across wikis through Wikidata IDs - https://phabricator.wikimedia.org/T215616 (10Isaac) thank you @JAllemandou this is awesome!!! completely unblocks me (i have a bunch of page titles across all the wikipedias an... [19:10:24] 10DBA, 10Analytics, 10MediaWiki-Database, 10Research, 10Wikidata: Improve interlingual links across wikis through Wikidata IDs - https://phabricator.wikimedia.org/T215616 (10diego) @JAllemandou , yes. Having this by revision would be great! [19:18:03] 10DBA, 10Analytics, 10MediaWiki-Database, 10Research, 10Wikidata: Improve interlingual links across wikis through Wikidata IDs - https://phabricator.wikimedia.org/T215616 (10Isaac) @diego: my interpretation is that right now in the revision history version, the same wikidb/page ID/title is associated wit... [19:40:18] 10DBA, 10Analytics, 10MediaWiki-Database, 10Research, 10Wikidata: Improve interlingual links across wikis through Wikidata IDs - https://phabricator.wikimedia.org/T215616 (10JAllemandou) Thanks @Isaac for reformulating the question I tried to explain above :) @diego: Can you confirm there is value for yo... [19:41:01] 10DBA, 10Data-Services, 10Datasets-General-or-Unknown, 10Patch-For-Review, 10User-notice: Archive and drop education program (ep_*) tables on all wikis - https://phabricator.wikimedia.org/T174802 (10GTirloni) [21:32:50] 10DBA, 10MediaWiki-API, 10Core Platform Team Backlog (Watching / External), 10Core Platform Team Kanban (Waiting for Review), and 2 others: Certain ApiQueryRecentChanges::run api query is too slow, slowing down dewiki - https://phabricator.wikimedia.org/T149077 (10Anomie) >>! In T149077#4964548, @Marostegu... [22:44:02] 10DBA, 10Jade, 10Scoring-platform-team: Review real-world query plans and performance for Jade - https://phabricator.wikimedia.org/T212435 (10Halfak) p:05Normal→03Low