[05:32:05] 10Blocked-on-schema-change, 10DBA: Schema change: Make page.page_restrictions column NULL - https://phabricator.wikimedia.org/T248333 (10Marostegui) [05:58:49] 10DBA, 10MediaWiki-extensions-WikibaseRepository, 10Wikidata, 10MW-1.35-notes (1.35.0-wmf.27; 2020-04-07), and 8 others: Wikidata's wb_items_per_site table has suddenly disappeared, creating DBQueryErrors on page views - https://phabricator.wikimedia.org/T249565 (10Ladsgroup) >>! In T249565#6037357, @Addsh... [06:11:10] 10DBA, 10Data-Services, 10Operations: Replace labsdb (wikireplicas) dbproxies: dbproxy1010 and dbproxy1011 - https://phabricator.wikimedia.org/T231520 (10Marostegui) Looks like there are no more connections going through dbproxy1011: ` root@cumin1001:/home/marostegui# host dbproxy1011 dbproxy1011.eqiad.wmnet... [06:11:57] 10DBA, 10Data-Services, 10Operations: Replace labsdb (wikireplicas) dbproxies: dbproxy1010 and dbproxy1011 - https://phabricator.wikimedia.org/T231520 (10Marostegui) I have stopped haproxy on dbproxy1011 [06:22:26] 10DBA, 10Operations, 10Wikimedia-Incident: Redefine mysql GRANTs for wikiadmin - https://phabricator.wikimedia.org/T249683 (10Marostegui) [06:25:17] 10DBA, 10Operations, 10Wikimedia-Incident: Redefine mysql GRANTs for wikiadmin - https://phabricator.wikimedia.org/T249683 (10Marostegui) p:05Triage→03High [06:26:09] 10DBA, 10Operations, 10Wikimedia-Incident: Redefine mysql GRANTs for wikiadmin - https://phabricator.wikimedia.org/T249683 (10Marostegui) [06:26:31] 10DBA, 10Operations, 10Wikimedia-Incident: Redefine mysql GRANTs for wikiadmin - https://phabricator.wikimedia.org/T249683 (10Marostegui) [06:26:45] 10DBA, 10MediaWiki-extensions-WikibaseRepository, 10Wikidata, 10MW-1.35-notes (1.35.0-wmf.27; 2020-04-07), and 8 others: Wikidata's wb_items_per_site table has suddenly disappeared, creating DBQueryErrors on page views - https://phabricator.wikimedia.org/T249565 (10Marostegui) [06:27:17] https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/587403/ when able [06:27:24] http://dictionary.dauntless-soft.com/definitions/groundschoolfaa/WHEN+ABLE [06:30:08] 10Blocked-on-schema-change, 10DBA: Schema change: Make page.page_restrictions column NULL - https://phabricator.wikimedia.org/T248333 (10Marostegui) [] labsdb1012 [] labsdb1011 [] labsdb1010 [] labsdb1009 [x] dbstore1004 [] db1124 [] db1123 [] db1112 [] db1095 [] db1078 [] db1075 [06:34:58] marostegui: jynus I just wanted to thank you for all you have done for the incident. It's greatly appreciated <3 [07:04:35] thank you too! you did a lot as well as addshore [07:27:12] 10DBA, 10Operations, 10Wikimedia-Incident: Redefine mysql GRANTs for wikiadmin - https://phabricator.wikimedia.org/T249683 (10Ladsgroup) CREATE is already there but I want to emphasize that we need it for two reasons: 1- Sometimes devs, with coordination with DBAs, create tables in production. I have done it... [07:49:20] 10DBA, 10Operations, 10Wikimedia-Incident: Redefine mysql GRANTs for wikiadmin - https://phabricator.wikimedia.org/T249683 (10jcrespo) > creating indexes as well, right No, that would require ALTER rights. Creates are a non issue, that is why we let them be created by owners (e.g new extension or new wiki).... [08:06:23] Amir1: should we rename wb_terms on the wiki replicas? [08:07:01] Didn't we already did this? I think the script that dropped the table probably created it again [08:07:04] Can you check [08:07:13] sure [08:07:35] We only renamed the wb_terms table in production, analytics and labsdb1012 but not on labsdb1009-1011 [08:08:15] labsdb1012 has: T248086_wb_terms only [08:08:32] 1009 - 1011 were not renamed, as it requires the view recreation, which I can run as soon as we decide to rename them [08:09:59] could I steal some of your time later, marostegui, in around 30 minutes- when backup finishes- for codfw importing/replication? [08:10:08] sure [08:10:31] will ping you when ready [08:11:14] marostegui: I see, let me check with PM [08:11:21] ok! [08:12:37] Amir1: I am essentially talking about: https://phabricator.wikimedia.org/T248592#6031988 [08:12:43] Yup [08:17:46] https://gerrit.wikimedia.org/r/c/operations/puppet/+/586384/4/modules/profile/templates/labs/db/views/maintain-views.yaml [08:17:56] marostegui: is it deployed? [08:17:59] addshore: is asking [08:18:15] \o [08:18:21] it is merged but not run (I have to run it once I have renamed) [08:18:38] gotcha [08:18:44] yes, then les do it all now :) [08:18:53] / let us check with product :P [08:18:54] ok [08:18:56] sure [08:37:10] marostegui: we have the go ahead! [08:37:12] make it so! [08:37:13] :D [08:37:19] ok, give me 10 mins [08:37:38] Yeah, no rush, anytime in the next 8 hours is fine! [08:37:38] https://usercontent.irccloud-cdn.com/file/aRUUAYyn/image.png [08:49:36] addshore: done on labsdb1009, can you check? [08:49:41] I have recreated the views [08:49:58] let me check [08:49:59] for both testwikidatawiki and wikidata [08:50:09] I need to also do the empty ones for commons and test commons [08:50:25] root@labsdb1009.eqiad.wmnet[wikidatawiki_p]> select * from wb_terms limit 1; [08:50:25] ERROR 1356 (HY000): View 'wikidatawiki_p.wb_terms' references invalid table(s) or column(s) or function(s) or definer/invoker of view lack rights to use them [08:50:32] looks like something is wrong with the views [08:50:46] ah right [08:50:48] I forgot the create [08:50:49] fixing [08:50:51] :D [08:50:56] yes currently they both still have data [08:51:27] (for me) [08:51:59] https://www.irccloud.com/pastebin/kZpOw0FE/ [08:53:04] ok done [08:53:24] root@labsdb1009.eqiad.wmnet[wikidatawiki_p]> select * from wb_terms limit 1; [08:53:25] Empty set (0.00 sec) [08:53:48] * addshore waits for it to propogate to the server he is on [08:53:49] I think there is something wrong with the views generation [08:53:56] root@labsdb1009.eqiad.wmnet[wikidatawiki_p]> select * from wb_terms_no_longer_updated limit 1; [08:53:56] Empty set (0.00 sec) [08:54:21] yeah, it is using the wrong table [08:54:25] hmmm, wb_terms should be empty, but wb_terms_no_longer_updated still point to the data? [08:54:32] let me check bstorm_ patch [08:54:46] no, right now wb_terms_no_longer_updated keeps pointing to wb_terms [08:54:50] and it should point to the renamed version [08:55:00] I'm not seeing any of this on labsdb1011 yet [08:55:05] it is only on 1009 [08:55:16] aaah :D [08:55:25] but the view for wb_terms_no_longer_updated is wrong anyways [08:55:26] * addshore will just watch the outputs of the commands you paste :D [08:55:31] it keeps pointing to wb_terms [08:55:37] I can make a patch? [08:55:38] I am checking https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/586384/ [08:55:57] I see the problem [08:55:58] Fixing [08:56:09] ty! [08:57:50] https://phabricator.wikimedia.org/P10938 [08:57:53] looks good? [08:58:03] looks perfect [08:58:40] ok, going to apply it to testwikidatawiki, commonswiki and testcommonswiki [08:58:42] on 1009 [08:59:34] done [08:59:46] Going ahead for 1010 and 1011 then [09:00:40] and 1012 [09:04:30] 10DBA, 10Growth-Team, 10MediaWiki-Maintenance-system, 10Performance-Team, and 8 others: sql.php runs LoadExtensionSchemaUpdates - https://phabricator.wikimedia.org/T157651 (10daniel) [09:04:46] 1010 done [09:05:35] wooo! [09:06:20] I need to depool 1011 to be able to do it [09:06:35] jynus: can you check: https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/587403/ ? [09:06:38] Going to 1012 meanwhile [09:06:55] yup [09:07:02] checking [09:07:44] *ignore my yup* that patch was not for me ;) [09:07:45] thanks! [09:07:47] haha [09:08:59] will you have to wait for draining? [09:09:08] 1012 done [09:09:11] yeah, I will wait a bit [09:09:22] saying in case you have time for codfw [09:09:52] yep [09:09:56] give me one sec to reload proxies [09:11:27] jynus: I am ready [09:11:51] confirm db2079 as the host to load WITH replication? [09:11:56] checking [09:12:44] db2079, s8 master, currently replication is stopped. All its slaves have server_id from codfw [09:12:49] So, looks like it! [09:13:03] ok, proceding to load it with 20 threads [09:13:06] which table are you going to load [09:13:16] _recovered [09:13:27] replication will take care of renaming it [09:13:50] ok, make sure to use -e with myloader [09:13:53] that enables binlog [09:14:04] I use --enable-binlog [09:14:10] same, yes [09:14:29] hosts downtimed? [09:14:35] or should I downtime them for another day? [09:14:38] nah [09:14:39] marostegui: when do you think we can drop that monster? [09:14:47] Amir1: next week? [09:14:48] one thing failed, which is the grants for dbprov2001 [09:14:55] makes sense [09:15:00] sure [09:15:06] we need root@10.192.0.114 [09:15:12] jynus: want me to add them? I might have it on my history [09:15:24] ok, but ofc with new ip [09:15:28] yep [09:15:31] I am reloading locally for performance [09:15:31] want it with replication? [09:15:46] yes, although not sure if on eqiad or codfw master [09:15:52] only codfw, no? [09:15:55] ok [09:15:59] ok, adding them [09:16:10] done [09:16:14] retrying [09:16:28] working nicely now [09:16:32] excellent [09:16:42] once it finishes, I calculate 15 minutes or less [09:16:49] I will restart replication [09:16:55] ok [09:16:58] will load it also on the backup hosts pending [09:17:05] as we already have one backup before it [09:17:13] and 24 hours of no complains [09:17:42] cool [09:18:24] 10DBA, 10Operations, 10Wikimedia-Incident: Redefine mysql GRANTs for wikiadmin - https://phabricator.wikimedia.org/T249683 (10Ladsgroup) hmm, my problem is that the command to create the table is like this: `lang=sql CREATE TABLE IF NOT EXISTS /*_*/wb_items_per_site ( ips_row_id BIGINT unsi... [09:18:55] that should all be it, except of course to shout if you see something wrong/alerts/replication broken :-D [09:20:13] will do [09:20:15] thanks! [09:20:47] 10DBA: Drop wb_terms in production from s4 (commonswiki, testcommonswiki), s3 (testwikidatawiki), s8 (wikidatawiki) - https://phabricator.wikimedia.org/T248086 (10Marostegui) [09:21:51] oh, I think I won't recover db1095, just trash it [09:21:55] FYI [09:22:12] from which hst will you rebuild it? [09:22:13] we already have db1116 [09:22:19] ah [09:22:23] yes, it was supposed to be temporary [09:22:35] cool then yeah [09:23:25] 10DBA, 10Wikidata, 10User-Addshore, 10Wikidata-Campsite (Wikidata-Campsite-Iteration-∞), 10cloud-services-team (Kanban): Move wb_terms data in cloud replicas to wb_terms_no_longer_updated - https://phabricator.wikimedia.org/T248592 (10Addshore) [09:23:44] marostegui: everything is closed / done from your side for ^^ no right? If so I'll resolve it [09:26:39] nope [09:26:40] not yet [09:26:43] labsdb1011 pending [09:26:48] Waiting for it to drain connections [09:27:08] let's give it an hour or so [09:29:54] okay! [09:38:42] load finished, will restart replication everywhere [09:39:03] count confirms the load [09:42:08] nice [09:42:13] aside from 21 hours of lag, everthing seems fine [09:48:54] will 10.4 include hot buffer pool size change? [09:59:44] yep [09:59:47] since 10.2 it does [10:01:53] cool [10:01:56] 10DBA, 10Operations, 10Wikimedia-Incident: Redefine mysql GRANTs for wikiadmin - https://phabricator.wikimedia.org/T249683 (10Tgr) This is the wrong direction to attack the problem from, IMO. ALTER is useful but also dangerous; there is no way to separate functionality into non-overlapping "safe" and "not ne... [10:05:00] 10DBA, 10Operations, 10Wikimedia-Incident: Redefine mysql GRANTs for wikiadmin - https://phabricator.wikimedia.org/T249683 (10Marostegui) I can see that ALTER might be useful sometimes and might require a longer discussion and some other deep changes (more different roles etc), I don't think we should be kee... [10:08:59] 10DBA, 10Wikidata, 10User-Addshore, 10Wikidata-Campsite (Wikidata-Campsite-Iteration-∞), 10cloud-services-team (Kanban): Move wb_terms data in cloud replicas to wb_terms_no_longer_updated - https://phabricator.wikimedia.org/T248592 (10Marostegui) This is all done. All labsdb hosts show this behaviour: `... [10:09:26] 10DBA, 10Wikidata, 10User-Addshore, 10Wikidata-Campsite (Wikidata-Campsite-Iteration-∞), 10cloud-services-team (Kanban): Move wb_terms data in cloud replicas to wb_terms_no_longer_updated - https://phabricator.wikimedia.org/T248592 (10Marostegui) 05Open→03Resolved [10:09:49] 10DBA: Drop wb_terms in production from s4 (commonswiki, testcommonswiki), s3 (testwikidatawiki), s8 (wikidatawiki) - https://phabricator.wikimedia.org/T248086 (10Marostegui) [10:10:00] 10DBA, 10Operations, 10Wikimedia-Incident: Redefine mysql GRANTs for wikiadmin - https://phabricator.wikimedia.org/T249683 (10jcrespo) > I can't add the indexes as part of creating the table Why not? This is literally the definition live on the DB: ` CREATE TABLE `wb_items_per_site` ( `ips_row_id` bigint(... [10:14:48] 10DBA, 10Operations, 10Wikimedia-Incident: Redefine mysql GRANTs for wikiadmin - https://phabricator.wikimedia.org/T249683 (10jcrespo) > This is the wrong direction to attack the problem from Removing wikiadmin grants, or stopping using that account and using 3 or 4 others without those grants are just a se... [10:26:14] 10DBA, 10Growth-Team, 10MediaWiki-Maintenance-system, 10Performance-Team, and 8 others: sql.php runs LoadExtensionSchemaUpdates - https://phabricator.wikimedia.org/T157651 (10Tgr) I would suggest the opposite: keep `sql.php`, drop `patchSql.php`. I don't think many people are familiar with the latter (comp... [10:40:17] I've acked the s8 lagged replicas, in case recovery takes more than 1 hour [10:41:06] cool [13:20:02] Wow https://grafana.wikimedia.org/d/000000278/mysql-aggregated?panelId=8&fullscreen&orgId=1&var-dc=eqiad%20prometheus%2Fops&var-group=core&var-shard=s4&var-role=All&from=now-3h&to=now [13:28:23] what was that? [13:53:16] * addshore looks [13:53:27] whut O_o [13:53:54] s4 is commons right? [13:58:49] SELECT /* SpecialRecentChanges::doMainQuery */ [14:01:10] or more likely, SELECT /* Wikibase\Client\Usage\Sql\EntityUsageTable::getAffectedRowIds */ [14:43:27] 10DBA, 10Operations, 10Wikimedia-Incident: Redefine mysql GRANTs for wikiadmin - https://phabricator.wikimedia.org/T249683 (10Anomie) > Anything else? Does CREATE cover CREATE TEMPORARY TABLES? If not, we should most likely include that one too. There's also INDEX, see below. While we don't currently use... [14:57:31] 10DBA, 10Operations, 10Wikimedia-Incident: Redefine mysql GRANTs for wikiadmin - https://phabricator.wikimedia.org/T249683 (10Marostegui) >>! In T249683#6040077, @Anomie wrote: >> Anything else? > > Does CREATE cover CREATE TEMPORARY TABLES? If not, we should most likely include that one too. No, `CREATE`... [14:57:53] 10DBA, 10Operations, 10Wikimedia-Incident: Redefine mysql GRANTs for wikiadmin - https://phabricator.wikimedia.org/T249683 (10Marostegui) [16:20:42] 10DBA, 10Operations, 10Wikimedia-Incident: Redefine mysql GRANTs for wikiadmin - https://phabricator.wikimedia.org/T249683 (10Reedy) >>! In T249683#6039249, @jcrespo wrote: >> I can't add the indexes as part of creating the table > Why not? This is literally the definition live on the DB: > > ` > CREATE TAB...