[05:02:17] 10DBA, 10Core-Platform-Team, 10Patch-For-Review, 10Schema-change: Fix WMF schemas to not break when comment store goes WRITE_NEW - https://phabricator.wikimedia.org/T187089 (10Marostegui) [05:05:59] 10DBA, 10Datacenter-Switchover-2018, 10Patch-For-Review: Reclone db2054 and db2068 - https://phabricator.wikimedia.org/T204127 (10Marostegui) >>! In T204127#4581347, @jcrespo wrote: > db2068 has been recloned, but needs time to catch up replication and then be slowly repooled with the above patch. I have me... [05:26:51] 10Blocked-on-schema-change, 10Wikibase-Quality, 10Wikibase-Quality-Constraints, 10Wikidata, and 4 others: Deploy schema change for adding numeric primary key to wbqc_constraints table - https://phabricator.wikimedia.org/T189101 (10Marostegui) Table imported on testwikidatawiki on eqiad hosts: ``` root@db10... [05:27:18] 10Blocked-on-schema-change, 10Wikibase-Quality, 10Wikibase-Quality-Constraints, 10Wikidata, and 4 others: Deploy schema change for adding numeric primary key to wbqc_constraints table - https://phabricator.wikimedia.org/T189101 (10Marostegui) [05:27:28] 10DBA, 10Operations, 10Epic, 10Patch-For-Review: DB meta task for next DC failover issues - https://phabricator.wikimedia.org/T189107 (10Marostegui) [05:27:34] 10DBA, 10MediaWiki-Database, 10PostgreSQL, 10Schema-change: Some tables lack unique or primary keys, may allow confusing duplicate data - https://phabricator.wikimedia.org/T17441 (10Marostegui) [05:27:38] 10Blocked-on-schema-change, 10Wikibase-Quality, 10Wikibase-Quality-Constraints, 10Wikidata, and 4 others: Deploy schema change for adding numeric primary key to wbqc_constraints table - https://phabricator.wikimedia.org/T189101 (10Marostegui) 05Open>03Resolved [05:50:45] 10Blocked-on-schema-change, 10MediaWiki-extensions-Translate, 10Language-2018-July-September: Apply schema change to translate_reviews in WMF - https://phabricator.wikimedia.org/T201011 (10Marostegui) The biggest table is on metawiki which is 22M, so probably good to go directly on the masters someday early... [06:09:10] 10DBA, 10Core-Platform-Team, 10Patch-For-Review, 10Schema-change: Fix WMF schemas to not break when comment store goes WRITE_NEW - https://phabricator.wikimedia.org/T187089 (10Marostegui) [06:43:37] https://grafana.wikimedia.org/dashboard/db/mysql?panelId=11&fullscreen&orgId=1&var-dc=codfw%20prometheus%2Fops&var-server=db2055&var-port=9104&from=now-24h&to=now [06:44:10] yeah, I saw it finished the long query :) [06:45:02] jynus: ok to merge: https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/460472/ ? [06:45:23] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review: Make several mediawiki table fields unsigned ints on wmf databases - https://phabricator.wikimedia.org/T89737 (10Marostegui) [06:46:04] I had done https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/460389/ [06:46:11] Ah! [06:46:16] I didn't know! [06:46:19] I will abandon mine [06:50:07] 10DBA, 10Datacenter-Switchover-2018, 10Patch-For-Review: Reclone db2054 and db2068 - https://phabricator.wikimedia.org/T204127 (10jcrespo) a:05jcrespo>03Marostegui [06:52:01] I will upgrade now db1062, db1069 [06:52:08] <3 [07:36:26] I have issues with remote management iface on db1062 :-/ [07:37:11] RAC0218: The maximum number of user sessions is reached. [07:37:23] that just means someone is currently logged in? [07:37:33] we only have a license for one user [07:37:36] I doubt that [07:38:18] I think you can try a "racadm racreset", if there's really someone logged in, he/she will be logged off [07:38:35] I cannot execute that if I cannot log in! [07:38:53] I tried bot the web and the ssh interfaces [07:39:06] ah, you can't even get to the mgmt? I thought you couldn't get a console [07:40:01] than it usually needs dc ops intervention, usually they reset the idrac module or check the cabling [07:40:27] yeah, but that is a few hours away [07:42:01] if it should happen sooner, I think Mark or Faidon can summon smarthands, that's probably within the set of actions smarthands can perform [07:42:12] that is ok, I can wait [07:48:23] If I do an "mc reset warm" will it reset to factory or just restart it? [07:58:36] "According to Dell this is a known issue with the following firmware releases 1.50.50, 1.51.51, 1.51.52, 1.55.55 & 1.56.55." [08:02:22] I filed T204302 [08:02:23] T204302: db1062 management interface busy (no sessions allowed) - https://phabricator.wikimedia.org/T204302 [08:03:30] Thanks [08:11:01] 10DBA, 10Operations, 10Epic, 10Patch-For-Review: DB meta task for next DC failover issues - https://phabricator.wikimedia.org/T189107 (10Marostegui) [08:13:14] no idea on the "mc reset", sorry [08:14:08] going with db1069 [08:14:19] +1 [08:32:09] I finished db1069, is there some more masters you are not going to touch? [08:32:30] only touching s4 now [08:33:23] then I woudl like to reimage db1070 [08:33:27] +1 [08:33:33] (s5) [08:37:00] I think I am going to touch s7 master [08:37:04] Before chris wakes up [08:37:17] any objection? [08:37:17] will you finish by then? [08:37:26] yeah [08:37:26] I don't want to delay the fix [08:37:29] ok, then [08:38:03] I am checking table sizes, I think I will be finished. But we have plenty of time, I will do it next week [08:38:04] ping when finished just to be sure we don't clash [08:38:07] No need to finish everyting today [08:38:18] Better to get it fixed by chris today [08:38:54] I am saying to do it [08:39:37] nah, I am checking all the table sizes and I think I will be finished by then, but not 100%, so I prefer to post pone it [08:39:40] we have lots of days :) [08:42:16] we don't know if chris could do it today or when [08:43:40] This is how I feel now: https://i.ytimg.com/vi/RseLZ9LqQv0/hqdefault.jpg [08:44:40] I will do it now, it will be at least 4-5 hours until chris gets to it (if done today) [08:44:43] Plenty of time [08:44:45] so there is like 7 dbs there? [08:44:53] yeah [08:44:53] do it in slow steps [08:45:03] so you can stop quickly and problem solved [08:45:26] yeah [08:45:51] or do it in small -> large order [09:07:00] while talking to banyek, I just remember that now would be a good time to do learning cycles for our batteries on eqiad [09:07:59] yeah, but also not sure what will happen if we do it, they are so old and delicate that I don't know [09:08:35] I think it is better to know [09:08:39] than to not do nothing [09:08:48] yep [09:08:50] after all it is our fault for disabling them [09:09:00] and better now than at random time [09:09:18] yes [09:22:18] jynus: I am done with db1062 [09:24:35] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review: Make several mediawiki table fields unsigned ints on wmf databases - https://phabricator.wikimedia.org/T89737 (10Marostegui) [09:24:49] 10DBA, 10Operations, 10Epic, 10Patch-For-Review: DB meta task for next DC failover issues - https://phabricator.wikimedia.org/T189107 (10Marostegui) [09:24:53] 10DBA, 10Schema-change, 10Tracking: [DO NOT USE] Schema changes for Wikimedia wikis (tracking) [superseded by #Blocked-on-schema-change] - https://phabricator.wikimedia.org/T51188 (10Marostegui) [09:24:57] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review: Make several mediawiki table fields unsigned ints on wmf databases - https://phabricator.wikimedia.org/T89737 (10Marostegui) 05Open>03Resolved All done [09:26:18] marostegui: thanks [09:56:47] do we have a master upgrade/10.1 task? [09:57:43] mmmm nope [09:57:57] I propose to upgrade to mariadb 10.1, what do you think? [09:58:10] Alⅼah iѕ doіng [09:58:10] sun іs ᥒot doing Aⅼlaһ iѕ ⅾoing [09:58:10] Oh yeah, I thought that was given :) [09:58:10] mooᥒ iѕ nⲟt doing Ꭺlⅼаһ iѕ dⲟing [09:58:12] stars are not doіᥒɡ Αlⅼaһ is doing [09:58:12] рlаnᥱts are ᥒоt doing Alⅼah ⅰs dⲟіnɡ [09:58:16] galaⲭіes ɑrᥱ ᥒot ԁoiᥒg Αⅼlah is ԁoing [09:58:19] оceans arе not dοіng Aⅼlah is ⅾoіnɡ [09:58:38] ok, I am going to create a ticket about that [10:00:28] jynus: I thought the idea was to upgrade to 10.1 :) [10:01:39] we agree then [10:03:08] 10DBA, 10Operations, 10Epic: Upgrade all core (mediawiki) database servers to mariadb 10.1 - https://phabricator.wikimedia.org/T204311 (10jcrespo) p:05Triage>03Normal [10:05:14] 10DBA, 10Operations, 10Epic: Upgrade all core (mediawiki) database servers to mariadb 10.1 - https://phabricator.wikimedia.org/T204311 (10Marostegui) [10:05:33] 10DBA, 10Operations, 10Epic: Upgrade all core (mediawiki) database servers to mariadb 10.1 - https://phabricator.wikimedia.org/T204311 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by banyek on neodymium.eqiad.wmnet for hosts: ``` ['db1070.eqiad.wmnet'] ``` The log can be found in `/var/log/wmf... [10:06:02] 10DBA, 10Operations: Upgrade all core (mediawiki) database servers to mariadb 10.1 - https://phabricator.wikimedia.org/T204311 (10Marostegui) [10:06:24] 10DBA, 10Operations: Upgrade all core (mediawiki) database servers to mariadb 10.1 - https://phabricator.wikimedia.org/T204311 (10Marostegui) [10:09:10] 10DBA, 10Operations: Upgrade all core (mediawiki) database servers to mariadb 10.1 - https://phabricator.wikimedia.org/T204311 (10Marostegui) [10:19:48] 10Blocked-on-schema-change, 10MediaWiki-extensions-Translate, 10Language-2018-July-September: Apply schema change to translate_reviews in WMF - https://phabricator.wikimedia.org/T201011 (10Marostegui) [10:19:51] 10DBA, 10MediaWiki-Database, 10PostgreSQL, 10Schema-change: Some tables lack unique or primary keys, may allow confusing duplicate data - https://phabricator.wikimedia.org/T17441 (10Marostegui) [10:58:36] 10DBA, 10Operations: Upgrade all core (mediawiki) database servers to mariadb 10.1 - https://phabricator.wikimedia.org/T204311 (10ops-monitoring-bot) Completed auto-reimage of hosts: ``` ['db1070.eqiad.wmnet'] ``` and were **ALL** successful. [10:59:39] 10DBA, 10Operations: Upgrade all core (mediawiki) database servers to mariadb 10.1 - https://phabricator.wikimedia.org/T204311 (10jcrespo) [12:42:04] Aⅼⅼɑh is ԁoing [12:53:42] 10DBA, 10Datacenter-Switchover-2018, 10Patch-For-Review: Reclone db2054 and db2068 - https://phabricator.wikimedia.org/T204127 (10Banyek) 05Open>03Resolved [12:53:45] 10DBA, 10Operations, 10Epic, 10Patch-For-Review: DB meta task for next DC failover issues - https://phabricator.wikimedia.org/T189107 (10Banyek) [13:03:05] 10DBA, 10Operations: Upgrade all core (mediawiki) database servers to mariadb 10.1 - https://phabricator.wikimedia.org/T204311 (10jcrespo) a:03jcrespo [13:32:52] 10DBA, 10Operations: Upgrade all core (mediawiki) database servers to mariadb 10.1 - https://phabricator.wikimedia.org/T204311 (10jcrespo) [13:32:57] 10DBA, 10Operations, 10Epic, 10Patch-For-Review: DB meta task for next DC failover issues - https://phabricator.wikimedia.org/T189107 (10jcrespo) [14:03:41] Allɑh ⅰѕ dഠinɡ [14:37:55] Ꭺⅼlah іs dοiᥒɡ [14:37:55] ѕᥙn ⅰѕ not ԁoіng Αⅼⅼаһ іѕ ԁoinɡ [14:37:55] ⅿoοn ⅰѕ ᥒot doiᥒɡ Allаh iѕ ԁoіnɡ [15:20:54] 10DBA, 10Operations: Upgrade all core (mediawiki) database servers to mariadb 10.1 - https://phabricator.wikimedia.org/T204311 (10Cmjohnson) [15:23:55] Allah ⅰѕ dοiᥒg [15:23:55] sun is ᥒot dοiᥒg Alⅼah iѕ doiᥒɡ [16:02:25] 10DBA, 10Operations: Upgrade all core (mediawiki) database servers to mariadb 10.1 - https://phabricator.wikimedia.org/T204311 (10jcrespo) db1090 took a long time to recover replication lag after db1062 initial maintenance, even more than dbstore1002- need to check why next week. [17:06:56] Ꭺlⅼah іs ⅾഠinɡ [17:06:56] ѕun is ᥒot ⅾoiᥒɡ Αlⅼah is dഠinɡ [17:19:05] Αⅼlaһ is doіᥒg [19:07:17] Αⅼlɑһ ⅰs dοinɡ [19:07:17] sᥙᥒ іs ᥒοt doіᥒg Allah ⅰs ԁⲟing [19:07:17] mοoᥒ іs nοt doⅰᥒɡ Aⅼⅼah ⅰѕ doіᥒɡ [19:56:15] Alⅼah іs dοiᥒg [20:08:32] Аllah іs dοіng [20:40:05] Aⅼlaһ is dⲟіᥒg [21:14:16] Aⅼlah is ⅾoinɡ [21:26:36] Aⅼlɑh іѕ ԁоiᥒg [21:32:51] 10DBA, 10Wikimedia-Extension-setup, 10Patch-For-Review, 10User-Urbanecm: Extension:Translate for id.wikimedia.org website - https://phabricator.wikimedia.org/T204292 (10Urbanecm) Well, I know I said this will be deployed in the week of September 17. I just now realized it's DC switchover those days, which... [22:52:06] Aⅼⅼɑh ⅰѕ ⅾoіᥒg