[06:08:34] 10DBA, 10Wikimedia-Site-requests: Global rename of The_Photographer → Wilfredor: supervision needed - https://phabricator.wikimedia.org/T215107 (10Marostegui) Sorry, I wasn't available during the weekend. I am normally available from Monday to Friday from 7:00 UTC to 16:00 UTC [06:35:45] 10DBA, 10MediaWiki-Database: Convert primary key integers and references thereto from int to bigint (unsigned) - https://phabricator.wikimedia.org/T63111 (10Marostegui) >>! In T63111#4941569, @Krinkle wrote: > Re-opening so as to let the DBAs triage this. One question I wasn't able to answer quickly is: What u... [07:06:10] 10DBA, 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Banyek: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5] - https://phabricator.wikimedia.org/T210478 (10Marostegui) Now that we know that the biggest and and most painful table (as it was Aria and it was huge - around 180G... [07:06:19] elukey: ^ [07:07:10] you are awesome [07:07:15] thanks a lot [07:07:24] I am already asking people to test the new stuff [07:07:41] I am now thinking if we should move staging from dbstore1003 to dbstore1005 to optimize better the disk space [07:07:46] I will do some math and comment on the tas [07:07:47] task [07:07:59] (it is a matter of transferring 130GB, so it shouldn't take long anyways) [07:08:11] ack, I'll leave this choice to the expert :) [07:08:15] :) [07:08:23] I am going to merge the SRV dns change and complete the code snippets [07:08:24] so that means you! [07:08:26] great [07:08:28] yeah sure [07:09:13] with the SRV records there is a ton of code that it is not needed in people's script [07:09:24] so asking to them to migrate should be easier [07:09:48] awesome :) [07:10:02] I would like to aim for a given day (tbd) [07:10:09] but to kind of set a deadline [07:14:23] ;; ANSWER SECTION: [07:14:23] _staging-analytics._tcp.eqiad.wmnet. 300 IN SRV 0 1 3340 dbstore1003.eqiad.wmnet. [07:14:26] lovely [07:14:56] yep yep makes sense, today I am going to send an email to alert people, if you are ok I'd do it next monday [07:15:17] Including the staging migration? [07:15:22] or, maybe better, we can ask to migrate staging stuff for say Thursday [07:15:24] (which includes read only) [07:15:50] in SF they asked to me to have the two systems in parallel for a bit [07:16:03] to have time to re-tune the scripts, test, etc.. [07:16:16] but they are aware that changes done to staging on one host will not get replicated to another host, right? [07:17:20] not sure, this is why I want to send an email explaining the plan. My idea was to give them a week to prepare for the migration to dbstore1003/5 (depending where we'll put staging) and then do the migration [07:17:31] great [07:17:42] I will check the final destination of staging today [07:17:51] ack [07:18:02] I am prepping scripts/docs/etc.. [07:18:09] What we can do is do a final migration of staging with the latest data (which includes read only time as I specified) one day before or something? [07:41:20] (sorry just seen this) [07:41:29] yes definitely, as long as everybody is onboard [07:41:37] sure [07:41:46] then we set staging to read-only (on dbstore1002) [07:41:59] so whoever didn't see the email hopefully will reach out to me [07:42:03] hehe [07:42:04] yes [07:42:15] Keep in mind that we have to set read only to do the staging migration [07:42:15] in the meantime, I added more info to https://wikitech.wikimedia.org/wiki/Analytics/Data_access#MariaDB_replicas [07:42:23] even better [07:42:36] oh nice [08:06:49] 10Blocked-on-schema-change, 10MediaWiki-Change-tagging, 10Patch-For-Review, 10User-Ladsgroup: Drop change_tag.ct_tag column in production - https://phabricator.wikimedia.org/T210713 (10Marostegui) [08:27:07] 10DBA, 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Banyek: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5] - https://phabricator.wikimedia.org/T210478 (10Marostegui) After having al the sections ready and compressed on all hosts, there is one thought I had, where to leave... [09:07:31] elukey: I am going to move the staging db to dbstore1005 yeah [09:07:39] I will update the SRV records [09:08:51] and also the CNAME [09:08:54] but +1 [09:08:55] yep [09:09:00] I will add you as a reviewer [09:24:33] 10DBA, 10Analytics, 10Analytics-Cluster, 10Analytics-Kanban, and 2 others: Cleanup or remove mysql puppet module; repurpose mariadb module to cover misc use cases - https://phabricator.wikimedia.org/T162070 (10jcrespo) > Second, mariadb::packages_wmf and mariadb::packages should probably be merged into one... [09:43:14] elukey: https://gerrit.wikimedia.org/r/#/c/operations/dns/+/489636/ [09:43:44] 10DBA: Remove AFT tables from the analytics slaves - https://phabricator.wikimedia.org/T92739 (10Marostegui) 05Open→03Resolved a:03Marostegui The clicktracking no longer exist on either s1 master or dbstore1002: ` root@db1067.eqiad.wmnet[enwiki]> show tables like '%click%'; Empty set (0.00 sec) root@dbsto... [09:43:47] 10DBA, 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Banyek: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5] - https://phabricator.wikimedia.org/T210478 (10Marostegui) [09:43:50] 10DBA, 10Epic, 10Tracking: Database tables to be dropped on Wikimedia wikis and other WMF databases (tracking) - https://phabricator.wikimedia.org/T54921 (10Marostegui) [09:45:51] marostegui: merged and authdns-updated [09:45:57] Thank you! [10:12:45] 10Blocked-on-schema-change, 10MediaWiki-Change-tagging, 10Patch-For-Review, 10User-Ladsgroup: Drop change_tag.ct_tag column in production - https://phabricator.wikimedia.org/T210713 (10Marostegui) [10:12:59] 10DBA, 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Banyek: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5] - https://phabricator.wikimedia.org/T210478 (10Marostegui) [10:13:46] 10DBA, 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Banyek: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5] - https://phabricator.wikimedia.org/T210478 (10Marostegui) [10:14:15] 10DBA, 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Banyek: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5] - https://phabricator.wikimedia.org/T210478 (10Marostegui) [10:14:48] 10Blocked-on-schema-change, 10MediaWiki-Change-tagging, 10Patch-For-Review, 10User-Ladsgroup: Drop change_tag.ct_tag column in production - https://phabricator.wikimedia.org/T210713 (10Marostegui) s8 eqiad progress [] labsdb1011 [] labsdb1010 [] labsdb1009 [] dbstore1005 [] dbstore1002 [] db1124 [] db1116... [10:17:07] 10DBA, 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Banyek: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5] - https://phabricator.wikimedia.org/T210478 (10Marostegui) >>! In T210478#4942496, @Marostegui wrote: > After having al the sections ready and compressed on all host... [10:42:52] I have uploaded 10.1.38-MariaDB, seems to work ok [10:43:20] \o/ [10:44:08] should I add db1114 to tendril? [10:44:18] db1114 should already be there no? [10:44:47] it had to be removed to take away the topology while it was down [10:44:52] aaaaah [10:44:53] right [10:44:54] sure [10:45:45] we don't have many core hosts not running 10.1.37 in eqiad anymore [10:45:51] only a few on s8 I think [10:49:49] dumps finised on db1106, so we can restart that one now [10:50:00] once I pool db1118 [12:18:34] jynus: can you paste the cve or mariadb link on the sre etherpad? I cannot find it now [12:19:06] let me see [12:20:20] some of them are not public, I would redirect to something like https://www.percona.com/blog/2019/02/06/percona-responds-to-mysql-local-infile-security-issues/ [12:21:23] as it provides an easy to read config or upgrade suggetsions [12:58:13] 10DBA, 10Analytics, 10Research: Improve interlingual links across wikis through Wikidata IDs - https://phabricator.wikimedia.org/T215616 (10diego) [13:02:28] 10DBA, 10Analytics, 10Research, 10Wikidata: Improve interlingual links across wikis through Wikidata IDs - https://phabricator.wikimedia.org/T215616 (10diego) [14:06:46] thanks! [14:11:10] 10DBA, 10Analytics, 10MediaWiki-Database, 10Research, 10Wikidata: Improve interlingual links across wikis through Wikidata IDs - https://phabricator.wikimedia.org/T215616 (10Marostegui) [16:07:21] 10DBA, 10Operations, 10ops-eqiad, 10Patch-For-Review, 10User-Marostegui: rack/setup/install db11[26-38].eqiad.wmnet - https://phabricator.wikimedia.org/T211613 (10Marostegui) @Cmjohnson I can take care of the installations once you've done the RAID and added DNS and pxeboot entries with the MACs :-) [16:35:19] 10DBA, 10MediaWiki-Database, 10Core Platform Team Backlog (Watching / External), 10Performance-Team (Radar), 10Wikimedia-Incident: Fix mediawiki heartbeat model, change pt-heartbeat model to not use super-user, avoid SPOF and switch automatically to the real mast... - https://phabricator.wikimedia.org/T172497 [16:41:27] 10DBA, 10Patch-For-Review: Drop valid_tag table - https://phabricator.wikimedia.org/T212254 (10GTirloni) [16:43:38] 10DBA, 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Banyek: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5] - https://phabricator.wikimedia.org/T210478 (10Marostegui) a:05elukey→03Marostegui [16:44:38] 10DBA, 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Banyek: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5] - https://phabricator.wikimedia.org/T210478 (10Marostegui) [16:45:51] 10DBA, 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Banyek: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5] - https://phabricator.wikimedia.org/T210478 (10Marostegui) [16:46:59] 10DBA, 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Banyek: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5] - https://phabricator.wikimedia.org/T210478 (10Marostegui) [16:47:41] 10DBA, 10Analytics, 10MediaWiki-Database, 10Research, 10Wikidata: Improve interlingual links across wikis through Wikidata IDs - https://phabricator.wikimedia.org/T215616 (10Nuria) p:05Triage→03High [16:56:15] db1106 of course stuck on loading ramdisk [16:56:39] 10DBA, 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Banyek: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5] - https://phabricator.wikimedia.org/T210478 (10Marostegui) [16:56:46] jynus: same case as db2085?? [16:57:06] probably, it rebooted on its own [16:57:36] not booting a second time [16:57:44] should I force the older kernel? [16:57:45] :( [16:57:49] yeah [16:58:04] I think moritzm mentioned on the task trying to enable debugging on intrd [16:58:24] no time for that now [16:58:32] could you note it on the task, cannot find it now [16:59:04] yeah, it enter an infinite reboot loop [16:59:22] I am not saying to do it _now_ I mean as a next step [16:59:26] PowerEdge R630 [17:01:25] 10DBA, 10Operations, 10Packaging: db2085 doesn't boot with 4.9.0-8-amd64 - https://phabricator.wikimedia.org/T214840 (10Marostegui) Same thing just happened with db1106 (PowerEdge R630 - same chassis as db2085) @MoritzMuehlenhoff can you help us with the approach you mentioned at T214840#4918369 ? [17:02:00] thanks [17:03:04] 10DBA, 10Analytics, 10MediaWiki-Database, 10Research, 10Wikidata: Improve interlingual links across wikis through Wikidata IDs - https://phabricator.wikimedia.org/T215616 (10JAllemandou) @diego : This has worked for me (takes some time to compute and needs a bunch of resources). I hope it's close enough... [17:03:34] it doesn't boot on -6 either [17:03:38] :-( [17:04:01] :| [17:04:02] I don't have any other kernel available [17:04:10] that is very strange [17:04:22] any error logged? (hw error I mean) [17:04:26] on the idrac [17:06:40] it finally booted with -6 [17:06:43] :-P [17:07:08] pheeeeew [17:07:15] please mention this issue on the codfw restart [17:07:24] 10DBA, 10Operations, 10Packaging: db2085/db1106 don't boot with 4.9.0-8-amd64 - https://phabricator.wikimedia.org/T214840 (10Marostegui) [17:07:25] codfw restart? [17:07:35] the "rolling restart" [17:07:44] ah yes [17:07:47] I added it as a line [17:07:49] earlier today [17:07:50] line 199 [17:11:20] as in, hey, this is happening, be careful when restarting [17:11:32] not need for help (except moritz) for now [17:13:43] are db2085 and db1106 from the same hw batch/class? [17:13:52] same HW yes [17:13:56] same batch, don't know [17:14:28] seems so, yes. very similar purchase date [17:16:12] we can narrow this down further by testing the 4.9.144-3 kernel from the stretch point update (pending for next weekend, but we can already test the new kernel), I need to leave after the SRE meeting, but happy to investigate further tomorrow morning [17:17:47] 10DBA, 10Operations, 10Packaging: db2085/db1106 don't boot with 4.9.0-8-amd64 - https://phabricator.wikimedia.org/T214840 (10Marostegui) [17:18:06] thanks moritzm [17:23:27] moritzm: sorry for the ping [17:23:43] I was just getting worried because meeting + that host being more difficult to failover [17:23:57] we wasted the restart because we wen't back to -6 [17:24:03] *went [17:27:34] ack, np wrt ping, happy to help/fix this [17:29:00] oh, we will try to look at this ourselves, don't worry, but obviously guidance will be helpful [17:46:41] 10DBA, 10Operations, 10Packaging: db2085/db1106 don't boot with 4.9.0-8-amd64 - https://phabricator.wikimedia.org/T214840 (10Marostegui) @paravoid gave us some food for thought: ` stuck at "loading ramdisk" is sometimes an indication of misconfigured serial redirection after boot basically when Linux and the... [18:23:43] 10DBA, 10Analytics, 10MediaWiki-Database, 10Research, 10Wikidata: Improve interlingual links across wikis through Wikidata IDs - https://phabricator.wikimedia.org/T215616 (10diego) Looks good @JAllemandou, thanks. This is a good workaround, but imho, we should have an structure or schema that makes this... [18:28:05] 10DBA, 10Analytics, 10MediaWiki-Database, 10Research, 10Wikidata: Improve interlingual links across wikis through Wikidata IDs - https://phabricator.wikimedia.org/T215616 (10EBernhardson) I don't know if this meets your needs, but the cirrussearch dumps have the wikidata id's broken out. This is the `wi... [19:49:35] db1083 looks like it is slow than average to connect, maybe we can do better connection rebalancing weights, someting to look at tomorrow [19:50:19] apparently it was just a spike, so nm [19:51:25] 10DBA, 10Analytics, 10MediaWiki-Database, 10Research, 10Wikidata: Improve interlingual links across wikis through Wikidata IDs - https://phabricator.wikimedia.org/T215616 (10diego) @EBernhardson , this looks exactly what I was looking for, initially. Thank you very much for that. However, I wont close... [19:53:13] 10DBA, 10Analytics, 10MediaWiki-Database, 10Research, 10Wikidata: Improve interlingual links across wikis through Wikidata IDs - https://phabricator.wikimedia.org/T215616 (10EBernhardson) >>! In T215616#4944986, @diego wrote: > @EBernhardson , this looks exactly what I was looking for, initially. Thank... [20:07:29] 10DBA, 10Analytics, 10MediaWiki-Database, 10Research, 10Wikidata: Improve interlingual links across wikis through Wikidata IDs - https://phabricator.wikimedia.org/T215616 (10jcrespo) > diego added a project: DBA. I don't understand what is the actionable here for us. Without context, I would say that:... [22:34:49] 10DBA, 10Analytics, 10MediaWiki-Database, 10Research, 10Wikidata: Improve interlingual links across wikis through Wikidata IDs - https://phabricator.wikimedia.org/T215616 (10diego) @jcrespo, the API works good for query specific pages/entities, not for example to know which pages that existing in X_wiki...