[00:14:03] 10DBA, 10MediaWiki-extensions-WikibaseClient, 10Wikidata, 10Patch-For-Review, and 3 others: wikibase-addUsagesForPage doesn't batch properly - https://phabricator.wikimedia.org/T172015#3482899 (10Marostegui) Was this deployed? [06:47:57] 10DBA, 10MediaWiki-extensions-WikibaseClient, 10Operations, 10Wikidata, and 5 others: Cache invalidations coming from the JobQueue are causing lag on several wikis - https://phabricator.wikimedia.org/T164173#3515010 (10jcrespo) I've been told that several thousands of UPDATES Title::invalidateCache per sec... [07:04:49] 10DBA, 10MediaWiki-extensions-WikibaseClient, 10Wikidata, 10Patch-For-Review, and 3 others: wikibase-addUsagesForPage doesn't batch properly - https://phabricator.wikimedia.org/T172015#3515021 (10Ladsgroup) Not yet, Wikidata got rollbacked. [08:02:10] 10DBA, 10MediaWiki-extensions-WikibaseClient, 10Operations, 10Wikidata, and 5 others: Cache invalidations coming from the JobQueue are causing lag on several wikis - https://phabricator.wikimedia.org/T164173#3515038 (10jcrespo) To avoid the continuous lagging on non-directly pooled hosts (passive dc codfw,... [10:59:09] 10DBA, 10Patch-For-Review: Migrate dbstore2001 to multi instance - https://phabricator.wikimedia.org/T168409#3515164 (10Marostegui) >>! In T168409#3512149, @jcrespo wrote: > I am finishing adding s1 and s2 today, will continue adding the other 3 tomorrow. s5 and s6 will soon be ready to be added too. How woul... [11:43:57] 10DBA, 10Patch-For-Review: Migrate dbstore2001 to multi instance - https://phabricator.wikimedia.org/T168409#3515211 (10jcrespo) I will deploy some conditional code, want to finish the first 5 first. [13:20:22] 10DBA: Drop m3 from dbstore servers - https://phabricator.wikimedia.org/T156758#3515306 (10Marostegui) 05Open>03Resolved m3 is now gone from dbstore1002, dbstore2001 and dbstore2002 (those last two didn't have it a loooong time ago) [13:48:34] Should we maybe migrate s5 from db1095 to db1102? Given that db1095 doesn't have a lot more space available and db1102 has 2T...(s5 compressed is around 800G) [13:54:06] yah [13:54:13] yeah [13:54:36] we should move to multinstaince soon anyway [13:56:30] Should we use db1069 for it maybe? [13:59:44] we can setup db1102:s5, the rest can wait [14:00:47] for some reason dbstore2001/2 s3, s4 aren't working on grafana [14:01:12] I am creating the ticket for s5 yes :) [14:05:17] 10DBA: Migrate s5 from db1095 to db1102 - https://phabricator.wikimedia.org/T172996#3515549 (10Marostegui) [14:05:34] 10DBA: Migrate s5 from db1095 to db1102 - https://phabricator.wikimedia.org/T172996#3515562 (10Marostegui) p:05Triage>03Normal [14:05:46] 10DBA: Migrate s5 from db1095 to db1102 - https://phabricator.wikimedia.org/T172996#3515549 (10Marostegui) [14:07:56] prometheus2003.codfw.wmnet:9900/ops/graph?g0.range_input=1h&g0.expr=mysql_version_info&g0.tab=1 [14:08:11] only s1, s2 and x1 are there [14:13:47] silly question...grants? [14:14:04] something is wrong [14:14:07] it works locally [14:14:12] but not from prometheus server [14:14:26] maybe iptables? [14:15:11] let's see [14:15:40] no, it doesn't work locally either [14:16:55] yeah, you are right, it is the grants [14:17:14] \o/ [14:20:43] now it works: https://grafana.wikimedia.org/dashboard/db/mysql?orgId=1&var-dc=codfw%20prometheus%2Fops&var-server=dbstore2002&var-port=13313 [14:20:45] thank you [14:24:01] 10DBA: Migrate s5 from db1095 to db1102 - https://phabricator.wikimedia.org/T172996#3515652 (10Marostegui) [14:24:38] awesome!! I am curious to see how dbstore2001 performs with multi instance [14:24:49] We know how it did with multisource. specially how bad enwiki did [14:24:51] me, too, that was why I want to see that [14:25:14] I felt that enwiki was slower than 2001 [14:25:21] but maybe it was just the ongoing transfer [14:25:40] (reads "load"ing less than writes from another shard) [14:26:18] s4 is about to finish [14:26:22] 10DBA: Migrate s5 from db1095 to db1102 - https://phabricator.wikimedia.org/T172996#3515655 (10Marostegui) [14:26:37] is s4 the last one? [14:26:38] I am currently more worried about the general lag issues we discussed [14:26:43] on production [14:26:58] I was just checking the graphs I sent you yesterday [14:27:02] s4 codfw is still falling behind [14:27:12] even with the current hack [14:27:32] Ah, sorry I thought you meant the invalidation thingy [14:27:40] it is the same thing [14:28:05] I thought s4 issues were more imports than invalidation [14:28:18] I don't think so anymore [14:28:23] Just checking the graphs for the slow slaves, looks lke the Updates have finished or reduced [14:28:45] yeah, the hack should have slowden down [14:28:57] but maybe that is not ideal [14:29:20] yeah, but there is not much we can do [14:29:24] apart from that [14:29:30] well, complain louder [14:29:54] if there is some code that forced us to slowdown the database, that is close to an unbreak now [14:37:50] 10DBA, 10MediaWiki-extensions-WikibaseClient, 10Operations, 10Wikidata, and 5 others: Cache invalidations coming from the JobQueue are causing lag on several wikis - https://phabricator.wikimedia.org/T164173#3515701 (10Marostegui) @daniel this happened again yesterday evening (Montreal time, and we got som... [14:39:10] 10DBA, 10MediaWiki-extensions-WikibaseClient, 10Operations, 10Wikidata, and 5 others: Cache invalidations coming from the JobQueue are causing lag on several wikis - https://phabricator.wikimedia.org/T164173#3515704 (10hoo) @Marostegui @daniel: Please keep me in the loop when discussing this. [14:44:43] marostegui: thoughts on https://gerrit.wikimedia.org/r/#/c/370217/ ? [14:44:50] 10DBA, 10Patch-For-Review: Productionize 22 new codfw database servers - https://phabricator.wikimedia.org/T170662#3515767 (10Marostegui) [14:45:57] haha you are in front of me and you ask me here! [15:30:45] 10DBA, 10Operations, 10ops-codfw, 10Patch-For-Review: es2013 faulty BBU - https://phabricator.wikimedia.org/T172265#3515925 (10Papaul) a:05Papaul>03Marostegui Raid controller replacement complete [15:37:59] 10DBA, 10Operations, 10ops-codfw, 10Patch-For-Review: es2013 faulty BBU - https://phabricator.wikimedia.org/T172265#3515977 (10Marostegui) Thanks @Papaul Everything looks good so far, it is re-charging. I have disabled BBU auto-learn. The BBU itself looks good, and so does the storage ``` Charger Status:... [15:46:23] 10DBA, 10Operations, 10ops-codfw, 10Patch-For-Review: es2013 faulty BBU - https://phabricator.wikimedia.org/T172265#3516028 (10Marostegui) ``` Charger Status: Complete ``` [15:48:10] 10DBA, 10Operations, 10ops-codfw, 10Patch-For-Review: es2013 faulty BBU - https://phabricator.wikimedia.org/T172265#3516033 (10Marostegui) 05Open>03Resolved I have started MySQL and everything else looks fine. So I am going to close this as resolved and I hope that we do not have to re-open it :-) [15:59:08] 10DBA, 10Wikimedia-Site-requests: Create Wikivoyage Hindi - https://phabricator.wikimedia.org/T173013#3516056 (10MF-Warburg) [16:03:28] 10DBA, 10Wikimedia-Site-requests: Create Wikivoyage Hindi - https://phabricator.wikimedia.org/T173013#3516091 (10Marostegui) p:05Triage>03Normal Will this be a public wiki? Please, ping us (DBAs) once the database is actually created. [16:10:28] 10DBA, 10ContentTranslation, 10Language-2017 Sprint 10, 10WorkType-Maintenance: Remove cx_drafts table from production - https://phabricator.wikimedia.org/T172364#3516147 (10KartikMistry) Removed cx_drafts from Labs instances and it is OK. [16:12:44] 10DBA, 10ContentTranslation, 10Language-2017 Sprint 10, 10WorkType-Maintenance: Remove cx_drafts table from production - https://phabricator.wikimedia.org/T172364#3516155 (10Marostegui) >>! In T172364#3516147, @KartikMistry wrote: > Removed cx_drafts from Labs instances and it is OK. What do you mean with... [16:15:19] 10DBA, 10ContentTranslation, 10Language-2017 Sprint 10, 10WorkType-Maintenance: Remove cx_drafts table from production - https://phabricator.wikimedia.org/T172364#3516179 (10KartikMistry) >>! In T172364#3516155, @Marostegui wrote: > What do you mean with removed from labs instances? Just to double check i... [16:16:53] 10DBA, 10ContentTranslation, 10Language-2017 Sprint 10, 10WorkType-Maintenance: Remove cx_drafts table from production - https://phabricator.wikimedia.org/T172364#3516186 (10Marostegui) >>! In T172364#3516179, @KartikMistry wrote: >>>! In T172364#3516155, @Marostegui wrote: >> What do you mean with removed... [16:58:40] 10DBA, 10Wikimedia-Site-requests: Create Wikivoyage Hindi - https://phabricator.wikimedia.org/T173013#3516056 (10MarcoAurelio) Taking care of the initial configuration patches. @Marostegui Yes, this will be a public wiki. Will create a subtask for preparing and check the storage layer for hi.wikivoyage. Regards. [16:59:34] 10DBA, 10Wikimedia-Site-requests: Create Wikivoyage Hindi - https://phabricator.wikimedia.org/T173013#3516379 (10MarcoAurelio) [17:00:31] 10DBA, 10Wikimedia-Site-requests, 10User-MarcoAurelio: Create Wikivoyage Hindi - https://phabricator.wikimedia.org/T173013#3516056 (10MarcoAurelio) [17:04:30] 10DBA, 10ContentTranslation, 10Language-2017 Sprint 10, 10WorkType-Maintenance: Remove cx_drafts table from production - https://phabricator.wikimedia.org/T172364#3516388 (10KartikMistry) >>! In T172364#3516186, @Marostegui wrote: > But did you actually drop the tables from the labs databases? > Sorry for... [17:16:24] 10DBA, 10Cloud-Services, 10User-MarcoAurelio: Prepare and check storage layer for hi.wikivoyage - https://phabricator.wikimedia.org/T173027#3516428 (10MarcoAurelio) [17:23:34] 10DBA, 10Cloud-Services, 10User-MarcoAurelio: Prepare and check storage layer for hi.wikivoyage - https://phabricator.wikimedia.org/T173027#3516472 (10Marostegui) Thanks - ping us once the database is actually created. Note for the DBAs, if this is done before - T172514 gets merged, we need to apply the ALTE... [17:38:11] 10DBA, 10MediaWiki-extensions-WikibaseClient, 10Wikidata, 10Patch-For-Review, and 2 others: Usage tracking: record which statement group is used - https://phabricator.wikimedia.org/T151717#3516501 (10Halfak) Talked to @hoo. If everything goes right with the deployment, we should be able to test this in th... [17:40:07] 10DBA, 10MediaWiki-extensions-WikibaseClient, 10Wikidata, 10Patch-For-Review, and 2 others: Usage tracking: record which statement group is used - https://phabricator.wikimedia.org/T151717#3516505 (10Halfak) Thinking about MariaDB capacity -- we could get ahead of potential capacity issues by designing a s... [18:08:12] 10DBA, 10MediaWiki-extensions-WikibaseClient, 10Wikidata, 10Patch-For-Review, and 2 others: Usage tracking: record which statement group is used - https://phabricator.wikimedia.org/T151717#3516648 (10Ottomata) I haven't fully grokked this ticket, but in general I am for events! :) 2 qs: - How many events... [18:46:44] 10DBA, 10MediaWiki-extensions-WikibaseClient, 10Operations, 10Wikidata, and 5 others: Cache invalidations coming from the JobQueue are causing lag on several wikis - https://phabricator.wikimedia.org/T164173#3516822 (10mmodell) [18:52:39] 10DBA, 10Operations, 10media-storage, 10monitoring: icinga hp raid check timeout on busy ms-be and db machines - https://phabricator.wikimedia.org/T141252#3516853 (10herron) [19:56:06] 10DBA, 10Cloud-Services: Prepare and check storage layer for hi.wikivoyage - https://phabricator.wikimedia.org/T173027#3517111 (10MarcoAurelio)