[07:10:57] Amir1: the pattern kept happening during the night (as expected sort of) [08:51:49] marostegui: So I found https://grafana.wikimedia.org/d/000000548/wikibase-wb_terms?refresh=30s&orgId=1&from=now-24h&to=now that syas which part of the code is actually have the spikes [08:52:14] Amir1: Oh, nice one, I had no idea about that graph :) [08:52:46] yeah, it's very code-related bits but it damn useful [08:53:53] marostegui: interestingly it has been happening before the deployment: https://grafana.wikimedia.org/d/000000548/wikibase-wb_terms?refresh=30s&orgId=1&from=now-36h&to=now [08:54:03] but in a different type [08:55:05] Amir1: yeah, but not in the same way we have seen with the traffic or the handlers [08:55:54] yeah, it's something hitting us but in the old system it hits cache, in the new one it bypasses the cache [08:55:57] https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-dc=eqiad%20prometheus%2Fops&var-server=db1109&var-port=9104&from=1564527404568&to=1564736146168&panelId=5&fullscreen [09:00:28] yeah, it seems the spikes are user-based but the new thing let it reach the users [09:00:40] *reach the database [09:00:44] yeah [09:00:59] let's revert for the weekend then? [09:01:21] sure [09:01:34] marostegui: do you want to do it? [09:03:04] Amir1: Do I have to do it? You don't have privileges for it? [09:03:35] I have, I was just asking :D [09:03:46] ah! haha [09:04:04] I rather not touch anything code related apart from db-eqiad.php and db-codfw.php! [09:04:29] sure :P [09:05:56] Revert "Revert "Revert [09:05:57] hahah [09:06:12] :P [09:06:19] And next week we'll have Revert Revert Revert Revert [09:06:41] I won't stop there :D I continue until someone gets angry at me [09:06:47] Amir1: Sorry for asking you to revert, but I rather have a normal pattern during the weekend, specially nowadays where the coverage is reduced due to holidays [09:07:40] All good, I've got PTSD for this thing, I'm very cautious [09:08:00] hahaha [09:08:04] Yeah, better be safe [09:18:41] so it's deployed now [09:20:49] Thank you :) [12:30:33] 10DBA, 10Data-Services: Compress and defragment tables on labsdb hosts - https://phabricator.wikimedia.org/T222978 (10Marostegui) [12:31:26] 10DBA, 10Operations, 10cloud-services-team, 10wikitech.wikimedia.org: Switchover m5 primary master: db1073 to db1133 - https://phabricator.wikimedia.org/T229657 (10Marostegui) [12:31:30] 10DBA, 10Goal: Address Database infrastructure blockers on datacenter switchover & multi-dc deployment - https://phabricator.wikimedia.org/T220170 (10Marostegui) [12:33:21] 10DBA, 10Operations, 10wikitech.wikimedia.org, 10cloud-services-team (Kanban): Switchover m5 primary master: db1073 to db1133 - https://phabricator.wikimedia.org/T229657 (10aborrero) I think we could either do this next week or wait until september because the WMCS team we will be traveling for Wikimania +... [12:35:00] 10DBA, 10Operations, 10wikitech.wikimedia.org, 10cloud-services-team (Kanban): Switchover m5 primary master: db1073 to db1133 - https://phabricator.wikimedia.org/T229657 (10Marostegui) >>! In T229657#5387587, @aborrero wrote: > I think we could either do this next week or wait until september because the W... [12:37:35] 10DBA, 10Operations, 10wikitech.wikimedia.org, 10cloud-services-team (Kanban): Switchover m5 primary master: db1073 to db1133 - https://phabricator.wikimedia.org/T229657 (10CDanis) FYI I'll be on vacation and without a work laptop approx Sept 10th - Sept 20th, and possibly Sept 9th as well. Outside of tha... [12:42:07] 10DBA, 10Operations, 10wikitech.wikimedia.org, 10cloud-services-team (Kanban): Switchover m5 primary master: db1073 to db1133 - https://phabricator.wikimedia.org/T229657 (10Marostegui) Let me pick a tentative.....Tuesday 3rd Sept at 13:00 UTC? @aborrero @CDanis ? [12:42:26] 10DBA, 10Operations, 10wikitech.wikimedia.org, 10cloud-services-team (Kanban): Switchover m5 primary master: db1073 to db1133 - https://phabricator.wikimedia.org/T229657 (10aborrero) Ok, so I'm proposing two dates: * 2019-10-03 -- I'm unavailable, but I think both @JHedden and @Andrew will be around. Also... [12:43:45] 10DBA, 10Operations, 10wikitech.wikimedia.org, 10cloud-services-team (Kanban): Switchover m5 primary master: db1073 to db1133 - https://phabricator.wikimedia.org/T229657 (10CDanis) >>! In T229657#5387609, @Marostegui wrote: > Let me pick a tentative.....Tuesday 3rd Sept at 13:00 UTC? @aborrero @CDanis ? L... [12:44:31] 10DBA, 10Operations, 10wikitech.wikimedia.org, 10cloud-services-team (Kanban): Switchover m5 primary master: db1073 to db1133 - https://phabricator.wikimedia.org/T229657 (10Marostegui) @aborrero are you proposing October? [12:45:53] 10DBA, 10Operations, 10wikitech.wikimedia.org, 10cloud-services-team (Kanban): Switchover m5 primary master: db1073 to db1133 - https://phabricator.wikimedia.org/T229657 (10aborrero) Ok, **2019-10-03**, work for us. Will let my team know, since I won't be around. >>! In T229657#5387615, @Marostegui wrote:... [12:47:01] 10DBA, 10Operations, 10wikitech.wikimedia.org, 10cloud-services-team (Kanban): Switchover m5 primary master: db1073 to db1133 - https://phabricator.wikimedia.org/T229657 (10Marostegui) Let's try to go for the 3rd of September at 13:00 UTC if @Andrew and/or @JHedden can confirm they'll be available to suppo... [13:01:14] marostegui: https://phabricator.wikimedia.org/T229407#5387629 [13:03:12] Amir1: looks like you will need to play Sherlock :) [13:03:36] 🕵️ [13:03:40] hahahah [13:04:07] I have been dealing with six UBN tasks this week, I won't think about anything work related for the next three days :P [13:04:58] Amir1: You can dream about the moment we can run: drop wb_terms; [13:05:26] by the time I'm probably a 5000-year-old mummy :P [13:05:37] It will be like when we dropped tag_summary [13:05:41] but 1000 times happier! [13:06:26] oh yeah that was awesome ^_^ [15:30:57] 10DBA, 10Gerrit, 10Operations, 10Release-Engineering-Team-TODO, and 2 others: Gerrit is failing to connect to db on gerrit2001 thus preventing systemd from working - https://phabricator.wikimedia.org/T176532 (10Dzahn) a:03Dzahn I'll add it. Thanks Manuel! [16:50:58] 10DBA, 10conftool: #dbctl: add 'comment'/'description' metadata to instances - https://phabricator.wikimedia.org/T229677 (10CDanis) [17:22:01] 10DBA, 10conftool: #dbctl: add 'comment'/'description' metadata to instances - https://phabricator.wikimedia.org/T229677 (10Marostegui) At the moment, on the zarcillo database we include the table `masters` which contains the master for each section of each dc: ` root@db1115.eqiad.wmnet[zarcillo]> select * fr... [17:34:03] 10DBA: Update rack information on zarcillo.servers - https://phabricator.wikimedia.org/T229683 (10Marostegui) [18:36:55] 10DBA, 10conftool: #dbctl: manage 'externalLoads' data - https://phabricator.wikimedia.org/T229686 (10CDanis) [18:46:10] 10DBA, 10conftool: #dbctl: manage 'externalLoads' data - https://phabricator.wikimedia.org/T229686 (10Marostegui) Those weights are treated the same way as the other sX sections, but given that we have few hosts we normally use them like that (0/1). Sometimes when we are repooling a cold host we might go for 3... [21:02:32] 10DBA, 10Gerrit, 10Operations, 10Release-Engineering-Team-TODO, and 2 others: Gerrit is failing to connect to db on gerrit2001 thus preventing systemd from working - https://phabricator.wikimedia.org/T176532 (10Dzahn) 05Open→03Resolved gerrit, gerrit's httpd and gerrit's sshd are now all running and li... [21:02:59] 10DBA, 10Gerrit, 10Operations, 10Release-Engineering-Team-TODO, 10Release-Engineering-Team (Development services): Gerrit is failing to connect to db on gerrit2001 thus preventing systemd from working - https://phabricator.wikimedia.org/T176532 (10Dzahn)