[06:09:53] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review: Drop now redundant indexes from pagelinks and templatelinks - https://phabricator.wikimedia.org/T174509#3705524 (10Marostegui) [06:33:38] 10DBA, 10Operations, 10Puppet: Switch databases to the future parser - https://phabricator.wikimedia.org/T172498#3705570 (10Joe) Since we switched all of production to the future parser almost 2 months ago, we clearly fixed these issues as part of the more general ticket about the future parser. [06:33:49] 10DBA, 10Operations, 10Puppet: Switch databases to the future parser - https://phabricator.wikimedia.org/T172498#3705571 (10Joe) 05Open>03Resolved [07:41:16] 10DBA, 10Patch-For-Review: Run pt-table-checksum on s3 - https://phabricator.wikimedia.org/T164488#3705618 (10Marostegui) [08:08:58] 10DBA, 10Wiki-Setup: fix flow table issue on techconductwiki - https://phabricator.wikimedia.org/T178868#3705646 (10ArielGlenn) [08:09:29] 10DBA, 10Dumps-Generation, 10Wiki-Setup: fix flow table issue on techconductwiki - https://phabricator.wikimedia.org/T178868#3705660 (10ArielGlenn) [08:17:33] 10DBA, 10Patch-For-Review, 10Wiki-Setup (Create): Create CoC committee private wiki - https://phabricator.wikimedia.org/T165977#3705675 (10Marostegui) Hi guys! Please review this: T178868#3705667 as it looks like the flow extension table isn't created. [08:21:41] 10DBA, 10Patch-For-Review, 10Wiki-Setup (Create): Create CoC committee private wiki - https://phabricator.wikimedia.org/T165977#3705699 (10Marostegui) 05Resolved>03Open [08:26:21] 10DBA, 10Operations, 10ops-eqiad, 10Patch-For-Review: db1082 storage crashed - https://phabricator.wikimedia.org/T178460#3705705 (10Marostegui) 05Open>03Resolved a:03Marostegui db1082 is fully repooled now, let's close this for now [08:32:41] 10DBA, 10Operations, 10ops-codfw, 10Patch-For-Review: Decommission db2010 and move m1 codfw to db2078 - https://phabricator.wikimedia.org/T175685#3705727 (10Marostegui) Hi, Is there anything pending here? Thanks! [08:34:16] 10DBA, 10Patch-For-Review: Productionize 22 new codfw database servers - https://phabricator.wikimedia.org/T170662#3705750 (10Marostegui) [08:48:40] 10DBA, 10Patch-For-Review: Productionize 22 new codfw database servers - https://phabricator.wikimedia.org/T170662#3531173 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by marostegui on neodymium.eqiad.wmnet for hosts: ``` db2088.codfw.wmnet ``` The log can be found in `/var/log/wmf-auto-reimage/... [09:07:29] 10DBA, 10Patch-For-Review: Productionize 22 new codfw database servers - https://phabricator.wikimedia.org/T170662#3705815 (10ops-monitoring-bot) Completed auto-reimage of hosts: ``` ['db2088.codfw.wmnet'] ``` and were **ALL** successful. [10:08:52] 10DBA, 10Operations: operations/software repo: flake8 check - https://phabricator.wikimedia.org/T178877#3705904 (10Volans) [10:14:20] 10DBA, 10Operations: operations/software repo: flake8 check - https://phabricator.wikimedia.org/T178877#3705921 (10Volans) p:05Triage>03Normal [12:41:25] 10DBA, 10Patch-For-Review, 10Wiki-Setup (Create): Create CoC committee private wiki - https://phabricator.wikimedia.org/T165977#3706305 (10Marostegui) 05Open>03Resolved The flow table issue was handled (and fixed) at T178868 Thanks @Reedy! [13:01:38] marostegui, volans - I think that we reached a good status for https://gerrit.wikimedia.org/r/#/c/385173, mind if I merge and run puppet on db1046/7/store1002? [13:02:39] elukey: can we maybe disable puppet on db1046 and dbstore1002 and only run it on db1047? [13:02:43] just being over careful [13:02:48] as those host are kinda snow flakes.. [13:02:52] sorry to be a pain [13:03:01] elukey: I was about to ask the same [13:03:09] also are we sure it doesn't apply to any other hosts? [13:05:13] marostegui: sure sure that was my plan too :) [13:05:58] volans: I don't think so since those are all new roles, but I can triple check [13:06:32] me neither, just to be on the safe side [13:06:44] I'm also about to step out for a bit [13:06:56] I'll call you to keep you company [13:07:00] :D [13:07:00] xddddd [13:07:16] so kind of you! :-p [13:09:21] marostegui: are you available for the next 10 mins if something is weird? [13:09:30] elukey: yep! [13:09:36] all right! merging then [13:09:43] that doesn't mean I will help [13:09:44] :ppp [13:09:53] let's merge! [13:13:11] all right puppet disabled, merged and now let's see how db1047 takes it [13:14:05] https://d30y9cdsu7xlg0.cloudfront.net/png/115782-200.png [13:15:30] marostegui: seems good https://phabricator.wikimedia.org/P6169 [13:15:47] oh nice [13:15:55] * marostegui cries [13:16:13] i would say let's apply it to db1046? [13:16:15] eventlogging sync is running fine :) [13:16:18] yep! [13:16:24] (and leave dbstore1002 for the last one?) [13:17:44] marostegui: 1046 even better https://phabricator.wikimedia.org/P6169#34248 [13:17:54] nice! [13:17:55] :) [13:18:01] let's go for the hard one then! [13:19:13] marostegui: https://phabricator.wikimedia.org/P6169#34249 - all good! [13:19:47] sweet!!! good job!! [13:21:16] all credits to volans [13:21:17] :) [13:22:07] marostegui: my plan is the following for db1108 [13:22:27] 1) add the puppet code to support eventlogging_sync with systemd [13:22:55] 2) assign the new eventlogging replica code to db1108 [13:23:13] 3) ping you to figure out when/how to migrate the data on db1047 to it [13:23:34] Sounds good to me [13:23:36] 4) let it replicate alongside with db1047 for a bit to spot anomalyes [13:23:40] 5) nuke db1047 :D [13:23:49] Are you going for mariadb10 or 10.1? [13:24:09] whatever you suggest! [13:24:37] I don't think we have stretch [13:24:40] 10.0 package [13:24:46] I would go for 10.1 anyways [13:24:52] with systemd and all that [13:24:58] we are going to have our new core servers with 10.1 anyways [13:25:07] so we better start seeing if that works on eventlogging :) [13:25:37] sure, let's go for it [13:25:37] so I would suggest 10.1 [13:26:19] the data migration is basically a netcat + mysql_upgrade [13:26:30] we'll see how it goes and if we face issues with 10.1+toku or something [13:42:49] 10DBA, 10Patch-For-Review, 10Wiki-Setup (Create): Create CoC committee private wiki - https://phabricator.wikimedia.org/T165977#3706435 (10Dereckson) I've prepared https://gerrit.wikimedia.org/r/#/c/386180/ so Flow tables will be present for the next private wiki. [13:48:16] 10DBA, 10Patch-For-Review, 10Wiki-Setup (Create): Create CoC committee private wiki - https://phabricator.wikimedia.org/T165977#3706445 (10Reedy) >>! In T165977#3706435, @Dereckson wrote: > I've prepared https://gerrit.wikimedia.org/r/#/c/386180/ so Flow tables will be present for the next private wiki. I f... [13:59:22] <_joe_> marostegui: we're raising the number of workers for the jobrunners temporarily [13:59:58] temporarily meaning a few hours? days? [14:00:04] (just to keep an eye) [14:00:31] <_joe_> a couple hours max [14:00:35] ah coolio [14:00:38] thanks for the heads up :) [14:00:41] <_joe_> please ping me if something goes wrong [14:00:45] will do :) [14:00:46] <_joe_> but it should not [14:00:48] hopefully not! [14:01:07] <_joe_> in fact, we probably need to tune them up a bit in the upcoming weeks [14:01:44] why is that? [14:02:00] <_joe_> because the jobqueue is lagging behind badly [14:02:46] I was looking for a graph, but it seems I cannot find it [14:03:14] <_joe_> https://grafana.wikimedia.org/dashboard/db/job-queue-health?orgId=1&refresh=1m [14:03:19] <_joe_> 10 million jobs now [14:03:23] <_joe_> meh [14:03:58] wow [14:03:58] XD [15:28:18] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review: Drop now redundant indexes from pagelinks and templatelinks - https://phabricator.wikimedia.org/T174509#3706903 (10Marostegui) [15:42:08] 10DBA, 10Patch-For-Review: Support multi-instance on core hosts - https://phabricator.wikimedia.org/T178359#3706959 (10Marostegui) [15:42:56] 10DBA, 10Patch-For-Review: Support multi-instance on core hosts - https://phabricator.wikimedia.org/T178359#3689827 (10Marostegui) db2088 is now populated. Compressing s2 there as s1 was already transferred compressed from db2092. [15:50:58] 10DBA, 10MediaWiki-extensions-WikibaseRepository, 10Wikidata, 10Goal: Remove numeric entity IDs from database schema - https://phabricator.wikimedia.org/T114902#3706987 (10Lydia_Pintscher) [16:34:46] 10DBA, 10Wikidata, 10Performance, 10User-Ladsgroup, 10Wikidata-Sprint: slow master queries on Wikibase\Client\Usage\Sql\EntityUsageTable::getAffectedRowIds - https://phabricator.wikimedia.org/T169336#3707135 (10thiemowmde) [19:04:22] 10DBA, 10Data-Services, 10Goal, 10cloud-services-team (FY2017-18): Migrate all users to new Wiki Replica cluster and decommission old hardware - https://phabricator.wikimedia.org/T142807#3707820 (10madhuvishy) [19:04:29] 10DBA, 10Operations, 10cloud-services-team, 10Scoring-platform-team (Current): Labsdb* servers need to be rebooted - https://phabricator.wikimedia.org/T168584#3707817 (10madhuvishy) 05Resolved>03Open Reopening since we are scheduling the labsdb1001 and 1003 reboots over the next couple weeks. [19:53:26] 10DBA, 10Operations, 10cloud-services-team, 10Scoring-platform-team (Current): Labsdb* servers need to be rebooted - https://phabricator.wikimedia.org/T168584#3707945 (10madhuvishy) Proposed timing for the 2 reboots: labsdb1001: Monday Oct 30 2017, 14:30 UTC (16:30 Madrid, 10:30 EST, 07:30 PT) labsdb1003:... [19:55:11] 10DBA, 10Operations, 10cloud-services-team, 10Scoring-platform-team (Current): Labsdb* servers need to be rebooted - https://phabricator.wikimedia.org/T168584#3707948 (10Marostegui) Looks good to me! Thanks for getting this arranged [20:14:30] 10DBA, 10Operations, 10cloud-services-team, 10Scoring-platform-team (Current): Labsdb* servers need to be rebooted - https://phabricator.wikimedia.org/T168584#3708038 (10madhuvishy) Thanks @Marostegui. I've updated the lists, and our wiki here -https://wikitech.wikimedia.org/wiki/Wiki_Replica_c1_and_c3_sh... [20:17:31] 10DBA, 10Data-Services, 10Goal, 10cloud-services-team (FY2017-18): Migrate all users to new Wiki Replica cluster and decommission old hardware - https://phabricator.wikimedia.org/T142807#3708070 (10madhuvishy)