[06:14:44] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review, 10Schema-change: Dropping site_stats.ss_total_views on wmf databases - https://phabricator.wikimedia.org/T86339 (10Marostegui) [06:27:17] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review, 10Schema-change: Dropping site_stats.ss_total_views on wmf databases - https://phabricator.wikimedia.org/T86339 (10Marostegui) [06:32:38] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review, 10Schema-change: Dropping site_stats.ss_total_views on wmf databases - https://phabricator.wikimedia.org/T86339 (10Marostegui) s3 eqiad progress: [] labsdb1011 [] labsdb1010 [] labsdb1009 [x] dbstore1002 [] db1124 [] db1123 [] db1095 [] db1078 [] db10... [06:33:10] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review, 10Schema-change: Dropping site_stats.ss_total_views on wmf databases - https://phabricator.wikimedia.org/T86339 (10Marostegui) [06:50:39] 10DBA, 10Operations, 10ops-eqiad, 10Patch-For-Review: rack/setup/install pc1007-pc1010 - https://phabricator.wikimedia.org/T207258 (10Marostegui) [06:51:01] 10DBA, 10Operations, 10ops-eqiad, 10Patch-For-Review: rack/setup/install pc1007-pc1010 - https://phabricator.wikimedia.org/T207258 (10Marostegui) @Cmjohnson reminder: this is RAID5 instead of 10 as noted on top of the task. [06:57:43] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review, 10Schema-change: Dropping site_stats.ss_total_views on wmf databases - https://phabricator.wikimedia.org/T86339 (10Marostegui) s3 eqiad progress: [] labsdb1011 [] labsdb1010 [] labsdb1009 [x] dbstore1002 [] db1124 [x] db1123 [] db1095 [] db1078 [] db1... [07:25:06] 10DBA, 10Operations, 10monitoring: Create a check/calendar alert for MariaDB TLS certs - https://phabricator.wikimedia.org/T152427 (10Marostegui) Just to be on the safe side I have created Calendar events (personals and on Ops maintenance) as a reminder: 1 year before expiration 6 months before expiration 3... [08:32:08] It seems s3@dbstore2002 is performs better with the incresed buffer pool [08:32:15] at least is it not lagging now [08:32:34] I guess after the next weeks backup we'll see it how it catches up, right? [08:34:13] how long will we keep the sqldata.s2.bak dir? it's almost 700 G (I know we have plenty of space, and I know we prefer not to remove stuff, I am just curious) [08:37:03] It was pretty much lagging even without the backups [08:37:15] So it should be fine next week I think [08:37:29] It will probably lag whilst the backup runs, but it will be at least able to catch back up :) [08:38:43] as it is expected :) [08:39:07] Regarding the remove, I guess once we need some space, we can delete it [09:11:56] emails? [09:13:23] ? [09:14:23] T152427 [09:14:23] T152427: Create a check/calendar alert for MariaDB TLS certs - https://phabricator.wikimedia.org/T152427 [09:14:41] That is a "at least we have something for now" [09:14:48] It took me 2 minutes to create those [09:14:54] the health checks in place check the TLS expiration [09:15:02] but they don't alert on it [09:15:24] I am aware of that [09:15:36] We need to include them on icinga and place an alert [09:15:56] where did you get the date from? [09:16:12] because on reimage, the cert was renewed [09:16:37] I got it from one of the masters [09:16:40] But good point [09:16:44] It is useless :) [10:50:01] 10DBA, 10Operations, 10StructuredDiscussions, 10Growth-Team (Current Sprint), and 2 others: Setup separate logical External Store for Flow in production - https://phabricator.wikimedia.org/T107610 (10Banyek) As I plan to get involved this I read back the ticket, and put the following tldr together. Did I... [11:01:16] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review, 10Schema-change, 10User-Banyek: Dropping user.user_options on wmf databases - https://phabricator.wikimedia.org/T85757 (10Banyek) @Marostegui : I am thinking on executing this schema change today accross the codfw sections and all the sanitarium mast... [11:04:00] marostegui: is db1118 available? I want to test xtrabackup, but I can take other host if that is not [11:10:42] jynus: No, sorry :( [11:10:51] jynus: if you can find some other host, I would appreciate it [11:11:32] is banyek going to do an all-dc schema change after midday on thanksgiving friday? [11:11:44] I was replying to that ticket right now [11:11:48] I can do it on monday [11:11:58] The point is to do that part first [11:13:03] The today part is easily avoidable [11:15:29] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review, 10Schema-change, 10User-Banyek: Dropping user.user_options on wmf databases - https://phabricator.wikimedia.org/T85757 (10Marostegui) >>! In T85757#4769511, @Banyek wrote: > @Marostegui : I am thinking on executing this schema change today accross th... [11:17:46] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review, 10Schema-change, 10User-Banyek: Dropping user.user_options on wmf databases - https://phabricator.wikimedia.org/T85757 (10Banyek) Okay, then I stick to the original plan, thanks! [12:33:16] I'd like to proceed on https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/473546/ but I need a CR [12:35:06] you need to disable alerts before puppetizeing them [12:37:50] why going with 15 and 16 first, you don't say on the description? [12:38:36] you have right in both! I fix those, thanks [12:39:28] you are also "breaking" all other hosts [12:39:33] I voted -1 [12:40:20] you don't need to do all at once, but please propose a plan first on the ticket [12:40:44] T202367 is empty [12:40:44] T202367: Productionize dbproxy101[2-7].eqiad.wmnet - https://phabricator.wikimedia.org/T202367 [12:41:24] for example, it is ok to not do a refactoring, but if you don't, it doesn't make sense to not do all at once [12:41:34] please explain on ticket first [12:59:45] please put the plan on the ticket as suggested above [13:20:04] My plan was originally to do it on all hosts at once, but marostegui suggested to do only the hosts which are not used now. I can accept this idea. [13:20:15] But you have right, I'll put that to the ticket first [13:41:42] The reason I said to do only the unused ones first was to basically catch any possible errors when deploying or proceeding [13:41:59] Instead of committing to replace all of them at once [13:48:52] 10DBA, 10Patch-For-Review: Productionize dbproxy101[2-7].eqiad.wmnet - https://phabricator.wikimedia.org/T202367 (10Banyek) The mapping of the new and old hosts will be the following: dbproxy1001 -> dbproxy1012 dbproxy1002 -> dbproxy1013 dbproxy1003 -> dbproxy1014 dbproxy1004 -> dbproxy1015 dbproxy1005 -> dbp... [13:50:01] 10DBA, 10Patch-For-Review: Productionize dbproxy101[2-7].eqiad.wmnet - https://phabricator.wikimedia.org/T202367 (10jcrespo) what about dbproxy1007 to 11? [14:27:22] 10DBA, 10Patch-For-Review: Productionize dbproxy101[2-7].eqiad.wmnet - https://phabricator.wikimedia.org/T202367 (10Banyek) T191595 mentions that 10 and 11 will be moved to hosts which are purchased later ("next fiscal a new set of proxies will be purchased for codfw and labsdb-eqiad.") But indeed I were not... [14:31:27] 10DBA: Implement a proof of concept of a snapshot cycle automation for a mediawiki section database - https://phabricator.wikimedia.org/T210292 (10jcrespo) p:05Triage>03Normal [15:13:06] 10DBA, 10Patch-For-Review: Productionize dbproxy101[2-7].eqiad.wmnet - https://phabricator.wikimedia.org/T202367 (10Banyek) Ok, I see what I am missing here, I'll get back with something clever [15:16:32] 10DBA, 10Patch-For-Review: Productionize dbproxy101[2-7].eqiad.wmnet - https://phabricator.wikimedia.org/T202367 (10jcrespo) Note you don't have to plan everything on your own- maybe we had some ideas, but you should think and ask questions. Originally, the plan was going to refactor the dbproxy profile into m... [15:17:17] 10DBA, 10Patch-For-Review: Productionize dbproxy101[2-7].eqiad.wmnet - https://phabricator.wikimedia.org/T202367 (10Banyek) ack [16:16:33] 10DBA, 10Patch-For-Review: Productionize dbproxy101[2-7].eqiad.wmnet - https://phabricator.wikimedia.org/T202367 (10Banyek) I am thinking on running HAProxy service to listen on 3306 and 3307 on each hosts. From dbproxy1012 to 1016 we could implement the currently used mapping on 3306, and we could create red... [16:17:14] I leave for today, have a good weekend for all of you [16:17:16] bye [16:40:10] 10DBA, 10Patch-For-Review: Productionize dbproxy101[2-7].eqiad.wmnet - https://phabricator.wikimedia.org/T202367 (10jcrespo) This is going in the right direction, except that we already have assigned ports for misc services: ` m1: 3321 m2: 3322 m3: 3323 m5: 3325 ` See https://phabricator.wikimedia.org/source... [17:27:25] 10DBA, 10Patch-For-Review: Implement a proof of concept of a snapshot cycle automation for a mediawiki section database - https://phabricator.wikimedia.org/T210292 (10jcrespo) So the above patch kinda works (it requires additional puppet changes to create a new hierarchy: ` /srv /backups /dumps /...