[00:01:10] 10VPS-project-Codesearch, 10Patch-For-Review: Codesearch: Index schema/events/* repos - https://phabricator.wikimedia.org/T275705 (10Legoktm) 05Open→03Resolved [00:02:53] 10Release-Engineering-Team-TODO (2021-01-01 to 2021-03-31 (Q3)), 10MW-on-K8s, 10serviceops: Missing docker iptables nat rules for releases hosts - https://phabricator.wikimedia.org/T276869 (10dduvall) [00:03:32] 10Release-Engineering-Team-TODO (2021-01-01 to 2021-03-31 (Q3)), 10MW-on-K8s, 10serviceops: Missing docker iptables nat rules for releases hosts - https://phabricator.wikimedia.org/T276869 (10dduvall) [00:18:17] 10Release-Engineering-Team (Logspam), 10Wikimedia-General-or-Unknown, 10JavaScript, 10Wikimedia-production-error: TypeError: format.replace is not a function in randomToken function used by SearchSatisfaction schema - https://phabricator.wikimedia.org/T272904 (10Jdlrobson) 05Open→03Resolved a:03Jdlrob... [00:22:38] 10Release-Engineering-Team-TODO (2021-01-01 to 2021-03-31 (Q3)), 10MW-on-K8s, 10serviceops: Missing docker iptables nat rules for releases hosts - https://phabricator.wikimedia.org/T276869 (10Legoktm) Including `profile::docker::builder` would be wrong since that also pulls in a bunch of other stuff to build... [02:30:00] I'm hitting issues on our Ci because the cache directory for composer doesn't exist. I see that doing the composer cache as a volume seems to be recommended in some places for performance with docker when containers go up and down a lot (like CI). I'm wondering if other CI builds do anything along those lines? [02:30:01] The error is generated by wikimedia-composer-merge-plugin "/cache/composer/vcs does not exist and could not be created." [02:30:01] https://integration.wikimedia.org/ci/job/wikimedia-fundraising-civicrm-docker/4628/console [06:32:15] 10Phabricator (Upstream): If you are not subscribed to a task, you will be subscribed to this task when you try to send an empty message under that task - https://phabricator.wikimedia.org/T276880 (10IN) [06:34:47] 10Phabricator (Upstream): If you are not subscribed to a task, you will be subscribed to this task when you try to send an empty message under that task without prompt - https://phabricator.wikimedia.org/T276880 (10IN) [06:42:30] 10Phabricator (Upstream): If you are not subscribed to a task, you will be subscribed to this task when you try to send an empty message under that task without prompt - https://phabricator.wikimedia.org/T276880 (10Aklapper) 05Open→03Declined No thanks [08:15:41] 10Release-Engineering-Team-TODO (2021-01-01 to 2021-03-31 (Q3)), 10GitLab (Initialization), 10User-brennen: Define auth strategy for GitLab - https://phabricator.wikimedia.org/T274461 (10MoritzMuehlenhoff) >>! In T274461#6894542, @brennen wrote: > Could we get the Speed & Function folks a handful of test acc... [08:17:47] 10Release-Engineering-Team-TODO (2021-01-01 to 2021-03-31 (Q3)), 10SRE, 10GitLab (Initialization), 10User-brennen: Define auth strategy for GitLab - https://phabricator.wikimedia.org/T274461 (10MoritzMuehlenhoff) [09:40:58] 10Release-Engineering-Team-TODO (2021-01-01 to 2021-03-31 (Q3)), 10SRE, 10GitLab (Initialization), 10User-brennen: Define auth strategy for GitLab - https://phabricator.wikimedia.org/T274461 (10jbond) >>! In T274461#6895107, @MoritzMuehlenhoff wrote: > @Sergey.Trofimovsky.SF : You can now log into idp01.ss... [13:08:32] 10Gerrit: Rename Gerrit repository "LdapGroups" to "LDAPGroups" - https://phabricator.wikimedia.org/T200736 (10planetenxin) any chance to get the broken Github mirror fixed? [13:23:44] (03PS3) 10ZPapierski: Add sonar scanner to discolytics [integration/config] - 10https://gerrit.wikimedia.org/r/669770 (https://phabricator.wikimedia.org/T264877) [13:25:00] (03CR) 10jerkins-bot: [V: 04-1] Add sonar scanner to discolytics [integration/config] - 10https://gerrit.wikimedia.org/r/669770 (https://phabricator.wikimedia.org/T264877) (owner: 10ZPapierski) [13:29:55] (03PS4) 10ZPapierski: Add sonar scanner to discolytics [integration/config] - 10https://gerrit.wikimedia.org/r/669770 (https://phabricator.wikimedia.org/T264877) [13:31:05] (03CR) 10jerkins-bot: [V: 04-1] Add sonar scanner to discolytics [integration/config] - 10https://gerrit.wikimedia.org/r/669770 (https://phabricator.wikimedia.org/T264877) (owner: 10ZPapierski) [13:46:48] patches have been waiting up to one hour to even get started on gate-and-submit jobs [13:50:08] (03PS1) 10Lars Wirzenius: feat: add script to test Scap under train-dev [tools/scap] - 10https://gerrit.wikimedia.org/r/670172 [14:03:22] (03PS5) 10ZPapierski: Add sonar scanner to discolytics [integration/config] - 10https://gerrit.wikimedia.org/r/669770 (https://phabricator.wikimedia.org/T264877) [14:11:51] (03CR) 10ZPapierski: Add sonar scanner to discolytics (033 comments) [integration/config] - 10https://gerrit.wikimedia.org/r/669770 (https://phabricator.wikimedia.org/T264877) (owner: 10ZPapierski) [14:42:04] 10Release-Engineering-Team-TODO (2021-01-01 to 2021-03-31 (Q3)), 10Scap, 10EngProd-Virtual-Hackathon, 10Patch-For-Review, 10Python3-Porting: Fix Scap test suite problems under Python 3 - https://phabricator.wikimedia.org/T246025 (10LarsWirzenius) 05Open→03Declined Scap is to be retired before we lose... [14:43:38] 10Release-Engineering-Team-TODO, 10DNS, 10SRE, 10Traffic, and 3 others: DNS for GitLab - https://phabricator.wikimedia.org/T276170 (10wkandek) gerrit.wikimedia.org lives on a second IP address on gerrit1001. Should we follow that model here as well? [14:46:39] 10Release-Engineering-Team (Development services), 10Release-Engineering-Team-TODO (2021-01-01 to 2021-03-31 (Q3)), 10Scap, 10User-brennen: Applying security patches should be robust and also give some useful output - https://phabricator.wikimedia.org/T269153 (10LarsWirzenius) I've been thinking about this... [14:50:29] 10Release-Engineering-Team (Development services), 10Release-Engineering-Team-TODO (2021-01-01 to 2021-03-31 (Q3)), 10Scap, 10User-brennen: Applying security patches should be robust and also give some useful output - https://phabricator.wikimedia.org/T269153 (10LarsWirzenius) I note that I would like to r... [14:53:42] 10Continuous-Integration-Config, 10Release-Engineering-Team: Enable PHP assertions in WMF CI - https://phabricator.wikimedia.org/T276940 (10Daimona) [14:59:27] 10Release-Engineering-Team-TODO (2021-01-01 to 2021-03-31 (Q3)), 10Patch-For-Review, 10Release, 10Train Deployments: 1.36.0-wmf.33 deployment blockers - https://phabricator.wikimedia.org/T274937 (10LarsWirzenius) 05Open→03Resolved CLosing this, new train starting. [16:16:43] !log taking deployment-deploy01 agent offline to mitigate stuck post-merge jobs [16:16:45] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [16:21:39] hmm, that doesn't appear to have worked. deployment-deploy01 still isn't receiving builds [16:24:54] !log cycling gearman plugin on integration.wikimedia.org [16:24:56] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [16:26:18] Nikerabbit: wikibase DOS'd CI, should be clearing soon. [16:26:50] !log builds once again being scheduled on deployment-deploy01 [16:27:04] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [16:32:40] 10Beta-Cluster-Infrastructure, 10Cloud-VPS (Debian Jessie Deprecation): Migrate deployment-prep away from Debian Jessie to Debian Stretch/Buster - https://phabricator.wikimedia.org/T218729 (10MoritzMuehlenhoff) I think we can simply remove deployment-sca01/sca02? The respective hosts in production have been re... [16:37:13] huh. still waiting on queued builds to be scheduled on deployment-deploy01 [16:47:24] !log still seeing "JobOffer[deployment-deploy01 #3] rejected beta-scap-eqiad: Waiting for next available executor on ‘deployment-deploy01’" despite available executors [16:47:26] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [16:54:57] 10Release-Engineering-Team (Development services), 10Release-Engineering-Team-TODO (2021-01-01 to 2021-03-31 (Q3)), 10Scap, 10User-brennen: Applying security patches should be robust and also give some useful output - https://phabricator.wikimedia.org/T269153 (10hashar) About being atomic. All security pa... [17:14:36] seems to be worse now, not better [17:20:22] yep. working on it [17:20:27] T276783#6896365 might fix it, once that makes it through gate-and-submit itself [17:20:27] T276783: "Page should be undoable" selenium test is flaky - https://phabricator.wikimedia.org/T276783 [17:20:40] if it’s still about CI being full [17:21:32] deployment-db05 seems to be acting up (intermittent connection failures) which is causing issues with beta-update-databases-eqiad [17:22:14] !log deployment-db05 seems to be acting up (intermittent connection failures) which is causing issues with beta-update-databases-eqiad, which is (possibly) causing post-merge jobs to pile up [17:22:17] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [17:22:42] !log restarting deployment-db05 via horizon [17:22:44] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [17:25:07] !log seeing "[ 2886.337845] EXT4-fs error (device vda3): ext4_validate_block_bitmap:" for deployment-db05 [17:25:10] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [17:27:41] Does that mean could be a bad block? [17:37:29] deployment-db05 is current database master [17:37:46] and the hypervisor it was on was having some trouble today, a.rturo live migrated it [17:38:47] marxarelli: given T268628 we could just build a new replica and promote db06 into beta master [17:38:48] T268628: Upgrade deployment-prep-db hosts to buster/MariaDB 10.4 - https://phabricator.wikimedia.org/T268628 [17:39:13] Majavah: that's our current thinking (see -ops) [17:39:27] yeah, didn't see that, moving there [17:46:24] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (CI & Testing services), 10Cloud-VPS, 10Patch-For-Review, 10cloud-services-team (Kanban): integration instances suffer from high IO latency due to Ceph - https://phabricator.wikimedia.org/T266777 (10Phamhi) Hi @hashar, Can this ticket be... [17:50:05] 10Beta-Cluster-Infrastructure: deployment-db05 disk issues - https://phabricator.wikimedia.org/T276968 (10Majavah) p:05Triage→03Unbreak! [17:51:30] 10Beta-Cluster-Infrastructure: deployment-db05 disk issues - https://phabricator.wikimedia.org/T276968 (10Majavah) that host is on Jessie / MariaDB 10.1, see also {T268628} [18:04:01] !log deleting shut down memc* deployment-prep instances to free up quota for replacement db instances (T276968) [18:04:05] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [18:04:06] T276968: deployment-db05 disk issues - https://phabricator.wikimedia.org/T276968 [18:09:07] !log set deployment-db05 to read-only to avoid issues with T276968 [18:09:10] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [18:09:10] T276968: deployment-db05 disk issues - https://phabricator.wikimedia.org/T276968 [18:10:26] Majavah: thinking we might as well tackle the buster upgrade at the same time. 1) mysqldump on db06, 2) spin up replacement db instances, 3) restore, 4) reconfigure pool. does that sound reasonable? [18:11:47] marxarelli: as long as exporting on mariadb 10.1 and importing on .4 won't cause issues [18:13:55] 10Beta-Cluster-Infrastructure: deployment-db05 disk issues - https://phabricator.wikimedia.org/T276968 (10Majavah) db06 slave status looks worrying: ` root@127.0.01[(none)]> show slave status\G *************************** 1. row *************************** Slave_IO_State: Master_... [18:14:02] marxarelli: db06 slave status has some errors :( [18:14:45] and that's the only replica that we have [18:15:07] wee! [18:15:20] what are the errors? [18:15:48] posted it on the task, but tl:dr is "Client requested master to start replication from impossible position" [18:16:06] 10Release-Engineering-Team (Development services), 10Release-Engineering-Team-TODO (2021-01-01 to 2021-03-31 (Q3)), 10Scap, 10User-brennen: Applying security patches should be robust and also give some useful output - https://phabricator.wikimedia.org/T269153 (10mmodell) >>! In T269153#6897343, @hashar wro... [18:16:10] oh yeah, that should be expected i think since db05 is failing [18:19:10] Agreed. [18:20:53] !log disabled puppet on deployment-db06 and started mysqldump (T276968) [18:20:56] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [18:20:56] T276968: deployment-db05 disk issues - https://phabricator.wikimedia.org/T276968 [18:21:00] 10MediaWiki-Codesniffer: "UnrecognizedAnnotation" sniff conflicts with "PhpunitAnnotations" - https://phabricator.wikimedia.org/T276971 (10Krinkle) [18:21:03] !log create deployment-db07 as g2.cores8.ram16.disk160 Buster T276968 [18:21:07] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [18:21:15] or should it use Cinder? [18:21:30] andrewbogott, bd808: ^ [18:22:26] Majavah: If you want to do it today then best to just use lvm. If you can wait a day or two I might have a cinder workflow ironed out. [18:23:18] going with LVM in that case, unfortunately need it right now [18:25:27] !log "View 'labswiki.tag_summary' references invalid table(s) or column(s) or function(s) or definer/invoker of view lack rights to use them" when using LOCK TABLES" during mysqldump on db06 (T276968) [18:25:31] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [18:25:34] (03CR) 1020after4: [C: 03+2] "> Patch Set 7: Code-Review+1" [releng/phatality] - 10https://gerrit.wikimedia.org/r/668202 (https://phabricator.wikimedia.org/T237682) (owner: 10Krinkle) [18:26:10] (03CR) 1020after4: [V: 03+2 C: 03+2] Various updates and alignment with MediaWiki/PHP semantics [releng/phatality] - 10https://gerrit.wikimedia.org/r/668202 (https://phabricator.wikimedia.org/T237682) (owner: 10Krinkle) [18:26:19] 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO (2021-01-01 to 2021-03-31 (Q3)), 10serviceops, 10Patch-For-Review: Replace production deployment servers and update them to Buster - https://phabricator.wikimedia.org/T265963 (10Papaul) [18:26:59] (03CR) 1020after4: [V: 03+2 C: 03+2] helpers: Remove search for obsolete 'fatal_error' field [releng/phatality] - 10https://gerrit.wikimedia.org/r/668201 (owner: 10Krinkle) [18:27:11] (03PS5) 10Krinkle: Add .editorconfig [releng/phatality] - 10https://gerrit.wikimedia.org/r/668203 [18:27:13] (03PS5) 10Krinkle: helpers: Redact trace and url [releng/phatality] - 10https://gerrit.wikimedia.org/r/668207 [18:31:34] (03CR) 1020after4: [V: 03+2 C: 03+2] Add .editorconfig [releng/phatality] - 10https://gerrit.wikimedia.org/r/668203 (owner: 10Krinkle) [18:32:22] (03CR) 1020after4: [C: 03+2] helpers: Redact trace and url [releng/phatality] - 10https://gerrit.wikimedia.org/r/668207 (owner: 10Krinkle) [18:34:14] (03CR) 1020after4: [V: 03+2 C: 03+2] "I haven't tested this but it lgtm." [releng/phatality] - 10https://gerrit.wikimedia.org/r/668207 (owner: 10Krinkle) [18:34:29] (03CR) 10DannyS712: "This change is ready for review." [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/670120 (https://phabricator.wikimedia.org/T276971) (owner: 10DannyS712) [18:38:21] !log installing mariadb 10.4 via role::mariadb::beta to db07 T276968 [18:38:24] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [18:38:25] T276968: deployment-db05 disk issues - https://phabricator.wikimedia.org/T276968 [18:39:29] (03CR) 10Daimona Eaytoy: FunctionAnnotationsSniff: recognize more phpunit annotations (031 comment) [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/670120 (https://phabricator.wikimedia.org/T276971) (owner: 10DannyS712) [18:45:41] 10Release-Engineering-Team-TODO (2021-01-01 to 2021-03-31 (Q3)), 10Patch-For-Review, 10Release, 10Train Deployments, 10User-brennen: 1.36.0-wmf.34 deployment blockers - https://phabricator.wikimedia.org/T274938 (10brennen) Actions to date: - Patched a copy of `logspam-watch` to consolidate "Prefix sea... [18:49:14] !log restarting db dump on db06 `mysqldump -h 127.0.0.1 --events --routines --triggers --all-databases -f --single-transaction` (T276968) [18:49:19] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [18:49:20] T276968: deployment-db05 disk issues - https://phabricator.wikimedia.org/T276968 [18:59:58] Majavah: how'd the role apply go? i see mysqld running/listening [19:00:12] still dumping the db on db06 [19:02:46] marxarelli: it's running, but I can't connect to it [19:08:51] (03CR) 10Ahmon Dancy: [C: 04-1] "I haven't tested this out yet but I hope to soon." (033 comments) [tools/scap] - 10https://gerrit.wikimedia.org/r/670172 (owner: 10Lars Wirzenius) [19:15:06] 10Release-Engineering-Team-TODO (2021-01-01 to 2021-03-31 (Q3)), 10Patch-For-Review, 10Release, 10Train Deployments, 10User-brennen: 1.36.0-wmf.34 deployment blockers - https://phabricator.wikimedia.org/T274938 (10brennen) Ran `scap clean --delete 1.36.0-wmf.31`. Some errors there: ` cannot delete non-e... [19:50:26] !log restoring database dump on deployment-db07 (T276968) [19:50:31] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [19:50:32] T276968: deployment-db05 disk issues - https://phabricator.wikimedia.org/T276968 [19:51:16] marxarelli: will you make db06 or db07 the new master? [19:51:39] i'm going to make db07 the new master [19:51:46] we might as well spin up a new replica too [19:52:46] so we have both db servers on buster with the same version of mariadb [19:53:12] that may mean deleting db05 to free up quota [19:53:37] which i see no reason to keep around since the filesystem is corrupt and we can't connect [19:53:42] dancy: ^ any thoughts? [19:53:56] either that or I think we can just WMCS for temporary quota for the upgrade [19:54:25] deleting db05 is an option t oo [19:56:10] k [19:56:39] !log deleting deployment-db05 to free up quota for new replica (T276968) [19:56:43] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [19:56:44] T276968: deployment-db05 disk issues - https://phabricator.wikimedia.org/T276968 [19:58:53] marxarelli: https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/670273/ [19:59:55] !log creating new instance deployment-db08 to use as new beta replica db (T276968) [20:00:01] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [20:00:18] will you take care of the puppet and mariadb install for db08 or should I? [20:00:19] Majavah: nice. might as well put db08 in there as the new replica [20:00:48] should I remove db06 as well? or leave for now? [20:01:00] leave it for now just in case we need a new dump [20:01:10] i can handle the role and mariadb init [20:01:34] i had to manually set a new root pw. not sure what's up with the puppet config there but i'll take a look after fixing this up [20:01:41] and the puppet cert dance? [20:01:46] root *mysql* pw [20:01:52] oh right [20:01:58] (03CR) 10Jeena Huneidi: [C: 04-1] "failing systemtests" [integration/pipelinelib] - 10https://gerrit.wikimedia.org/r/668245 (https://phabricator.wikimedia.org/T269902) (owner: 10Jeena Huneidi) [20:02:14] Majavah: well, in that case. if you don't mind doing the dance, that would be really helpful [20:02:27] it would probably be faster than me double checking the docs :) [20:03:05] sure, done [20:03:48] tl;dr is ssh into the new instance, sudo rm -rf /var/lib/puppet/ssl && sudo run-puppet-agent, ssh into deployment-puppetmaster04 and sudo puppet ca sign deployment-db08.deployment-prep.eqiad1.wikimedia.cloud, then sudo-run-puppet-agent on the new instance again [20:04:12] now has puppet installed but not the hiera/roles for mariadb [20:04:28] also amended the config patch to add the new replica [20:12:01] Majavah: please do ping me if i can be of any help :) [20:14:11] Urbanecm: ask marxarelli :P [20:15:51] I'm happy to help both, if i can :D [20:17:03] are the db grants puppetised or do they need to be added manually somehow? [20:17:35] was puppet ran for the first time? [20:18:09] puppet is running on the new instances, but for example it had some issues with connecting to the database [20:18:46] Majavah: i can't find any mysql process on db08 [20:19:08] Majavah, Urbanecm: i appreciate the help! [20:19:20] currently restoring on db07 [20:19:38] i forgot that bin logging needed to be turned off first so i've done that and restarted the restore [20:20:54] Urbanecm: now that I think of it, could you set ops/mw-config repo to enforce beta readonly? so that even when we merge the patch to add the new master and replace nothing starts writing and potentially breaking replication [20:20:59] sure [20:21:05] I'll do it now [20:21:35] marxarelli: do you want me to install mariadb on db08 in the meantime? [20:21:48] yes please [20:22:35] 10Release-Engineering-Team-TODO (2021-01-01 to 2021-03-31 (Q3)), 10SRE, 10GitLab (Initialization), 10User-brennen: Define auth strategy for GitLab - https://phabricator.wikimedia.org/T274461 (10Sergey.Trofimovsky.SF) >>! In T274461#6895448, @jbond wrote: > I just wanted to note that the SSO project provi... [20:23:49] Majavah: mind a quick +1 on https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/670277? [20:25:30] Urbanecm: maybe add a phab link? otherwise lgtm [20:25:36] good point [20:25:44] Majavah: do you have the link at hand? [20:26:02] https://phabricator.wikimedia.org/T276968 [20:26:03] nvm, found it [20:26:05] thanks [20:27:17] +2'ed [20:27:22] brb. i need to grab some lunch from my kitchen before i write any more sql :) [20:30:20] 10Release-Engineering-Team-TODO, 10DNS, 10SRE, 10Traffic, and 3 others: DNS for GitLab - https://phabricator.wikimedia.org/T276170 (10Dzahn) >>! In T276170#6896563, @wkandek wrote: > gerrit.wikimedia.org lives on a second IP address on gerrit1001. Should we follow that model here as well? It would be appr... [20:31:04] Majavah: patch got merged. Should be live soon, hopefully [20:31:09] (syncing manually) [20:33:01] Majavah: confirmed, MW now enforces read only mode [20:33:20] thanks! [20:33:50] !log install mariadb on deployment-db08 T276968 [20:33:53] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [20:33:54] T276968: deployment-db05 disk issues - https://phabricator.wikimedia.org/T276968 [20:34:44] * marxarelli back [20:36:46] marxarelli: db08 now has mariadb installed but with the same problem with connecting, please do your magic and document it on the task [20:36:59] Majavah: will do. thank you [20:37:26] https://github.com/wikimedia/puppet/blob/production/modules/profile/manifests/mariadb/beta.pp does not include grants at a quick look, we need to fish them somewhere [20:37:52] i'm confused about why it won't connect over the unix socket. maybe there's a ferm rule in place or something [20:38:09] because afaik, the default root@localhost user has no password [20:39:11] the install script says that it should let root@localhost and mysql@localhost connect via the socket [20:39:42] !log doing `--skip-grant-tables` on deployment-db08 and creating a new root@127.0.0.1 user (T276968) [20:39:46] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [20:39:46] T276968: deployment-db05 disk issues - https://phabricator.wikimedia.org/T276968 [20:39:48] something that should be investigated at some point, but not sure if this is the right moment for that [20:40:26] Majavah: right. and lsof says mysqld is listening at /tmp/mysql.sock [20:40:37] but that path doesn't exist [20:41:17] 10Beta-Cluster-Infrastructure, 10Quality-and-Test-Engineering-Team (QTE): Upgrade deployment-prep-db hosts to buster/MariaDB 10.4 - https://phabricator.wikimedia.org/T268628 (10Majavah) beta is getting 10.4 replicas as part of {T276968} [20:41:53] 10Beta-Cluster-Infrastructure, 10Quality-and-Test-Engineering-Team (QTE): Upgrade deployment-prep-db hosts to buster/MariaDB 10.4 - https://phabricator.wikimedia.org/T268628 (10Majavah) [20:41:55] 10Beta-Cluster-Infrastructure, 10Patch-For-Review: deployment-db05 disk issues - https://phabricator.wikimedia.org/T276968 (10Majavah) [20:42:11] weird [20:42:48] how's the import going? [20:42:55] slooowly [20:43:52] well, 10G out of ~ 40G [20:43:57] not too slowly i guess [20:50:16] the production db grants are in https://github.com/wikimedia/puppet/blob/production/modules/role/templates/mariadb/grants/production-core.sql.erb, but I can't find beta grants stored anywhere [20:53:05] !log restore on db07 failed. appears to be a bug w/ mariadb/mysqldump 10.4 compat https://jira.mariadb.org/browse/MDEV-22127 [20:53:09] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [20:53:33] !log restore on db07 failed. appears to be a bug w/ mariadb/mysqldump 10.4 compat https://jira.mariadb.org/browse/MDEV-22127 (T276968) [20:53:37] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [20:53:37] T276968: deployment-db05 disk issues - https://phabricator.wikimedia.org/T276968 [20:54:35] uhh [20:55:59] note that https://wikitech.wikimedia.org/wiki/Nova_Resource:Deployment-prep/MariaDB_Slave_instance_setup lists some other tools [20:56:41] er, db07 is not responding... [20:56:56] oh, thank you. yeah i was looking for those docs earlier [20:57:07] i tried mariabackup but it failed for some obscure reason [20:57:57] and https://wikitech.wikimedia.org/wiki/Setting_up_a_MySQL_replica [20:58:13] but they say to do that before mysqld, which we already did [21:01:06] It's starting to get fairly late so I think i'm going to bed. Please keep the task updated, I can continue looking tomorrow morning if still needed [21:01:47] marxarelli, Urbanecm: ^ [21:01:53] Majavah: ok. thanks for your help [21:14:08] (03CR) 10DannyS712: FunctionAnnotationsSniff: recognize more phpunit annotations (031 comment) [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/670120 (https://phabricator.wikimedia.org/T276971) (owner: 10DannyS712) [21:51:59] puppet dance [21:54:40] !log restoring from db06 dump on db07 and db08 following `DROP VIEW IF EXISTS user` workaround (T276968) [21:54:43] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [21:54:44] T276968: deployment-db05 disk issues - https://phabricator.wikimedia.org/T276968 [21:57:32] looks like the restore is a little farther along than last time [21:58:06] * marxarelli goes for a walk while the restore is running [22:13:31] (03PS1) 10Ahmon Dancy: feat: Add train-dev ssh subcommand [tools/train-dev] - 10https://gerrit.wikimedia.org/r/670311 [23:03:32] (03PS1) 10Ahmon Dancy: deploy-promote: Don't use a fixed branch for train-dev [tools/release] - 10https://gerrit.wikimedia.org/r/670315 [23:05:03] (03PS2) 10Ahmon Dancy: deploy-promote: Don't use a fixed branch for train-dev [tools/release] - 10https://gerrit.wikimedia.org/r/670315 [23:05:54] (03CR) 10Ahmon Dancy: "Something for upcoming train-dev changes." [tools/release] - 10https://gerrit.wikimedia.org/r/670315 (owner: 10Ahmon Dancy) [23:12:10] (03PS1) 10Ahmon Dancy: Use train-dev branch of operations/mediawiki-config [tools/train-dev] - 10https://gerrit.wikimedia.org/r/670316 [23:12:18] (03PS1) 10Hashar: Add /cache to wikimedia-fundraising-civicrm-docker [integration/config] - 10https://gerrit.wikimedia.org/r/670317 (https://phabricator.wikimedia.org/T276983) [23:21:19] (03CR) 10Hashar: "I have updated the Jenkins job ! :)" [integration/config] - 10https://gerrit.wikimedia.org/r/670317 (https://phabricator.wikimedia.org/T276983) (owner: 10Hashar) [23:25:29] 10MediaWiki-Codesniffer, 10Wikidata, 10wdwb-tech-focus: Consider adding bash scripts to mediawiki-codesniffer for only running on touched files - https://phabricator.wikimedia.org/T251533 (10Addshore) >>! In T251533#6142372, @Umherirrender wrote: > There is a bootstrap for use of CI/jenkins, but such a way h... [23:33:49] 10Release-Engineering-Team-TODO, 10DNS, 10SRE, 10Traffic, and 3 others: DNS for GitLab - https://phabricator.wikimedia.org/T276170 (10wkandek) Let's go with the simpler solution and use the CNAME. [23:39:22] 10Continuous-Integration-Config, 10MediaWiki-Core-Testing, 10MediaWiki-Vendor: Ensure with test that composer.json matches between mediawiki/vendor and mediawiki/core - https://phabricator.wikimedia.org/T113360 (10Addshore) >>! In T113360#6889415, @Jdforrester-WMF wrote: > Surely we implicitly test this, bec... [23:49:02] 10Release-Engineering-Team, 10MW-on-K8s, 10serviceops: Investigate how we can provide an mwdebug functionality on kubernetes - https://phabricator.wikimedia.org/T276994 (10jijiki)