[01:24:06] PROBLEM - Mediawiki Error Rate on graphite-labs is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [10.0] [01:47:11] hmm... [01:49:58] so I don't know what shinken-wm is smoking because the site appears to be up for me and no longer read-only either [01:51:11] \o/ [01:51:34] That could be errors twentyafterfour [01:51:39] Rather then uptime [01:52:19] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Kanban), 10Patch-For-Review: Recover from corrupted beta MySQL slave (deployment-db04) - https://phabricator.wikimedia.org/T216067 (10mmodell) it appears that we're out of read-only and replicating properly [01:54:59] can't seem to remember a valid user/pass for beta logstash [01:55:42] I think Firefox may reveal where you can get that info from [01:59:54] “Logstash (ssh deployment-deploy01.deployment-prep.eqiad.wmflabs sudo cat /root/secrets.txt)” [01:59:59] ah got it [02:01:18] cool, looks like it's all good [02:04:15] twentyafterfour: thanks mukunda :) [02:04:32] take the fun of updating that task :) [02:06:28] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Kanban), 10Patch-For-Review, 10User-Ryasmeen: Recover from corrupted beta MySQL slave (deployment-db04) - https://phabricator.wikimedia.org/T216067 (10mmodell) 05Open→03Resolved beta logstash looks good. we're back folks. Ping @RazShuty, @Otto... [02:07:28] we should probably document the different commands on the page [02:09:05] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Kanban), 10Patch-For-Review, 10User-Ryasmeen: Recover from corrupted beta MySQL slave (deployment-db04) - https://phabricator.wikimedia.org/T216067 (10Ryasmeen) Yay! Thanks so much @mmodell and everyone else involved in fixing this. [02:09:05] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team, 10DBA: MySQL database on deployment-db03 does not start due to InnoDB issue - https://phabricator.wikimedia.org/T216635 (10mmodell) 05Stalled→03Declined I think we should just delete db03 now that we've got db04 and db05 going. Can we snapshot t... [02:09:41] twentyafterfour, btw I feel like there was some crashed table somewhere on one of those hosts [02:09:56] should ensure that's taken care of, might be possible to extract non-broken data from the other host [02:10:08] IIRC under centralauth? [02:49:07] PROBLEM - Mediawiki Error Rate on graphite-labs is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [10.0] [03:51:33] 10Continuous-Integration-Config, 10MediaWiki-extensions-General, 10MW-1.32-notes (WMF-deploy-2018-05-08 (1.32.0-wmf.3)): Add phan to MediaWiki extensions and skins for static analysis [cloneable] - https://phabricator.wikimedia.org/T179554 (10Tgr) Should this be closed as resolved, or moved to #outreach-prog... [04:04:09] PROBLEM - Mediawiki Error Rate on graphite-labs is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [10.0] [04:27:36] Krenair: thanks I'll look into it [05:01:40] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Kanban), 10Patch-For-Review, 10User-Ryasmeen: Recover from corrupted beta MySQL slave (deployment-db04) - https://phabricator.wikimedia.org/T216067 (10RazShuty) Omg you're all amazing! Thanks a lot!!!! Much much much appreciated! [05:39:45] 10Phabricator, 10Release-Engineering-Team (Kanban): Unable to edit Herald rule, "unhandled exception" - https://phabricator.wikimedia.org/T217082 (10Aklapper) [05:39:48] 10Phabricator, 10Regression: Cannot edit Herald rules: "Argument 1 passed to HeraldTokenizerFieldValue::setValueMap() must be of the type array, object given" - https://phabricator.wikimedia.org/T216785 (10Aklapper) [05:40:36] 10Phabricator, 10Proton, 10Reading-Infrastructure-Team-Backlog: Update Herald (H228) to include project [insert project number here] - https://phabricator.wikimedia.org/T217078 (10Aklapper) [05:40:39] 10Phabricator, 10Release-Engineering-Team (Kanban): Unable to edit Herald rule, "unhandled exception" - https://phabricator.wikimedia.org/T217082 (10Aklapper) 05duplicate→03Open [05:40:44] 10Phabricator, 10Regression: Cannot edit Herald rules: "Argument 1 passed to HeraldTokenizerFieldValue::setValueMap() must be of the type array, object given" - https://phabricator.wikimedia.org/T216785 (10Aklapper) [05:40:47] 10Phabricator, 10Release-Engineering-Team (Kanban): Unable to edit Herald rule, "unhandled exception" - https://phabricator.wikimedia.org/T217082 (10Aklapper) [05:41:09] 10Phabricator, 10Release-Engineering-Team (Kanban): Unable to edit Herald rules:: "Argument 1 passed to HeraldTokenizerFieldValue::setValueMap() must be of the type array, object given" - https://phabricator.wikimedia.org/T217082 (10Aklapper) [05:49:11] PROBLEM - Mediawiki Error Rate on graphite-labs is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [10.0] [06:09:09] PROBLEM - Mediawiki Error Rate on graphite-labs is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [10.0] [06:28:58] 10Continuous-Integration-Infrastructure, 10docker-pkg, 10Patch-For-Review: Pruning docker-pkg images - https://phabricator.wikimedia.org/T207703 (10Joe) >>! In T207703#4982786, @thcipriani wrote: > I think this just needs to be deployed at this point (which looking at `/srv/deployment` maybe hasn't happened... [06:49:04] 10Phabricator-Bot-Requests, 10WMSE-Organisational-Development-2019, 10User-Sebastian_Berlin-WMSE: Create bot for adding WMSE's projects - https://phabricator.wikimedia.org/T211351 (10Sebastian_Berlin-WMSE) Ok, I'll keep digging. Here's the error log, by the way: ` lines=10 $ arc tasks --unassigned --limit... [06:56:51] 10Phabricator-Bot-Requests, 10WMSE-Organisational-Development-2019, 10User-Sebastian_Berlin-WMSE: Create bot for adding WMSE's projects - https://phabricator.wikimedia.org/T211351 (10Aklapper) Hmm, I wonder if the "cert" stuff in the docs is outdated (but I'm rather clueless). Could you please give this in .... [06:58:49] 10Phabricator-Bot-Requests, 10WMSE-Organisational-Development-2019, 10User-Sebastian_Berlin-WMSE: Create bot for adding WMSE's projects - https://phabricator.wikimedia.org/T211351 (10Sebastian_Berlin-WMSE) 05Open→03Resolved a:03Sebastian_Berlin-WMSE Yes, that fixed it. Thanks! [07:00:03] 10Phabricator-Bot-Requests, 10WMSE-Organisational-Development-2019, 10User-Sebastian_Berlin-WMSE: Create bot for adding WMSE's projects - https://phabricator.wikimedia.org/T211351 (10Sebastian_Berlin-WMSE) [07:00:32] 10Phabricator-Bot-Requests, 10WMSE-Organisational-Development-2019, 10User-Sebastian_Berlin-WMSE: Create bot for adding WMSE's projects - https://phabricator.wikimedia.org/T211351 (10Aklapper) Thanks! (And sorry.) I've updated https://www.mediawiki.org/w/index.php?title=Phabricator%2FBots&type=revision&diff=... [07:19:08] PROBLEM - Mediawiki Error Rate on graphite-labs is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [10.0] [07:49:09] PROBLEM - Mediawiki Error Rate on graphite-labs is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [10.0] [08:14:31] (03PS2) 10Alexandros Kosiaris: Install heapdump and gc-stats when env production [blubber] - 10https://gerrit.wikimedia.org/r/492922 (https://phabricator.wikimedia.org/T205911) [08:22:51] 10Release-Engineering-Team (Watching / External), 10Operations, 10Release Pipeline, 10Core Platform Team Backlog (Watching / External), and 2 others: Track and install additional npm packages for all service container images - https://phabricator.wikimedia.org/T205911 (10akosiaris) I 've just added a simpl... [09:05:37] (03CR) 10Hashar: [C: 03+1] Setup Basic CI for mediawiki/services/kask (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/490678 (https://phabricator.wikimedia.org/T209106) (owner: 10Clarakosi) [09:08:17] (03PS1) 10Hashar: Register mediawiki/extensions/CollapsibleSections [integration/config] - 10https://gerrit.wikimedia.org/r/492972 [09:08:37] (03CR) 10Hashar: [C: 03+2] Register mediawiki/extensions/CollapsibleSections [integration/config] - 10https://gerrit.wikimedia.org/r/492972 (owner: 10Hashar) [09:09:09] PROBLEM - Mediawiki Error Rate on graphite-labs is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [10.0] [09:09:57] (03Merged) 10jenkins-bot: Register mediawiki/extensions/CollapsibleSections [integration/config] - 10https://gerrit.wikimedia.org/r/492972 (owner: 10Hashar) [09:10:58] (03PS1) 10Hashar: Register mediawiki/extensions/TEI [integration/config] - 10https://gerrit.wikimedia.org/r/492973 [09:11:09] (03CR) 10Hashar: [C: 03+2] Register mediawiki/extensions/TEI [integration/config] - 10https://gerrit.wikimedia.org/r/492973 (owner: 10Hashar) [09:12:30] (03Merged) 10jenkins-bot: Register mediawiki/extensions/TEI [integration/config] - 10https://gerrit.wikimedia.org/r/492973 (owner: 10Hashar) [09:18:16] 10Continuous-Integration-Config, 10Tracking: Add CI to all Gerrit repositories - https://phabricator.wikimedia.org/T180317 (10hashar) [09:18:23] 10Continuous-Integration-Config, 10Release-Engineering-Team (Kanban), 10Security-Team, 10Patch-For-Review: Add tests/CI to wikimedia/security/tooling - https://phabricator.wikimedia.org/T216801 (10hashar) 05Open→03Resolved ;) [09:20:15] 10Continuous-Integration-Config, 10Release-Engineering-Team (Kanban), 10Security-Team: Add tests/CI to wikimedia/security/puppet - https://phabricator.wikimedia.org/T217123 (10hashar) [09:30:59] !log remove now-merged node-exporter timer disable, cherry pick https://gerrit.wikimedia.org/r/c/operations/puppet/+/492632 [09:31:29] stashbot: :( [09:32:42] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [09:32:42] See https://wikitech.wikimedia.org/wiki/Tool:Stashbot for help. [09:33:48] godog: seems it is just lagged out hehe [09:41:25] hashar: indeed, sure enough pink timeout [09:44:15] hashar: could you take a look at https://gerrit.wikimedia.org/r/c/operations/puppet/+/492633 when you get a chance? and/or add other folks that might be interested? [09:46:06] godog: no clue? :) [09:46:24] does that mean currently we send jetty and gerrit logs twice? [09:46:39] it does yeah [09:46:44] (once via logstash and once via the logging pipeline) [09:46:58] and we probably had some custom rules in logstash as well [09:47:29] or maybe that was just the generic log4j rule ./modules/logstash/templates/input/log4j.erb [09:48:11] yeah afaics processing was verbatim for gerrit/log4j [09:49:14] hashar: do you know who else other than the folks already in the review might be interested in reviewing that change? [09:49:30] tyler and i [09:50:27] godog: just do it. I have +1ed it [09:51:12] kk, thanks hashar ! [09:51:23] eventually I will have to dig into the Gerrit logging system [09:51:33] entries lacks metadata :/ [09:52:51] and we lack logs anyway ;D [09:54:27] (03PS4) 10Hashar: gerrit: Set zuul-status plugin url [All-Projects] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/488512 (https://phabricator.wikimedia.org/T214068) (owner: 10Paladox) [09:54:41] ah? meaning there's things gerrit isn't logging but should ? [09:54:46] (03CR) 10Hashar: [V: 03+2 C: 03+2] "Hope for the best." [All-Projects] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/488512 (https://phabricator.wikimedia.org/T214068) (owner: 10Paladox) [10:12:25] (03PS1) 10Vgutierrez: Add acme-chief-tox-docker tests for operations/software/acme-chief [integration/config] - 10https://gerrit.wikimedia.org/r/492982 (https://phabricator.wikimedia.org/T207389) [10:12:38] (03PS1) 10Alex Monk: Copy rights setup over from certcentral.git instead of inheriting [software/acme-chief] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/492983 [10:20:41] (03CR) 10Vgutierrez: [C: 03+1] "LGTM" [integration/config] - 10https://gerrit.wikimedia.org/r/492982 (https://phabricator.wikimedia.org/T207389) (owner: 10Vgutierrez) [10:25:14] maintenance-disconnect-full-disks build 50080 integration-slave-docker-1053 (/: 95%): OFFLINE due to disk space [10:25:14] maintenance-disconnect-full-disks build 50080 integration-slave-docker-1053 (/: 95%): OFFLINE due to disk space [10:41:17] hashar: :) [10:44:31] (03CR) 10Hashar: "log/* -> log/ that is a nice one thank you :)" [integration/config] - 10https://gerrit.wikimedia.org/r/492401 (https://phabricator.wikimedia.org/T216622) (owner: 10Krinkle) [10:49:05] (03CR) 10Hashar: [C: 03+2] "The job already exist so that is all fine :)" [integration/config] - 10https://gerrit.wikimedia.org/r/492982 (https://phabricator.wikimedia.org/T207389) (owner: 10Vgutierrez) [10:49:39] grr [10:49:42] that integration-slave-docker-1053 is wrong [10:50:13] maintenance-disconnect-full-disks build 50085 integration-slave-docker-1053: OFFLINE due to disk space [10:50:51] (03Merged) 10jenkins-bot: Add acme-chief-tox-docker tests for operations/software/acme-chief [integration/config] - 10https://gerrit.wikimedia.org/r/492982 (https://phabricator.wikimedia.org/T207389) (owner: 10Vgutierrez) [10:53:56] (03CR) 10Hashar: [C: 03+2] "I have deployed the change" [integration/config] - 10https://gerrit.wikimedia.org/r/492982 (https://phabricator.wikimedia.org/T207389) (owner: 10Vgutierrez) [11:01:47] (03CR) 10Hashar: [C: 03+2] Pin flake8 to 3.6.x [integration/quibble] - 10https://gerrit.wikimedia.org/r/491937 (owner: 10Hashar) [11:02:00] (03CR) 10Hashar: [C: 03+2] doc: quibble-stretch no more has php [integration/quibble] - 10https://gerrit.wikimedia.org/r/491934 (owner: 10Hashar) [11:02:33] (03Merged) 10jenkins-bot: Pin flake8 to 3.6.x [integration/quibble] - 10https://gerrit.wikimedia.org/r/491937 (owner: 10Hashar) [11:02:35] (03CR) 10Hashar: [C: 03+2] Upgrade to flake8 3.7.x [integration/quibble] - 10https://gerrit.wikimedia.org/r/491938 (owner: 10Hashar) [11:02:41] (03Merged) 10jenkins-bot: doc: quibble-stretch no more has php [integration/quibble] - 10https://gerrit.wikimedia.org/r/491934 (owner: 10Hashar) [11:03:08] (03CR) 10jenkins-bot: Pin flake8 to 3.6.x [integration/quibble] - 10https://gerrit.wikimedia.org/r/491937 (owner: 10Hashar) [11:03:32] (03CR) 10jenkins-bot: doc: quibble-stretch no more has php [integration/quibble] - 10https://gerrit.wikimedia.org/r/491934 (owner: 10Hashar) [11:03:47] (03Merged) 10jenkins-bot: Upgrade to flake8 3.7.x [integration/quibble] - 10https://gerrit.wikimedia.org/r/491938 (owner: 10Hashar) [11:04:12] (03CR) 10jenkins-bot: Upgrade to flake8 3.7.x [integration/quibble] - 10https://gerrit.wikimedia.org/r/491938 (owner: 10Hashar) [11:05:09] (03PS2) 10Hashar: fabfile: simplify a string concatenation [integration/config] - 10https://gerrit.wikimedia.org/r/487822 [11:05:17] (03PS2) 10Hashar: Update flake8 3.5.0 to 3.7.4 [integration/config] - 10https://gerrit.wikimedia.org/r/487824 [11:07:05] (03CR) 10Hashar: [C: 03+2] fabfile: simplify a string concatenation [integration/config] - 10https://gerrit.wikimedia.org/r/487822 (owner: 10Hashar) [11:07:23] (03PS3) 10Hashar: Update flake8 3.5.0 to 3.7.7 [integration/config] - 10https://gerrit.wikimedia.org/r/487824 [11:07:32] (03CR) 10Hashar: [C: 03+2] Update flake8 3.5.0 to 3.7.7 [integration/config] - 10https://gerrit.wikimedia.org/r/487824 (owner: 10Hashar) [11:08:31] (03Merged) 10jenkins-bot: fabfile: simplify a string concatenation [integration/config] - 10https://gerrit.wikimedia.org/r/487822 (owner: 10Hashar) [11:08:56] (03Merged) 10jenkins-bot: Update flake8 3.5.0 to 3.7.7 [integration/config] - 10https://gerrit.wikimedia.org/r/487824 (owner: 10Hashar) [11:12:33] (03PS1) 10Hashar: Allow JenkinsBot to submit changes [eventlogging] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/492998 [11:12:49] (03PS2) 10Hashar: Allow JenkinsBot to submit changes [eventlogging] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/492998 (https://phabricator.wikimedia.org/T212396) [11:13:09] (03CR) 10Hashar: [V: 03+2 C: 03+2] Allow JenkinsBot to submit changes [eventlogging] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/492998 (https://phabricator.wikimedia.org/T212396) (owner: 10Hashar) [12:46:33] hashar: about wmf-quibble/mediawiki okay if I try to patch that later? [13:28:26] (03CR) 10Vgutierrez: "> Patch Set 1:" [integration/config] - 10https://gerrit.wikimedia.org/r/492982 (https://phabricator.wikimedia.org/T207389) (owner: 10Vgutierrez) [13:39:02] Krinkle: sorry I have missed your ping [13:39:14] I guess the job filters are broken somehow :(((( [14:01:56] (03PS5) 10Thcipriani: Improve Deployment Pipeline/Gerrit feedback [integration/pipelinelib] - 10https://gerrit.wikimedia.org/r/492225 [14:02:02] (03CR) 10Thcipriani: Improve Deployment Pipeline/Gerrit feedback (0313 comments) [integration/pipelinelib] - 10https://gerrit.wikimedia.org/r/492225 (owner: 10Thcipriani) [14:02:06] hashar: Krinkle: I guess this is the same issue? https://integration.wikimedia.org/ci/job/quibble-vendor-mysql-hhvm-docker/36799/console [14:02:13] https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/CentralNotice/+/492486/ [14:19:06] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Kanban), 10Patch-For-Review, 10User-Ryasmeen: Recover from corrupted beta MySQL slave (deployment-db04) - https://phabricator.wikimedia.org/T216067 (10Ottomata) Thank youuuu! [14:29:09] (03CR) 10Lars Wirzenius: [C: 03+1] "Looks good to me." [integration/pipelinelib] - 10https://gerrit.wikimedia.org/r/492225 (owner: 10Thcipriani) [14:51:09] Project beta-scap-eqiad build #239353: 04FAILURE in 8.7 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/239353/ [14:56:27] 10Release-Engineering-Team (Kanban), 10Release Pipeline, 10Patch-For-Review: Refactor integration/pipelinelib to use blubberoid.wikimedia.org - https://phabricator.wikimedia.org/T212247 (10thcipriani) 05Open→03Resolved a:03dduvall [15:02:37] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Kanban), 10Patch-For-Review, 10User-Ryasmeen: Recover from corrupted beta MySQL slave (deployment-db04) - https://phabricator.wikimedia.org/T216067 (10ggellerman) Thanks, @mmodell ! [15:04:38] Yippee, build fixed! [15:04:39] Project beta-scap-eqiad build #239354: 09FIXED in 10 min: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/239354/ [15:29:00] 10Release-Engineering-Team (Next), 10Release Pipeline, 10Patch-For-Review: Define pipeline failure developer feedback - https://phabricator.wikimedia.org/T177868 (10Ottomata) In addition to failures, it'd be nice to know when a tested and built image is ready and available for deployment. [15:29:08] PROBLEM - Mediawiki Error Rate on graphite-labs is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [10.0] [15:30:05] Anyone know/remember offhand how to check if someone has 2FA enabled on phab? [15:31:45] Reedy: https://phabricator.wikimedia.org/people/query/advanced/ has "Has MFA" [15:32:13] cheers [15:33:14] 10Release-Engineering-Team (Next), 10Release Pipeline, 10Patch-For-Review: Define pipeline failure developer feedback - https://phabricator.wikimedia.org/T177868 (10thcipriani) We currently have 2 paths to an image being built and pushed: 1. Gerrit patch merging, triggers a "post-merge" job to build and pus... [15:51:33] 10Release-Engineering-Team (Kanban), 10MW-1.32-notes, 10MW-1.32-release: Release 1.32.1 as a maintenance release - https://phabricator.wikimedia.org/T213595 (10mmodell) I //still// haven't announced this release. I've been overwhelmed with phabricator work and that doesn't seem to be easing up much this week. [16:00:07] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10Patch-For-Review: Phase out Nodepool from production - https://phabricator.wikimedia.org/T209361 (10RobH) [16:00:12] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10LDAP-Access-Requests: Disable nodepoolmanager user in LDAP - https://phabricator.wikimedia.org/T217064 (10RobH) 05Open→03Resolved a:03RobH Thanks to @volans for pointing out to me we have an offboard script to ha... [16:00:16] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10LDAP-Access-Requests: Disable nodepoolmanager user in LDAP - https://phabricator.wikimedia.org/T217064 (10RobH) a:05RobH→03None [16:01:11] 10Continuous-Integration-Config, 10Lexicographical data, 10Wikidata, 10Patch-For-Review, 10Wikidata-Campsite (Wikidata-Campsite-Iteration-∞): Enable phan checks for WikibaseLexeme extension - https://phabricator.wikimedia.org/T215556 (10Addshore) [16:01:22] 10Continuous-Integration-Config, 10Lexicographical data, 10Wikidata, 10Patch-For-Review, 10Wikidata-Campsite (Wikidata-Campsite-Iteration-∞): Enable phan checks for WikibaseLexeme extension - https://phabricator.wikimedia.org/T215556 (10Addshore) [16:03:54] (03PS2) 10Giuseppe Lavagetto: Fix OpcacheManager.invalidate_all() [tools/scap] - 10https://gerrit.wikimedia.org/r/492738 [16:11:18] 10Continuous-Integration-Config, 10Lexicographical data, 10Wikidata, 10Patch-For-Review, and 2 others: Enable phan checks for WikibaseLexeme extension - https://phabricator.wikimedia.org/T215556 (10Addshore) a:03Ladsgroup [16:11:39] !log Generating 1.33.0-wmf.19 deploy notes https://integration.wikimedia.org/ci/job/train-deploy-notes/9/console | T206673 [16:11:42] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [16:11:42] T206673: 1.33.0-wmf.19 deployment blockers - https://phabricator.wikimedia.org/T206673 [16:15:00] 10Release-Engineering-Team (Kanban), 10Patch-For-Review, 10Release, 10Train Deployments: 1.33.0-wmf.19 deployment blockers - https://phabricator.wikimedia.org/T206673 (10hashar) a:03hashar [16:18:41] (03PS3) 10Thcipriani: Train notes: automatically upload changelog [integration/config] - 10https://gerrit.wikimedia.org/r/492758 [16:23:37] (03PS1) 10Ladsgroup: Enable phan for WikibaseLexeme [integration/config] - 10https://gerrit.wikimedia.org/r/493061 (https://phabricator.wikimedia.org/T215556) [16:25:00] 10Release-Engineering-Team (Kanban), 10Patch-For-Review, 10Release, 10Train Deployments: 1.33.0-wmf.19 deployment blockers - https://phabricator.wikimedia.org/T206673 (10hashar) [16:29:10] PROBLEM - Mediawiki Error Rate on graphite-labs is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [10.0] [16:37:44] 10Release-Engineering-Team (Kanban), 10serviceops, 10Release Pipeline (Blubber): Add k8s credentials for Blubberoid continuous deployment - https://phabricator.wikimedia.org/T217147 (10thcipriani) [16:38:12] 10Release-Engineering-Team (Kanban), 10serviceops, 10Release Pipeline (Blubber): Add k8s credentials for Blubberoid continuous deployment - https://phabricator.wikimedia.org/T217147 (10thcipriani) a:05LarsWirzenius→03None [16:39:56] thcipriani i've done https://gerrit.wikimedia.org/r/#/c/operations/software/gerrit/+/493066/-1..2 (lfs has had a fix for over ssh). [16:42:46] Builds locally so good to merge! [16:46:46] !log added Cparle to deployment-prep [16:46:47] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [16:54:08] PROBLEM - Mediawiki Error Rate on graphite-labs is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [10.0] [17:34:38] 10Release-Engineering-Team (Kanban), 10Release Pipeline, 10Patch-For-Review: Pipeline image build cleanup - https://phabricator.wikimedia.org/T177867 (10mmodell) maybe possibly helpful? docker tag tool: https://github.com/gofunky/tuplip [17:44:07] RECOVERY - Mediawiki Error Rate on graphite-labs is OK: OK: Less than 1.00% above the threshold [1.0] [17:45:22] question: who should i talk to in order to get the rights to create support pages for testing out QuickSurveys on beta enwiki? i.e. creating pages like: https://en.wikipedia.beta.wmflabs.org/wiki/MediaWiki:Reader-trust-1-message [18:01:10] !log deployement-prep: installing elastic 6.5.4 to deployment-elastic* machines [18:01:11] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [18:15:37] AndyRussG: I don't know the context for that ping, but the build failure in CN is not related to what I was mentioning to Antoine about wmf-quibble/mediawiki-quibble, that was about the inefficiency of running both. No actual failures. [18:16:10] Krinkle: ah ok! thx [18:17:12] Krinkle: in a meeting, thanks for the Gerrit comment :) [18:17:22] 10Release-Engineering-Team (Kanban), 10Patch-For-Review, 10Release, 10Train Deployments: 1.33.0-wmf.19 deployment blockers - https://phabricator.wikimedia.org/T206673 (10Jdlrobson) [18:24:01] 10Beta-Cluster-Infrastructure, 10Discovery-Search, 10MediaWiki-Search, 10Services (next): Beta Cluster search box displays unexisting pages as results - https://phabricator.wikimedia.org/T186993 (10EBernhardson) [18:27:09] isaacj: what rights do you need and what's your username? [18:29:08] (03PS1) 10Thcipriani: Pipeline: use wmf base image [blubber] - 10https://gerrit.wikimedia.org/r/493090 [18:29:57] ALL OF THE RIGHTS [18:30:18] thanks greg-g: page creation rights. if i go to a page that i want to create (e.g., https://en.wikipedia.beta.wmflabs.org/wiki/MediaWiki:Reader-demographics-1-message), there is no way for me to create it. my user name is "Isaac" (https://en.wikipedia.beta.wmflabs.org/wiki/User:Isaac) -- I couldn't add the "WMF" when generating the account as I think that's been blacklisted :) [18:30:32] let me know if i can do anything else re: verification etc. [18:31:00] Mostly we don't care for beta [18:31:33] https://en.wikipedia.beta.wmflabs.org/wiki/Special:UserRights/Isaac which group(s) do you want? :) [18:32:48] If it's just messages.. Possibly just administrator I think [18:34:50] based on https://en.wikipedia.beta.wmflabs.org/wiki/Special:UserRights/Bmansurov_(WMF) who has also generated these pages in the past, I think it's "Interface administrators" [18:35:25] thanks for helping figure this out...not my domain either :) [18:35:36] editinterface right looks to be the key there [18:35:50] As it's not js, json and css [18:36:05] isaacj: Yuo should be able to create that page now [18:36:37] yep - thanks Reedy and greg-g !! [18:36:38] thanks reedy, you bet me [18:36:49] * greg-g is multitasking and lost [18:36:53] interface admin might've been enough... But whatever [18:45:51] 10Gerrit, 10Release-Engineering-Team: Deploy multi-site plugin to cobalt and gerrit2001 - https://phabricator.wikimedia.org/T217174 (10Paladox) [18:57:22] (03CR) 10Paladox: "recheck" [integration/zuul] (debian/jessie-wikimedia) - 10https://gerrit.wikimedia.org/r/491641 (https://phabricator.wikimedia.org/T214807) (owner: 10Paladox) [19:21:01] thcipriani https://gerrit.git.wmflabs.org/r/c/testing/test/+/2161#message-b4b1866986220a97d3c0655f11eb93c698f5ffa8 [19:21:34] bd808: so I was looking how to process your repo creation request but I'm not sure how to make replication not to follow the usual scheme [19:21:42] thcipriani i also figured out how to get it to work constantly now :) [19:21:49] (ie a hard refresh works now!) [19:28:10] (03CR) 10Alexandros Kosiaris: [C: 03+1] Pipeline: use wmf base image [blubber] - 10https://gerrit.wikimedia.org/r/493090 (owner: 10Thcipriani) [19:29:19] I wonder if a status badge would work [19:34:53] 10Phabricator: Update H26 to be "first time" instead of "every time" - https://phabricator.wikimedia.org/T217188 (10MBinder_WMF) [19:35:28] 10Phabricator, 10Release-Engineering-Team (Kanban): Unable to edit Herald rules:: "Argument 1 passed to HeraldTokenizerFieldValue::setValueMap() must be of the type array, object given" - https://phabricator.wikimedia.org/T217082 (10MBinder_WMF) [19:35:31] 10Phabricator: Update H26 to be "first time" instead of "every time" - https://phabricator.wikimedia.org/T217188 (10MBinder_WMF) [20:00:54] Project beta-scap-eqiad build #239380: 04FAILURE in 6.4 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/239380/ [20:14:46] Yippee, build fixed! [20:14:46] Project beta-scap-eqiad build #239381: 09FIXED in 10 min: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/239381/ [20:20:23] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Kanban), 10Patch-For-Review, 10User-Ryasmeen: Recover from corrupted beta MySQL slave (deployment-db04) - https://phabricator.wikimedia.org/T216067 (10ArielGlenn) I have a request: can the procedure for setting up the replica/new master/whatever fi... [20:28:43] 10Release-Engineering-Team (Kanban), 10Patch-For-Review, 10Release, 10Train Deployments: 1.33.0-wmf.19 deployment blockers - https://phabricator.wikimedia.org/T206673 (10hashar) [21:29:58] 10Continuous-Integration-Config, 10Fresnel, 10Performance-Team: Add "Console section" pattern to Jenkins to make Fresnel results easy to find - https://phabricator.wikimedia.org/T216572 (10Krinkle) 05Open→03Resolved {F28290123} {F28290124 size=full} [21:31:18] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Kanban), 10Patch-For-Review, 10User-Ryasmeen: Recover from corrupted beta MySQL slave (deployment-db04) - https://phabricator.wikimedia.org/T216067 (10mmodell) a:05mmodell→03None @ArielGlenn it's mostly covered on https://wikitech.wikimedia.org... [21:51:15] (03CR) 10Thcipriani: [C: 03+2] "Nice! Looks like everything should work." (031 comment) [tools/scap] - 10https://gerrit.wikimedia.org/r/492738 (owner: 10Giuseppe Lavagetto) [21:52:39] (03Merged) 10jenkins-bot: Fix OpcacheManager.invalidate_all() [tools/scap] - 10https://gerrit.wikimedia.org/r/492738 (owner: 10Giuseppe Lavagetto) [21:53:17] (03CR) 10jenkins-bot: Fix OpcacheManager.invalidate_all() [tools/scap] - 10https://gerrit.wikimedia.org/r/492738 (owner: 10Giuseppe Lavagetto) [22:16:05] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Kanban), 10Patch-For-Review, 10User-Ryasmeen: Recover from corrupted beta MySQL slave (deployment-db04) - https://phabricator.wikimedia.org/T216067 (10mmodell) https://wikitech.wikimedia.org/wiki/Nove_Resource:Deployment-prep/MariaDB_Slave_instance... [22:17:47] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Kanban), 10Patch-For-Review, 10User-Ryasmeen: Recover from corrupted beta MySQL slave (deployment-db04) - https://phabricator.wikimedia.org/T216067 (10Krenair) The package doesn't come from puppet? [22:18:56] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Kanban), 10Patch-For-Review, 10User-Ryasmeen: Recover from corrupted beta MySQL slave (deployment-db04) - https://phabricator.wikimedia.org/T216067 (10mmodell) I'm not sure, I got to this kinda late, I'm still tweaking the docs by re-reading the hi... [22:24:45] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Kanban), 10Patch-For-Review, 10User-Ryasmeen: Recover from corrupted beta MySQL slave (deployment-db04) - https://phabricator.wikimedia.org/T216067 (10mmodell) @krenair: does that look better now? [22:27:18] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Kanban), 10Patch-For-Review, 10User-Ryasmeen: Recover from corrupted beta MySQL slave (deployment-db04) - https://phabricator.wikimedia.org/T216067 (10Krenair) Yep, ty [22:29:55] 10Gerrit, 10Jade, 10Scoring-platform-team: Clone gerrit repo mediawiki/extensions/JADE to mediawiki/extensions/Jade - https://phabricator.wikimedia.org/T212180 (10Halfak) p:05Triage→03Normal [22:30:20] 10Continuous-Integration-Config, 10Jade, 10Scoring-platform-team, 10Patch-For-Review: Rename JADE->Jade in continuous integration - https://phabricator.wikimedia.org/T212181 (10Halfak) p:05Triage→03Normal [22:30:36] 10Continuous-Integration-Config, 10Jade, 10Patch-For-Review, 10Scoring-platform-team (Current): Rename JADE->Jade in continuous integration - https://phabricator.wikimedia.org/T212181 (10Halfak) [22:44:10] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Kanban), 10Patch-For-Review, 10User-Ryasmeen: Recover from corrupted beta MySQL slave (deployment-db04) - https://phabricator.wikimedia.org/T216067 (10ArielGlenn) The page title looks wrong: 'Nove Resource:' instead of Nova Resource. [22:44:54] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10LDAP-Access-Requests: Disable nodepoolmanager user in LDAP - https://phabricator.wikimedia.org/T217064 (10hashar) That is excellent! Thank you @RobH [22:45:12] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10Patch-For-Review: Phase out Nodepool from production - https://phabricator.wikimedia.org/T209361 (10hashar) [22:46:25] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10Patch-For-Review: Phase out Nodepool from production - https://phabricator.wikimedia.org/T209361 (10hashar) 05Stalled→03Resolved All clean up tasks conducted. The remaining one is to dispose of the hardware which... [22:50:31] 10Gerrit: Install rename-project plugin - https://phabricator.wikimedia.org/T201953 (10MarcoAurelio) @Paladox Does the plugin support NoteDB (or whichever storage we will be using in the next gerrit upgrade?) Thanks. [22:51:30] 10Gerrit: Install rename-project plugin - https://phabricator.wikimedia.org/T201953 (10Paladox) Nope, not currently :( [22:52:59] 10Release-Engineering-Team: Create production code deployment management process - https://phabricator.wikimedia.org/T203703 (10Jrbranaa) Although the goal is to have all code that deploys to production go through this process, it may be necessary to have two branches in this process. One that is for "permanent... [22:53:13] 10Gerrit: Install rename-project plugin - https://phabricator.wikimedia.org/T201953 (10MarcoAurelio) >>! In T201953#4986655, @Paladox wrote: > Nope, not currently :( 😿 [22:53:42] paladox: sad indeed [22:53:47] yup [22:53:48] looks like a good plugin [22:59:12] 10Continuous-Integration-Infrastructure, 10Code-Health, 10JavaScript, 10Test-Coverage: Generate JavaScript code coverage reports for extensions - https://phabricator.wikimedia.org/T184657 (10Jrbranaa) [22:59:38] 10Continuous-Integration-Infrastructure, 10Code-Health, 10JavaScript, 10Test-Coverage: Generate JavaScript code coverage reports for extensions - https://phabricator.wikimedia.org/T184657 (10Jrbranaa) p:05Normal→03High [23:06:26] (03CR) 10Brennen Bearnes: [C: 03+1] Pipeline: use wmf base image [blubber] - 10https://gerrit.wikimedia.org/r/493090 (owner: 10Thcipriani) [23:51:03] Project beta-scap-eqiad build #239401: 04FAILURE in 1 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/239401/ [23:52:23] 10Beta-Cluster-Infrastructure, 10Discovery-Search, 10Elasticsearch, 10Wikimedia-Logstash: ApiFeatureUsage data is not being populated in the Beta Cluster - https://phabricator.wikimedia.org/T183156 (10EBernhardson) poked this a little today. The logstash api claims it is outputting events, the numbers are...