[00:01:03] PROBLEM - Mediawiki Error Rate on graphite-labs is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [10.0] [00:24:43] twentyafterfour: At https://phabricator.wikimedia.org/T227709, I am unable to close the task. [00:24:56] It seems since the "Task to Prod Error" change, there is no longer a "status" field [00:25:14] Krinkle: my mistake, I'll fix it [00:25:53] thx :) [00:26:39] Krinkle: how about now? [00:31:31] twentyafterfour: yep, it's back. I think on the edit form we usually have it near the top after title, but don't mind much [00:31:47] the comment form shows it again as well now. cool [00:33:18] Krinkle: the changes I made are in preparation for having the form pre-filled automatically via a link from kibana [00:34:23] So we can click a "Submit phab task" button in kibana and it'll link directly to the submit form with all of the values pre-filled with data copied from the logstash values [00:34:43] should save significant time when reporting errors from kibana monitoring [00:35:28] I'm also working on a way to avoid duplicate submissions but that may take a bit more work to perfect [00:45:42] twentyafterfour: "Wikimedia-production-error (Shared Build Failure)" tasks should not be marked production error. (not logstash related) [00:46:42] perhaps it could be transplanted to #ci-test-error if that helps [01:01:02] PROBLEM - Mediawiki Error Rate on graphite-labs is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [10.0] [01:16:04] PROBLEM - Mediawiki Error Rate on graphite-labs is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [10.0] [01:32:19] (03PS2) 1020after4: Phatality plugin for kibana [releng/phatality] - 10https://gerrit.wikimedia.org/r/531047 [01:33:54] Krinkle: ok ... [01:37:16] (03CR) 1020after4: "This is now ready for code review. If there are no objections from logstash maintainers then I think this is very nearly ready for product" [releng/phatality] - 10https://gerrit.wikimedia.org/r/531047 (owner: 1020after4) [01:38:09] (03PS3) 1020after4: Phatality plugin for kibana [releng/phatality] - 10https://gerrit.wikimedia.org/r/531047 (https://phabricator.wikimedia.org/T230752) [02:11:05] PROBLEM - Mediawiki Error Rate on graphite-labs is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [10.0] [02:11:44] 10Continuous-Integration-Config, 10Release-Engineering-Team (Unit & Int & System Tooling), 10Release-Engineering-Team-TODO, 10MediaWiki-Core-Testing, and 5 others: Reduce runtime of MW shared gate Jenkins jobs to 5 min - https://phabricator.wikimedia.org/T225730 (10Krinkle) @Simetrical Agreed. The slow tes... [02:32:17] !log Delete erroneous branch 'REL1_34-5' (108cb95c57a6e02b8) from mediawiki/extensions/SoftRedirector. ref https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/SoftRedirector/+/532274/ [02:32:39] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [04:11:00] PROBLEM - Mediawiki Error Rate on graphite-labs is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [10.0] [06:11:05] PROBLEM - Mediawiki Error Rate on graphite-labs is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [10.0] [06:54:47] RECOVERY - Free space - all mounts on deployment-fluorine02 is OK: OK: All targets OK [07:37:39] 10Gerrit, 10Release-Engineering-Team-TODO (201908), 10User-zeljkofilipin: Can not download a specific patch from Gerrit using git-review - https://phabricator.wikimedia.org/T194520 (10hashar) @Huji great and thank you for the feedback ;-] [08:11:03] PROBLEM - Mediawiki Error Rate on graphite-labs is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [10.0] [08:13:23] 10Continuous-Integration-Config, 10Gerrit, 10ISA, 10translatewiki.net, and 3 others: Setup L10n for labs/tools/Isa.git - https://phabricator.wikimedia.org/T230646 (10Eugene233) @siebrand @Nikerabbit @Nemo_bis Can you please have a look at this when you get the chance to ? [08:32:35] 10Continuous-Integration-Config, 10Gerrit, 10ISA, 10translatewiki.net, and 3 others: Setup L10n for labs/tools/Isa.git - https://phabricator.wikimedia.org/T230646 (10Nikerabbit) I see bunch of files in i18n and translations. Which files should be set up for translation? Also, I don't see mention of which l... [08:38:31] 10Continuous-Integration-Config, 10Gerrit, 10ISA, 10translatewiki.net, and 3 others: Setup L10n for labs/tools/Isa.git - https://phabricator.wikimedia.org/T230646 (10Nemo_bis) The basic information Nikerabbit mentioned is listed at https://translatewiki.net/wiki/Translating:New_project and https://translat... [08:44:04] 10Continuous-Integration-Config, 10Gerrit, 10ISA, 10translatewiki.net, and 3 others: Setup L10n for labs/tools/Isa.git - https://phabricator.wikimedia.org/T230646 (10Eugene233) [08:45:15] 10Phabricator, 10Documentation: Document subtypes / task types in Phabricator - https://phabricator.wikimedia.org/T224417 (10Nemo_bis) >>! In T224417#5215068, @Quiddity wrote: > I made screenshots, but I don't know enough to write the words. > https://commons.wikimedia.org/wiki/File:Phabricator_task_subtypes.j... [08:49:02] 10Phabricator-Production-Instance: Decide whether we need to add a severity (impact) field to match Bugzilla's - https://phabricator.wikimedia.org/T102 (10Nemo_bis) >>! In T102#11219, @Nemo_bis wrote: > Moreover, it's a huge culture shift. For countless years we've stressed how priority has a relative value (ver... [08:58:25] 10Continuous-Integration-Config, 10Gerrit, 10ISA, 10translatewiki.net, and 3 others: Setup L10n for labs/tools/Isa.git - https://phabricator.wikimedia.org/T230646 (10Eugene233) @Nikerabbit ideally, the files to be setup for translations are files under `translations//messages.po` which is actual... [10:09:32] (03PS1) 10Hashar: zuul: group PhpTags* extensions together [integration/config] - 10https://gerrit.wikimedia.org/r/533179 [10:11:08] PROBLEM - Mediawiki Error Rate on graphite-labs is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [10.0] [10:21:46] (03PS1) 10Hashar: jjb: allow parallel maven builds [integration/config] - 10https://gerrit.wikimedia.org/r/533182 [10:22:21] (03CR) 10Hashar: [C: 03+2] "Deployed!" [integration/config] - 10https://gerrit.wikimedia.org/r/533182 (owner: 10Hashar) [10:41:56] (03CR) 10Hashar: [C: 03+2] "Noop in Zuul since the output has projects sorted :]" [integration/config] - 10https://gerrit.wikimedia.org/r/533179 (owner: 10Hashar) [10:43:39] (03Merged) 10jenkins-bot: zuul: group PhpTags* extensions together [integration/config] - 10https://gerrit.wikimedia.org/r/533179 (owner: 10Hashar) [10:44:23] (03Merged) 10jenkins-bot: jjb: allow parallel maven builds [integration/config] - 10https://gerrit.wikimedia.org/r/533182 (owner: 10Hashar) [10:45:48] (03PS1) 10Hashar: PhpTags now passes php 7.1 [integration/config] - 10https://gerrit.wikimedia.org/r/533187 (https://phabricator.wikimedia.org/T206296) [10:46:03] (03CR) 10Hashar: [C: 03+2] PhpTags now passes php 7.1 [integration/config] - 10https://gerrit.wikimedia.org/r/533187 (https://phabricator.wikimedia.org/T206296) (owner: 10Hashar) [10:47:35] (03CR) 10jerkins-bot: [V: 04-1] PhpTags now passes php 7.1 [integration/config] - 10https://gerrit.wikimedia.org/r/533187 (https://phabricator.wikimedia.org/T206296) (owner: 10Hashar) [10:47:47] (03CR) 10jerkins-bot: [V: 04-1] PhpTags now passes php 7.1 [integration/config] - 10https://gerrit.wikimedia.org/r/533187 (https://phabricator.wikimedia.org/T206296) (owner: 10Hashar) [10:47:52] pfff [10:47:52] :- [10:47:54] ( [10:50:19] (03PS2) 10Hashar: PhpTags now passes php 7.1 [integration/config] - 10https://gerrit.wikimedia.org/r/533187 (https://phabricator.wikimedia.org/T206296) [10:53:06] (03CR) 10Hashar: [C: 03+2] PhpTags now passes php 7.1 [integration/config] - 10https://gerrit.wikimedia.org/r/533187 (https://phabricator.wikimedia.org/T206296) (owner: 10Hashar) [10:58:21] (03Merged) 10jenkins-bot: PhpTags now passes php 7.1 [integration/config] - 10https://gerrit.wikimedia.org/r/533187 (https://phabricator.wikimedia.org/T206296) (owner: 10Hashar) [11:19:57] 10Release-Engineering-Team (Deployment services), 10Release, 10Train Deployments: 1.34.0-wmf.21 deployment blockers - https://phabricator.wikimedia.org/T220746 (10Mainframe98) [11:32:10] 10Continuous-Integration-Config, 10Release-Engineering-Team (Unit & Int & System Tooling), 10Release-Engineering-Team-TODO, 10MediaWiki-Core-Testing, and 5 others: Reduce runtime of MW shared gate Jenkins jobs to 5 min - https://phabricator.wikimedia.org/T225730 (10Simetrical) If you want to split things u... [11:35:51] 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO (201908), 10Release, 10Train Deployments, 10User-zeljkofilipin: 1.34.0-wmf.20 deployment blockers - https://phabricator.wikimedia.org/T220745 (10zeljkofilipin) [11:37:06] 10Phabricator: Phabricator shows that I removed other subscribers on a ticket. - https://phabricator.wikimedia.org/T231543 (10Fnielsen) [11:42:49] 10Phabricator: Phabricator shows that I removed other subscribers on a ticket. - https://phabricator.wikimedia.org/T231543 (10Peachey88) Where? I can't see that in my view, could you perhaps attach a screenshot? [11:45:30] 10Phabricator: Phabricator shows that I removed other subscribers on a ticket. - https://phabricator.wikimedia.org/T231543 (10Reedy) [11:46:30] 10Phabricator: Phabricator shows that I removed other subscribers on a ticket. - https://phabricator.wikimedia.org/T231543 (10Reedy) >>! In T231543#5450718, @Peachey88 wrote: > Where? I can't see that in my view, could you perhaps attach a screenshot? Ditto, I can't see anything. Have you made changes above th... [11:47:30] 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO (201908), 10Release, 10Train Deployments, 10User-zeljkofilipin: 1.34.0-wmf.20 deployment blockers - https://phabricator.wikimedia.org/T220745 (10zeljkofilipin) [11:55:09] 10Phabricator: Phabricator shows that I removed other subscribers on a ticket. - https://phabricator.wikimedia.org/T231543 (10Fnielsen) 05Open→03Resolved a:03Fnielsen Thanks, that was apparently not submitted but just shown as preview The line showed up at the bottom when "Change Subscribers" is selected... [12:11:04] PROBLEM - Mediawiki Error Rate on graphite-labs is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [10.0] [12:18:48] 10Continuous-Integration-Config, 10Release-Engineering-Team, 10phan: Phan now (wrongly) complains about code using variadic parameters - https://phabricator.wikimedia.org/T228695 (10daniel) I'm getting: > 11:09:17 Fatal error: Parameter $components is variadic and has a type constraint (array); variadic para... [12:21:29] 10Continuous-Integration-Config, 10Release-Engineering-Team, 10phan: Phan now (wrongly) complains about code using variadic parameters - https://phabricator.wikimedia.org/T228695 (10Daimona) >>! In T228695#5450795, @daniel wrote: > I'm getting: >> 11:09:17 Fatal error: Parameter $components is variadic and h... [12:34:24] 10Continuous-Integration-Config, 10Release-Engineering-Team, 10phan: Phan now (wrongly) complains about code using variadic parameters - https://phabricator.wikimedia.org/T228695 (10daniel) >>! In T228695#5450805, @Daimona wrote: > Heh, this is an HHVM +PHPUnit limitation with variadic functions. IIRC, if yo... [12:38:52] 10Continuous-Integration-Config, 10Release-Engineering-Team, 10phan: Phan now (wrongly) complains about code using variadic parameters - https://phabricator.wikimedia.org/T228695 (10Daimona) >>! In T228695#5450824, @daniel wrote: >>>! In T228695#5450805, @Daimona wrote: >> Heh, this is an HHVM +PHPUnit limit... [13:04:04] (03PS1) 10Hashar: DonationInterface job is now passing fine [integration/config] - 10https://gerrit.wikimedia.org/r/533205 (https://phabricator.wikimedia.org/T203084) [13:05:38] (03CR) 10jerkins-bot: [V: 04-1] DonationInterface job is now passing fine [integration/config] - 10https://gerrit.wikimedia.org/r/533205 (https://phabricator.wikimedia.org/T203084) (owner: 10Hashar) [13:10:29] 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO (201908), 10Release, 10Train Deployments, 10User-zeljkofilipin: 1.34.0-wmf.20 deployment blockers - https://phabricator.wikimedia.org/T220745 (10zeljkofilipin) [13:17:49] Daimona: replie d:) [13:17:58] => AbuseFilterVariableHolder {#2703 [13:17:58] +forFilter: false, [13:17:58] +mVarsVersion: 2, [13:18:46] hashar: thanks, following-up on phab [13:20:42] 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO (201908), 10Release, 10Train Deployments, 10User-zeljkofilipin: 1.34.0-wmf.20 deployment blockers - https://phabricator.wikimedia.org/T220745 (10zeljkofilipin) [13:30:31] 10Continuous-Integration-Config, 10Release-Engineering-Team, 10phan: Phan now (wrongly) complains about code using variadic parameters - https://phabricator.wikimedia.org/T228695 (10daniel) >>! In T228695#5450847, @Daimona wrote: > There are various ways... One is to remove the variadic params, but I don't l... [14:09:20] 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO (201908), 10Release, 10Train Deployments, 10User-zeljkofilipin: 1.34.0-wmf.20 deployment blockers - https://phabricator.wikimedia.org/T220745 (10Nikerabbit) [14:11:01] PROBLEM - Mediawiki Error Rate on graphite-labs is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [10.0] [14:15:41] Daimona: that worked :) [14:15:58] How? [14:19:11] Daimona: and i have uploaded the result [14:19:59] Thanks [14:22:32] (03PS1) 10Mholloway: Revert "MachineVision: Depend on Wikibase" [integration/config] - 10https://gerrit.wikimedia.org/r/533232 [14:29:58] (03PS2) 10Hashar: DonationInterface job is now passing fine [integration/config] - 10https://gerrit.wikimedia.org/r/533205 (https://phabricator.wikimedia.org/T203084) [14:42:59] 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO (201908), 10Release, 10Train Deployments, 10User-zeljkofilipin: 1.34.0-wmf.20 deployment blockers - https://phabricator.wikimedia.org/T220745 (10zeljkofilipin) [14:47:33] 10Phabricator: Phabricator shows that I removed other subscribers on a ticket. - https://phabricator.wikimedia.org/T231543 (10JJMC89) 05Resolved→03Invalid a:05Fnielsen→03None [15:04:32] (03CR) 10Thcipriani: [V: 03+2 C: 03+2] "> Any reason this has not been merged yet?" [All-Projects] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/528772 (https://phabricator.wikimedia.org/T230015) (owner: 10Ladsgroup) [15:14:22] (03CR) 10Hashar: [C: 03+2] DonationInterface job is now passing fine [integration/config] - 10https://gerrit.wikimedia.org/r/533205 (https://phabricator.wikimedia.org/T203084) (owner: 10Hashar) [15:16:32] (03Merged) 10jenkins-bot: DonationInterface job is now passing fine [integration/config] - 10https://gerrit.wikimedia.org/r/533205 (https://phabricator.wikimedia.org/T203084) (owner: 10Hashar) [15:32:25] (03CR) 10Jforrester: "Wikibase's tests will be run anyway as it's in the gate, right?" [integration/config] - 10https://gerrit.wikimedia.org/r/533232 (owner: 10Mholloway) [15:37:49] (03CR) 10Mholloway: "I don't believe that Wikibase tests were running on gate-and-submit before this change, but I could be wrong. The console output from back" [integration/config] - 10https://gerrit.wikimedia.org/r/533232 (owner: 10Mholloway) [15:38:58] PROBLEM - Work requests waiting in Zuul Gearman server on contint1001 is CRITICAL: CRITICAL: 53.33% of data above the critical threshold [140.0] https://www.mediawiki.org/wiki/Continuous_integration/Zuul https://grafana.wikimedia.org/dashboard/db/zuul-gearman?panelId=10&fullscreen&orgId=1 [15:40:14] (03PS1) 10Hashar: jjb: remove some useless wmf-quibble jobs [integration/config] - 10https://gerrit.wikimedia.org/r/533248 [15:43:41] (03CR) 10Jforrester: "Nice simplification." [integration/config] - 10https://gerrit.wikimedia.org/r/533248 (owner: 10Hashar) [15:44:12] (03CR) 10Jforrester: "(Should we retain the php73 ones given we may switch to php73 in production and so want them "soon"?)" [integration/config] - 10https://gerrit.wikimedia.org/r/533248 (owner: 10Hashar) [15:44:27] (03CR) 10Jforrester: [C: 03+2] Revert "MachineVision: Depend on Wikibase" [integration/config] - 10https://gerrit.wikimedia.org/r/533232 (owner: 10Mholloway) [15:45:50] (03CR) 10Hashar: "Possibly? Then I am not aware of any plan to upgrade to php7.3. My understanding is we first want to migrate out of HHVM?" [integration/config] - 10https://gerrit.wikimedia.org/r/533248 (owner: 10Hashar) [15:46:29] (03Merged) 10jenkins-bot: Revert "MachineVision: Depend on Wikibase" [integration/config] - 10https://gerrit.wikimedia.org/r/533232 (owner: 10Mholloway) [15:46:31] (03CR) 10Jforrester: "> Patch Set 1:" [integration/config] - 10https://gerrit.wikimedia.org/r/533248 (owner: 10Hashar) [15:47:58] !log Zuul: Drop Wikibase dependency from MachineVision [15:48:00] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [15:49:21] (03CR) 10Hashar: "Good point, though that will be trivial to add back later on:" [integration/config] - 10https://gerrit.wikimedia.org/r/533248 (owner: 10Hashar) [15:50:11] (03CR) 10Jforrester: [C: 03+1] jjb: remove some useless wmf-quibble jobs [integration/config] - 10https://gerrit.wikimedia.org/r/533248 (owner: 10Hashar) [15:50:33] James_F: I tried to clean out the mess in the layout.yaml file today but eventually gave up :-] [15:50:46] notably tried to move DonationInterface / fundraising/REL1_31 to something easier to manager [15:50:51] but I could not find a good solution :^ [15:50:53] hashar: I've been thinking about making a section for archived repos. [15:51:02] oh [15:51:09] like move all archived project down the file? [15:51:10] Did you see my patches trying to remove skip-ifs and instead have positive match rules? [15:51:18] yeah seen it [15:51:23] Yeah, so they don't blow things up. [15:51:26] E_TOO_COMPLICATED :-\ [15:51:37] True. But skip-if is very complicated. [15:51:53] And we're about to need to skip php70/php71 on master/REL1_34… [15:54:01] (03CR) 10Hashar: [C: 03+2] "I have deleted the jobs." [integration/config] - 10https://gerrit.wikimedia.org/r/533248 (owner: 10Hashar) [15:54:59] I thought also about expanding all those filters [15:55:04] since only the last one matches [15:55:07] Yeah. [15:55:35] The magic is a bit scary. Maybe we should avoid it entirely? [15:55:56] I also thought about having a pipeline per branch [15:56:18] then the jobs to trigger per branch are listed explicitly [15:56:31] Yeah, that could work. [15:56:31] and we no more need branch filter / skip if :] [15:56:36] +1 [15:58:46] (03Merged) 10jenkins-bot: jjb: remove some useless wmf-quibble jobs [integration/config] - 10https://gerrit.wikimedia.org/r/533248 (owner: 10Hashar) [15:58:58] the new wikimedia error page does not give the requestid anymore, is this known issue? [16:00:26] Nikerabbit: not that I know of. If in doubt, fill a new task :] [16:01:17] hmm, this is fatal error, maybe that one never had.... [16:11:03] PROBLEM - Mediawiki Error Rate on graphite-labs is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [10.0] [16:13:10] RECOVERY - Work requests waiting in Zuul Gearman server on contint1001 is OK: OK: Less than 30.00% above the threshold [90.0] https://www.mediawiki.org/wiki/Continuous_integration/Zuul https://grafana.wikimedia.org/dashboard/db/zuul-gearman?panelId=10&fullscreen&orgId=1 [16:20:30] (03PS1) 10Jforrester: layout: Move all archived repos to their own section at the end (no-op) [integration/config] - 10https://gerrit.wikimedia.org/r/533257 [16:41:13] (03CR) 10Krinkle: [C: 03+1] layout: Move all archived repos to their own section at the end (no-op) [integration/config] - 10https://gerrit.wikimedia.org/r/533257 (owner: 10Jforrester) [16:58:05] (03CR) 10Thcipriani: [C: 03+2] Update go-import and wikimedia plugins [software/gerrit] (wmf/stable-2.15) - 10https://gerrit.wikimedia.org/r/525869 (owner: 10Paladox) [17:05:12] (03Merged) 10jenkins-bot: Update go-import and wikimedia plugins [software/gerrit] (wmf/stable-2.15) - 10https://gerrit.wikimedia.org/r/525869 (owner: 10Paladox) [17:14:33] (03PS1) 10Mholloway: WikibaseMediaInfo: Stop running all Wikibase-related tests [integration/config] - 10https://gerrit.wikimedia.org/r/533272 [17:25:11] Project beta-scap-eqiad build #264625: 04FAILURE in 43 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/264625/ [17:35:14] Project beta-scap-eqiad build #264626: 04STILL FAILING in 49 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/264626/ [17:45:08] Project beta-scap-eqiad build #264627: 04STILL FAILING in 41 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/264627/ [17:55:11] Project beta-scap-eqiad build #264628: 04STILL FAILING in 45 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/264628/ [17:57:52] Fixing Beta Cluster with https://gerrit.wikimedia.org/r/c/mediawiki/extensions/MachineVision/+/533278 [18:02:36] PROBLEM - Work requests waiting in Zuul Gearman server on contint1001 is CRITICAL: CRITICAL: 35.71% of data above the critical threshold [140.0] https://www.mediawiki.org/wiki/Continuous_integration/Zuul https://grafana.wikimedia.org/dashboard/db/zuul-gearman?panelId=10&fullscreen&orgId=1 [18:03:01] Gearman queue issue again? [18:05:21] Project beta-scap-eqiad build #264629: 04STILL FAILING in 42 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/264629/ [18:11:03] PROBLEM - Mediawiki Error Rate on graphite-labs is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [10.0] [18:11:37] Project beta-scap-eqiad build #264630: 04STILL FAILING in 43 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/264630/ [18:15:17] Project beta-scap-eqiad build #264631: 04STILL FAILING in 44 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/264631/ [18:22:13] > ExtensionDependencyError from line 345 of /srv/mediawiki-staging/php-master/includes/registration/ExtensionRegistry.php: MachineVision requires WikibaseClient to be installed. [18:22:23] re:beta-scap-eqiad failure [18:22:54] (03Abandoned) 10Mholloway: WikibaseMediaInfo: Stop running all Wikibase-related tests [integration/config] - 10https://gerrit.wikimedia.org/r/533272 (owner: 10Mholloway) [18:22:56] thcipriani: Already fixing, it's waiting on CI. [18:23:04] ah, heh, good luck :) [18:23:06] thcipriani: https://gerrit.wikimedia.org/r/533278 [18:23:32] thanks for that [18:24:02] all the jenkins executors are busy [18:24:04] mdholloway is doing great work trying to make CI faster for his teams, it's just unfortunately not possible. [18:24:11] so zuul seems to be doing the best it can, currently [18:24:30] Yeah, we've had extended periods of 100% saturation of executors for a few weeks. I think we need more. [18:25:19] Project beta-scap-eqiad build #264632: 04STILL FAILING in 43 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/264632/ [18:26:06] indeed. Was there any particular trigger. I know there were slow tests briefly clogging the tubes, but it looks like all those tasks are resolved [18:26:23] s/trigger./trigger?/ [18:26:41] Project beta-scap-eqiad build #264633: 04STILL FAILING in 41 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/264633/ [18:27:25] I don't think so. Just the mass of code, the mass of patches, and (particularly) the mass of tests is getting too much. [18:27:40] ugh [18:28:05] (03PS1) 10Mholloway: Revert "Revert "MachineVision: Depend on Wikibase"" [integration/config] - 10https://gerrit.wikimedia.org/r/533291 [18:28:13] needed after all [18:28:33] We should probably have a base unit test that fatals when you try to depend on an extension that isn't found, which only errors in CI but not in dev? [18:28:44] (03CR) 10Jforrester: [C: 03+2] Revert "Revert "MachineVision: Depend on Wikibase"" [integration/config] - 10https://gerrit.wikimedia.org/r/533291 (owner: 10Mholloway) [18:31:50] (03Merged) 10jenkins-bot: Revert "Revert "MachineVision: Depend on Wikibase"" [integration/config] - 10https://gerrit.wikimedia.org/r/533291 (owner: 10Mholloway) [18:32:00] maybe i'll have to devote some 10% time to helping get T88258 over the line [18:32:01] T88258: Convert WikibaseRepository, WikibaseClient, WikibaseLib and WikibaseView to use extension registration - https://phabricator.wikimedia.org/T88258 [18:32:10] this is slowing down a lot of people [18:32:25] and causing no small amount of frustration [18:35:19] Project beta-scap-eqiad build #264634: 04STILL FAILING in 45 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/264634/ [18:38:46] !log Zuul: Adding Wikibase back into dependencies for MachineVision [18:38:49] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [18:40:02] RECOVERY - Work requests waiting in Zuul Gearman server on contint1001 is OK: OK: Less than 30.00% above the threshold [90.0] https://www.mediawiki.org/wiki/Continuous_integration/Zuul https://grafana.wikimedia.org/dashboard/db/zuul-gearman?panelId=10&fullscreen&orgId=1 [18:45:12] Project beta-scap-eqiad build #264635: 04STILL FAILING in 47 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/264635/ [18:55:09] Project beta-scap-eqiad build #264636: 04STILL FAILING in 41 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/264636/ [19:05:10] Project beta-scap-eqiad build #264637: 04STILL FAILING in 42 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/264637/ [19:15:12] Project beta-scap-eqiad build #264638: 04STILL FAILING in 45 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/264638/ [19:17:28] [b146ac55d21eb6e78266a6a8] [no req] ExtensionDependencyError from line 345 of /srv/mediawiki-staging/php-master/includes/registration/ExtensionRegistry.php: MachineVision requires WikibaseClient to be installed. [19:19:19] hauskatze: Yes, see scrollback. ;-) [19:19:32] Just logged in James_F [19:19:59] I know, hence the wink. [19:20:01] Suffices to know that you're all aware [19:20:02] https://gerrit.wikimedia.org/r/c/mediawiki/extensions/MachineVision/+/533278 [19:20:58] Good Lord... Changes still taking one hour to be tested :/ [19:25:10] Project beta-scap-eqiad build #264639: 04STILL FAILING in 44 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/264639/ [19:27:47] hauskatze: Yeah, repeated failures in CI slow everything down. [19:35:10] Project beta-scap-eqiad build #264640: 04STILL FAILING in 42 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/264640/ [19:37:03] mdholloway: I'm afraid I'm going to have to just force-merge the MachineVision patch to fix Beta Cluster, and you'll need to fix your tests from a broken master, sorry. [19:37:18] But it's been two hours. [19:38:59] Project beta-scap-eqiad build #264641: 04STILL FAILING in 42 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/264641/ [19:39:39] James_F: sorry, in meetings, but that sounds reasonable to me [19:40:34] sorry for the breakage [19:43:47] Not your fault, CI should catch this. [19:45:09] (Filed as T231601.) [19:45:10] T231601: Have a CI test that fails if a Beta Cluster extension has an unsatisfiable dependency - https://phabricator.wikimedia.org/T231601 [19:47:35] It looks it's passing this time [19:49:08] Yeah, finally. [19:53:34] Yippee, build fixed! [19:53:35] Project beta-scap-eqiad build #264642: 09FIXED in 9 min 1 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/264642/ [19:57:16] James_F: I've spent entirely too long fiddling with CI graphs that say what we already know https://grafana.wikimedia.org/d/000000108/releng-kpis?orgId=1&from=1566146078804&to=1567108432570 [19:57:28] ci is slower than it used to be. [19:59:24] ^^ [20:01:16] Project mwcore-phpunit-coverage-master build #141: 04FAILURE in 5 hr 0 min: https://integration.wikimedia.org/ci/job/mwcore-phpunit-coverage-master/141/ [20:02:22] 5 h [20:11:02] PROBLEM - Mediawiki Error Rate on graphite-labs is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [10.0] [20:11:04] thcipriani: :-) [20:11:24] thcipriani: Do you have a triggering event/period where it got worse? [20:11:44] thcipriani: I'm speculatively blaming us adding php73 to the testing mix, a few months ago. [20:12:19] Also, yay KPIs etc. [20:12:51] Eyeballing, it looks to have notably worsened around the end of July? [20:19:18] that might be about right. It was trending in a bad direction at that time. then we hit some dead-space with lots of folks out mid-august. [20:21:38] Yeah. [20:22:21] Well, theoretically (🤞🏽🤞🏽🤞🏽) TechCom are going to approve dropping php70 and php71 support from master next week, which will reduce our job-load-per-patch by ~30%. [20:22:34] Which will help, but it's not a long term fix. [20:22:50] Eventually we'll be adding php74 jobs, after all. [20:23:07] (And killing HHVM, or SRE will owe me some serious apologies. ;-)) [20:24:57] is there a request for more capacity in the integration project already? [20:26:37] My understanding in Chicago was that we asked for that and were told no? [20:26:48] But maybe I misunderstood. [20:28:04] More capacity please. [20:29:50] (03PS1) 10D3r1ck01: Remove ViewFiles extension from Integration Config [integration/config] - 10https://gerrit.wikimedia.org/r/533326 (https://phabricator.wikimedia.org/T228367) [20:32:10] (03CR) 10MarcoAurelio: Remove ViewFiles extension from Integration Config (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/533326 (https://phabricator.wikimedia.org/T228367) (owner: 10D3r1ck01) [20:33:27] hrm, that may be right, but at this point might be worth asking again. Should track utilization for a bit, make sure that is, indeed, our bottleneck (although intuitively it sure looks like it is) [20:34:22] (03PS2) 10D3r1ck01: Archive ViewFiles extension in Integration Config [integration/config] - 10https://gerrit.wikimedia.org/r/533326 (https://phabricator.wikimedia.org/T228367) [20:35:08] (03CR) 10D3r1ck01: "done!" (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/533326 (https://phabricator.wikimedia.org/T228367) (owner: 10D3r1ck01) [20:35:18] (03CR) 10MarcoAurelio: [C: 03+1] Archive ViewFiles extension in Integration Config [integration/config] - 10https://gerrit.wikimedia.org/r/533326 (https://phabricator.wikimedia.org/T228367) (owner: 10D3r1ck01) [20:35:23] I think maybe we should reserve some space for test runs so that a long stack of C+2'ed changes doesn't starve the oxygen of our developer community. [20:37:29] " is there a request for more capacity in the integration project already?" -- not that I know of at least from the Cloud Services side [20:38:30] There was talk about it at annual plan time, and the hope then was that it would move "somewhere" with the pipeline project. That maybe fell flat by the end of planning. [20:39:09] Would be happy to talk about what growth might help and see if we can find the spare compute in Cloud VPS to do some easing [20:40:39] quota bump request process is at https://phabricator.wikimedia.org/project/view/2880/ [20:41:30] Fair warning that today Cloud VPS is a bit low on spare compute, but that's a temporary thing as we shuffle some physical hardware around [20:44:55] (03CR) 10D3r1ck01: "This change is ready for review." [integration/config] - 10https://gerrit.wikimedia.org/r/533326 (https://phabricator.wikimedia.org/T228367) (owner: 10D3r1ck01) [20:51:52] bd808: thanks for the info! I'll checkout the capacity project/make a request. The pipeline work is a bit underfunded at the moment as well. Also, the always problem of capacity for maintenance vs capacity for the new shiny. [20:54:39] Yeah. :-( [21:10:21] PROBLEM - App Server Main HTTP Response on deployment-mediawiki-09 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:14:09] (03PS1) 10Paladox: Merge tag 'v2.15.16' into wmf/stable-2.15 [software/gerrit] (wmf/stable-2.15) - 10https://gerrit.wikimedia.org/r/533332 [21:15:14] RECOVERY - App Server Main HTTP Response on deployment-mediawiki-09 is OK: HTTP OK: HTTP/1.1 200 OK - 48211 bytes in 1.358 second response time [21:15:46] thcipriani ^ [21:16:39] "Failed to load Starlark extension '//plugins:gitiles/external_plugin_deps.bzl'." [21:17:00] (03CR) 10jerkins-bot: [V: 04-1] Merge tag 'v2.15.16' into wmf/stable-2.15 [software/gerrit] (wmf/stable-2.15) - 10https://gerrit.wikimedia.org/r/533332 (owner: 10Paladox) [21:20:26] paladox: yep, that's exactly what I got with bazel 0.25.0-0.28.0 [21:22:48] thcipriani https://gerrit-review.googlesource.com/c/gerrit/+/231396 [21:24:00] interesting [21:24:47] * paladox tries following that [21:24:51] seems to be building with that flag [21:25:31] thcipriani it works when using "--incompatible_disallow_load_labels_to_cross_package_boundaries=false" [21:27:26] yes indeed [21:27:45] So i'm going to see if i can change this to use the new format [21:27:47] paladox: nice find [21:27:50] :) [21:31:23] thcipriani works \o/ [21:32:33] (03PS1) 10Paladox: Support newer bazel versions [software/gerrit] (wmf/stable-2.15) - 10https://gerrit.wikimedia.org/r/533336 [21:32:44] (03PS2) 10Paladox: Support newer bazel versions [software/gerrit] (wmf/stable-2.15) - 10https://gerrit.wikimedia.org/r/533336 [21:32:55] (03PS3) 10Paladox: Support newer bazel versions [software/gerrit] (wmf/stable-2.15) - 10https://gerrit.wikimedia.org/r/533336 [21:36:50] (03PS4) 10Paladox: Support newer bazel versions [software/gerrit] (wmf/stable-2.15) - 10https://gerrit.wikimedia.org/r/533336 [21:36:52] (03PS2) 10Paladox: Merge tag 'v2.15.16' into wmf/stable-2.15 [software/gerrit] (wmf/stable-2.15) - 10https://gerrit.wikimedia.org/r/533332 [21:36:54] thcipriani ^ :) [21:37:14] :) [21:37:16] * thcipriani looks [21:41:28] thcipriani https://integration.wikimedia.org/ci/job/gerrit-docker/23/console :D [21:41:29] that [21:41:37] *that's the build for the 2.15.16 merge above [21:45:11] (03Abandoned) 10Paladox: Merge branch 'stable-2.15' into wmf/stable-2.15 [software/gerrit] (wmf/stable-2.15) - 10https://gerrit.wikimedia.org/r/525865 (owner: 10Paladox) [21:45:44] thcipriani passed :) [21:45:53] can you merge https://gerrit.wikimedia.org/r/#/c/operations/software/gerrit/+/533336/ please? :) [21:55:56] (03PS4) 1020after4: Phatality plugin for kibana [releng/phatality] - 10https://gerrit.wikimedia.org/r/531047 (https://phabricator.wikimedia.org/T230752) [22:00:10] (03CR) 10Thcipriani: [C: 03+2] "Nice work!" [software/gerrit] (wmf/stable-2.15) - 10https://gerrit.wikimedia.org/r/533336 (owner: 10Paladox) [22:02:11] thanks! [22:03:57] the gerrit project also use the .gitreview file [22:08:27] (03Merged) 10jenkins-bot: Support newer bazel versions [software/gerrit] (wmf/stable-2.15) - 10https://gerrit.wikimedia.org/r/533336 (owner: 10Paladox) [22:11:02] PROBLEM - Mediawiki Error Rate on graphite-labs is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [10.0] [22:20:16] (03PS1) 10Paladox: Support newer bazel versions [software/gerrit] (wmf/stable-2.16) - 10https://gerrit.wikimedia.org/r/533349 [22:20:49] (03PS1) 10Paladox: Merge tag 'v2.16.11' into wmf/stable-2.16 [software/gerrit] (wmf/stable-2.16) - 10https://gerrit.wikimedia.org/r/533350 [22:31:18] (03CR) 10Paladox: [C: 03+2] "Self merging as it's been merged into 2.15 and it passes the build." [software/gerrit] (wmf/stable-2.16) - 10https://gerrit.wikimedia.org/r/533349 (owner: 10Paladox) [22:39:42] (03Merged) 10jenkins-bot: Support newer bazel versions [software/gerrit] (wmf/stable-2.16) - 10https://gerrit.wikimedia.org/r/533349 (owner: 10Paladox) [23:02:39] twentyafterfour: looks like "Assignee" also got lost - https://phabricator.wikimedia.org/T227817 [23:06:56] PROBLEM - Puppet staleness on deployment-mediawiki-jhuneidi is CRITICAL: CRITICAL: 80.00% of data above the critical threshold [43200.0] [23:12:49] (03CR) 10D3r1ck01: [C: 03+1] layout: [mediawiki/services/recommendation-api/deploy] Archive [integration/config] - 10https://gerrit.wikimedia.org/r/516718 (owner: 10Jforrester) [23:14:07] (03PS2) 10Jforrester: layout: Move all archived repos to their own section at the end (no-op) [integration/config] - 10https://gerrit.wikimedia.org/r/533257 [23:14:15] (03CR) 10Jforrester: [C: 03+2] layout: Move all archived repos to their own section at the end (no-op) [integration/config] - 10https://gerrit.wikimedia.org/r/533257 (owner: 10Jforrester) [23:16:11] (03Merged) 10jenkins-bot: layout: Move all archived repos to their own section at the end (no-op) [integration/config] - 10https://gerrit.wikimedia.org/r/533257 (owner: 10Jforrester) [23:16:58] RECOVERY - Puppet staleness on deployment-mediawiki-jhuneidi is OK: OK: Less than 1.00% above the threshold [3600.0] [23:23:44] !log Zuul: Move all archived repos to their own section at the end (no-op) [23:23:46] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [23:51:02] PROBLEM - Puppet errors on deployment-mediawiki-jhuneidi is CRITICAL: CRITICAL: 48.98% of data above the critical threshold [3.0]