[00:06:53] (03CR) 10Jforrester: [C: 03+1] "Yay. Why do we still have hhvmlint on top of composertest?" [integration/config] - 10https://gerrit.wikimedia.org/r/496688 (owner: 10Krinkle) [00:09:04] (03CR) 10Jforrester: [C: 03+1] Remove a few redundant mediawiki/job quibble jobs (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/496688 (owner: 10Krinkle) [00:12:32] (03PS2) 10Krinkle: Remove a few redundant mediawiki/job quibble jobs [integration/config] - 10https://gerrit.wikimedia.org/r/496688 [00:13:14] (03CR) 10Krinkle: Remove a few redundant mediawiki/job quibble jobs (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/496688 (owner: 10Krinkle) [00:14:06] * James_F grins at Krinkle. [00:17:30] Project beta-scap-eqiad build #241435: 04FAILURE in 9 min 19 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/241435/ [00:29:06] Yippee, build fixed! [00:29:07] Project beta-scap-eqiad build #241436: 09FIXED in 10 min: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/241436/ [04:26:40] 10Release-Engineering-Team (Kanban), 10Jenkins: Evaluate Jenkins - https://phabricator.wikimedia.org/T218333 (10Liuxinyu970226) [04:26:42] 10Release-Engineering-Team (Kanban), 10Jenkins: Evaluate Jenkins X - https://phabricator.wikimedia.org/T218334 (10Liuxinyu970226) [05:01:04] Project beta-scap-eqiad build #241460: 04FAILURE in 8 min 49 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/241460/ [05:12:23] Yippee, build fixed! [05:12:23] Project beta-scap-eqiad build #241461: 09FIXED in 10 min: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/241461/ [07:02:44] 10Continuous-Integration-Config, 10Fresnel, 10Performance-Team: Omit "npm install" step in Fresnel job output - https://phabricator.wikimedia.org/T218374 (10hashar) `npm install` is quite slow, partly due to the Quibble containers using nodejs6 and npm 3.8.x. Eventually I will get the containers updated. T... [07:04:34] 10Release-Engineering-Team (Backlog), 10Developer Productivity, 10local-charts, 10Epic: Create official docker images for Mediawiki and services used in the local development environment - https://phabricator.wikimedia.org/T217872 (10hashar) Wikimedia base Docker images are build using an home grown utilit... [07:33:31] (03CR) 10Alexandros Kosiaris: Switch change-propagation to the pipeline (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/496387 (https://phabricator.wikimedia.org/T213193) (owner: 10Alexandros Kosiaris) [07:51:59] 10Gerrit, 10Phabricator, 10Operations, 10Security-Team, 10Traffic: Add gerrit.wikimedia.org to the Phabricator CSP - https://phabricator.wikimedia.org/T218308 (10Jdrewniak) [08:52:50] PROBLEM - English Wikipedia Mobile Main page on beta-cluster is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 Internal Server Error - string 'Wikipedia' not found on 'https://en.m.wikipedia.beta.wmflabs.org:443/wiki/Main_Page?debug=true' - 2534 bytes in 0.035 second response time [08:53:39] oooof that's likely me on deployment-mediawiki-07, apologies [08:53:40] fixing [08:54:36] PROBLEM - App Server Main HTTP Response on deployment-mediawiki-07 is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 Internal Server Error - string 'Wikipedia' not found on 'http://en.wikipedia.beta.wmflabs.org:80/wiki/Main_Page?debug=true' - 1904 bytes in 0.019 second response time [08:55:15] PROBLEM - English Wikipedia Main page on beta-cluster is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 Internal Server Error - string 'Wikipedia' not found on 'https://en.wikipedia.beta.wmflabs.org:443/wiki/Main_Page?debug=true' - 2533 bytes in 0.035 second response time [08:57:50] RECOVERY - English Wikipedia Mobile Main page on beta-cluster is OK: HTTP OK: HTTP/1.1 200 OK - 36919 bytes in 0.905 second response time [08:59:33] RECOVERY - App Server Main HTTP Response on deployment-mediawiki-07 is OK: HTTP OK: HTTP/1.1 200 OK - 48181 bytes in 1.216 second response time [09:00:14] RECOVERY - English Wikipedia Main page on beta-cluster is OK: HTTP OK: HTTP/1.1 200 OK - 48807 bytes in 0.766 second response time [09:22:25] 10Continuous-Integration-Infrastructure, 10HHVM, 10Language-Team (Language-2019-January-March), 10Patch-For-Review, 10Wikimedia-production-error (Shared Build Failure): Merge blocker: quibble-vendor-mysql-hhvm-docker in gate fails for most merges (exit status -11... - https://phabricator.wikimedia.org/T216689 [09:59:40] 10Phabricator: On Phabricator workboard, show status of associated Gerrit patches - https://phabricator.wikimedia.org/T215148 (10Jdrewniak) @hashar thanks for the tip about the `bug` parameter! I didn't fix the issue with the multiple bugs on one patch you mention above, but it did make the queries faster. That... [10:18:30] 10Continuous-Integration-Infrastructure, 10Growth-Team, 10Notifications: quibble-vendor-mysql-hhvm-docker fails on EchoDiscussionParserTest - https://phabricator.wikimedia.org/T218388 (10dcausse) [11:37:08] (03CR) 10Ppchelko: Switch change-propagation to the pipeline (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/496387 (https://phabricator.wikimedia.org/T213193) (owner: 10Alexandros Kosiaris) [11:39:04] "You are watching this project and will receive mail about changes made to any related object." <-- should say "receive notifications" ..it's not always mail and it's good that it isn't:) [11:39:43] in which system? [11:39:59] andre__: Phabricator [11:40:13] when becoming a "watcher" of a project [11:40:32] mutante, file a bug :) [11:42:41] 10Scap, 10serviceops: Scap2 to use etcd for target servers - https://phabricator.wikimedia.org/T218328 (10jijiki) [12:30:36] 10Gerrit, 10Release-Engineering-Team (Next), 10Patch-For-Review: Upgrade to Gerrit 2.15.11 - https://phabricator.wikimedia.org/T214359 (10hashar) 05Open→03Resolved a:03thcipriani I was not able to reach Gerrit SSH over IPv6 (TCP connection established but some KEX exchange never got a reply from the se... [12:33:26] Project beta-scap-eqiad build #241500: 04FAILURE in 12 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/241500/ [12:46:07] Yippee, build fixed! [12:46:07] Project beta-scap-eqiad build #241501: 09FIXED in 11 min: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/241501/ [12:49:01] (03CR) 10Ppchelko: Switch change-propagation to the pipeline (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/496387 (https://phabricator.wikimedia.org/T213193) (owner: 10Alexandros Kosiaris) [13:22:06] (03CR) 10Alexandros Kosiaris: "> Regarding the integration tests - change-prop is special in a sense that it has no REST API, so service-checker doesn't do anything for " [integration/config] - 10https://gerrit.wikimedia.org/r/496387 (https://phabricator.wikimedia.org/T213193) (owner: 10Alexandros Kosiaris) [13:31:57] 10Release-Engineering-Team (Kanban), 10Patch-For-Review, 10Release, 10Train Deployments, 10User-zeljkofilipin: 1.33.0-wmf.21 deployment blockers - https://phabricator.wikimedia.org/T206675 (10zeljkofilipin) >>! In T206675#5024650, @Aklapper wrote: > I'm not sure yet what to make out of {T218310} (missing... [13:33:48] 10Release-Engineering-Team (Kanban), 10Patch-For-Review, 10Release, 10Train Deployments, 10User-zeljkofilipin: 1.33.0-wmf.21 deployment blockers - https://phabricator.wikimedia.org/T206675 (10zeljkofilipin) >>! In T206675#5024498, @herron wrote: > There looks to be a significant increase (about 1.5 milli... [13:35:58] PROBLEM - Puppet errors on integration-slave-docker-1053 is CRITICAL: CRITICAL: 7.78% of data above the critical threshold [3.0] [13:36:41] PROBLEM - Puppet errors on deployment-mediawiki-07 is CRITICAL: CRITICAL: 8.89% of data above the critical threshold [3.0] [13:37:10] PROBLEM - Puppet errors on deployment-db03 is CRITICAL: CRITICAL: 6.74% of data above the critical threshold [3.0] [13:37:51] PROBLEM - Puppet errors on integration-slave-jessie-1002 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [3.0] [13:38:19] 10Release-Engineering-Team (Kanban), 10Patch-For-Review, 10Release, 10Train Deployments, 10User-zeljkofilipin: 1.33.0-wmf.21 deployment blockers - https://phabricator.wikimedia.org/T206675 (10zeljkofilipin) >>! In T206675#5025503, @Jdforrester-WMF wrote: > That's now fixed, but the new glut of `PasswordP... [13:38:57] PROBLEM - Puppet errors on integration-slave-docker-1034 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [3.0] [13:39:37] PROBLEM - Puppet errors on integration-slave-docker-1048 is CRITICAL: CRITICAL: 7.78% of data above the critical threshold [3.0] [13:40:25] PROBLEM - Puppet errors on integration-slave-docker-1049 is CRITICAL: CRITICAL: 11.43% of data above the critical threshold [3.0] [13:40:38] PROBLEM - Puppet errors on deployment-deploy02 is CRITICAL: CRITICAL: 10.00% of data above the critical threshold [3.0] [13:42:47] PROBLEM - Puppet errors on integration-slave-docker-1043 is CRITICAL: CRITICAL: 6.67% of data above the critical threshold [3.0] [13:48:06] 10Release-Engineering-Team (Kanban), 10Patch-For-Review, 10Release, 10Train Deployments, 10User-zeljkofilipin: 1.33.0-wmf.21 deployment blockers - https://phabricator.wikimedia.org/T206675 (10zeljkofilipin) I'm not sure if this is a blocker: [[ https://lists.wikimedia.org/pipermail/wikitech-l/2019-March/... [13:51:13] PROBLEM - Puppet errors on integration-slave-docker-1050 is CRITICAL: CRITICAL: 8.99% of data above the critical threshold [3.0] [13:52:28] 10Release-Engineering-Team (Kanban), 10Patch-For-Review, 10Release, 10Train Deployments, 10User-zeljkofilipin: 1.33.0-wmf.21 deployment blockers - https://phabricator.wikimedia.org/T206675 (10herron) >>! In T206675#5026999, @zeljkofilipin wrote: >>>! In T206675#5024498, @herron wrote: >> https://logstash... [13:52:31] PROBLEM - Puppet errors on integration-slave-docker-1052 is CRITICAL: CRITICAL: 10.00% of data above the critical threshold [3.0] [13:54:26] PROBLEM - Puppet errors on integration-slave-docker-1041 is CRITICAL: CRITICAL: 10.00% of data above the critical threshold [3.0] [13:56:53] PROBLEM - Puppet errors on integration-slave-jessie-1001 is CRITICAL: CRITICAL: 7.78% of data above the critical threshold [3.0] [13:59:18] PROBLEM - Puppet errors on integration-slave-docker-1037 is CRITICAL: CRITICAL: 8.99% of data above the critical threshold [3.0] [13:59:24] PROBLEM - Puppet errors on integration-slave-jessie-1004 is CRITICAL: CRITICAL: 11.24% of data above the critical threshold [3.0] [14:00:01] PROBLEM - Puppet errors on integration-slave-docker-1051 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [3.0] [14:01:45] 10Continuous-Integration-Config, 10MediaWiki-Core-Testing, 10MediaWiki-extensions-MultimediaViewer, 10MobileFrontend, and 9 others: Audit tests/selenium/LocalSettings.php file aiming at possibly deprecating the feature - https://phabricator.wikimedia.org/T199939 (10kostajh) I'm working on {T218345}, where... [14:21:47] (03PS1) 10Ema: Test operations/debs/superior-cache-analyzer [integration/config] - 10https://gerrit.wikimedia.org/r/496783 (https://phabricator.wikimedia.org/T213263) [14:51:35] (03CR) 10Daniel Kinzler: [C: 03+1] Adds Vedmaka Wakalaka to the CI whitelist [integration/config] - 10https://gerrit.wikimedia.org/r/496196 (owner: 10Vedmaka Wakalaka) [14:51:38] 10Phabricator: On Phabricator workboard, show status of associated Gerrit patches - https://phabricator.wikimedia.org/T215148 (10hashar) It is good to see enhancements are being made :] For the limit of 50 items to `q=`, I guess you can also consider to make those queries in parallel. [15:00:52] 10Continuous-Integration-Infrastructure, 10HHVM, 10Language-Team (Language-2019-January-March), 10Patch-For-Review, 10Wikimedia-production-error (Shared Build Failure): Merge blocker: quibble-vendor-mysql-hhvm-docker in gate fails for most merges (exit status -11... - https://phabricator.wikimedia.org/T216689 [15:05:19] 10Continuous-Integration-Infrastructure, 10HHVM, 10Language-Team (Language-2019-January-March), 10Patch-For-Review, 10Wikimedia-production-error (Shared Build Failure): Merge blocker: quibble-vendor-mysql-hhvm-docker in gate fails for most merges (exit status -11... - https://phabricator.wikimedia.org/T216689 [15:10:02] 10Continuous-Integration-Infrastructure, 10Growth-Team, 10Notifications: quibble-vendor-mysql-hhvm-docker fails on EchoDiscussionParserTest - https://phabricator.wikimedia.org/T218388 (10kostajh) See also {T194632} [15:12:28] 10Release-Engineering-Team (Kanban), 10Patch-For-Review, 10Release, 10Train Deployments, 10User-zeljkofilipin: 1.33.0-wmf.21 deployment blockers - https://phabricator.wikimedia.org/T206675 (10Jdforrester-WMF) >>! In T206675#5027002, @zeljkofilipin wrote: >>>! In T206675#5025503, @Jdforrester-WMF wrote: >... [15:22:33] 10Release-Engineering-Team (Kanban), 10Patch-For-Review, 10Release, 10Train Deployments, 10User-zeljkofilipin: 1.33.0-wmf.21 deployment blockers - https://phabricator.wikimedia.org/T206675 (10zeljkofilipin) >>! In T206675#5027019, @herron wrote: > The logstash link above might be too specific since it on... [15:26:07] 10Beta-Cluster-Infrastructure, 10Elasticsearch, 10Wikimedia-Logstash, 10Discovery-Search (Current work), 10Patch-For-Review: ApiFeatureUsage data is not being populated in the Beta Cluster - https://phabricator.wikimedia.org/T183156 (10Anomie) Confirmed that deployment-elastic05 now contains indexes for... [15:35:40] RECOVERY - Puppet errors on deployment-deploy02 is OK: OK: Less than 1.00% above the threshold [2.0] [15:36:42] RECOVERY - Puppet errors on deployment-mediawiki-07 is OK: OK: Less than 1.00% above the threshold [2.0] [16:07:42] Project beta-scap-eqiad build #241519: 15ABORTED in 2 min 47 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/241519/ [16:08:20] !log disable beta-scap-eqiad to test new php, back shortly [16:08:21] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [16:20:44] (03CR) 10Hashar: "Still have to update all jobs." [integration/config] - 10https://gerrit.wikimedia.org/r/496620 (https://phabricator.wikimedia.org/T216689) (owner: 10Hashar) [16:26:44] 10Beta-Cluster-Infrastructure, 10Elasticsearch, 10Wikimedia-Logstash, 10Discovery-Search (Current work): ApiFeatureUsage data is not being populated in the Beta Cluster - https://phabricator.wikimedia.org/T183156 (10EBernhardson) [16:46:00] hey @thcipriani yesterday Greg-g gave us permission for a special Friday deploy [16:46:20] (2:41 PM) [16:46:32] I just want to work out what time that would be possible [16:48:16] 10Scap, 10serviceops: Define a mediawiki "version" - https://phabricator.wikimedia.org/T218412 (10jijiki) p:05Triage→03Normal [16:48:33] jdlrobson: deploy of what? probably the sooner the better just so more folks are around if they're needed. Do you need me to deploy? [16:49:06] i'll prepare the patch now [16:49:11] basiclly the mobile editor is broken in iOS [16:49:17] (see thread up from kaldari) [16:49:19] ah [16:49:21] i'll get the patch ready now for you.. [16:49:25] cool [16:54:48] https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/MobileFrontend/+/496827 iOS: Fix mobile editor (squash of 3 commits) < @thcipriani that will be the commit I'm just getting some local verification from others before proceeding [16:54:57] !log reenable beta-scap-eqiad [16:54:58] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [16:58:11] 10Scap, 10serviceops: Define a mediawiki "version" - https://phabricator.wikimedia.org/T218412 (10Dzahn) How about this: We calculate the sha1 sum of each file in /srv/mediawiki/php/cache/gitinfo and then the sha1 sum of the sum of these, like so: `sha1sum /srv/mediawiki/php/cache/gitinfo/* | sha1sum` I tri... [16:58:16] 10Scap, 10serviceops: Define a mediawiki "version" - https://phabricator.wikimedia.org/T218412 (10jijiki) [17:01:24] 10Beta-Cluster-Infrastructure, 10PHP 7.0 support: Cannot access beta cluster db - https://phabricator.wikimedia.org/T217938 (10greg) >>! In T217938#5024516, @mmodell wrote: > on `deployment-deploy01` in `/usr/local/bin/sql` we have `php=php7.0` but apt doesn't have a `php7.0-redis package` Erik B just tried m... [17:01:55] 10Beta-Cluster-Infrastructure, 10PHP 7.0 support: Beta Cluster does not have php7.0-redis available - https://phabricator.wikimedia.org/T217938 (10greg) [17:02:10] thcipriani: just waiting on a green light from @matmarex [17:05:20] 10Release-Engineering-Team (Kanban), 10User-zeljkofilipin: Evaluate sourcehut builds - https://phabricator.wikimedia.org/T217852 (10zeljkofilipin) I've tested sourcehut too, and I really like it. The workflow is exactly what we are looking for: # push dotfile (`.build.yml`) to a repo (example: [[ https://git... [17:06:31] thcipriani: ok ready! [17:06:47] do you need me to put https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/MobileFrontend/+/496827 somewhere on the calendar? [17:08:11] jdlrobson: if you could add a row there for...SWAT I suppose...that'd be great :) [17:08:19] patch +2'd waiting on CI, FYI [17:11:49] https://wikitech.wikimedia.org/wiki/Deployments < done [17:11:58] thank you! [17:18:26] 10Continuous-Integration-Infrastructure, 10HHVM, 10Language-Team (Language-2019-January-March), 10Patch-For-Review, 10Wikimedia-production-error (Shared Build Failure): Merge blocker: quibble-vendor-mysql-hhvm-docker in gate fails for most merges (exit status -11... - https://phabricator.wikimedia.org/T216689 [17:18:28] 10Release-Engineering-Team (Kanban): Consider and evaluate possible new CI tooling - https://phabricator.wikimedia.org/T217325 (10zeljkofilipin) [17:18:30] 10Release-Engineering-Team (Kanban), 10User-zeljkofilipin: Evaluate sourcehut builds - https://phabricator.wikimedia.org/T217852 (10zeljkofilipin) 05Open→03Resolved [17:20:33] oh no it quibbled thcipriani [17:20:41] can we force merge or do we need to try again? [17:20:44] seems unrelated [17:21:01] ERROR in Flow\Tests\Api\ApiFlowModerateTopicTest::testModerateLockedTopic: [PHPUnit\Framework\ExceptionWrapper] A database query error has occurred. Did you forget to run your application's database schema updater after upgrading? [17:23:21] huh, that seems like a legit test failure rather than an infrastructure failure (possibly) [17:24:02] but shouldn't be related to this change - i'm not touching any PHP code [17:24:08] hrm, maybe a weird test. [17:25:14] * thcipriani re-runs merge tests [17:25:34] I can only flaunt so many norms and feel comfortable with it :) [17:28:08] 10Phabricator: Phabricator admin rights for bawolff - https://phabricator.wikimedia.org/T217917 (10Aklapper) 05Open→03Resolved a:03Aklapper Done. [17:29:14] oh good. [17:29:18] npm checksum failure. [17:30:15] thcipriani: we merged a change to keyholder in ops/puppet that will restart the service. I think most prod copies are re-armed now, but it will hit deployment-prep soon [17:31:55] bd808: thanks for the heads-up. I can rearm (assuming puppet update is working, which I can check after I take care of deploying mobilefrontend fix) [17:37:04] Project beta-scap-eqiad build #241522: 04FAILURE in 9 min 20 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/241522/ [17:37:59] heh, well, there it is [17:38:02] * thcipriani arms [17:38:35] o_O [17:40:25] jdlrobson: merged. [17:40:40] w00t and synced? [17:40:43] or on debug1002? [17:40:48] not yet, one sec [17:40:52] !log rearm beta keyholder [17:40:53] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [17:48:23] Yippee, build fixed! [17:48:24] Project beta-scap-eqiad build #241523: 09FIXED in 9 min 59 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/241523/ [17:55:46] 10Beta-Cluster-Infrastructure, 10Elasticsearch, 10Wikimedia-Logstash, 10Discovery-Search (Current work): ApiFeatureUsage data is not being populated in the Beta Cluster - https://phabricator.wikimedia.org/T183156 (10Anomie) 05Open→03Resolved Let's call this resolved then, unless you want to keep it ope... [18:11:15] PROBLEM - Disk space on contint1001 is CRITICAL: DISK CRITICAL - free space: / 2566 MB (5% inode=57%) [18:14:14] (03PS1) 10Smalyshev: Bump Blazegraph time limit again [integration/config] - 10https://gerrit.wikimedia.org/r/496843 [18:17:44] * thcipriani looks at contint1001 [18:21:12] !log clean old docker images from contint1001 [18:21:13] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [18:34:12] RECOVERY - Disk space on contint1001 is OK: DISK OK [18:37:50] 10Release-Engineering-Team (Kanban), 10Developer Productivity, 10Documentation: Improve documentation on Docker-based development environments for new developers - https://phabricator.wikimedia.org/T217614 (10srodlund) @brennen After you feel you have a solid plan in place for improving the Docker docs or ha... [19:05:04] Project beta-scap-eqiad build #241530: 04FAILURE in 9 min 3 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/241530/ [19:16:10] Yippee, build fixed! [19:16:10] Project beta-scap-eqiad build #241531: 09FIXED in 9 min 46 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/241531/ [19:26:49] 10Continuous-Integration-Config, 10Project-Admins: Create phabricator tag to track CI blockers - https://phabricator.wikimedia.org/T218043 (10Aklapper) I'd love to have input from the CI crew here before proceeding [19:28:04] 10Continuous-Integration-Config, 10Project-Admins: Create phabricator tag to track CI blockers - https://phabricator.wikimedia.org/T218043 (10Jdforrester-WMF) We use #jenkins-failure for these. I don't think we need a second tag? [19:28:26] Project beta-scap-eqiad build #241533: 04FAILURE in 1.1 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/241533/ [19:34:24] Project beta-scap-eqiad build #241534: 04STILL FAILING in 5.2 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/241534/ [19:40:30] 10Continuous-Integration-Config, 10Project-Admins: Create phabricator tag to track CI blockers - https://phabricator.wikimedia.org/T218043 (10greg) 05Open→03Invalid >>! In T218043#5027961, @Jdforrester-WMF wrote: > We use #jenkins-failure for these. I don't think we need a second tag? yeah, use that one ^... [19:44:45] 10Continuous-Integration-Config, 10Project-Admins: Create phabricator tag to track CI blockers (#jenkins-failure) - https://phabricator.wikimedia.org/T218043 (10Aklapper) [19:54:00] Yippee, build fixed! [19:54:01] Project beta-scap-eqiad build #241535: 09FIXED in 9 min 38 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/241535/ [20:01:02] PROBLEM - Puppet errors on deployment-mwmaint01 is CRITICAL: CRITICAL: 6.67% of data above the critical threshold [3.0] [20:01:10] 10Release-Engineering-Team (Watching / External), 10Operations, 10Release Pipeline, 10Core Platform Team Backlog (Watching / External), and 2 others: Track and install additional npm packages for all service container images - https://phabricator.wikimedia.org/T205911 (10dduvall) >>! In T205911#5026051, @m... [20:01:13] greg-g: how did you get that "silent edit" icon in https://phabricator.wikimedia.org/T174022#5009969 ? Did that require stuff on the shell, or only web ui? [20:01:39] * andre__ still wondering how to mass-move the rest of sre tasks in https://phabricator.wikimedia.org/T197624 to another column without being super noisy [20:03:05] ah, CLI only, I guess [20:04:21] (03CR) 10Krinkle: [C: 03+2] Remove a few redundant mediawiki/job quibble jobs [integration/config] - 10https://gerrit.wikimedia.org/r/496688 (owner: 10Krinkle) [20:04:59] PROBLEM - Puppet errors on deployment-jobrunner03 is CRITICAL: CRITICAL: 8.99% of data above the critical threshold [3.0] [20:07:03] (03Merged) 10jenkins-bot: Remove a few redundant mediawiki/job quibble jobs [integration/config] - 10https://gerrit.wikimedia.org/r/496688 (owner: 10Krinkle) [20:10:41] !log Reloading Zuul to deploy https://gerrit.wikimedia.org/r/496688 [20:10:43] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [20:11:22] Thanks Krinkle, I was just about to complain that the gate-and-submit queue had been stuck for 93 minutes, but your restart seems to have gotten it unstuck [20:12:09] Haha [20:12:10] Nice [20:26:37] Project beta-scap-eqiad build #241538: 04FAILURE in 9 min 8 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/241538/ [20:26:46] (03PS1) 10Krinkle: Update mwext-EventLogging postmerge from jsduck to generic node10 [integration/config] - 10https://gerrit.wikimedia.org/r/496882 (https://phabricator.wikimedia.org/T138401) [20:26:53] (03CR) 10Krinkle: [C: 03+2] Update mwext-EventLogging postmerge from jsduck to generic node10 [integration/config] - 10https://gerrit.wikimedia.org/r/496882 (https://phabricator.wikimedia.org/T138401) (owner: 10Krinkle) [20:29:45] (03PS2) 10VolkerE: Update mwext-EventLogging postmerge from jsduck to generic node10 [integration/config] - 10https://gerrit.wikimedia.org/r/496882 (https://phabricator.wikimedia.org/T138401) (owner: 10Krinkle) [20:30:00] (03CR) 10VolkerE: "Nice." [integration/config] - 10https://gerrit.wikimedia.org/r/496882 (https://phabricator.wikimedia.org/T138401) (owner: 10Krinkle) [20:30:28] (03CR) 10Krinkle: [C: 03+2] Update mwext-EventLogging postmerge from jsduck to generic node10 [integration/config] - 10https://gerrit.wikimedia.org/r/496882 (https://phabricator.wikimedia.org/T138401) (owner: 10Krinkle) [20:32:01] (03Merged) 10jenkins-bot: Update mwext-EventLogging postmerge from jsduck to generic node10 [integration/config] - 10https://gerrit.wikimedia.org/r/496882 (https://phabricator.wikimedia.org/T138401) (owner: 10Krinkle) [20:37:35] Yippee, build fixed! [20:37:36] Project beta-scap-eqiad build #241539: 09FIXED in 9 min 38 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/241539/ [20:38:43] andre__: yeah I did the cli trick [20:38:57] Project beta-scap-eqiad build #241540: 04FAILURE in 2.4 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/241540/ [20:42:21] meh, thanks [20:52:43] thcipriani https://gerrit-review.googlesource.com/c/gerrit/+/218010 [20:53:57] Yippee, build fixed! [20:53:58] Project beta-scap-eqiad build #241541: 09FIXED in 9 min 35 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/241541/ [21:01:41] (03PS1) 10Jforrester: [Flow] Replace mwext-jsduck-publish with generic-node10-docs-docker-publish [integration/config] - 10https://gerrit.wikimedia.org/r/496887 [21:01:41] (03PS1) 10Jforrester: [Kartographer] Replace mwext-jsduck-publish with generic-node10-docs-docker-publish [integration/config] - 10https://gerrit.wikimedia.org/r/496888 [21:01:44] (03PS1) 10Jforrester: [TemplateData] Replace mwext-jsduck-publish with generic-node10-docs-docker-publish [integration/config] - 10https://gerrit.wikimedia.org/r/496889 [21:01:45] (03PS1) 10Jforrester: [CollabKit] Replace mwext-jsduck-publish with generic-node10-docs-docker-publish [integration/config] - 10https://gerrit.wikimedia.org/r/496890 [21:01:47] (03PS1) 10Jforrester: [MultimediaViewer] Replace mwext-jsduck-publish with generic-node10-docs-docker-publish [integration/config] - 10https://gerrit.wikimedia.org/r/496891 [21:01:50] (03PS1) 10Jforrester: [GuidedTour] Replace mwext-jsduck-publish with generic-node10-docs-docker-publish [integration/config] - 10https://gerrit.wikimedia.org/r/496892 [21:01:52] :O [21:01:53] (03PS1) 10Jforrester: [Wikibase] Replace mwext-jsduck-publish with generic-node10-docs-docker-publish [integration/config] - 10https://gerrit.wikimedia.org/r/496893 [21:01:53] (03PS1) 10Jforrester: Drop mwext-jsduck-publish, no longer used [integration/config] - 10https://gerrit.wikimedia.org/r/496894 [21:03:15] what's going on with Wikibase\Repo\Tests\Api\SetAliasesTest::testUserCannotSetAliasesWhenTheyLackPermission ? [21:03:28] PROBLEM - Host integration-slave-docker-1046 is DOWN: CRITICAL - Host Unreachable (172.16.1.115) [21:03:50] cscott: It's flaking but not failing 100%. [21:03:54] a bunch of gate-and-submit builds are failing with that, but not all. [21:03:56] yeah [21:04:08] James_F: and when it fails, it throws away a huge amount of speculative work [21:04:15] 39 jobs worth or so [21:04:17] https://phabricator.wikimedia.org/T218378 [21:04:29] Well, yes, all gate failures are always terribly inefficient. [21:04:44] can we dial down the speculation queue on gate-and-submit until that's fixed, so it doesn't take quite so long for the top of the queue to fail? [21:04:50] But CI being backlogged by two hours isn't great. [21:05:17] +1 [21:05:25] cscott: There's no "speculation queue". It's a strict queue of C+2'ed patches in repos which depend on each other. Wikibase is defined as a gate (depended-by-everyone) repo. [21:05:53] (As are ~20 other repos, like VE and Notifications and Vector and so on.) [21:05:59] James_F: yes, but it starts executing the tests for the second, third, fourth, etc entries in the queue on the assumptoin that the head wil pass [21:06:16] Project beta-scap-eqiad build #241543: 04FAILURE in 1 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/241543/ [21:06:23] and that's what's grinding jenkins to a halt, since that's a lot of extra work it is wasting [21:06:29] It only uses available executors which aren't needed earlier in the run. [21:06:35] Otherwise they'd just be idle. [21:07:05] well, judging my the pace at which phpunit dots are crawling across the console logs, i'd say that there is actual contention going on [21:07:27] Yes, but that's contention on the cloud services level, which we can't control. [21:07:45] Aka go shout at people writing expensive queries in Quarry or whatever. ;-( [21:08:05] the unicorns are unhappy [21:08:11] They're not the only ones. [21:09:52] https://gerrit.wikimedia.org/r/496080 is one of a vanishingly small # of patches which have actually merged in the past two hours [21:09:58] James_F: it looks it's going to get more backlogged, apparently l10n-bot is submitting translations [21:10:05] * James_F sighs. [21:10:22] TBF l10n-bot's CI checks are cheap, but the merges are disruptive, of course [21:12:51] can we disable the broken wikibase test? [21:13:53] Yes, but it might not be the cause, just a symptom. Ideally we'd have a Wikibassist to help diagnose. [21:14:16] (This is the… fifth? CI-wide flaky/broken test I've dealt with this week.) [21:20:37] James_F: yeah, even just today to have the echo issue and the wikibase issue together seems ... much [21:21:02] anyway, i'm going to C-2 and bail on the patch set I was trying to get merged, enjoy my weekend, and try again on monday. [21:22:23] T218172 T216689 T218388 and now T218378. Oy. [21:22:23] T218378: Flaky test Wikibase\Repo SetAliasesTest::testUserCannotSetAliasesWhenTheyLackPermission - https://phabricator.wikimedia.org/T218378 [21:22:24] T216689: Merge blocker: quibble-vendor-mysql-hhvm-docker in gate fails for most merges (exit status -11) - https://phabricator.wikimedia.org/T216689 [21:22:24] T218172: Can't merge in master of UploadWizard; error thrown by MobileFrontend NearbyGateway.js _distanceMessage() test - https://phabricator.wikimedia.org/T218172 [21:22:25] T218388: quibble-vendor-mysql-hhvm-docker fails on EchoDiscussionParserTest - https://phabricator.wikimedia.org/T218388 [21:23:58] Yippee, build fixed! [21:23:58] Project beta-scap-eqiad build #241544: 09FIXED in 9 min 37 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/241544/ [21:25:14] James_F: is there a quicker way to cancel the build for https://gerrit.wikimedia.org/r/496845 than just to wait for it to inevitably fail? [21:25:47] cscott: One mo. [21:27:13] cscott: There, I aborted all of the still-running jobs. [21:27:34] thanks. that should reduce the backlog a little bit, too, since i had a nine-patch sequence in there. [21:28:07] And 480889 etc. have dropped out of the backlog, yes. [21:28:25] Down to 'only' 16 patches. [21:29:11] if we keep cancelling jobs, eventually our backlog will be zero and we can pretend the problem is totally fixed [21:29:17] ;) [21:29:39] Can I just set the "State" of Gerrit projects from "Active" to "Read Only" or does that break random stuff? Asking because https://gerrit.wikimedia.org/r/#/admin/projects/generator-wikimedia-php-library has been archived for ages (see its last commit). [21:31:05] before I go, quick wikilove James_F -- stumbled a bunch of "squid doesn't exist any more" cleanup patches from you and wanted to give you thanx and appreciation for toiling in the cleanup-our-codebase pits [21:31:37] cscott: That's very kind. Thank *you* for your epic patience in making Parsoid better. :-) [21:32:25] 10Release-Engineering-Team (Watching / External), 10Operations, 10Release Pipeline, 10Core Platform Team Backlog (Watching / External), and 2 others: Track and install additional npm packages for all service container images - https://phabricator.wikimedia.org/T205911 (10dduvall) >>! In T205911#5028106, @d... [21:36:18] Project beta-scap-eqiad build #241546: 04FAILURE in 2.9 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/241546/ [21:43:29] (03PS1) 10Krinkle: Disable post-merge doc/coverage publish for l10n commits [integration/config] - 10https://gerrit.wikimedia.org/r/496976 [21:43:55] (03CR) 10Krinkle: "Every week day this causes a backlog on the postmerge pipeline which then drains resources from other pipelines taking quite a while to fl" [integration/config] - 10https://gerrit.wikimedia.org/r/496976 (owner: 10Krinkle) [21:44:09] (03CR) 10Jforrester: [C: 03+1] "Neat." [integration/config] - 10https://gerrit.wikimedia.org/r/496976 (owner: 10Krinkle) [21:45:00] Krinkle: In case you can't tell, I git-stalk integration/config.git. ;-) [21:46:11] (03PS1) 10Krinkle: Set low priority on job scheduling for postmerge pipeline [integration/config] - 10https://gerrit.wikimedia.org/r/496978 [21:46:19] James_F: Yeah, I figured :) [21:46:24] 10Phabricator: drydock: translation typo on source: `chagne` - https://phabricator.wikimedia.org/T218447 (10MarcoAurelio) [21:47:05] Some comments appear before my browser has notified me of the commit arriving in Gerrit via IRC. Which, admittedly isn't always very hard to beat (it's a *lot* of layers to go through), but does still mean you got it elsewhere :) [21:47:11] * James_F laughs. [21:47:37] Krinkle: Feel like pushing https://gerrit.wikimedia.org/r/496887 and friends? I'll test and confirm. [21:47:59] Sure. Still need to test the EL change... it's in a tab somewhere. [21:48:04] James_F: I assume it has package.json set to jsduck? [21:48:21] 10Phabricator: drydock: translation typo on source: `chagne` - https://phabricator.wikimedia.org/T218447 (10epriestley) Thanks, see . [21:48:23] (03CR) 10Krinkle: [Flow] Replace mwext-jsduck-publish with generic-node10-docs-docker-publish (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/496887 (owner: 10Jforrester) [21:48:39] Krinkle: https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/extensions/Flow/+/master/package.json#8 [21:49:03] yay @ scripts.doc. boo @ name/version/desc. [21:49:25] Krinkle: Doesn't removing the extension-jsduck template mean that documentation is only tested in gate, not in check? [21:49:27] (03CR) 10Krinkle: [C: 03+2] Disable post-merge doc/coverage publish for l10n commits [integration/config] - 10https://gerrit.wikimedia.org/r/496976 (owner: 10Krinkle) [21:49:51] (That's why I left it in.) [21:50:31] James_F: indeed. didn't think of that. Can you find an example of a repo that has npm run doc on test (e.g. not publish, but test only) [21:50:36] It must exist I assume [21:50:43] (03CR) 10Krinkle: [C: 03+2] Set low priority on job scheduling for postmerge pipeline [integration/config] - 10https://gerrit.wikimedia.org/r/496978 (owner: 10Krinkle) [21:51:27] (03Merged) 10jenkins-bot: Disable post-merge doc/coverage publish for l10n commits [integration/config] - 10https://gerrit.wikimedia.org/r/496976 (owner: 10Krinkle) [21:51:31] 10Phabricator: drydock: translation typo on source: `chagne` - https://phabricator.wikimedia.org/T218447 (10MarcoAurelio) That was fast. Thank you very much. [21:51:37] Krinkle: mediawiki/extensions/OOJsUIAjaxLogin [21:51:45] extension-jsduck but no publish. [21:52:20] (So test and gate but not publish.) [21:52:29] !log Reloading Zuul to deploy https://gerrit.wikimedia.org/r/496882 [21:52:30] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [21:52:43] (03Merged) 10jenkins-bot: Set low priority on job scheduling for postmerge pipeline [integration/config] - 10https://gerrit.wikimedia.org/r/496978 (owner: 10Krinkle) [21:56:57] Yippee, build fixed! [21:56:57] Project beta-scap-eqiad build #241547: 09FIXED in 12 min: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/241547/ [21:58:13] !log Reloading Zuul to deploy https://gerrit.wikimedia.org/r/496978 [21:58:14] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [21:59:58] RECOVERY - Puppet errors on deployment-jobrunner03 is OK: OK: Less than 1.00% above the threshold [2.0] [22:01:02] RECOVERY - Puppet errors on deployment-mwmaint01 is OK: OK: Less than 1.00% above the threshold [2.0] [22:10:23] 10Phabricator: Unhandled exception when accessing to a Phabricator mock - https://phabricator.wikimedia.org/T218450 (10abian) [22:22:53] 10Phabricator (Upstream), 10Upstream: drydock: translation typo on source: `chagne` - https://phabricator.wikimedia.org/T218447 (10Aklapper) [22:47:33] 10Phabricator: Exception when accessing to a Phabricator mock: "Attempting to add more metadata after metadata has been locked." - https://phabricator.wikimedia.org/T218450 (10Aklapper) [22:49:13] 10Phabricator: Exception when accessing to a Phabricator mock: "Attempting to add more metadata after metadata has been locked." - https://phabricator.wikimedia.org/T218450 (10Aklapper) Confirming. https://phabricator.wikimedia.org/pholio/edit/101/ works though. https://secure.phabricator.com/D17209 (closed two... [23:25:12] James_F: is https://integration.wikimedia.org/ci/job/quibble-vendor-mysql-php70-docker/18204/console failure expected? Was this the misbehaving test you all were talking about? [23:26:04] Project beta-scap-eqiad build #241555: 04FAILURE in 11 min: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/241555/ [23:42:32] Yippee, build fixed! [23:42:32] Project beta-scap-eqiad build #241556: 09FIXED in 15 min: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/241556/ [23:49:14] is zuul stuck on something? A lot of tests in submit state with all unit tests green but not merged [23:51:19] SMalyshev: The stack's waiting on the last job for 486008. [23:51:37] Which is https://integration.wikimedia.org/ci/job/quibble-vendor-mysql-hhvm-docker/40099/console [23:53:01] (Yay for backlogs.) [23:55:00] SMalyshev: Unfortunately the Wikibase flaky test failed, so the entire two hour stack of green got reset. :-( [23:58:03] Krinkle, SMalyshev: Feel free to C+2 https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/Wikibase/+/497015 which disables the test.