[03:52:09] 10Phabricator, 10Release-Engineering-Team, 10Operations, 10Wikimedia-Incident: Analyze and amend (if necessary) workflow of user reporting and detecting large regressions/outages - https://phabricator.wikimedia.org/T219589 (10Aklapper) >>! In T219589#5072592, @Yann wrote: > Hi, There should be a clear way... [05:59:47] PROBLEM - Free space - all mounts on deployment-fluorine02 is CRITICAL: CRITICAL: deployment-prep.deployment-fluorine02.diskspace._srv.byte_percentfree (<50.00%) [06:34:21] 10Release-Engineering-Team, 10serviceops: Rebuild integration/config images based on jessie - https://phabricator.wikimedia.org/T219748 (10Joe) [06:34:27] (03CR) 10PipelineBot: "pipeline-dashboard: service-pipeline-test" [tools/scap] - 10https://gerrit.wikimedia.org/r/498340 (https://phabricator.wikimedia.org/T216518) (owner: 10Effie Mouzeli) [06:53:16] (03PS1) 10Giuseppe Lavagetto: Create an image for building php packages [integration/config] - 10https://gerrit.wikimedia.org/r/500381 (https://phabricator.wikimedia.org/T216712) [06:53:18] (03PS1) 10Giuseppe Lavagetto: Rebuild all jessie-based images for removal of backports, updates [integration/config] - 10https://gerrit.wikimedia.org/r/500382 (https://phabricator.wikimedia.org/T219748) [06:53:37] (03CR) 10jerkins-bot: [V: 04-1] Rebuild all jessie-based images for removal of backports, updates [integration/config] - 10https://gerrit.wikimedia.org/r/500382 (https://phabricator.wikimedia.org/T219748) (owner: 10Giuseppe Lavagetto) [06:54:23] 10Phabricator, 10Release-Engineering-Team, 10Operations, 10Wikimedia-Incident: Analyze and amend (if necessary) workflow of user reporting and detecting large regressions/outages - https://phabricator.wikimedia.org/T219589 (10Peachey88) I would prefer to see someone over prioritize a task so it shows up ea... [06:59:32] (03PS2) 10Giuseppe Lavagetto: Rebuild all jessie-based images for removal of backports, updates [integration/config] - 10https://gerrit.wikimedia.org/r/500382 (https://phabricator.wikimedia.org/T219748) [07:04:48] RECOVERY - Free space - all mounts on deployment-fluorine02 is OK: OK: All targets OK [07:22:08] (03PS1) 10Giuseppe Lavagetto: Edit Project Config [docker-images/production-images] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/500384 [07:23:50] (03PS1) 10Giuseppe Lavagetto: Edit Project Config [docker-images] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/500385 [07:24:53] (03PS1) 10Giuseppe Lavagetto: Edit Project Config [docker-images/production-images] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/500386 [07:27:50] (03PS2) 10Giuseppe Lavagetto: Edit Project Config [docker-images/production-images] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/500384 [07:28:05] (03CR) 10Giuseppe Lavagetto: [V: 03+2 C: 03+2] Edit Project Config [docker-images/production-images] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/500384 (owner: 10Giuseppe Lavagetto) [08:04:13] (03CR) 10Hashar: [C: 03+2] docker: rebuild ci-jessie due to changes in apt repo [integration/config] - 10https://gerrit.wikimedia.org/r/500143 (https://phabricator.wikimedia.org/T219683) (owner: 10Hashar) [08:06:13] (03Merged) 10jenkins-bot: docker: rebuild ci-jessie due to changes in apt repo [integration/config] - 10https://gerrit.wikimedia.org/r/500143 (https://phabricator.wikimedia.org/T219683) (owner: 10Hashar) [08:07:17] !log Rebuilding container docker-registry.wikimedia.org/wikimedia-jessie # T219683 [08:07:19] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [08:07:19] T219683: Rebuild docker-registry.wikimedia.org/wikimedia-jessie to drop jessie-update/jessie-backports - https://phabricator.wikimedia.org/T219683 [08:08:06] !log Rebuilding Quibble Jessie containers that failed to build last week due to wikimedia-jessie container. # T219647 [08:08:08] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [08:08:09] T219647: Upgrade CI jobs to Quibble 0.0.30 - https://phabricator.wikimedia.org/T219647 [08:10:04] <_joe_> hashar: ney [08:10:40] <_joe_> hashar: please stop [08:10:41] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Kanban), 10Quibble, 10Patch-For-Review: Upgrade CI jobs to Quibble 0.0.30 - https://phabricator.wikimedia.org/T219647 (10hashar) [08:10:42] <_joe_> and coordinate [08:11:13] <_joe_> 10:08:51 <_joe_> you should revert https://gerrit.wikimedia.org/r/c/integration/config/+/500143 [08:11:15] <_joe_> 10:09:21 <_joe_> and then merge and use https://gerrit.wikimedia.org/r/c/integration/config/+/500382 [08:11:27] <_joe_> I've rebuilt our base image not to use -updates [08:12:41] <_joe_> uhm no your patch can actually stay [08:12:49] <_joe_> but then you have to update mine :) [08:13:05] <_joe_> https://gerrit.wikimedia.org/r/c/integration/config/+/500382/2/dockerfiles/ci-jessie/changelog this is now out of date [08:42:50] _joe_: arghhhh sorry :( [08:43:10] <_joe_> hashar: that's fine, you just need to change the changelog entry for ci-jessie [08:43:19] _joe_: sorry those rebuilds were on my todo list for this morning, and since you have rebuild the wikimedia-jessie container I immediately processed to my pending change :] [08:43:26] <_joe_> eheh [08:43:31] I haven't even seen your other changes :(( [08:43:37] <_joe_> I wanted to test the new version of docker-pkg update [08:43:44] oh [08:43:45] <_joe_> which will be ready and merged today [08:43:51] have you deployed it on contint1001 as well? [08:43:54] <_joe_> I need the prune action in production [08:43:56] <_joe_> nope [08:44:02] <_joe_> it will be deployed later :) [08:44:09] I have been running it locally from master for every :] [08:44:13] so it is probably fine [08:44:17] <_joe_> so that we can use all changes we made in the last few months [08:44:18] <_joe_> me too [08:44:19] notably --info is definitely useful [08:44:21] <_joe_> it should be [08:44:28] <_joe_> and prune, and --select [08:45:43] --select , I am always confused about it. I never remember against what it is supposed to match [08:46:18] the full image name with tag, the full image name, the local namespaced basename (eg: releng/ci-jessie ) or the directory name (ci-jessie) :D [08:46:26] I end up using wildcard: *ci-jessie* [08:46:50] <_joe_> the full name with the tag :) [08:48:43] and sorry again to not have used your config change, I would have used it if I had noticed it :( [08:53:01] <_joe_> hah no problems [08:53:16] <_joe_> I just didn't want you to rebuild and publish without my update [08:53:21] <_joe_> btw, now that I think of it [08:53:48] <_joe_> we can even remove the change from ci-jessie - you'll rebuild it on top of the correct wikimedia-jessie anyways [08:54:20] 2019-04-01 10:52:25,497 [docker-pkg-build] INFO - E: Unable to locate package python3-git [08:54:21] sniff [08:54:47] * hashar remembers to never ever use -backports in the future [09:01:38] 10Continuous-Integration-Config, 10MediaWiki-extensions-Scribunto, 10Wikidata, 10Patch-For-Review, 10Wikidata-Campsite (Wikidata-Campsite-Iteration-∞): [Task] Add Scribunto to extension-gate in CI - https://phabricator.wikimedia.org/T125050 (10Tarrow) [09:01:56] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Kanban), 10Quibble, 10Patch-For-Review: Upgrade CI jobs to Quibble 0.0.30 - https://phabricator.wikimedia.org/T219647 (10hashar) Eventually `releng/quibble-jessie` fails to build: 2019-04-01 10:52:25,497 [docker-pkg-build] INFO - E:... [09:22:40] (03PS4) 10Effie Mouzeli: Add more tests for main [tools/scap] - 10https://gerrit.wikimedia.org/r/498340 (https://phabricator.wikimedia.org/T216518) [09:25:29] (03CR) 10PipelineBot: "pipeline-dashboard: service-pipeline-test" [tools/scap] - 10https://gerrit.wikimedia.org/r/498340 (https://phabricator.wikimedia.org/T216518) (owner: 10Effie Mouzeli) [09:35:23] PROBLEM - Puppet errors on deployment-cache-text05 is CRITICAL: CRITICAL: 8.99% of data above the critical threshold [3.0] [09:36:05] ^ me [09:53:59] (03PS3) 10Giuseppe Lavagetto: Rebuild all jessie-based images for removal of backports, updates [integration/config] - 10https://gerrit.wikimedia.org/r/500382 (https://phabricator.wikimedia.org/T219748) [10:00:13] (03PS1) 10Hashar: docker: quibble-jessie lack python3-git, use pip3 instead [integration/config] - 10https://gerrit.wikimedia.org/r/500402 (https://phabricator.wikimedia.org/T219647) [10:00:28] (03CR) 10Hashar: [C: 03+2] docker: quibble-jessie lack python3-git, use pip3 instead [integration/config] - 10https://gerrit.wikimedia.org/r/500402 (https://phabricator.wikimedia.org/T219647) (owner: 10Hashar) [10:02:06] (03Merged) 10jenkins-bot: docker: quibble-jessie lack python3-git, use pip3 instead [integration/config] - 10https://gerrit.wikimedia.org/r/500402 (https://phabricator.wikimedia.org/T219647) (owner: 10Hashar) [10:10:18] (03CR) 10Lucas Werkmeister (WMDE): Add my private email address to the CI whitelist (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/500178 (owner: 10Lucas Werkmeister (WMDE)) [10:11:06] (03PS2) 10Lucas Werkmeister (WMDE): Add my private email address to the CI whitelist [integration/config] - 10https://gerrit.wikimedia.org/r/500178 [10:13:00] <_joe_> hashar: my patch can be merged I think [10:14:11] <_joe_> and well, then you need to rebuild the images :) [10:27:50] <_joe_> hashar: fyi, I'm deploying docker-pkg 1.1.2, with most of our enhancements [10:28:05] _joe_: lovely! thank you :) [10:28:33] <_joe_> including the parallelization of the queries to the registry [10:28:46] <_joe_> so building on contint1001 should be a tad better [10:38:01] (03PS1) 10Hashar: docker: typos depends on ci-src-setup-simple [integration/config] - 10https://gerrit.wikimedia.org/r/500412 [10:38:33] 10Release-Engineering-Team (Kanban), 10Release, 10Train Deployments: 1.33.0-wmf.24 deployment blockers - https://phabricator.wikimedia.org/T206678 (10Krenair) [10:43:19] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Watching / External), 10Patch-For-Review: Get letsencrypt wildcard cert for *.beta.wmflabs.org domains - https://phabricator.wikimedia.org/T182927 (10Krenair) So the remaining part of this - other than the patch for the subtask is this puppet commit... [10:43:42] Is there a command to trigger a post-merge build like there is "recheck" for the pipeline test? [10:43:54] https://gerrit.wikimedia.org/r/#/c/mediawiki/services/citoid/+/497315/ post merge build hasn't happened. [10:44:39] And I can't figure out how to trigger it [10:46:56] (03PS1) 10Hashar: docker: update sury.org GPG key and rebuild containers [integration/config] - 10https://gerrit.wikimedia.org/r/500416 (https://phabricator.wikimedia.org/T218735) [10:47:36] mvolz: hello [10:47:50] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Testing, 10Patch-For-Review, 10User-zeljkofilipin: Upgrade webdriverio to version 5 - https://phabricator.wikimedia.org/T213268 (10zeljkofilipin) [10:47:52] mvolz: postmerge steps have been broken eventually :- [10:47:58] and did not trigger for some time [10:48:03] hashar: hi [10:48:07] ah ok [10:48:12] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Testing, 10Patch-For-Review, 10User-zeljkofilipin: Upgrade webdriverio to version 5 - https://phabricator.wikimedia.org/T213268 (10zeljkofilipin) [10:48:15] but I can trigger it manually [10:49:30] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Testing, 10Patch-For-Review, 10User-zeljkofilipin: Upgrade webdriverio to version 5 - https://phabricator.wikimedia.org/T213268 (10zeljkofilipin) 05Open→03Stalled [10:50:18] !log Manually triggering postmerge step of citoid due to T219017 for mvolz. On contint1001: zuul enqueue --trigger gerrit --pipeline postmerge --project mediawiki/services/citoid --change 497315,1 [10:50:21] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [10:50:21] T219017: CI job beta-mediawiki-config-update-eqiad has stopped running - https://phabricator.wikimedia.org/T219017 [10:50:36] hashar: thanks! [10:50:52] mvolz: I think it is there https://integration.wikimedia.org/ci/blue/organizations/jenkins/service-pipeline-test-and-publish/detail/service-pipeline-test-and-publish/91/ [10:50:53] :) [10:52:43] (03CR) 10Hashar: [C: 03+2] docker: typos depends on ci-src-setup-simple [integration/config] - 10https://gerrit.wikimedia.org/r/500412 (owner: 10Hashar) [10:52:50] (03CR) 10Hashar: [C: 03+2] docker: update sury.org GPG key and rebuild containers [integration/config] - 10https://gerrit.wikimedia.org/r/500416 (https://phabricator.wikimedia.org/T218735) (owner: 10Hashar) [10:54:19] (03Merged) 10jenkins-bot: docker: typos depends on ci-src-setup-simple [integration/config] - 10https://gerrit.wikimedia.org/r/500412 (owner: 10Hashar) [10:55:42] (03Merged) 10jenkins-bot: docker: update sury.org GPG key and rebuild containers [integration/config] - 10https://gerrit.wikimedia.org/r/500416 (https://phabricator.wikimedia.org/T218735) (owner: 10Hashar) [11:06:27] HTTPSConnectionPool(host='docker-registry.discovery.wmnet', port=443): Max retries exceeded with url: /v2/releng/operations-dnslint/manifests/0.0.4 (Caused by SSLError(SSLError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:600)'),)) [11:06:31] right now [11:06:33] my life is miserable [11:08:20] :( :( [11:08:34] <_joe_> hashar: I'm fixing that right now [11:08:42] :) :) [11:08:43] <_joe_> hashar: quick fix is [11:08:45] _joe_: but it works with curl! :/ [11:09:01] <_joe_> export REQUESTS_CA_BUNDLE=/etc/ssl/certs/ca-certificates.crt [11:09:14] <_joe_> hashar: and it works with debian's version of python-requests [11:09:15] curl https://docker-registry.discovery.wmnet/v2/releng/oprations-dnslint/manifests/0.0.4 [11:09:16] ;D [11:09:28] ah [11:09:30] <_joe_> hashar: use that export there [11:09:34] <_joe_> and it will work [11:09:41] <_joe_> sorry for the inconvenience, quickfix coming [11:09:52] don't worry :] [11:09:57] <_joe_> which makes me think [11:10:04] <_joe_> we should have some sort of integration test [11:10:14] <_joe_> anyways [11:10:14] CURL_INSECURE=1 works too \o/ [11:10:39] <_joe_> https://gerrit.wikimedia.org/r/c/operations/docker-images/docker-pkg/+/500417 [11:10:47] <_joe_> yeah don't use that :P [11:11:07] * python-requests that won't use the system CA bundle by default. [11:11:10] ahhhhh [11:13:47] <_joe_> hashar: I'm deploying the new version in a couple minutes anyways [11:14:51] <_joe_> sorry I usually deploy just to boron [11:16:59] <_joe_> hashar: fix deployed, I now go to lunch :) [11:20:01] _joe_: awesome, thank you! [11:26:38] Arguments: ('Get https://docker-registry.discovery.wmnet/v2/releng/ci-jessie/manifests/0.4.0: no basic auth credentials',) [11:26:40] :D [11:30:22] RECOVERY - Puppet errors on deployment-cache-text05 is OK: OK: Less than 1.00% above the threshold [2.0] [11:32:29] that is official [11:32:35] I am done with Docker container [11:32:51] it is a shit show of doom [11:34:09] <_joe_> hashar: uh what's the problem? [11:34:50] <_joe_> if you're trying from contint, you do have to provide credentials [11:34:53] <_joe_> IIRC [11:34:54] well [11:35:05] <_joe_> that's how our registry works [11:35:09] I wrote polished up some python last week for the mediawiki test runner Quibble [11:35:22] which really is supposed to be straightforward, just a few bits here and there that are updated [11:35:51] and I ended up having to fix some apt configuration, hack in a gpg key for some unrelated thing and it is monday 1pm and I don't even remember what I was supposed to do today :( [11:36:04] I think I am just getting old :/ [11:36:06] <_joe_> heh welcome to SRE [11:36:17] <_joe_> no, docker makes you have to think about the os [11:36:26] <_joe_> which is somewhat funny for me :) [11:36:30] hashar: hey im not old and i do the same thing i think its just being a developer :P [11:37:15] _joe_: and I got some other unrelated error due to logger.error() missing some arguments :] [11:37:28] but yeah hmm. Let me fill a task about it to paste the errors [11:40:11] 10Continuous-Integration-Infrastructure, 10docker-pkg: docker-pkg is unhappy on contint1001 - https://phabricator.wikimedia.org/T219778 (10hashar) [11:40:27] _joe_: https://phabricator.wikimedia.org/T219778 but I am going to have some lunch of some sort first ;D [11:42:17] 10Continuous-Integration-Infrastructure, 10docker-pkg: docker-pkg is unhappy on contint1001 - https://phabricator.wikimedia.org/T219778 (10hashar) So the root cause is: Get https://docker-registry.discovery.wmnet/v2/releng/ci-stretch/manifests/0.1.4: no basic auth credentials' There is also a concern wi... [11:47:15] (03Abandoned) 10WMDE-leszek: Remove redundant WikibaseLexeme jobs [integration/config] - 10https://gerrit.wikimedia.org/r/441178 (owner: 10WMDE-leszek) [11:51:26] (03CR) 10Lucas Werkmeister (WMDE): "We just noticed during SWAT that CI for a WikibaseLexeme backport needs almost 45 minutes, so if you don’t +2 a backport at the beginning " [integration/config] - 10https://gerrit.wikimedia.org/r/441178 (owner: 10WMDE-leszek) [11:57:35] <_joe_> hashar: I think I know what's happening [11:57:42] found it [11:57:55] <_joe_> but I need to rest a bit, will be back in ~ 1 hour or so [11:58:25] <_joe_> the issue is our docker registry requires auth to servers that can upload [11:58:34] 10Continuous-Integration-Infrastructure, 10docker-pkg, 10Patch-For-Review: docker-pkg is unhappy on contint1001 - https://phabricator.wikimedia.org/T219778 (10hashar) ` Message: 'Build failed: $s' ^^^ ` Must be a percentage sign, not a dollar! [11:58:47] <_joe_> it's not a hard thing to fix, but I'll do that later [11:59:00] <_joe_> we can redeploy an old version if you're blocked [11:59:07] _joe_: I had two mixed errors actually. One being logging.error('$s') , the dollar is broken and should be a percent [11:59:13] <_joe_> yes [11:59:20] then the no basic auth credetnaisl , I don't know what is going on [11:59:25] <_joe_> I do [11:59:28] maybe it can not load its configuration [11:59:33] <_joe_> nope [12:00:12] <_joe_> I can fix it, but later. If you need docker-pkg, we can rollback [12:00:21] lets do a rollback yes [12:00:22] :( [12:01:01] it is a pity really since it works just fine on my local machine [12:01:10] * hashar ships his laptop to the datacenter [12:01:57] Lol [12:02:55] <_joe_> hashar: can you try again? [12:03:16] doing [12:03:16] <_joe_> hashar: your local machine has no private docker registry to interact with :) [12:03:25] <_joe_> --info won't work ofc [12:04:43] fun fun fun [12:04:45] [contint1001.wikimedia.org] out: docker.errors.BuildError: Building image docker-registry.discovery.wmnet/releng/sury-php:0.3.0 failed [12:04:49] is all I get haha [12:05:22] * hashar tries with --debug [12:05:23] Try —debug [12:05:48] <_joe_> hashar: well you have the build log [12:06:00] hmm [12:06:04] looks like the revert did not work [12:06:11] logger.error('Build failed: $s' [12:06:15] <_joe_> what do you mean? [12:07:18] https://gerrit.wikimedia.org/r/#/c/operations/docker-images/docker-pkg/+/500429/1/docker_pkg/image.py [12:07:26] which got introduced later on I think [12:07:30] <_joe_> yeah it didn't [12:07:45] <_joe_> clearly something in how scap works escapes me [12:07:57] ditto [12:08:03] orrr [12:08:08] I use the wrong command :) [12:09:37] /srv/deployment/docker-pkg/venv/bin/docker-pkg is from November 2017 [12:09:56] then that is just the entry point [12:10:20] <_joe_> no that's correct [12:10:43] <_joe_> the problem is I did revert the local copy to the version I want to redeploy [12:10:49] <_joe_> but apparently doesn't work [12:13:30] well the venv has several versions of docker-pkg [12:13:33] so that is certainly broken [12:14:01] <_joe_> no, that's not the issue [12:14:12] <_joe_> the issue is scap doesn't allow to rollback this way [12:14:16] <_joe_> only git reverts [12:17:28] <_joe_> hashar: can you please disable pulling the parent images? [12:17:36] <_joe_> and try again? [12:19:27] trying [12:19:56] oh i mean [12:20:02] I don't use dopcker-pkg --pull yet [12:20:06] run('docker pull docker-registry.wikimedia.org/wikimedia-jessie') [12:20:07] run('docker pull docker-registry.wikimedia.org/wikimedia-stretch') [12:20:07] [12:20:08] I removed those [12:20:11] but that is unrealted [12:20:11] <_joe_> it pulls by default [12:20:42] <_joe_> --no-pull [12:20:50] <_joe_> you have to tell it --no-pull [12:21:08] yeah that seems better [12:21:25] <_joe_> ok so I got what the problem is, and it *should* be easy to fix [12:21:27] so we can't pull from the discovery url ? :( [12:21:42] <_joe_> lemme rollback for now [12:21:58] <_joe_> this will allow me to also rollback the --nightly change [12:22:34] I am letting the images build ;) [12:24:13] <_joe_> anyways I'm rolling back now [12:24:23] <_joe_> I'll try and fix this now [12:26:22] 10Continuous-Integration-Infrastructure, 10docker-pkg, 10Patch-For-Review: docker-pkg is unhappy on contint1001 - https://phabricator.wikimedia.org/T219778 (10Joe) The underlying issue is that we're trying to pull from a private registry (since we're contacting the internal registry there) and we don't provi... [12:28:28] _joe_: thank you ;] [12:40:53] 10Phabricator-Bot-Requests, 10WMSE-Bug-Reporting-and-Translation-2019: Show active projects first on subproject page, then archived projects - https://phabricator.wikimedia.org/T218041 (10mmodell) >>! In T218041#5072521, @Aklapper wrote: > @mmodell: Thanks! Is that one-liner change something to propose to upst... [12:41:19] <_joe_> hashar: uhm what user are you running docker-pkg as? [12:41:58] _joe_: has 'hashar' apparently [12:42:13] <_joe_> ok [12:42:27] <_joe_> and I guess you never ran "docker login" from your user [12:42:51] <_joe_> which is good, and I'm keen on not changing that [12:43:16] <_joe_> at the same time, you have access to those credentials, so... [13:04:56] PROBLEM - Puppet errors on deployment-restbase01 is CRITICAL: CRITICAL: 5.56% of data above the critical threshold [3.0] [13:10:24] (03PS1) 10Hashar: Switch Jenkins jobs to Quibble 0.0.30 [integration/config] - 10https://gerrit.wikimedia.org/r/500442 (https://phabricator.wikimedia.org/T219647) [13:11:01] !log Upgraded CI Jenkins jobs to Quibble 0.0.30 # T219647 [13:11:06] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [13:11:11] T219647: Upgrade CI jobs to Quibble 0.0.30 - https://phabricator.wikimedia.org/T219647 [13:16:35] <_joe_> hashar: https://gerrit.wikimedia.org/r/c/operations/docker-images/docker-pkg/+/500445 should be the solution to your problems FYI [13:20:37] 10Continuous-Integration-Config, 10Operations, 10Patch-For-Review, 10User-zeljkofilipin: npm 6 consistently fails with "Z_DATA_ERROR: invalid distance too far back" on some repos - https://phabricator.wikimedia.org/T215562 (10MoritzMuehlenhoff) @Krinkle I've prepared a new build and uploaded it to https://... [13:24:35] PROBLEM - Puppet errors on deployment-restbase02 is CRITICAL: CRITICAL: 1.11% of data above the critical threshold [3.0] [13:33:15] _joe_: thank you! will dig into it tomorrow :) [13:37:10] (03CR) 10Hashar: [C: 03+2] Switch Jenkins jobs to Quibble 0.0.30 [integration/config] - 10https://gerrit.wikimedia.org/r/500442 (https://phabricator.wikimedia.org/T219647) (owner: 10Hashar) [13:38:03] 10Continuous-Integration-Infrastructure, 10Patch-For-Review: sury.org packages do not validate gpg - https://phabricator.wikimedia.org/T218735 (10hashar) 05Open→03Resolved a:03hashar [13:38:17] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Kanban), 10Quibble, 10Patch-For-Review: Upgrade CI jobs to Quibble 0.0.30 - https://phabricator.wikimedia.org/T219647 (10hashar) [13:38:20] 10Continuous-Integration-Infrastructure, 10Patch-For-Review: sury.org packages do not validate gpg - https://phabricator.wikimedia.org/T218735 (10hashar) [13:40:39] (03Merged) 10jenkins-bot: Switch Jenkins jobs to Quibble 0.0.30 [integration/config] - 10https://gerrit.wikimedia.org/r/500442 (https://phabricator.wikimedia.org/T219647) (owner: 10Hashar) [13:48:18] AZOEIUR [13:49:22] (03PS1) 10Hashar: Revert "Switch Jenkins jobs to Quibble 0.0.30" [integration/config] - 10https://gerrit.wikimedia.org/r/500452 (https://phabricator.wikimedia.org/T219647) [13:49:30] (03CR) 10Hashar: [C: 03+2] Revert "Switch Jenkins jobs to Quibble 0.0.30" [integration/config] - 10https://gerrit.wikimedia.org/r/500452 (https://phabricator.wikimedia.org/T219647) (owner: 10Hashar) [13:49:55] Azoeiur? Is that French? :) [13:50:03] !log Reverted CI Jenkins jobs to Quibble 0.0.28 # T219647 [13:50:09] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [13:50:09] T219647: Upgrade CI jobs to Quibble 0.0.30 - https://phabricator.wikimedia.org/T219647 [13:50:43] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Kanban), 10Quibble, 10Patch-For-Review: Upgrade CI jobs to Quibble 0.0.30 - https://phabricator.wikimedia.org/T219647 (10hashar) And that fails due to `counterexample Database was successfully set up MediaWiki has been successfully instal... [13:51:45] (03Merged) 10jenkins-bot: Revert "Switch Jenkins jobs to Quibble 0.0.30" [integration/config] - 10https://gerrit.wikimedia.org/r/500452 (https://phabricator.wikimedia.org/T219647) (owner: 10Hashar) [14:57:28] 10Continuous-Integration-Config, 10Operations, 10Patch-For-Review, 10User-zeljkofilipin: npm 6 consistently fails with "Z_DATA_ERROR: invalid distance too far back" on some repos - https://phabricator.wikimedia.org/T215562 (10Krinkle) a:05MoritzMuehlenhoff→03Krinkle [14:58:46] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Kanban), 10Quibble: Quibble 0.0.30 misses quibble/mediawiki/local_settings.php - https://phabricator.wikimedia.org/T219786 (10hashar) [15:02:16] (03PS1) 10Hashar: Fix package_data inclusion for old setuptools [integration/quibble] - 10https://gerrit.wikimedia.org/r/500469 (https://phabricator.wikimedia.org/T219786) [15:09:45] (03CR) 10Hashar: [C: 03+2] Fix package_data inclusion for old setuptools [integration/quibble] - 10https://gerrit.wikimedia.org/r/500469 (https://phabricator.wikimedia.org/T219786) (owner: 10Hashar) [15:10:23] (03Merged) 10jenkins-bot: Fix package_data inclusion for old setuptools [integration/quibble] - 10https://gerrit.wikimedia.org/r/500469 (https://phabricator.wikimedia.org/T219786) (owner: 10Hashar) [15:10:52] (03CR) 10jenkins-bot: Fix package_data inclusion for old setuptools [integration/quibble] - 10https://gerrit.wikimedia.org/r/500469 (https://phabricator.wikimedia.org/T219786) (owner: 10Hashar) [15:16:06] (03PS1) 10Kosta Harlan: GrowthExperiments: Add dependency on EventLogging for phan [integration/config] - 10https://gerrit.wikimedia.org/r/500471 [15:39:05] (03CR) 10Hashar: [C: 03+2] "And that is still broken" [integration/quibble] - 10https://gerrit.wikimedia.org/r/500469 (https://phabricator.wikimedia.org/T219786) (owner: 10Hashar) [15:51:40] 10Phabricator: Form 46 imagery is broken - https://phabricator.wikimedia.org/T219739 (10Krinkle) Thanks @MarcoAurelio! I suppose it won't surprise you to hear that it works for me... {F28548090} I believe there's a bug in Phabricator's image uploader. When uploading an image in a task description or task comme... [15:55:39] (03CR) 10Sbisson: [C: 03+1] GrowthExperiments: Add dependency on EventLogging for phan [integration/config] - 10https://gerrit.wikimedia.org/r/500471 (owner: 10Kosta Harlan) [16:07:05] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Kanban), 10Quibble: Quibble 0.0.30 misses quibble/mediawiki/local_settings.php - https://phabricator.wikimedia.org/T219804 (10hashar) [16:10:31] PROBLEM - Host deployment-db03 is DOWN: CRITICAL - Host Unreachable (172.16.5.23) [16:13:38] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Kanban), 10Quibble: Quibble 0.0.30 misses quibble/mediawiki/local_settings.php - https://phabricator.wikimedia.org/T219804 (10hashar) With Jessie, `pip3 install git+https://gerrit.wikimedia.org/r/integration/quibble.git@0.0.30#egg=quibble`... [16:16:21] (03PS7) 10Dduvall: WIP Allow configuration of pipeline [integration/pipelinelib] - 10https://gerrit.wikimedia.org/r/499918 [16:17:00] (03CR) 10PipelineBot: "pipeline-dashboard: service-pipeline-test" [integration/pipelinelib] - 10https://gerrit.wikimedia.org/r/499918 (owner: 10Dduvall) [16:17:02] (03CR) 10jerkins-bot: [V: 04-1] WIP Allow configuration of pipeline [integration/pipelinelib] - 10https://gerrit.wikimedia.org/r/499918 (owner: 10Dduvall) [16:17:12] 10Release-Engineering-Team, 10MediaWiki-Core-Testing, 10Wikimedia-production-error (Shared Build Failure), 10phan: phan 1.2.6 is OOMing on MediaWiki core - https://phabricator.wikimedia.org/T219114 (10Krinkle) [16:18:05] (03PS1) 10Hashar: docker: try to rebuild quibble-jessie with 0.0.31 [integration/config] - 10https://gerrit.wikimedia.org/r/500482 (https://phabricator.wikimedia.org/T219804) [16:20:17] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Kanban), 10Quibble, 10Patch-For-Review: Quibble 0.0.30 misses quibble/mediawiki/local_settings.php - https://phabricator.wikimedia.org/T219786 (10hashar) With Jessie, `pip3 install git+https://gerrit.wikimedia.org/r/integration/quibble.gi... [16:20:21] (03PS2) 10Hashar: docker: try to rebuild quibble-jessie with 0.0.31 [integration/config] - 10https://gerrit.wikimedia.org/r/500482 (https://phabricator.wikimedia.org/T219786) [16:20:35] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Kanban), 10Quibble, 10Patch-For-Review: Quibble 0.0.30 misses quibble/mediawiki/local_settings.php - https://phabricator.wikimedia.org/T219786 (10hashar) [16:20:39] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Kanban), 10Quibble, 10Patch-For-Review: Quibble 0.0.30 misses quibble/mediawiki/local_settings.php - https://phabricator.wikimedia.org/T219804 (10hashar) [16:28:51] 10Continuous-Integration-Config, 10MediaWiki-extensions-Scribunto, 10Wikidata, 10Patch-For-Review, 10Wikidata-Campsite (Wikidata-Campsite-Iteration-∞): [Task] Add Scribunto to extension-gate in CI - https://phabricator.wikimedia.org/T125050 (10WMDE-leszek) So this task has been Unbreak now for almost two... [16:29:29] (03CR) 10Alexandros Kosiaris: [C: 03+1] Rebuild all jessie-based images for removal of backports, updates [integration/config] - 10https://gerrit.wikimedia.org/r/500382 (https://phabricator.wikimedia.org/T219748) (owner: 10Giuseppe Lavagetto) [16:30:58] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Kanban), 10Quibble, 10Patch-For-Review: Quibble 0.0.30 misses quibble/mediawiki/local_settings.php - https://phabricator.wikimedia.org/T219786 (10hashar) $ docker run --rm -it --entrypoint=find docker-registry.wikimedia.org/releng/quibble... [16:41:18] 10Gerrit: Triple-clicking Gerrit change subject selects unwanted space at the beginning - https://phabricator.wikimedia.org/T219809 (10Lucas_Werkmeister_WMDE) [16:44:35] 10Release-Engineering-Team (Kanban), 10Release, 10Train Deployments: 1.33.0-wmf.23 deployment blockers - https://phabricator.wikimedia.org/T206677 (10dduvall) [16:46:39] (03PS1) 10Hashar: Archive LocalSettings.php before linting it [integration/quibble] - 10https://gerrit.wikimedia.org/r/500489 [16:49:27] (03PS1) 10Hashar: Fix prepend due to missing ?> in LocalSettings.php [integration/quibble] - 10https://gerrit.wikimedia.org/r/500490 (https://phabricator.wikimedia.org/T219786) [16:52:22] (03CR) 10Hashar: [C: 03+2] Fix prepend due to missing ?> in LocalSettings.php [integration/quibble] - 10https://gerrit.wikimedia.org/r/500490 (https://phabricator.wikimedia.org/T219786) (owner: 10Hashar) [16:52:24] (03CR) 10Hashar: [C: 03+2] Archive LocalSettings.php before linting it [integration/quibble] - 10https://gerrit.wikimedia.org/r/500489 (owner: 10Hashar) [16:52:58] (03Merged) 10jenkins-bot: Archive LocalSettings.php before linting it [integration/quibble] - 10https://gerrit.wikimedia.org/r/500489 (owner: 10Hashar) [16:53:08] (03Merged) 10jenkins-bot: Fix prepend due to missing ?> in LocalSettings.php [integration/quibble] - 10https://gerrit.wikimedia.org/r/500490 (https://phabricator.wikimedia.org/T219786) (owner: 10Hashar) [16:53:24] (03CR) 10jenkins-bot: Archive LocalSettings.php before linting it [integration/quibble] - 10https://gerrit.wikimedia.org/r/500489 (owner: 10Hashar) [16:54:02] (03CR) 10jenkins-bot: Fix prepend due to missing ?> in LocalSettings.php [integration/quibble] - 10https://gerrit.wikimedia.org/r/500490 (https://phabricator.wikimedia.org/T219786) (owner: 10Hashar) [16:58:29] (03PS3) 10Hashar: docker: try to rebuild quibble-jessie with 0.0.31 [integration/config] - 10https://gerrit.wikimedia.org/r/500482 (https://phabricator.wikimedia.org/T219786) [17:00:04] (03CR) 10Hashar: [C: 03+2] "At least quibble-jessie-hhvm worked for me locally :]" [integration/config] - 10https://gerrit.wikimedia.org/r/500482 (https://phabricator.wikimedia.org/T219786) (owner: 10Hashar) [17:01:30] (03PS4) 10Hashar: docker: trebuild quibble-jessie with 0.0.31 [integration/config] - 10https://gerrit.wikimedia.org/r/500482 (https://phabricator.wikimedia.org/T219786) [17:01:55] (03PS5) 10Hashar: docker: rebuild for Quibble 0.0.31 [integration/config] - 10https://gerrit.wikimedia.org/r/500482 (https://phabricator.wikimedia.org/T219786) [17:02:13] (03CR) 10Hashar: [C: 03+2] docker: rebuild for Quibble 0.0.31 [integration/config] - 10https://gerrit.wikimedia.org/r/500482 (https://phabricator.wikimedia.org/T219786) (owner: 10Hashar) [17:03:49] 10Phabricator: Form 46 imagery is broken - https://phabricator.wikimedia.org/T219739 (10MarcoAurelio) Thank you. Images are now visible on the form (good idea fwiw - it'll help those reporting errors to add the relevant information to the tickets). [17:04:10] (03Merged) 10jenkins-bot: docker: rebuild for Quibble 0.0.31 [integration/config] - 10https://gerrit.wikimedia.org/r/500482 (https://phabricator.wikimedia.org/T219786) (owner: 10Hashar) [17:04:34] !log Building CI docker images for Quibble 0.0.31 (yes it is a long day...) [17:04:35] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [17:06:17] 10Beta-Cluster-Infrastructure, 10Analytics, 10EventBus, 10Patch-For-Review, 10Wikimedia-production-error: PHP Warning: Array key should be either a string or an integer - https://phabricator.wikimedia.org/T219738 (10MarcoAurelio) Is this a wmf.24 blocker as T219737 was? [17:17:54] hashar: you there? [17:21:00] 10Release-Engineering-Team, 10MediaWiki-Core-Testing, 10Wikimedia-production-error (Shared Build Failure), 10phan: phan 1.2.6 is OOMing on MediaWiki core - https://phabricator.wikimedia.org/T219114 (10hashar) Can it be due to the Phan upgrade? Or most probably we started to exceed some memory threshold.... [17:25:06] !log fresnel-node10-browser-docker failing with ENOMEM. Depooled integration-slave-docker-1049 as precaution. [17:25:07] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [17:25:21] hashar: ^ Don't know what's going on, depooled for now. [17:26:42] Krinkle: up and memory issues are hard to track down :-(((( [17:27:06] Maybe due to overcommit? [17:27:42] our OpenStack used to have memory overcommit [17:27:46] I'm guessing UNSTABLE on some quibble jobs "Publish JUnit test result report' changed build result to UNSTABLE" [17:27:55] https://gerrit.wikimedia.org/r/500107 [17:27:59] until I found out that nova does not support memory overcommit when using kvm [17:28:04] not sure if this is a problem with the extension or the platform [17:28:16] Krinkle: so openstack memory overcommitting should be disabled. OpenStack does not support it with KVM [17:29:57] hashar: hm.. ok. I read somewhere you are considering to enable overcommit for fork(). thought maybe it was related :) [17:30:05] Krinkle: another possibility is that a process already taking lot of memory then attempting a fork() [17:30:21] the instance has /proc/sys/vm/overcommit_memory = 0 [17:30:39] and in this case the Kernel will refuses the fork since it considers that potentially there is not enough memory for the forked procesess [17:30:48] Krinkle: yeah exactly ;] [17:30:54] yeah, but the npm-test job doesn't fork afaik. [17:31:07] maybe it's a coincidence. but haven't seen the issue before. [17:31:30] jdlrobson: that is because somehow the job compared the current build test result with the previous build results and found out that some difference in tests has been introduced. On our system that usually does not make sense so the job junit config has to be fixed up so that it always FAIL [17:31:47] jdlrobson: but I can't just fix it on spot, but a task would get it done eventually :] [17:33:19] hashar: so how do i deal with this one? [17:33:26] force merge? [17:33:35] jdlrobson: no [17:33:56] jdlrobson: check the job page, there must be a test result on it [17:33:58] ohhh is it the bit that says "There were 2 risky tests:" [17:34:04] it was hidden in the report [17:34:06] there must be something in phpunit that trigger some error [17:34:11] ah yeah [17:34:18] so my previous asusmption wsa probably wrong [17:34:20] but [17:34:28] probably we should flag that as being a failure [17:35:01] jdlrobson: it might be phpunit in mediawiki/core being configured to no more output any messages for risky tests [17:35:08] beside the R in the dot progress [17:35:25] thanks! yeh the "unstable" bit was the thing that threw me [17:36:01] jdlrobson: https://integration.wikimedia.org/ci/job/quibble-vendor-mysql-php71-docker/8323/testReport/Tests.QuickSurveys/ :) [17:36:27] phpunit still exit 0 that is why the build is not marked as a FAILURE [17:36:39] but some tests do fail and are marked as such in junit [17:37:08] when Jenkins process the result it flags the build as UNSTABLE which in Jenkins world means: the build worked perfectly fine, all tests ran [17:37:13] but some tests failed [17:37:20] jdlrobson: the longer explaanation ;) [17:39:56] RECOVERY - Puppet errors on deployment-restbase01 is OK: OK: Less than 1.00% above the threshold [2.0] [17:40:43] I am off for dinner etc [17:41:33] got it! thanks for unblocking me hashar ! [17:44:34] RECOVERY - Puppet errors on deployment-restbase02 is OK: OK: Less than 1.00% above the threshold [2.0] [17:52:50] (03PS1) 10Phedenskog: docker: update to fresnel 0.2.2 [integration/config] - 10https://gerrit.wikimedia.org/r/500504 [17:57:22] (03CR) 10Krinkle: [C: 03+2] docker: update to fresnel 0.2.2 [integration/config] - 10https://gerrit.wikimedia.org/r/500504 (owner: 10Phedenskog) [17:58:54] (03Merged) 10jenkins-bot: docker: update to fresnel 0.2.2 [integration/config] - 10https://gerrit.wikimedia.org/r/500504 (owner: 10Phedenskog) [18:00:24] !log Updating docker-pkg files on contint1001 for https://gerrit.wikimedia.org/r/#/c/integration/config/+/500504/ [18:00:25] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [18:00:51] Krinkle: also the quibble refactoring you proposed for quibble turned out to be broken in some non obvious way ;] But 0.0.31 should solve it now [18:01:06] I will switch the ci jobs tomorrow after I have tested each of the containers [18:01:07] hashar: Yeah, saw that. MANIFEST.in was not used? [18:01:12] hashar: I wonder how it worked previously. [18:02:21] the rule "include mediawiki *.php" is not understood by some older setuptools version [18:02:31] it literally looks for /mediawiki or /*.php [18:02:46] when later version seems to treat *.php has a recursive match [18:02:56] well that is how I understand it, I gave up tryin gto figure out the root cause [18:02:58] anyway [18:03:06] it worked fine for me locally [18:03:07] hashar: ah okay, so the file was used, but a different version format. [18:03:16] right, me too :D [18:03:18] bu [18:03:26] I don't think the local_settings.php got included anyway [18:03:34] cause that causes a php -l error anyway [18:03:47] the file is prepened and lacks a "?>\n" [18:03:52] so that caused some syntax error :] [18:03:55] so yeah [18:04:03] ah interesting. [18:04:06] no issue while manually testing, but the file was not included anyway somehow [18:04:16] will look at testing those containers tomorrow [18:04:21] thanks [18:04:32] hashar: which host is affected by the older version? Is it built in a jenkins job? [18:04:36] and probably add some kind of integration test for patches proposed to integration/quibble to help catch those nasty issues [18:05:02] the Jessie based containers at least [18:05:09] but I had not looked at the strech ones [18:05:23] the faulty ones got published with version 0.0.30 something [18:05:40] the new ones are 0.0.31 which I have published a few minutes ago, but I have not switched the ci jobs to it yet [18:05:41] ;D [18:05:51] anyway I am overtime. Gotta have dinner! Take care :] [18:07:05] hashar: I'm updating quibble-fresnel now for a npm update. [18:07:10] But it still uses 0.0.28 [18:07:17] so I guess we'll find out if it works or not for plain mediawiki [18:07:29] If it breaks I'll just roll back for now and let you re-deploy as part of [18:08:11] oops [18:08:20] .. as part of the 0.0.31 roll out [18:08:29] Krinkle: it will get 0.0.31 [18:08:32] (03PS1) 10Phedenskog: jjb: Updated Fresnel to 0.2.2 [integration/config] - 10https://gerrit.wikimedia.org/r/500511 [18:08:32] Yeah [18:08:35] ^^ [18:08:58] (03CR) 10Krinkle: [C: 03+1] "Compiling/deploying now" [integration/config] - 10https://gerrit.wikimedia.org/r/500511 (owner: 10Phedenskog) [18:09:12] indeed :] [18:09:28] there is hack somewhere which just install the latest tag from the git repo [18:09:42] might be nicer to install the version based on the image changelog [18:16:19] FileNotFoundError: [Errno 2] No such file or directory: '/usr/local/lib/python3.5/dist-packages/quibble/mediawiki/local_settings.php' [18:18:38] OK. rolling back to master for now [18:19:00] (03CR) 10Phedenskog: "Ooops it's broken:" [integration/config] - 10https://gerrit.wikimedia.org/r/500511 (owner: 10Phedenskog) [18:50:26] !log Created mediawiki/extensions/ContributionCredits.git per request on mediawiki.org [18:50:27] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [18:53:11] PROBLEM - Host deployment-db04 is DOWN: CRITICAL - Host Unreachable (172.16.5.5) [19:45:32] (03CR) 10Hashar: "Yup T219786 :(" [integration/config] - 10https://gerrit.wikimedia.org/r/500511 (owner: 10Phedenskog) [20:01:40] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Kanban), 10Quibble, 10Patch-For-Review: Quibble 0.0.30 misses quibble/mediawiki/local_settings.php - https://phabricator.wikimedia.org/T219786 (10hashar) Eventually locally that worked for me T219786#5074879 but the images still do not ha... [20:05:08] Krinkle: sorry about the quibble container. The 0.0.31 are wrong :( [20:05:17] they did not install the proper version of quibble due to a bug somewhere [20:10:24] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Kanban), 10Quibble, 10Patch-For-Review: Quibble 0.0.30 misses quibble/mediawiki/local_settings.php - https://phabricator.wikimedia.org/T219786 (10hashar) The https remote being used lacks the tag: ` diff -u <(git ls-remote --refs --tags s... [20:10:48] !log gerrit: flush-caches --cache git_tags # some tag got stalled when querying over https - T219786 [20:10:50] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [20:10:51] T219786: Quibble 0.0.30 misses quibble/mediawiki/local_settings.php - https://phabricator.wikimedia.org/T219786 [20:11:37] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Kanban), 10Quibble, 10Patch-For-Review: Quibble 0.0.30 misses quibble/mediawiki/local_settings.php - https://phabricator.wikimedia.org/T219786 (10hashar) Had to flush Gerrit internal cache for git tags. I guess I now have to rebuild all i... [20:19:08] PROBLEM - Mediawiki Error Rate on graphite-labs is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [10.0] [20:34:07] RECOVERY - Mediawiki Error Rate on graphite-labs is OK: OK: Less than 1.00% above the threshold [1.0] [20:34:47] 10Release-Engineering-Team, 10MediaWiki-Core-Testing, 10Wikimedia-production-error (Shared Build Failure), 10phan: phan 1.2.6 is OOMing on MediaWiki core - https://phabricator.wikimedia.org/T219114 (10Legoktm) Disabling phan isn't an option due to the amount of issues it regularly catches. Worst case we ca... [20:39:12] 10Beta-Cluster-Infrastructure, 10Analytics, 10EventBus, 10Patch-For-Review, 10Wikimedia-production-error: PHP Warning: Array key should be either a string or an integer - https://phabricator.wikimedia.org/T219738 (10MarcoAurelio) [20:39:16] 10Release-Engineering-Team (Kanban), 10Release, 10Train Deployments: 1.33.0-wmf.24 deployment blockers - https://phabricator.wikimedia.org/T206678 (10MarcoAurelio) [20:40:02] 10Release-Engineering-Team (Kanban), 10Release, 10Train Deployments: 1.33.0-wmf.24 deployment blockers - https://phabricator.wikimedia.org/T206678 (10MarcoAurelio) Precautionarily adding T219738 due to the enormous ammount of logspam traffic it generated on Beta and because it depends on #eventbus. [20:40:54] (03PS1) 10Hashar: docker: rebuild Quibble containers to get 0.0.31 [integration/config] - 10https://gerrit.wikimedia.org/r/500568 (https://phabricator.wikimedia.org/T219786) [20:41:13] Krinkle: sorry the quibble 0.0.31 tag was invisible I had to flush some cache. I am going to rebuild all those quibble containers [20:41:50] (03CR) 10Hashar: [C: 03+2] docker: rebuild Quibble containers to get 0.0.31 [integration/config] - 10https://gerrit.wikimedia.org/r/500568 (https://phabricator.wikimedia.org/T219786) (owner: 10Hashar) [20:44:22] (03Merged) 10jenkins-bot: docker: rebuild Quibble containers to get 0.0.31 [integration/config] - 10https://gerrit.wikimedia.org/r/500568 (https://phabricator.wikimedia.org/T219786) (owner: 10Hashar) [20:44:55] !log Building Quibble 0.0.31 containers again # T219647 T219786 [20:44:58] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [20:44:59] T219647: Upgrade CI jobs to Quibble 0.0.30 - https://phabricator.wikimedia.org/T219647 [20:45:00] T219786: Quibble 0.0.30 misses quibble/mediawiki/local_settings.php - https://phabricator.wikimedia.org/T219786 [20:53:10] !log ssh contint1001.wikimedia.org sudo rm /tmp/docker-pkg-build.log [20:53:11] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [20:54:34] hashar: I need a gerrit admin [21:19:12] (03PS3) 10Effie Mouzeli: Fix flake8 errors [tools/scap] - 10https://gerrit.wikimedia.org/r/498341 [21:22:33] (03CR) 10PipelineBot: "pipeline-dashboard: service-pipeline-test" [tools/scap] - 10https://gerrit.wikimedia.org/r/498341 (owner: 10Effie Mouzeli) [21:24:22] (03PS9) 10Effie Mouzeli: Add --canary-wait-time flag [tools/scap] - 10https://gerrit.wikimedia.org/r/495398 (https://phabricator.wikimedia.org/T217924) [21:29:40] (03CR) 10PipelineBot: "pipeline-dashboard: service-pipeline-test" [tools/scap] - 10https://gerrit.wikimedia.org/r/495398 (https://phabricator.wikimedia.org/T217924) (owner: 10Effie Mouzeli) [21:33:40] !log Created https://gerrit.wikimedia.org/r/#/admin/projects/labs/tools/ldap | T219703 [21:33:43] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [21:33:43] T219703: Move `tool-ldap` from Differential to Gerrit - https://phabricator.wikimedia.org/T219703 [21:42:23] !log Imported tool-ldap from Diffusion to Gerrit with full history | T219703 [21:42:26] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [21:42:27] T219703: Move `tool-ldap` from Differential to Gerrit - https://phabricator.wikimedia.org/T219703 [21:59:43] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Kanban), 10Quibble, 10Patch-For-Review: Upgrade CI jobs to Quibble 0.0.30 - https://phabricator.wikimedia.org/T219647 (10hashar) [21:59:48] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Kanban), 10Quibble, 10Patch-For-Review: Quibble 0.0.30 misses quibble/mediawiki/local_settings.php - https://phabricator.wikimedia.org/T219786 (10hashar) 05Open→03Resolved docker-registry.discovery.wmnet/releng/quibble-stretch-php71:0... [22:04:05] 10Phabricator, 10Release-Engineering-Team, 10Operations, 10Wikimedia-Incident: Analyze and amend (if necessary) workflow of user reporting and detecting large regressions/outages - https://phabricator.wikimedia.org/T219589 (10Yann) >>! In T219589#5072801, @Aklapper wrote: >>>! In T219589#5072592, @Yann wro... [22:25:26] 10Gerrit, 10Phabricator, 10Release-Engineering-Team (Backlog): Stop using Differential for code review - https://phabricator.wikimedia.org/T191182 (10MarcoAurelio) [22:30:20] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Kanban), 10Quibble, 10Patch-For-Review: Upgrade CI jobs to Quibble 0.0.31 - https://phabricator.wikimedia.org/T219647 (10hashar) [22:53:31] 10Continuous-Integration-Config, 10Operations, 10Patch-For-Review, 10User-zeljkofilipin: npm 6 consistently fails with "Z_DATA_ERROR: invalid distance too far back" on some repos - https://phabricator.wikimedia.org/T215562 (10Krinkle) >>! In T215562#5074013, @MoritzMuehlenhoff wrote: > @Krinkle I've prepar...