[00:38:08] 10Release-Engineering-Team (CI & Testing services), 10Release-Engineering-Team-TODO (201907), 10Quibble, 10Documentation: Document how to add a new development dependency for an extension in Quibble - https://phabricator.wikimedia.org/T227909 (10greg) p:05Normal→03Triage Not sure if this is something t... [00:38:47] Krinkle, James_F: does eslint automatically ignore node_modules/** or does that need to be in .eslintignore? [00:44:07] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (CI & Testing services), 10Release-Engineering-Team-TODO, 10Patch-For-Review, 10Technical-Debt: Clear /srv/.git on contint1001; move integration.wikimedia.org docroot to new location - https://phabricator.wikimedia.org/T149924 (10greg) [00:46:00] legoktm: by default [00:46:25] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Other / Uncategorized), 10Release-Engineering-Team-TODO (201907), 10User-greg: Request access to deployment-prep - https://phabricator.wikimedia.org/T228021 (10greg) a:03greg What is your wikitech.wm.o username? [00:47:38] 10Release-Engineering-Team (CI & Testing services), 10Release-Engineering-Team-TODO (201907), 10Wikidata, 10Wikidata Query UI, 10Wikimedia-production-error (Shared Build Failure): WDQS GUI deploy build fails - https://phabricator.wikimedia.org/T227818 (10greg) [00:57:45] 10Release-Engineering-Team (CI & Testing services), 10Release-Engineering-Team-TODO, 10Operations, 10User-Joe: Create jenkins job for creating deployment artifacts for `docker-pkg-deploy` - https://phabricator.wikimedia.org/T179562 (10greg) Is this task still generally accurate? [00:57:59] Krinkle: thanks [00:58:17] so I'm just going to have .eslintignore have "vendor/" [00:58:31] or, /vendor [01:29:34] Krinkle: I rebased https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/520148/ [01:41:32] (03PS1) 10Mholloway: WIP: Fix detection of use-system-flag [blubber] - 10https://gerrit.wikimedia.org/r/523421 [01:45:00] (03CR) 10Mholloway: "Hmm... maybe I'm just having a local dev environment problem." [blubber] - 10https://gerrit.wikimedia.org/r/523421 (owner: 10Mholloway) [04:43:33] 10Release-Engineering-Team (CI & Testing services), 10Release-Engineering-Team-TODO (201907), 10Wikidata, 10Wikidata Query UI, 10Wikimedia-production-error (Shared Build Failure): WDQS GUI deploy build fails - https://phabricator.wikimedia.org/T227818 (10Smalyshev) p:05High→03Unbreak! Still failing:... [06:28:39] 10Continuous-Integration-Config, 10Release-Engineering-Team (CI & Testing services), 10Release-Engineering-Team-TODO, 10LibUp: LibraryUpgrader CI normalisation tasks, June/July 2019 - https://phabricator.wikimedia.org/T225325 (10Legoktm) [06:33:08] (03PS1) 10Santhosh: Add Phan dependencies for Content Translation [integration/config] - 10https://gerrit.wikimedia.org/r/523569 [06:48:54] James_F: "view proposed patch" on https://libraryupgrader2.wmflabs.org/r/mediawiki/extensions/AJAXPoll [06:52:32] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Other / Uncategorized), 10Release-Engineering-Team-TODO (201907), 10User-greg: Request access to deployment-prep - https://phabricator.wikimedia.org/T228021 (10dom_walden) >>! In T228021#5335585, @greg wrote: > What is your wikitech.wm.o username?... [07:44:49] 10Phabricator: Create a project on phabricator for twitter to Commons - https://phabricator.wikimedia.org/T228139 (10Jnanaranjan_sahu) [08:14:56] (03CR) 10Awight: Don't run phpunit-unit stage if the composer script doesn't exist (031 comment) [integration/quibble] - 10https://gerrit.wikimedia.org/r/521515 (https://phabricator.wikimedia.org/T87781) (owner: 10Kosta Harlan) [09:08:24] 10Gerrit, 10Release-Engineering-Team (Development services), 10Release-Engineering-Team-TODO: Create mirror of Gerrit repositories for consumption by various tools - https://phabricator.wikimedia.org/T226240 (10hashar) [09:11:27] (03CR) 10Hashar: [C: 03+2] "It is now using 4 threads." [integration/config] - 10https://gerrit.wikimedia.org/r/523158 (https://phabricator.wikimedia.org/T221969) (owner: 10Hashar) [09:12:34] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team-TODO (201907), 10puppet-compiler, 10Patch-For-Review: Puppet catalog compiler - increasing max concurrent jobs - https://phabricator.wikimedia.org/T221969 (10hashar) https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compil... [09:13:47] (03Merged) 10jenkins-bot: puppet compiler: bump threads 2 -> 4 [integration/config] - 10https://gerrit.wikimedia.org/r/523158 (https://phabricator.wikimedia.org/T221969) (owner: 10Hashar) [09:47:14] 10Diffusion, 10Release-Engineering-Team (Development services), 10Release-Engineering-Team-TODO (201907), 10Operations, and 4 others: Cannot connect to vcs@git-ssh.wikimedia.org (since move from phab1001 to phab1003) - https://phabricator.wikimedia.org/T224677 (10MoritzMuehlenhoff) I've submitted a propose... [11:30:26] 10Release-Engineering-Team (Unit & Int & System Tooling), 10Release-Engineering-Team-TODO, 10Wikimedia-General-or-Unknown, 10Epic, 10phan: Enable mediawiki/mediawiki-phan-config on all Wikimedia-deployed repositories - https://phabricator.wikimedia.org/T224783 (10Daimona) [11:42:52] 10Continuous-Integration-Infrastructure, 10phan-taint-check-plugin: mwext-php70-phan-seccheck-docker times out for CommonsMetadata extension - https://phabricator.wikimedia.org/T224351 (10Daimona) 05Resolved→03Open I didn't read T224351#5227653 carefully enough. I still have to determine whether this still... [11:43:07] 10Continuous-Integration-Infrastructure, 10phan-taint-check-plugin, 10User-Daimona: mwext-php70-phan-seccheck-docker times out for CommonsMetadata extension - https://phabricator.wikimedia.org/T224351 (10Daimona) [11:51:17] 10Continuous-Integration-Config, 10Release-Engineering-Team (CI & Testing services), 10Release-Engineering-Team-TODO, 10LibUp: LibraryUpgrader CI normalisation tasks, June/July 2019 - https://phabricator.wikimedia.org/T225325 (10Daimona) [11:52:18] 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO (201907), 10Release, 10Train Deployments: 1.34.0-wmf.14 deployment blockers - https://phabricator.wikimedia.org/T220739 (10Krinkle) [11:55:13] 10Project-Admins: Create a project on phabricator for twitter to Commons - https://phabricator.wikimedia.org/T228139 (10Krinkle) [11:55:26] 10Continuous-Integration-Config, 10MediaWiki-extensions-Page_Forms: Add phan to PageForms - https://phabricator.wikimedia.org/T228155 (10Daimona) [11:55:45] 10Continuous-Integration-Config, 10MediaWiki-extensions-Page_Forms: Add phan to PageForms - https://phabricator.wikimedia.org/T228155 (10Daimona) [11:58:41] 10Continuous-Integration-Infrastructure (phase-out-jessie), 10Operations, 10serviceops, 10Patch-For-Review: Upload docker-ce 18.06.3 upstream package for Stretch - https://phabricator.wikimedia.org/T226236 (10MoritzMuehlenhoff) 05Open→03Resolved Packages have been synched to thirdparty/ci for stretch-w... [11:58:43] 10Continuous-Integration-Infrastructure (phase-out-jessie), 10Release-Engineering-Team (CI & Testing services), 10Release-Engineering-Team-TODO (201907): Rebuild integration-slave-docker-* instances to use less RAM, new name and Stretch - https://phabricator.wikimedia.org/T226233 (10MoritzMuehlenhoff) [11:58:46] 10Continuous-Integration-Infrastructure (phase-out-jessie), 10Operations: Migrate contint* hosts to Stretch/Buster - https://phabricator.wikimedia.org/T224591 (10MoritzMuehlenhoff) [11:58:59] 10Project-Admins: Create a project on phabricator for twitter to Commons - https://phabricator.wikimedia.org/T228139 (10Krinkle) 05Open→03Resolved a:03Krinkle Done. See #twitter-to-commons (), subproject of #Toolforge. [12:03:55] (03PS2) 10Mholloway: Add usage test for use-system-flag [blubber] - 10https://gerrit.wikimedia.org/r/523421 [12:06:19] (03CR) 10Mholloway: "I upgraded my local Go installation to 1.12.7, and that resolved the issue, so I've changed this patch just to be about adding another tes" [blubber] - 10https://gerrit.wikimedia.org/r/523421 (owner: 10Mholloway) [12:24:39] 10Gerrit: Support posting screenshots in Gerrit - https://phabricator.wikimedia.org/T228084 (10Aklapper) p:05Normal→03Triage I don't see how/why this has been prioritized hence resetting [12:51:30] 10Continuous-Integration-Infrastructure: Increase TTL of failed builds - https://phabricator.wikimedia.org/T228158 (10Daimona) [13:04:05] 10Release-Engineering-Team (Unit & Int & System Tooling), 10Release-Engineering-Team-TODO (201907), 10Ruby, 10User-zeljkofilipin: Mark mediawiki_api and mediawiki_selenium Ruby gems as deprecated - https://phabricator.wikimedia.org/T228160 (10zeljkofilipin) [13:04:14] 10Release-Engineering-Team (Unit & Int & System Tooling), 10Release-Engineering-Team-TODO (201907), 10Ruby, 10User-zeljkofilipin: Mark mediawiki_api and mediawiki_selenium Ruby gems as deprecated - https://phabricator.wikimedia.org/T228160 (10zeljkofilipin) p:05Triage→03Normal [13:18:41] PROBLEM - Work requests waiting in Zuul Gearman server on contint1001 is CRITICAL: CRITICAL: 53.33% of data above the critical threshold [140.0] https://www.mediawiki.org/wiki/Continuous_integration/Zuul https://grafana.wikimedia.org/dashboard/db/zuul-gearman?panelId=10&fullscreen&orgId=1 [13:25:17] 10Release-Engineering-Team (Unit & Int & System Tooling), 10Release-Engineering-Team-TODO (201907), 10Ruby, 10User-zeljkofilipin: mediawiki_api Ruby gem improvements - https://phabricator.wikimedia.org/T227584 (10zeljkofilipin) [13:25:33] 10Release-Engineering-Team (Unit & Int & System Tooling), 10Release-Engineering-Team-TODO, 10Ruby: mediawiki_api Ruby gem incorrectly assumes path to index.php - https://phabricator.wikimedia.org/T149169 (10zeljkofilipin) 05Open→03Declined Because of {T228160}. [13:25:35] 10Release-Engineering-Team (Unit & Int & System Tooling), 10Release-Engineering-Team-TODO (201907), 10Ruby, 10User-zeljkofilipin: mediawiki_api Ruby gem improvements - https://phabricator.wikimedia.org/T227584 (10zeljkofilipin) [13:25:47] 10Release-Engineering-Team, 10Release-Engineering-Team-TODO, 10Documentation, 10Ruby: Improve mediawiki_api documentation with inline yard - https://phabricator.wikimedia.org/T102726 (10zeljkofilipin) 05Open→03Declined Because of {T228160}. [13:25:49] 10Release-Engineering-Team (Unit & Int & System Tooling), 10Release-Engineering-Team-TODO (201907), 10Ruby, 10User-zeljkofilipin: mediawiki_api Ruby gem improvements - https://phabricator.wikimedia.org/T227584 (10zeljkofilipin) [13:26:04] 10Release-Engineering-Team, 10Release-Engineering-Team-TODO, 10Patch-For-Review, 10Ruby, and 2 others: mediawiki_api gem recursion on log_in - https://phabricator.wikimedia.org/T111133 (10zeljkofilipin) 05Stalled→03Declined Because of {T228160}. [13:26:06] 10Release-Engineering-Team (Unit & Int & System Tooling), 10Release-Engineering-Team-TODO (201907), 10Ruby, 10User-zeljkofilipin: mediawiki_api Ruby gem improvements - https://phabricator.wikimedia.org/T227584 (10zeljkofilipin) [13:26:34] 10Release-Engineering-Team (Unit & Int & System Tooling), 10Release-Engineering-Team-TODO (201907), 10Ruby, 10User-zeljkofilipin: mediawiki_api Ruby gem improvements - https://phabricator.wikimedia.org/T227584 (10zeljkofilipin) 05Open→03Declined Because of {T228160}. [14:17:49] RECOVERY - Work requests waiting in Zuul Gearman server on contint1001 is OK: OK: Less than 30.00% above the threshold [90.0] https://www.mediawiki.org/wiki/Continuous_integration/Zuul https://grafana.wikimedia.org/dashboard/db/zuul-gearman?panelId=10&fullscreen&orgId=1 [14:24:50] (03CR) 10Legoktm: "recheck" [integration/docroot] - 10https://gerrit.wikimedia.org/r/523639 (owner: 10Libraryupgrader) [14:25:23] (03CR) 10jerkins-bot: [V: 04-1] build: Updating mediawiki/mediawiki-codesniffer to 26.0.0 [integration/docroot] - 10https://gerrit.wikimedia.org/r/523639 (owner: 10Libraryupgrader) [14:25:49] legoktm: Re. https://libraryupgrader2.wmflabs.org/r/mediawiki/extensions/AJAXPoll "proposed patch", the options should go before not after the files list in Gruntfile, but otherwise it's great. [14:30:58] (03CR) 10Jforrester: "This repo runs php56 because contint1001 is still a jessie box. :-(" [integration/docroot] - 10https://gerrit.wikimedia.org/r/523639 (owner: 10Libraryupgrader) [14:32:10] (03CR) 10Jforrester: "> Patch Set 1:" [integration/docroot] - 10https://gerrit.wikimedia.org/r/523639 (owner: 10Libraryupgrader) [14:34:31] (03CR) 10Mholloway: "For posterity, I figured out the real issue, which is that I was trying to cherry-pick onto 0.6.0, but this also requires 56e830f in order" [blubber] - 10https://gerrit.wikimedia.org/r/523421 (owner: 10Mholloway) [14:40:18] 10Release-Engineering-Team (CI & Testing services), 10Reading-Infrastructure-Team-Backlog: Implement CI rules for new kartotherian repo - https://phabricator.wikimedia.org/T228170 (10MSantos) [14:46:14] Is something wrong with beta sync jobs? [14:46:33] We're seeing code on test-commons that's not on Beta Commons yet [14:46:45] marktraceur: Looking. [14:47:05] Compare https://commons.wikimedia.beta.wmflabs.org/wiki/File:Crystal-1265.stl with https://test-commons.wikimedia.org/wiki/File:Album_cover_Andrew_Fortnum.png [14:47:20] The latter has links to Wikidata on the super accurate qualifiers [14:49:12] Tip of wmf.14 is 6c57748aeee6e4f2a197d64785102306fbd4a297 [14:49:36] Beta Cluster is still running de25a85a22ea8ceafdbe270c84143f7f1571c495 from yesterday. [14:49:41] Huh [14:49:44] But the beta jobs think they're working. [14:49:57] Ooooh. [14:50:05] 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO (201907), 10Release, 10Train Deployments: 1.34.0-wmf.14 deployment blockers - https://phabricator.wikimedia.org/T220739 (10LarsWirzenius) [14:50:11] "Waiting for next available executor on ‘deployment-deploy01'" [14:50:42] !log Jenkins deadlock? Beta jenkins jobs haven't successfully run for ~20 hours. [14:50:45] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [14:51:40] Cool, good to know I'm not crazy [14:51:44] Or not as crazy as usual [14:53:58] OK, I've run `sudo -u jenkins jstat -gcutil PID_HERE 1000 3` but I have no idea what I'm looking at. :-) [14:57:11] !log Taking deployment-deploy01 offline [14:57:12] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [15:02:21] Project beta-scap-eqiad build #258051: 04FAILURE in 0.84 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/258051/ [15:02:46] Eurgh, lovely. [15:03:51] marktraceur: OK, things should be coming back online. Give it 20? [15:05:08] Cool thanks James_F [15:05:30] Project beta-scap-eqiad build #258052: 04STILL FAILING in 0.79 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/258052/ [15:08:41] Project beta-scap-eqiad build #258053: 04STILL FAILING in 0.66 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/258053/ [15:09:08] Project beta-scap-eqiad build #258054: 04STILL FAILING in 0.62 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/258054/ [15:09:35] Hmm, maybe not. [15:09:45] Not looking great [15:10:06] !log beta-scap-eqiad failing because mw-config repo is out of date? [15:10:08] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [15:14:18] (03CR) 10Thcipriani: [C: 03+2] "Thanks!" [blubber] - 10https://gerrit.wikimedia.org/r/523421 (owner: 10Mholloway) [15:14:24] Project beta-scap-eqiad build #258055: 04STILL FAILING in 0.8 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/258055/ [15:14:49] oh boy [15:15:17] (03CR) 10Hashar: "Yeah there is no urgency to bump mediawiki-codesniffer on integration/docroot.git . It is not like it is very active anyway." [integration/docroot] - 10https://gerrit.wikimedia.org/r/523639 (owner: 10Libraryupgrader) [15:15:54] FTR canceling beta-scap-eqiad when it gets into that "waiting for available executor" state works [15:15:54] `jforrester@deployment-deploy01:/srv/mediawiki-staging` shows master, as expected. [15:16:14] thcipriani: I did that, it came back immediately with the same state issue. [15:16:42] (03Merged) 10jenkins-bot: Add usage test for use-system-flag [blubber] - 10https://gerrit.wikimedia.org/r/523421 (owner: 10Mholloway) [15:17:45] OK, so the version of mw-config in /srv/mediawiki is out of date, and that's what scap is using to try to build? [15:17:51] hrm, haven't had that happen to me recently. There's also this horrible thing: https://www.mediawiki.org/wiki/Continuous_integration/Jenkins#Hung_beta_code/db_update [15:18:17] What will break if I manually copy from /mediawiki-staging to /mediawiki? [15:18:28] scap should do that as a first step FWIW [15:18:41] or you can run scap pull [15:19:21] !log Ran manual scap pull on deployment-deploy01 [15:19:22] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [15:19:30] Project beta-scap-eqiad build #258056: 04STILL FAILING in 0.64 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/258056/ [15:19:41] (03CR) 10PipelineBot: "pipeline-dashboard: blubber-pipeline-publish" [blubber] - 10https://gerrit.wikimedia.org/r/523421 (owner: 10Mholloway) [15:19:43] (03CR) 10PipelineBot: "pipeline-dashboard: blubber-pipeline-publish" [blubber] - 10https://gerrit.wikimedia.org/r/523421 (owner: 10Mholloway) [15:19:45] (03CR) 10jenkins-bot: Add usage test for use-system-flag [blubber] - 10https://gerrit.wikimedia.org/r/523421 (owner: 10Mholloway) [15:19:48] Oh, wait. [15:19:57] I'm an idiot. It's a real failure. [15:20:23] yep, looks like an undefined variable? [15:20:59] scap does: echo 1 | mwscript eval.php as a very first step now [15:21:40] suspenders and a belt deployment strategy [15:22:11] woot, it works [15:22:28] Fix is https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/523744 [15:22:30] I didn't realise the error_reporting thingy fix made it so that means this part works now [15:22:44] Krinkle: Which part? [15:22:51] Krinkle: it only recently started working. I noticed it actually caught one production problem (yay!) [15:23:25] James_F: the 'echo 1 | mwscript eval.php' sanity check, which catches all stderr like php undefined vars [15:23:36] Oh, right. [15:23:46] was written a few months ago but didn't work initially because PHP7 had pretty loose error_reporting by default (only reports fatals) [15:23:52] but since then fixed to match hhvm [15:23:59] I'm not sure errors in the post-merge of config is something people will normally pay attention to. [15:24:04] caught this one: https://tools.wmflabs.org/sal/log/AWuKvotnEHTBTPG-sd81 [15:24:13] in prod [15:24:16] I suppose in prod it runs before each scap and so should catch there? [15:24:19] was the first I noticed it was working [15:24:21] it catches it during 'scap sync-file' [15:24:25] and prevents deploy [15:24:26] * James_F nods. [15:24:29] Project beta-scap-eqiad build #258057: 04STILL FAILING in 3.3 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/258057/ [15:24:30] Nice. [15:24:47] thcipriani: does it run before going to canaries from the deploy server, or is it among the swagger checks we run on canaries? [15:24:57] I think we settled on the latter as first step right? [15:25:17] Krinkle: it runs before canaries [15:25:26] on the deployment host itself [15:25:37] oh perfect. [15:25:40] 10Continuous-Integration-Infrastructure, 10Jenkins: Increase TTL of failed builds - https://phabricator.wikimedia.org/T228158 (10hashar) The retention policy for a job shows up as: {F29782258 size=full} The artifacts are kept for 5, 7, 15, 30 or 60 days. Depending on how often their are triggered and the siz... [15:25:50] I recall there were some issues with that originally, but I guess that was resolved [15:25:55] ah, that was about calling it over HTTP [15:25:58] CLI is fine indeed, nice [15:26:32] it is actually the first thing we do after obtaining a lock, now that I look [15:26:46] I approve. [15:27:15] it was inspired by the arrray() incident [15:27:27] ( T121597 ) [15:27:27] T121597: Implement MediaWiki pre-promote checks - https://phabricator.wikimedia.org/T121597 [15:27:51] Damn talk like a pirate day [15:28:04] :) [15:28:06] 10Scap (Scap3-MediaWiki-MVP), 10scap2, 10Wikimedia-Incident: Implement MediaWiki pre-promote checks - https://phabricator.wikimedia.org/T121597 (10Krinkle) 05Open→03Resolved a:03thcipriani >>! In T121597#5258039, @Krinkle wrote: > Is this done/enabled in prod? (Yes.) [15:28:08] 10Deployments, 10Scap, 10WorkType-NewFunctionality: Create canary deploy process for MediaWiki - https://phabricator.wikimedia.org/T136883 (10Krinkle) [15:28:13] heh, "arrray" is still a deployment anecdote I bring up from time-to-time [15:28:26] Yeah, it's a very good, simple example. [15:28:57] it means that, in theory, trying to deploying a config file or common core file containing arrray() will not even reach the canary traffic. [15:29:03] OK, config fix is merged and theoretically that should make things beter. [15:30:07] marktraceur: oh no, I can never look at arrray the same way again now. 🏴‍☠️ [15:30:22] Hehe [15:30:42] Beta is fixed and I've infected Krinkle with a stupid joke, my work here is done [15:35:43] Yippee, build fixed! [15:35:43] Project beta-scap-eqiad build #258058: 09FIXED in 8 min 30 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/258058/ [15:36:41] Yippee, thanks all [15:37:33] (03PS1) 10Thcipriani: Unit tests: PosOf InsertElement [blubber] - 10https://gerrit.wikimedia.org/r/523748 [15:38:16] 10Continuous-Integration-Config, 10MediaWiki-extensions-Page_Forms, 10Patch-For-Review: Add phan to PageForms - https://phabricator.wikimedia.org/T228155 (10Yaron_Koren) @Daimona - thanks for putting this together, and of course for your work on the actual patch, which looks like it will make a lot of improv... [15:39:25] (03PS1) 10Thcipriani: Make test: add coverage output [blubber] - 10https://gerrit.wikimedia.org/r/523749 [15:41:18] 10Continuous-Integration-Infrastructure, 10Jenkins: Increase TTL of failed builds - https://phabricator.wikimedia.org/T228158 (10Daimona) >>! In T228158#5337446, @hashar wrote: > [...] Most of the time that is sufficient since people would either fix the issue right away or fill a task or the issue is still r... [15:45:41] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (CI & Testing services), 10Release-Engineering-Team-TODO (201907), 10Wikidata, and 3 others: WDQS GUI deploy build fails - https://phabricator.wikimedia.org/T227818 (10greg) Odd, @hashar ? [15:48:27] 10Continuous-Integration-Config, 10MediaWiki-extensions-Page_Forms, 10Patch-For-Review: Add phan to PageForms - https://phabricator.wikimedia.org/T228155 (10Daimona) >>! In T228155#5337535, @Yaron_Koren wrote: > @Daimona - thanks for putting this together, and of course for your work on the actual patch, whi... [15:58:02] 10Release-Engineering-Team (Unit & Int & System Tooling), 10Release-Engineering-Team-TODO (201907), 10Ruby, 10User-zeljkofilipin: Mark mediawiki_api and mediawiki_selenium Ruby gems as deprecated - https://phabricator.wikimedia.org/T228160 (10zeljkofilipin) [16:12:48] 10Release-Engineering-Team, 10Release-Engineering-Team-TODO, 10Operations, 10Release Pipeline, and 3 others: Migrate production services to kubernetes using the pipeline - https://phabricator.wikimedia.org/T198901 (10Jdforrester-WMF) [16:14:48] 10Continuous-Integration-Config, 10MediaWiki-extensions-Page_Forms, 10Patch-For-Review: Add phan to PageForms - https://phabricator.wikimedia.org/T228155 (10Yaron_Koren) Hi, > Unfortunately, phan will complain about undefined classes if SMW is not installed in the test environment, Are you sure about this?... [16:16:31] 10Continuous-Integration-Config, 10MediaWiki-extensions-Page_Forms, 10Patch-For-Review: Add phan to PageForms - https://phabricator.wikimedia.org/T228155 (10Daimona) >>! In T228155#5337679, @Yaron_Koren wrote: > Hi, > >> Unfortunately, phan will complain about undefined classes if SMW is not installed in th... [16:18:58] 10Release-Engineering-Team, 10Release-Engineering-Team-TODO, 10Operations, 10Release Pipeline, and 3 others: Migrate production services to kubernetes using the pipeline - https://phabricator.wikimedia.org/T198901 (10Jdforrester-WMF) [16:19:19] (03CR) 10Legoktm: [C: 04-1] "Oh shoot. libup wasn't supposed to bump this." [integration/docroot] - 10https://gerrit.wikimedia.org/r/523639 (owner: 10Libraryupgrader) [16:21:06] 10Continuous-Integration-Config, 10MediaWiki-extensions-Page_Forms, 10Patch-For-Review: Add phan to PageForms - https://phabricator.wikimedia.org/T228155 (10Yaron_Koren) Oh, okay. That's too bad. [16:24:11] 10Continuous-Integration-Config, 10MediaWiki-extensions-Page_Forms, 10Patch-For-Review: Add phan to PageForms - https://phabricator.wikimedia.org/T228155 (10Daimona) >>! In T228155#5337705, @Yaron_Koren wrote: > Oh, okay. That's too bad. Well, not necessarily bad :-) I guess that helps to ensure the extensi... [16:26:09] 10Continuous-Integration-Config, 10MediaWiki-extensions-Page_Forms, 10Patch-For-Review: Add phan to PageForms - https://phabricator.wikimedia.org/T228155 (10Daimona) [16:36:04] 10Release-Engineering-Team (CI & Testing services), 10Reading-Infrastructure-Team-Backlog: Implement CI rules for new kartotherian repo - https://phabricator.wikimedia.org/T228170 (10Jdforrester-WMF) a:03Jdforrester-WMF [16:38:41] 10Release-Engineering-Team (CI & Testing services), 10Reading-Infrastructure-Team-Backlog: Implement CI rules for new kartotherian repo - https://phabricator.wikimedia.org/T228170 (10Jdforrester-WMF) I'll add node10, and also the pipeline as an experimental job. [16:39:33] (03PS1) 10Jforrester: layout: [mediawiki/services/kartotherian] Initial task, node10 plus experimental pipeline [integration/config] - 10https://gerrit.wikimedia.org/r/523769 (https://phabricator.wikimedia.org/T228170) [16:45:39] (03CR) 10Jforrester: [C: 03+2] layout: [mediawiki/services/kartotherian] Initial task, node10 plus experimental pipeline [integration/config] - 10https://gerrit.wikimedia.org/r/523769 (https://phabricator.wikimedia.org/T228170) (owner: 10Jforrester) [16:46:21] (03CR) 10Jforrester: [C: 03+2] Add Phan dependencies for Content Translation [integration/config] - 10https://gerrit.wikimedia.org/r/523569 (owner: 10Santhosh) [16:48:01] (03Merged) 10jenkins-bot: layout: [mediawiki/services/kartotherian] Initial task, node10 plus experimental pipeline [integration/config] - 10https://gerrit.wikimedia.org/r/523769 (https://phabricator.wikimedia.org/T228170) (owner: 10Jforrester) [16:48:04] (03Merged) 10jenkins-bot: Add Phan dependencies for Content Translation [integration/config] - 10https://gerrit.wikimedia.org/r/523569 (owner: 10Santhosh) [16:48:44] !log Zuul: Adding first tasks for mediawiki/services/kartotherian T228170 [16:48:49] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [16:48:53] T228170: Implement CI rules for new kartotherian repo - https://phabricator.wikimedia.org/T228170 [16:49:40] !log Zuul: [ContentTranslation] Adding CentralAuth to phan dependencies [16:49:41] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [16:56:27] 10Continuous-Integration-Infrastructure, 10docker-pkg: Pruning docker-pkg images - https://phabricator.wikimedia.org/T207703 (10thcipriani) 05Open→03Resolved a:03Joe This looks to be released as a feature on contint1001 now. [16:56:29] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (CI & Testing services), 10Release-Engineering-Team-TODO, 10Release Pipeline: contint1001:/var/lib/docker growth - https://phabricator.wikimedia.org/T207702 (10thcipriani) [16:56:49] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (CI & Testing services), 10Release-Engineering-Team-TODO, 10Release Pipeline: contint1001:/var/lib/docker growth - https://phabricator.wikimedia.org/T207702 (10thcipriani) 05Open→03Invalid We use `/mnt/docker` now and it's got a lot of... [16:59:11] 10Project-Admins: Create a project on phabricator for twitter to Commons - https://phabricator.wikimedia.org/T228139 (10Jnanaranjan_sahu) Thank you [17:02:23] I'm presuming that I can abandon https://gerrit.wikimedia.org/r/c/integration/config/+/295396 ? Two years old, and we're not planning to use habourmaster any time soon. ;-) [17:02:52] 10Release-Engineering-Team (CI & Testing services), 10Release-Engineering-Team-TODO (201907), 10Reading-Infrastructure-Team-Backlog, 10Patch-For-Review: Implement CI rules for new kartotherian repo - https://phabricator.wikimedia.org/T228170 (10Jdforrester-WMF) [17:03:22] That ship departed long ago James_F :-) No longer in the harbour nor on the dry docks. [17:06:40] James_F: and the linked task is closed, so yeah cc twentyafterfour [17:07:46] (03Abandoned) 10Jforrester: Phabricator/harbormaster job templates [integration/config] - 10https://gerrit.wikimedia.org/r/295396 (https://phabricator.wikimedia.org/T130950) (owner: 1020after4) [17:08:13] OK, *now* the oldest open patch in integration/config is "only" a year old. [17:08:42] :) [17:09:06] (03Abandoned) 10Jforrester: Added scribunto to shared gate quibble job [integration/config] - 10https://gerrit.wikimedia.org/r/449949 (https://phabricator.wikimedia.org/T200976) (owner: 10WMDE-leszek) [17:09:27] hey anyone from releng around? [17:09:33] And now 8 months. [17:09:39] fsero: Yes. [17:09:45] i need somebody to help me rebuilding some releng images [17:09:59] specifically this one releng/quibble-jessie [17:10:38] 10Release-Engineering-Team (CI & Testing services), 10Release-Engineering-Team-TODO (201907), 10Reading-Infrastructure-Team-Backlog: Implement CI rules for new kartotherian repo - https://phabricator.wikimedia.org/T228170 (10Jdforrester-WMF) OK, our bit here is done. [17:10:47] 10Release-Engineering-Team (CI & Testing services), 10Release-Engineering-Team-TODO (201907), 10Reading-Infrastructure-Team-Backlog: Implement CI rules for new kartotherian repo - https://phabricator.wikimedia.org/T228170 (10Jdforrester-WMF) a:05Jdforrester-WMF→03MSantos [17:11:56] fsero: I don't think we use that any more. We deleted it a while ago. [17:12:36] fsero: Where do you need it for? [17:12:38] [17:15:01] fsero: Obviously, we can bring the images back if they're needed. :-) [17:17:29] RECOVERY - Puppet errors on integration-slave-jessie-1001 is OK: OK: Less than 1.00% above the threshold [2.0] [17:17:31] James_F: it seems that there are some layers missing in docker-registry docker pull docker-registry.wikimedia.org/releng/quibble-jessie [17:17:53] at least that one and docker-registry.wikimedia.org/wikimedia/wikibase-termbox:2019-07-12-144625-production [17:17:58] how can i trigger a rebuild if its needed? [17:18:25] fsero: Well, we undefined it as a useless image no-one should be using, so… [17:18:47] Termbox is a production service, right? So it must be based off an SRE image and not a RelEng one? [17:19:50] RECOVERY - Puppet errors on integration-slave-jessie-1004 is OK: OK: Less than 1.00% above the threshold [2.0] [17:21:28] To bring it back, we'd revert https://gerrit.wikimedia.org/r/c/integration/config/+/518302 and push that to RelEng production and rebuild the images. [17:22:22] James_F: i dont have more details right now, tarrow bring this up for me on #operations and im digging into it :) [17:22:33] tarrow: Hey. [17:22:39] he is out will come back later [17:22:46] OK. [17:22:53] * James_F reads operations scrollback [17:29:51] (03CR) 10Mholloway: [C: 03+1] Unit tests: PosOf InsertElement [blubber] - 10https://gerrit.wikimedia.org/r/523748 (owner: 10Thcipriani) [17:31:15] FWIW, docker-registry.wikimedia.org/wikimedia/wikibase-termbox:2019-07-12-144625-production was built by https://integration.wikimedia.org/ci/job/service-pipeline-test-and-publish/387/ [17:32:06] rebuilding will get you a different -production tag (since there's a different timestamp) but the 2b6053ca3b8575c43fc5e7f2a1d01b734475ea52 tag should update [17:35:49] Also that. [17:40:26] thcipriani: please rebuild current image is unusbale :) [17:41:26] fsero: rebuilding now https://integration.wikimedia.org/ci/job/service-pipeline-test-and-publish/393/console [17:43:10] ty [17:44:39] Hmm, it failed. Not obvious why? [17:47:40] looks like it failed on 39/60: FROM docker-registry.wikimedia.org/nodejs10-slim AS production [17:47:48] which would probably be something to do with that image? [17:49:46] That's an SRE image, right? [17:49:46] a second ago I got: Error response from daemon: manifest for docker-registry.wikimedia.org/nodejs10-slim:latest not found [17:50:05] now I get: Status: Downloaded newer image for docker-registry.wikimedia.org/nodejs-slim:latest [17:50:23] that is indeed an SRE image generated from their docker-pkg repo [17:50:36] https://tools.wmflabs.org/dockerregistry/nodejs10-slim/tags/ shows a latest tag. [17:51:29] although I don't see it here(?): https://gerrit.wikimedia.org/r/plugins/gitiles/operations/docker-images/production-images/+/master/images/ [17:52:22] this is what was talking about it seems some content has been lost [17:52:36] so do you say that nodejs-slim is an SRE image? [17:52:53] It's certainly not a RelEng one. All ours start with releng/ [17:53:05] yeah, there's no prefix (e.g., releng/ or dev/) so I would think it is [17:53:39] fsero: https://gerrit.wikimedia.org/r/c/operations/docker-images/production-images/+/518468 was the last change to the repo that looks relevant. [17:54:09] no is not related to that, it seems that deleting a swift container we have affected somehow the production one [17:54:22] hrm, it seems "nodejs" is "nodejs-slim" according to the changelog: https://gerrit.wikimedia.org/r/plugins/gitiles/operations/docker-images/production-images/+/master/images/nodejs10/changelog#1 [17:54:34] since this is an SRE image i think i can republish it [17:55:12] RECOVERY - Puppet errors on integration-slave-jessie-1002 is OK: OK: Less than 1.00% above the threshold [2.0] [17:56:16] https://phabricator.wikimedia.org/P8755 [18:01:11] it was my understanding that there is no latest by principle [18:01:14] docker pull docker-registry.wikimedia.org/nodejs10-slim:0.0.1 exist [18:04:12] hang on [18:05:48] thcipriani: https://phabricator.wikimedia.org/T228196 [18:06:18] seems that there was a latest: https://tools.wmflabs.org/dockerregistry/nodejs10-slim/tags/ [18:06:28] yues just republished it [18:06:29] my bad [18:06:33] ah, k [18:06:36] the way docker-pkg works [18:06:41] can you try a republish please? [18:06:44] * thcipriani does [18:06:45] i have some republish to do [18:07:21] going now: https://integration.wikimedia.org/ci/job/service-pipeline-test-and-publish/395/console [18:13:04] im trying to get a list of affected images [18:13:12] but in case of doubt you should rebuild/republish [18:15:09] ocker-registry.wikimedia.org/wikimedia/wikibase-termbox:2b6053ca3b8575c43fc5e7f2a1d01b734475ea52 is valid now, FYI [18:15:21] er...docker-registry.wikimedia.org/wikimedia/wikibase-termbox:2b6053ca3b8575c43fc5e7f2a1d01b734475ea52 [18:54:17] 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO (201907), 10Release, 10Train Deployments: 1.34.0-wmf.14 deployment blockers - https://phabricator.wikimedia.org/T220739 (10Jdforrester-WMF) [19:00:32] 10Project-Admins: Create #ci-test-error tag for tracking Gerrit repos failing tests - https://phabricator.wikimedia.org/T227992 (10Jdforrester-WMF) 05Open→03Resolved a:03Jdforrester-WMF Done. #ci-test-error created, #shared-build-failure points to it in the description, #jenkins-failure has been migrated. [19:28:26] 10Release-Engineering-Team, 10MediaWiki-Containers, 10Operations, 10Core Platform Team Workboards (Done with CPT), and 4 others: FY2017/18 Program 6 - Outcome 2 - Objective 3: Integrated, container-based development environment - https://phabricator.wikimedia.org/T170456 (10Pchelolo) [19:34:43] 10Release-Engineering-Team (Pipeline), 10Release-Engineering-Team-TODO, 10MediaWiki-Containers, 10Core Platform Team Backlog (Designing), and 3 others: RFC: Container path conventions - https://phabricator.wikimedia.org/T169998 (10Pchelolo) 05Open→03Resolved Given that we now have #blubber and default... [19:34:47] 10Release-Engineering-Team, 10MediaWiki-Containers, 10Operations, 10Core Platform Team Workboards (Done with CPT), and 4 others: FY2017/18 Program 6 - Outcome 2 - Objective 3: Integrated, container-based development environment - https://phabricator.wikimedia.org/T170456 (10Pchelolo) [19:34:51] 10Release-Engineering-Team, 10Release-Engineering-Team-TODO, 10Operations, 10Category, and 3 others: FY2017/18 Program 6: Streamlined Service delivery - https://phabricator.wikimedia.org/T170453 (10Pchelolo) [19:41:38] (03CR) 10Thcipriani: [C: 03+2] Unit tests: PosOf InsertElement [blubber] - 10https://gerrit.wikimedia.org/r/523748 (owner: 10Thcipriani) [19:42:04] no more deployment servers on jessie, also not in deployment-prep. correct? [19:42:23] as in "no more PHP5 support" https://gerrit.wikimedia.org/r/c/operations/puppet/+/523735/1/modules/profile/manifests/mediawiki/deployment/server.pp [19:44:11] (03Merged) 10jenkins-bot: Unit tests: PosOf InsertElement [blubber] - 10https://gerrit.wikimedia.org/r/523748 (owner: 10Thcipriani) [19:46:24] (03CR) 10PipelineBot: "pipeline-dashboard: blubber-pipeline-publish" [blubber] - 10https://gerrit.wikimedia.org/r/523748 (owner: 10Thcipriani) [19:46:26] (03CR) 10PipelineBot: "pipeline-dashboard: blubber-pipeline-publish" [blubber] - 10https://gerrit.wikimedia.org/r/523748 (owner: 10Thcipriani) [19:46:28] (03CR) 10jenkins-bot: Unit tests: PosOf InsertElement [blubber] - 10https://gerrit.wikimedia.org/r/523748 (owner: 10Thcipriani) [19:53:36] mutante: Yes. [19:54:07] mutante: The only things we have left running php5 in production are things like contint1001 itself. [19:54:35] James_F: thanks for confirming. i checked deployment-deploy* in deployment-prep [19:54:46] and ..merged removing support for it [19:55:25] Cool. [19:57:24] 10Gerrit: Support posting screenshots in Gerrit - https://phabricator.wikimedia.org/T228084 (10kostajh) > I don't see how/why this has been prioritized hence resetting @Aklapper looks like the ["Create task" form](https://phabricator.wikimedia.org/maniphest/task/edit/form/47/) defaults to "Normal" for priority;... [20:01:00] 10Continuous-Integration-Infrastructure: contint1001 spurious disk space alarms - https://phabricator.wikimedia.org/T227605 (10Dzahn) @hashar Cool, saw your change. Merged it. Better now to resolve again? [20:23:58] thcipriani: James_F please check out the task https://phabricator.wikimedia.org/T228196 there are several releng images that should be rebuilt [20:24:06] if you tell me how ill help [20:24:34] fsero: I'll run a re-build on the server. [20:25:09] we also probably have these images on various machines [20:25:11] !log Docker: Running a general rebuild for all missing RelEng images T228196 [20:25:13] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [20:25:14] T228196: docker-registry: some layers has been corrupted due to deleting other swift containers - https://phabricator.wikimedia.org/T228196 [20:25:18] if needed. [20:26:20] Yeah, I've got a total local build of RelEng/* locally. [20:26:57] thcipriani: unless im wrong just a push wouldnt work since the manifest is still on registry and what is missing is underlying content, for that work i need to delete images first and i'd prefer not do that a retag would work better [20:27:53] So you think we need to bump every docker image's version number, rebuilding them all, and re-configure every job to use the new image numbers? [20:28:17] (FWIW, the publish step is publishing a number of images.) [20:28:23] 10Release-Engineering-Team, 10Gerrit-Privilege-Requests: Gerrit manager rights for Ottomata - https://phabricator.wikimedia.org/T226724 (10Ottomata) > i.e., for your new project inheriting rights from mediawiki may restrict you somewhat. Ah shoot, this bit me. I just created https://gerrit.wikimedia.org/r/a... [20:28:36] are they pinned to a tag? or just using latest? [20:28:43] thcipriani: ^^ can you help when you have a sec? [20:28:54] if they are using latest as part of the retag latest will point to the latest version [20:28:59] fsero: Pinned to a tag. [20:29:11] fsero: We don't use :latest as that can break things unexpectedly. [20:29:17] and is good [20:29:23] i do agree just checking [20:29:41] how hard it is the reconfiguration? [20:29:58] 10Release-Engineering-Team-TODO (201907), 10Operations, 10serviceops, 10Wikimedia-Incident: docker-registry: some layers has been corrupted due to deleting other swift containers - https://phabricator.wikimedia.org/T228196 (10Jdforrester-WMF) ` [contint1001.wikimedia.org] out: == Step 0: scanning /etc/zuul... [20:30:06] 10Release-Engineering-Team-TODO (201907), 10Operations, 10serviceops, 10Wikimedia-Incident: docker-registry: some layers has been corrupted due to deleting other swift containers - https://phabricator.wikimedia.org/T228196 (10Jdforrester-WMF) [20:30:10] It's tedious. It'll take me a couple of hours to do well. [20:30:58] (We have 97 images currently, and > 450 jobs that use them.) [20:31:55] try pulling this image docker pull docker-registry.wikimedia.org/releng/tox-poolcounter [20:31:58] it would work [20:32:19] while docker-registry.wikimedia.org/releng/civicrm [20:32:23] it will not [20:32:36] Indeed. [20:33:06] is because tox-poolcounter has been rebuilt and the others probably just a retag [20:33:16] Also docker-registry.wikimedia.org/releng/civicrm:0.1.1-s1 breaks too. [20:33:30] because is a retag probably [20:33:45] Presumably all the images are still sitting on contint1001's docker image partition. [20:35:10] ottomata: give it a try now [20:35:59] I don't know about that re:all images on contint1001's image partition [20:36:10] * James_F sighs. [20:36:12] since we move to /mnt/docker [20:36:14] hm thcipriani i see that it says it inherits from All-Projects now [20:36:15] *moved [20:36:20] but i still can't edit anything on gerrit [20:36:53] what are you trying to edit? [20:37:36] oh wait, defaults to refs/heads/* rather than refs/*, one sec. [20:37:55] James_F: where are you running that docker-pkg? [20:38:04] i guess permissions? [20:38:10] i'd like to set a description too :) [20:38:16] fsero: contint1001. [20:38:33] fsero: (Which is the only thing with publish permissions.) [20:39:18] ottomata: ok, Gerrit Managers now own refs/* that should let you modify permissions on the projects (refs/meta/config) [20:39:31] s/projects/project/ [20:39:55] hrm, it looks like /dev images are on this list; i can probably do a rebuild on those (not that i think many are using them at the moment). [20:40:26] Hm, thcipriani doesn't seem so. I can edit the description and Project Optoins now [20:40:32] but I can't chagne access rights [20:44:44] hrm, I don't understand why that would be, "Being project owner means that you own a project in Gerrit. Technically this is expressed by having the Owner access right on refs/* on that project. As project owner you have the permission to edit the access control list and the project settings of the project." [20:44:57] via: https://gerrit.wikimedia.org/r/Documentation/intro-project-owner.html [20:48:31] there's no edit here? https://gerrit.wikimedia.org/r/#/admin/projects/eventgate-wikimedia,access [20:48:47] https://usercontent.irccloud-cdn.com/file/Nn4TFqoi/Screen%20Shot%202019-07-16%20at%2016.48.17.png [20:49:18] oh, what about in the old ui? [20:49:21] hm [20:49:31] that seems to work! [20:49:31] > This is currently in read only mode. To modify content, go to the Old UI [20:49:37] is what I see in the new UI [20:49:48] ¯\_(ツ)_/¯ [20:49:59] ok great! [20:50:15] I don't see that warning [20:50:17] b ut the old UI works [20:50:19] thank you! [20:50:22] yw! [20:50:22] I was able to push [20:50:26] great [20:53:33] hrm, doing a nightly docker-pkg won't work either: we'd have to bump all the images in jjb there as well [20:54:31] fsero: what happens when we try to re-push the same image with the same tags? explosion? [20:55:53] (03PS1) 10Ottomata: Trigger service-pipeline docker image builds of eventgate-wikimedia [integration/config] - 10https://gerrit.wikimedia.org/r/523804 (https://phabricator.wikimedia.org/T226668) [20:57:48] it would not upload anything as registry, have a listed manifest with that version [20:57:56] so docker would refuse to upload anything [20:58:10] another option is to completely remove images from registry then a re push would work [20:58:35] but i'd prefer to avoid that if possible [20:58:39] Wow, we have two RelEng docker images not based on ci-stretch or ci-jessie. [20:58:58] Oh, wait, one of them is but is a mistake. [21:01:58] hrm, if we can't push the same version then we'd have to bump all docker-pkg images, and then bump all the jjb to accommodate [21:02:11] which would take a bit. [21:02:16] thcipriani: Yes. ^^ [21:02:25] also: are all versions for these images now corrupt? [21:02:28] Bah, my timing sucks. [21:02:34] yeah they are corrupt [21:02:52] (03PS1) 10Jforrester: dockerfiles: Bump all 94 images for docker-registry consistency [integration/config] - 10https://gerrit.wikimedia.org/r/523807 (https://phabricator.wikimedia.org/T228196) [21:03:05] OK, ^^ bumps all 94 of our RelEng images. [21:04:21] (03CR) 10Fsero: [C: 03+1] dockerfiles: Bump all 94 images for docker-registry consistency [integration/config] - 10https://gerrit.wikimedia.org/r/523807 (https://phabricator.wikimedia.org/T228196) (owner: 10Jforrester) [21:04:34] some of these are bumped twice, although, I guess it doesn't really matter. [21:04:43] thcipriani: Oh, lame. One moment. [21:04:50] ottomata: ping [21:04:57] checking that gerrit repo for you [21:05:04] I think you should be able to push? [21:05:16] Owner: Gerrit Managers [21:05:21] Push: Gerrit Managers [21:05:27] Inherit from: All-Projects [21:05:44] hauskatze: he got it figured out, new ui wasn't letting him edit. [21:06:04] new UI is tae cra*** for now [21:06:07] :) [21:06:17] I think they made pretty nice fixes in 2.16 et later [21:06:23] (03PS2) 10Jforrester: dockerfiles: Bump all 94 images for docker-registry consistency [integration/config] - 10https://gerrit.wikimedia.org/r/523807 (https://phabricator.wikimedia.org/T228196) [21:06:27] :) [21:06:31] thcipriani: so no need to do anything? [21:06:35] thanks hauskatze i will note the thcipriani helped. [21:06:40] and it works now iwth old UI [21:06:59] perfect, there's nothing that cats love more than watch and lazing around :) [21:07:16] 10Release-Engineering-Team, 10Gerrit-Privilege-Requests: Gerrit manager rights for Ottomata - https://phabricator.wikimedia.org/T226724 (10Ottomata) Ah! Tyler helped me. He fixed some perms for Gerrit Managers, but then I had to switch to the old UI to edit access permissions. Works for me! [21:09:41] (03CR) 10Fsero: dockerfiles: Bump all 94 images for docker-registry consistency (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/523807 (https://phabricator.wikimedia.org/T228196) (owner: 10Jforrester) [21:09:47] (03CR) 10Thcipriani: [C: 03+2] dockerfiles: Bump all 94 images for docker-registry consistency (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/523807 (https://phabricator.wikimedia.org/T228196) (owner: 10Jforrester) [21:10:16] (03CR) 10Jforrester: dockerfiles: Bump all 94 images for docker-registry consistency (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/523807 (https://phabricator.wikimedia.org/T228196) (owner: 10Jforrester) [21:10:34] ah, crap. [21:10:44] (03PS3) 10Jforrester: dockerfiles: Bump all 94 images for docker-registry consistency [integration/config] - 10https://gerrit.wikimedia.org/r/523807 (https://phabricator.wikimedia.org/T228196) [21:12:23] I'm trying to write a clever sed script to bump all the jjb entries. [21:12:35] James_F: sorry, I have an itchy merge finger sometimes :\ [21:15:32] thcipriani: I just want to make the jjb change first. :-) [21:24:19] (03PS1) 10Jforrester: jjb: Point all docker references to new images post-T228196 [integration/config] - 10https://gerrit.wikimedia.org/r/523813 (https://phabricator.wikimedia.org/T228196) [21:24:27] (03CR) 10Jforrester: [C: 03+2] dockerfiles: Bump all 94 images for docker-registry consistency [integration/config] - 10https://gerrit.wikimedia.org/r/523807 (https://phabricator.wikimedia.org/T228196) (owner: 10Jforrester) [21:24:44] What, me, a hack?! Never! ;-) [21:25:09] 10Release-Engineering-Team (Code Health), 10Release-Engineering-Team-TODO (201907), 10Code-Stewardship-Reviews, 10Graphoid, and 3 others: graphoid: Code stewardship request - https://phabricator.wikimedia.org/T211881 (10Jrbranaa) p:05Normal→03High [21:25:19] 10Release-Engineering-Team (Code Health), 10Release-Engineering-Team-TODO (201907), 10Code-Stewardship-Reviews, 10Graphoid, and 3 others: graphoid: Code stewardship request - https://phabricator.wikimedia.org/T211881 (10greg) >>! In T211881#5332195, @akosiaris wrote: > the hardware and the Operating System... [21:25:23] 10Release-Engineering-Team (Code Health), 10Code-Stewardship-Reviews, 10MediaWiki-extensions-FlaggedRevs: FlaggedRevs: code stewardship review - https://phabricator.wikimedia.org/T185664 (10Jrbranaa) p:05Normal→03High [21:25:46] 10Release-Engineering-Team (Code Health), 10Code-Stewardship-Reviews, 10MediaWiki-extensions-ShortUrl: Code Stewardship Review: ShortUrl Extension - https://phabricator.wikimedia.org/T187045 (10greg) p:05Triage→03Normal [21:26:01] (03Merged) 10jenkins-bot: dockerfiles: Bump all 94 images for docker-registry consistency [integration/config] - 10https://gerrit.wikimedia.org/r/523807 (https://phabricator.wikimedia.org/T228196) (owner: 10Jforrester) [21:26:24] James_F / thcipriani -- translatewiki.net CI queue seems broken: https://integration.wikimedia.org/ci/job/translatewiki-composer-hhvm-docker/1184/console [21:27:06] 23:24:39 docker: error pulling image configuration: image config verification failed for digest sha256:596213af1072e08a7dd68cd770441ac3401b4f2fad74f1f7f19e56485154dd8e. [21:27:17] hauskatze: Yup, that's the UBN we're fixing now. [21:27:25] alrighty [21:27:28] hauskatze: See T228196 :-( [21:27:28] T228196: docker-registry: some layers has been corrupted due to deleting other swift containers - https://phabricator.wikimedia.org/T228196 [21:27:45] Random images won't be available and will be corrupted on fetch. [21:27:45] delete all the things [21:29:23] !log Docker: Publishing a whole new set of RelEng images for T228196 [21:29:25] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [21:30:17] (From my experience, this will take about an hour.) [21:30:50] 10Release-Engineering-Team (Code Health), 10Code-Stewardship-Reviews, 10MediaWiki-extensions-Nuke: Nuke Extension: Code Stewardship Review - https://phabricator.wikimedia.org/T221155 (10greg) p:05Triage→03Normal [21:30:57] * James_F rebuilds the world locally, too. [21:34:12] 10Release-Engineering-Team (Code Health), 10Code-Stewardship-Reviews, 10MediaWiki-extensions-LiquidThreads: LiquidThreads: code stewardship review - https://phabricator.wikimedia.org/T187487 (10greg) p:05Triage→03Normal [21:35:38] OK, we're six minutes in and have done 7 of 94. [21:36:06] one each minute-ish [21:38:09] Is there some way to trigger a rebuild without pushing a changeset? [21:38:58] Or maybe a better question is to clarify whether any individual action is required for T228196 [21:38:59] T228196: docker-registry: some layers has been corrupted due to deleting other swift containers - https://phabricator.wikimedia.org/T228196 [21:39:30] 10Release-Engineering-Team-TODO (201907), 10Operations, 10serviceops, 10Patch-For-Review, 10Wikimedia-Incident: docker-registry: some layers has been corrupted due to deleting other swift containers - https://phabricator.wikimedia.org/T228196 (10bd808) https://integration.wikimedia.org/ci/job/labs-strike... [21:39:54] James_F: will your big push catch tox-labs-striker? [21:40:32] (03PS1) 10MarcoAurelio: [Convert2Wiki] Archive extension [integration/config] - 10https://gerrit.wikimedia.org/r/523818 (https://phabricator.wikimedia.org/T228198) [21:40:40] bd808: Yup. [21:40:57] * bd808 will twiddle thumbs then :) [21:41:41] It'll probably be an hour or more, sorry. :-( [21:42:47] no biggie. I was going to do a quick follow up deploy to squelch some deprecation log noise in Striker. I can do it later tonight [21:44:52] urandom: you should rebuild sessionstore and point to the new built image, the old image is still cached but if is wiped from cache it would be impossible to launch a pod in k8s [21:45:01] but you can do it later or tomorrow [21:45:14] fsero: gotcha [21:45:32] fsero: and I do this by just pushing a code change of some kind? [21:45:41] or is there some way to simply trigger a rebuild? [21:46:14] you can always rebuild the last jenkins job that created your image, it will publish a new image [21:46:36] 10Release-Engineering-Team (Deployment services), 10Release-Engineering-Team-TODO, 10Operations, 10SRE-Access-Requests: Request access to deployment cluster for Alaa Sarhan - https://phabricator.wikimedia.org/T223698 (10alaa_wmde) thanks @greg, @MoritzMuehlenhoff and @akosiaris (and @Tarrow for the ping)... [21:46:56] fsero: is that going to work if the last build was just pushing a tag? Does it matter that the resulting name will be the same? [21:47:48] no, it needs to be a new image and i was under the impresion that trigerring the pipeline would always create a IMGNAME:TIMESTAMP tag [21:48:17] yeah, it will, but I used the image corresponding to the tagname on the last deploy [21:48:33] I can create a new tag though [21:49:18] if you can pull it nfrom your laptop then is ok to use [21:50:41] https://www.irccloud.com/pastebin/4S1tjUt0/ [21:51:01] So yeah, I'll push another tag and deploy tomorrow [21:51:37] I think that image only made it as far as staging anyway [21:51:54] fsero: thanks; I'll leave you be now! [21:57:09] * hauskatze deleting GitHub mirror of Convert2Wiki [21:58:42] !log GitHub: deleted `wikimedia/mediawiki-extensions-Convert2Wiki` refs. T228198 [21:58:45] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [21:58:46] T228198: Archive the Convert2Wiki extension - https://phabricator.wikimedia.org/T228198 [21:59:44] (03CR) 10Hashar: "Do not do that. It is going to break everything :-\" [integration/config] - 10https://gerrit.wikimedia.org/r/523807 (https://phabricator.wikimedia.org/T228196) (owner: 10Jforrester) [22:00:17] (03CR) 10Jforrester: "> Patch Set 3:" [integration/config] - 10https://gerrit.wikimedia.org/r/523807 (https://phabricator.wikimedia.org/T228196) (owner: 10Jforrester) [22:02:58] (03CR) 10Hashar: "We have all our containers on contint1001.wikimedia.org so we can just push from there! And in the lucky case there are some missing, the" [integration/config] - 10https://gerrit.wikimedia.org/r/523807 (https://phabricator.wikimedia.org/T228196) (owner: 10Jforrester) [22:07:58] (03CR) 10Jforrester: "> Patch Set 3:" [integration/config] - 10https://gerrit.wikimedia.org/r/523807 (https://phabricator.wikimedia.org/T228196) (owner: 10Jforrester) [22:16:23] Status update: Building the 44th image of 94 now after 47 minutes. Images will be published after all builds have finished. [22:16:46] It should get faster towards the end. Hopefully. [22:37:27] Hi! guessing this talk of rebuilding images is related to the 'error pulling image configuration' stuff we're seeing in multiple repo tests? [22:43:03] ejegg: most probably. [22:43:33] ejegg: Yes. [22:43:51] k, thanks! [22:44:17] Status update: Building the 60th image of 94 now after 85 minutes. Images will be published after all builds have finished. [22:57:10] thcipriani: James_F i might be wrong but seems that rescuing blobs from backup it worked [22:57:12] ➜ ~ docker pull -a docker-registry.wikimedia.org/dev/mediawiki-xdebug (⎈ |helmmanagement:kube-system) [22:57:12] 0.0.1-1: Pulling from dev/mediawiki-xdebug [22:57:12] 5c86276767f3: Already exists [22:57:12] 0445fd56950a: Pull complete [22:57:12] 4edf12b6f5f0: Pull complete [22:57:12] 39961bb3a06a: Pull complete [22:57:12] ff1bd3a20a43: Pull complete [22:57:13] c8d5cc47c664: Pull complete [22:57:13] f94c56cc4983: Pull complete [22:57:14] d4879eaa91af: Pull complete [22:57:14] d4726684c460: Pull complete [22:57:15] a8ae8fa08136: Pull complete [22:57:38] so maybe we dont need to merge the jjb change after all [22:58:23] bd808: could you please rerun your CI job? [23:00:10] hrm, well, did the pull from https://integration.wikimedia.org/ci/job/translatewiki-composer-hhvm-docker/1184/console on docker-1058 and it worked, so that's positive [23:02:59] same with docker-1059 on https://integration.wikimedia.org/ci/job/labs-striker-tox-docker/152/console [23:05:28] according to my test no image is failling during pulls [23:06:36] Hmm. [23:09:27] We're now ~15 images away from finishing the set. [23:09:53] (03CR) 10Jforrester: [C: 04-1] "May not be necessary." [integration/config] - 10https://gerrit.wikimedia.org/r/523813 (https://phabricator.wikimedia.org/T228196) (owner: 10Jforrester) [23:10:08] I'll just let it land. [23:11:45] fsero: mine seems to work now [23:24:25] (Ping bd808 and ejegg, world should be fixed now including your bits.) [23:24:48] * bd808 checks bits, finds them mostly in working order [23:24:59] Quite. :-) [23:31:44] (New images now being published.) [23:34:45] thanks James_F! CiviCRM tests are looking good now [23:34:54] Excellent. [23:53:43] in case you were wondering, it takes 51 minutes to clone all of Gerrit [23:55:36] legoktm: Is that all? It's taken me over two hours to rebuild/push all the RelEng docker images, and it's not finished yet. :-( [23:55:53] Which is much less code. So… well done gerrit? [23:57:58] I'm pleasantly surprised [23:58:36] Also, you're reminding me that we need to provide read-only replicas of gerrit so your massive query load goes elsewhere. :-) [23:58:51] I was doing it inside cloud services, so the bottleneck (I think) was the CPU of the instance [23:58:57] I'm setting up a replica! [23:59:03] Right. [23:59:03] That's why I had to clone everything :p [23:59:09] Excuses. ;-) [23:59:10] https://ggmirror.wmflabs.org/cgit/ [23:59:26] Is codesearch going to use that now?