[00:00:33] 10Release-Engineering-Team, 10Scap, 10Growth-Team, 10MediaWiki-Recent-changes, and 3 others: Scap deployments are not purging MessageBlobStore (was: Stale localized messages) - https://phabricator.wikimedia.org/T222539 (10Jdforrester-WMF) [00:49:05] RECOVERY - Mathoid on deployment-mathoid is OK: HTTP OK: HTTP/1.1 200 OK - 925 bytes in 0.026 second response time [00:55:04] PROBLEM - Mathoid on deployment-mathoid is CRITICAL: connect to address 172.16.5.73 and port 10042: Connection refused [01:05:04] RECOVERY - Mathoid on deployment-mathoid is OK: HTTP OK: HTTP/1.1 200 OK - 925 bytes in 0.037 second response time [01:11:04] PROBLEM - Mathoid on deployment-mathoid is CRITICAL: connect to address 172.16.5.73 and port 10042: Connection refused [01:25:06] (03PS1) 10Catrope: Clear MessageBlobStore after syncing i18n data [tools/scap] - 10https://gerrit.wikimedia.org/r/508488 (https://phabricator.wikimedia.org/T222539) [01:26:46] (03CR) 10Catrope: "I wasn't able to test this patch and I don't really know how to write Python, so I cargo-culted heavily. Please review this patch with a l" [tools/scap] - 10https://gerrit.wikimedia.org/r/508488 (https://phabricator.wikimedia.org/T222539) (owner: 10Catrope) [01:27:35] 10Release-Engineering-Team, 10Scap, 10MediaWiki-ResourceLoader, 10Patch-For-Review, and 2 others: Scap deployments are not purging MessageBlobStore (was: Stale localized messages) - https://phabricator.wikimedia.org/T222539 (10Catrope) [01:27:43] (03CR) 10PipelineBot: "pipeline-dashboard: service-pipeline-test" [tools/scap] - 10https://gerrit.wikimedia.org/r/508488 (https://phabricator.wikimedia.org/T222539) (owner: 10Catrope) [01:27:49] (03PS2) 10Catrope: Clear MessageBlobStore after syncing i18n data [tools/scap] - 10https://gerrit.wikimedia.org/r/508488 (https://phabricator.wikimedia.org/T222539) [01:28:19] 10Release-Engineering-Team, 10Scap, 10MediaWiki-ResourceLoader, 10Patch-For-Review, and 2 others: Scap deployments are not purging MessageBlobStore (was: Stale localized messages) - https://phabricator.wikimedia.org/T222539 (10Catrope) a:03Catrope [01:29:25] (03PS3) 10Catrope: Clear MessageBlobStore after syncing i18n data [tools/scap] - 10https://gerrit.wikimedia.org/r/508488 (https://phabricator.wikimedia.org/T222539) [01:29:57] 10Deployments, 10Release-Engineering-Team (Kanban), 10MediaWiki-Internationalization, 10Patch-For-Review, and 2 others: Post-mortem "MWException: No localisation cache found for English." - https://phabricator.wikimedia.org/T217719 (10Catrope) Wrong bug, sorry. Neither of these patches addresses this bug,... [01:30:33] (03CR) 10PipelineBot: "pipeline-dashboard: service-pipeline-test" [tools/scap] - 10https://gerrit.wikimedia.org/r/508488 (https://phabricator.wikimedia.org/T222539) (owner: 10Catrope) [01:31:36] (03CR) 10PipelineBot: "pipeline-dashboard: service-pipeline-test" [tools/scap] - 10https://gerrit.wikimedia.org/r/508488 (https://phabricator.wikimedia.org/T222539) (owner: 10Catrope) [02:05:53] PROBLEM - Free space - all mounts on deployment-fluorine02 is CRITICAL: CRITICAL: deployment-prep.deployment-fluorine02.diskspace._srv.byte_percentfree (<10.00%) [03:51:03] RECOVERY - Mathoid on deployment-mathoid is OK: HTTP OK: HTTP/1.1 200 OK - 925 bytes in 0.038 second response time [03:57:05] PROBLEM - Mathoid on deployment-mathoid is CRITICAL: connect to address 172.16.5.73 and port 10042: Connection refused [04:02:06] RECOVERY - Mathoid on deployment-mathoid is OK: HTTP OK: HTTP/1.1 200 OK - 925 bytes in 0.036 second response time [04:08:04] PROBLEM - Mathoid on deployment-mathoid is CRITICAL: connect to address 172.16.5.73 and port 10042: Connection refused [04:18:04] RECOVERY - Mathoid on deployment-mathoid is OK: HTTP OK: HTTP/1.1 200 OK - 925 bytes in 0.046 second response time [06:06:05] PROBLEM - Mathoid on deployment-mathoid is CRITICAL: connect to address 172.16.5.73 and port 10042: Connection refused [06:50:53] RECOVERY - Free space - all mounts on deployment-fluorine02 is OK: OK: All targets OK [08:11:50] (03PS1) 10Hashar: WMF: backport Fix reject clauses in the abscence of approvals [integration/zuul] (patch-queue/debian/jessie-wikimedia) - 10https://gerrit.wikimedia.org/r/508509 (https://phabricator.wikimedia.org/T105474) [08:15:02] (03PS2) 10Hashar: patch: Don't call merger for non live item [integration/zuul] (debian/jessie-wikimedia) - 10https://gerrit.wikimedia.org/r/508390 (https://phabricator.wikimedia.org/T140297) [08:15:04] (03PS1) 10Hashar: patch: Fix reject clauses in the absence of approvals [integration/zuul] (debian/jessie-wikimedia) - 10https://gerrit.wikimedia.org/r/508511 (https://phabricator.wikimedia.org/T105474) [08:20:23] <_joe_> hashar: I'm refactoring scap/role::deployment_server in puppet. Who is the best person in your team to give me a feedback on the patches? [08:20:32] <_joe_> tyler I'd guess? [08:20:40] <_joe_> or you? [08:21:36] (03PS1) 10Hashar: zuul: skip test/test-prio for CR+2 changes [integration/config] - 10https://gerrit.wikimedia.org/r/508512 (https://phabricator.wikimedia.org/T105474) [08:25:25] (03PS2) 10Hashar: zuul: skip test/test-prio for CR+2 changes [integration/config] - 10https://gerrit.wikimedia.org/r/508512 (https://phabricator.wikimedia.org/T105474) [08:26:46] 10Continuous-Integration-Infrastructure, 10Jenkins: Jenkins jobs regularly being queued while resources appear to be readily available - https://phabricator.wikimedia.org/T218458 (10hashar) [08:26:50] (03CR) 10jerkins-bot: [V: 04-1] zuul: skip test/test-prio for CR+2 changes [integration/config] - 10https://gerrit.wikimedia.org/r/508512 (https://phabricator.wikimedia.org/T105474) (owner: 10Hashar) [08:27:38] (03CR) 10Hashar: [V: 03+2 C: 03+2] "Tested added in integration/config to confirm a patch without any approval is still accepted:" [integration/zuul] (patch-queue/debian/jessie-wikimedia) - 10https://gerrit.wikimedia.org/r/508509 (https://phabricator.wikimedia.org/T105474) (owner: 10Hashar) [08:29:24] (03CR) 10Hashar: "recheck due to https://gerrit.wikimedia.org/r/#/c/integration/zuul/+/508509/ being merged." (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/508512 (https://phabricator.wikimedia.org/T105474) (owner: 10Hashar) [08:30:48] (03CR) 10jerkins-bot: [V: 04-1] zuul: skip test/test-prio for CR+2 changes [integration/config] - 10https://gerrit.wikimedia.org/r/508512 (https://phabricator.wikimedia.org/T105474) (owner: 10Hashar) [08:32:56] (03PS3) 10Hashar: zuul: skip test/test-prio for CR+2 changes [integration/config] - 10https://gerrit.wikimedia.org/r/508512 (https://phabricator.wikimedia.org/T105474) [08:35:41] (03CR) 10Hashar: [C: 04-2] "Requires Zuul to be bumped in production." [integration/config] - 10https://gerrit.wikimedia.org/r/508512 (https://phabricator.wikimedia.org/T105474) (owner: 10Hashar) [08:38:54] 10Continuous-Integration-Config, 10Release-Engineering-Team (Kanban), 10Zuul, 10Patch-For-Review, 10Upstream: 'recheck' on a CR+2 patch should trigger gate-and-submit, not test - https://phabricator.wikimedia.org/T105474 (10hashar) I have backported the upstream patch https://review.opendev.org/#/c/58976... [08:39:04] (03PS1) 10Hashar: Allow passing args to flake8 tox env [integration/config] - 10https://gerrit.wikimedia.org/r/508514 [08:41:56] _joe_: I have missed your ping sorry. I would guess 20after4 , Tyler, Dduval and potentially Krenair as well [08:42:07] <_joe_> ok thanks [08:42:20] <_joe_> it's an unholy mess [08:42:32] <_joe_> and it hurts my sense of aestethics [08:42:41] <_joe_> also makes it incredibly confusing to fix things [08:42:42] _joe_: I'm familiar with that mess [08:43:17] I might be able to help untangle some of it [08:46:36] hmm no the role is actually newer than anything I was involved with.... [08:47:33] but yeah what I do understand is that deployment server is a messy one. It's a bastardized mediawiki node plus various random things glued on [08:51:41] <_joe_> yeah trying to slowly untangle it [08:51:54] <_joe_> I'll add you as a rewiewer to all the patches [08:52:25] <_joe_> it's a loong path :/ [08:54:33] _joe_: so you are tackling all of the deployment_server related stuff? [08:55:36] hashar identified the right people. I think the most familiar group would be tyler, myself and Krenair with a little bit of Krinkle for good measure. [08:56:51] the "main" piece is probably class scap::master [08:59:02] <_joe_> no it's the unclear distinction between scap2, scap3, their ties into mediawiki [08:59:13] <_joe_> and the fact they share the configuration [09:02:24] (03CR) 10Hashar: [C: 03+2] patch: Don't call merger for non live item [integration/zuul] (debian/jessie-wikimedia) - 10https://gerrit.wikimedia.org/r/508390 (https://phabricator.wikimedia.org/T140297) (owner: 10Hashar) [09:02:29] (03CR) 10Hashar: [C: 03+2] patch: Fix reject clauses in the absence of approvals [integration/zuul] (debian/jessie-wikimedia) - 10https://gerrit.wikimedia.org/r/508511 (https://phabricator.wikimedia.org/T105474) (owner: 10Hashar) [09:05:22] (03Merged) 10jenkins-bot: patch: Don't call merger for non live item [integration/zuul] (debian/jessie-wikimedia) - 10https://gerrit.wikimedia.org/r/508390 (https://phabricator.wikimedia.org/T140297) (owner: 10Hashar) [09:05:26] (03Merged) 10jenkins-bot: patch: Fix reject clauses in the absence of approvals [integration/zuul] (debian/jessie-wikimedia) - 10https://gerrit.wikimedia.org/r/508511 (https://phabricator.wikimedia.org/T105474) (owner: 10Hashar) [09:08:21] <_joe_> I also fear I trusted the docs in the code a bit too much :/ [09:08:53] (03PS1) 10Hashar: 2.5.1-wmf8: bugs fix following incident [integration/zuul] (debian/jessie-wikimedia) - 10https://gerrit.wikimedia.org/r/508520 (https://phabricator.wikimedia.org/T105474) [09:23:45] (03CR) 10Hashar: [C: 03+2] 2.5.1-wmf8: bugs fix following incident [integration/zuul] (debian/jessie-wikimedia) - 10https://gerrit.wikimedia.org/r/508520 (https://phabricator.wikimedia.org/T105474) (owner: 10Hashar) [09:26:40] (03Merged) 10jenkins-bot: 2.5.1-wmf8: bugs fix following incident [integration/zuul] (debian/jessie-wikimedia) - 10https://gerrit.wikimedia.org/r/508520 (https://phabricator.wikimedia.org/T105474) (owner: 10Hashar) [09:32:14] 10Continuous-Integration-Infrastructure, 10Zuul, 10Operations, 10Wikimedia-Incident: Upload zuul_2.5.1-wmf8 to apt.wikimedia.org - https://phabricator.wikimedia.org/T222689 (10hashar) [09:32:52] 10Continuous-Integration-Infrastructure, 10Zuul, 10Operations, 10Wikimedia-Incident: Upload zuul_2.5.1-wmf8 to apt.wikimedia.org - https://phabricator.wikimedia.org/T222689 (10hashar) [09:32:55] 10Continuous-Integration-Config, 10Release-Engineering-Team (Kanban), 10Zuul, 10Patch-For-Review, 10Upstream: 'recheck' on a CR+2 patch should trigger gate-and-submit, not test - https://phabricator.wikimedia.org/T105474 (10hashar) [10:11:16] 10Continuous-Integration-Infrastructure, 10Zuul, 10Operations, 10Wikimedia-Incident: Upload zuul_2.5.1-wmf8 to apt.wikimedia.org - https://phabricator.wikimedia.org/T222689 (10jbond) this has been uploaded let me know if there are any issues [10:11:31] 10Continuous-Integration-Infrastructure, 10Zuul, 10Operations, 10Wikimedia-Incident: Upload zuul_2.5.1-wmf8 to apt.wikimedia.org - https://phabricator.wikimedia.org/T222689 (10jbond) 05Openβ†’03Resolved [10:11:34] 10Continuous-Integration-Config, 10Release-Engineering-Team (Kanban), 10Zuul, 10Patch-For-Review, 10Upstream: 'recheck' on a CR+2 patch should trigger gate-and-submit, not test - https://phabricator.wikimedia.org/T105474 (10jbond) [10:19:16] 10Continuous-Integration-Infrastructure, 10Operations-Software-Development, 10Patch-For-Review, 10cloud-services-team (Kanban): puppet broken on integration WMCS instances due to openstack Debian packages - https://phabricator.wikimedia.org/T218559 (10hashar) Cleanup complete, Zuul in production now uses D... [10:57:47] !log Upgraded Zuul to 2.5.1-wmf8 # T105474 T140297 [10:57:50] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [10:57:50] T105474: 'recheck' on a CR+2 patch should trigger gate-and-submit, not test - https://phabricator.wikimedia.org/T105474 [11:07:30] (03CR) 10Hashar: "recheck" [integration/config] - 10https://gerrit.wikimedia.org/r/508514 (owner: 10Hashar) [11:07:36] (03CR) 10jerkins-bot: [V: 04-1] Allow passing args to flake8 tox env [integration/config] - 10https://gerrit.wikimedia.org/r/508514 (owner: 10Hashar) [11:16:34] (03CR) 10Hashar: "recheck" [integration/config] - 10https://gerrit.wikimedia.org/r/508514 (owner: 10Hashar) [11:18:07] PROBLEM - Citoid on deployment-sca02 is CRITICAL: connect to address 172.16.5.112 and port 1970: Connection refused [11:21:05] RECOVERY - Mathoid on deployment-mathoid is OK: HTTP OK: HTTP/1.1 200 OK - 925 bytes in 0.027 second response time [11:27:05] PROBLEM - Mathoid on deployment-mathoid is CRITICAL: connect to address 172.16.5.73 and port 10042: Connection refused [11:48:07] RECOVERY - Citoid on deployment-sca02 is OK: HTTP OK: HTTP/1.1 200 OK - 921 bytes in 0.029 second response time [12:10:32] (03CR) 10Alexandros Kosiaris: [C: 03+1] grant access to Javamelody Monitoring for ldap/ops [All-Projects] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/508068 (owner: 10Paladox) [12:12:46] (03CR) 10Hashar: [V: 03+2 C: 03+2] grant access to Javamelody Monitoring for ldap/ops [All-Projects] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/508068 (owner: 10Paladox) [12:31:05] (03PS2) 10Hashar: Allow passing args to flake8 tox env [integration/config] - 10https://gerrit.wikimedia.org/r/508514 [12:32:39] (03CR) 10Hashar: [C: 03+2] Allow passing args to flake8 tox env [integration/config] - 10https://gerrit.wikimedia.org/r/508514 (owner: 10Hashar) [12:34:34] (03Merged) 10jenkins-bot: Allow passing args to flake8 tox env [integration/config] - 10https://gerrit.wikimedia.org/r/508514 (owner: 10Hashar) [12:48:24] 10Release-Engineering-Team (Kanban), 10User-greg: Improve the effectiveness of #releng related workboards/process - https://phabricator.wikimedia.org/T222496 (10Aklapper) I'm curious, so if you could lay out best practices in public that might be something for https://www.mediawiki.org/wiki/Phabricator/Project... [12:51:46] 10Project-Admins: Setup Phabricator tag for all Wikidata Documentation tasks - https://phabricator.wikimedia.org/T222711 (10NavinoEvans) [12:53:08] 10Project-Admins: Project for Wikimedia Finland - https://phabricator.wikimedia.org/T222442 (10Aklapper) Requested public project #WMFI has been created: https://phabricator.wikimedia.org/project/profile/4034/ Please encourage interested people to visit the project and to join the project as {icon users} member... [13:11:31] 10Project-Admins: Setup Phabricator tag for all Wikidata Documentation tasks - https://phabricator.wikimedia.org/T222711 (10Aklapper) This already exists by searching for open tasks which are both in #documentation and #wikidata? https://phabricator.wikimedia.org/maniphest/query/XlQA_7JitF26/#R Of course this w... [13:37:17] 10Gerrit: Unable to login to gerrit with my credential due to duplication entry - https://phabricator.wikimedia.org/T222715 (10Jitrixis) [13:52:50] 10Project-Admins: Setup Phabricator tag for all Wikidata Documentation tasks - https://phabricator.wikimedia.org/T222711 (10NavinoEvans) Thanks for that suggestion @Aklapper - The dedicated workboard is actually quite critical for us to organise it properly. The columns I had in mind were: * Tours (new tours or... [14:44:30] 10Gerrit, 10Release-Engineering-Team (Watching / External), 10Operations: Add prometheus exporter to Gerrit - https://phabricator.wikimedia.org/T184086 (10crusnov) >>! In T184086#5162119, @Paladox wrote: > @crusnov we could use your help, yup. We need to create a prometheusBearerToken [plugin.javamelody.prom... [15:02:38] 10Release-Engineering-Team (Kanban), 10Release, 10Train Deployments: 1.34.0-wmf.4 deployment blockers - https://phabricator.wikimedia.org/T220729 (10alaa_wmde) [15:33:03] (03CR) 10Krinkle: Clear MessageBlobStore after syncing i18n data (031 comment) [tools/scap] - 10https://gerrit.wikimedia.org/r/508488 (https://phabricator.wikimedia.org/T222539) (owner: 10Catrope) [15:39:35] 10Beta-Cluster-Infrastructure: Migrate away from Debian Jessie to Debian Stretch - https://phabricator.wikimedia.org/T218729 (10Krenair) [15:49:09] RECOVERY - Mathoid on deployment-mathoid is OK: HTTP OK: HTTP/1.1 200 OK - 925 bytes in 0.032 second response time [15:53:57] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Testing, 10MW-1.34-notes (1.34.0-wmf.4; 2019-05-07), 10Patch-For-Review: Stop using jsonlint (as it's abandonware) and instead use eslint-plugin-json for the linting - https://phabricator.wikimedia.org/T220036 (10Jdforrester-WMF) [15:54:10] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Testing, 10MW-1.34-notes (1.34.0-wmf.4; 2019-05-07), 10Patch-For-Review: Stop using jsonlint (as it's abandonware) and instead use eslint-plugin-json for the linting - https://phabricator.wikimedia.org/T220036 (10Jdforrester-WMF) [16:17:10] 10Gerrit, 10Repository-Admins, 10Shape Expressions, 10Wikidata, and 2 others: rename repository for WikibaseSchema - https://phabricator.wikimedia.org/T221946 (10Ladsgroup) The integration patch has been merged by the old repo still exist in beta cluster instead. I'm trying to find out where config for tha... [16:18:07] i'm going to pretend you are all experts on mw-config/InitialiseSettings.php :) If i have something like ['default' => 1, 'group0' => 2, 'private' => 1], is it by pure chance that private wikis in group0 have the value of 1, or is that a guarantee? [16:18:23] i wrote a test case that asserts it is set as expected, but not sure i trust that that is a guarantee.. [16:19:04] eventually it will be ['default'=>2, 'private'=>1], but currently rolling out slowly [16:28:10] well 'private' is the first thing added to wikiTags if it's applicable [16:28:44] I would assume that means it'd take precedence over all defaults and other groups and stuff, just not individual DB names [16:28:52] but I could be wrong [16:29:16] if it could have security implications for private wikis, request security review [16:30:08] the whole idea of a private wiki being in group0 sounds questionable to me. [16:42:37] 10Gerrit, 10Shape Expressions, 10Wikidata, 10User-Ladsgroup, 10Wikidata-Campsite (Wikidata-Campsite-Iteration-∞): Replace WikibaseSchema repository content with message pointing to EntitySchema - https://phabricator.wikimedia.org/T222192 (10Ladsgroup) a:03Ladsgroup [16:50:05] 10Gerrit, 10Shape Expressions, 10Wikidata, 10User-Ladsgroup, 10Wikidata-Campsite (Wikidata-Campsite-Iteration-∞): Replace WikibaseSchema repository content with message pointing to EntitySchema - https://phabricator.wikimedia.org/T222192 (10Ladsgroup) ` To ssh://gerrit.wikimedia.org:29418/mediawiki/exten... [16:55:04] 10Gerrit, 10LDAP: Gerrit login failure for user tk-999 - https://phabricator.wikimedia.org/T222186 (10thcipriani) 05Openβ†’03Resolved a:03thcipriani Updated your account in the DB, please reopen if that does not resolve your issue. [16:55:27] 10Gerrit, 10LDAP: Gerrit: Cannot assign user name "vladi2016" to account XXXX; name already in use. - https://phabricator.wikimedia.org/T220867 (10thcipriani) 05Openβ†’03Resolved a:03thcipriani Updated your account in the DB, please reopen if that does not resolve your issue. [16:55:32] 10Gerrit: Gerrit: cannot assign username "aldnonymous" to account XXX; name already in use - https://phabricator.wikimedia.org/T221440 (10thcipriani) 05Openβ†’03Resolved a:03thcipriani Updated your account in the DB, please reopen if that does not resolve your issue. [17:15:13] 10Gerrit, 10Shape Expressions, 10Wikidata, 10User-Ladsgroup, 10Wikidata-Campsite (Wikidata-Campsite-Iteration-∞): Replace WikibaseSchema repository content with message pointing to EntitySchema - https://phabricator.wikimedia.org/T222192 (10WMDE-leszek) repo is read only, so behaves as expected :) Made i... [17:25:08] 10Gerrit, 10Shape Expressions, 10Wikidata, 10Patch-For-Review, and 2 others: Replace WikibaseSchema repository content with message pointing to EntitySchema - https://phabricator.wikimedia.org/T222192 (10Ladsgroup) >>! In T222192#5165175, @WMDE-leszek wrote: > repo is read only, so behaves as expected :) >... [17:26:05] PROBLEM - Mathoid on deployment-mathoid is CRITICAL: connect to address 172.16.5.73 and port 10042: Connection refused [17:39:23] (03PS1) 10Ladsgroup: Rename JADE to Jade [tools/release] - 10https://gerrit.wikimedia.org/r/508643 (https://phabricator.wikimedia.org/T212182) [17:54:03] (03PS1) 10Dduvall: doc: Generate documentation with groovydoc [integration/pipelinelib] - 10https://gerrit.wikimedia.org/r/508656 (https://phabricator.wikimedia.org/T222199) [17:55:47] (03CR) 10PipelineBot: "pipeline-dashboard: service-pipeline-test" [integration/pipelinelib] - 10https://gerrit.wikimedia.org/r/508656 (https://phabricator.wikimedia.org/T222199) (owner: 10Dduvall) [17:56:00] 10Gerrit, 10LDAP: Gerrit: Cannot assign user name "vladi2016" to account XXXX; name already in use. - https://phabricator.wikimedia.org/T220867 (10TK-999) Thank you! It's working fine now :) [17:56:32] thcipriani ^ :) !!! [17:56:36] (03PS9) 10Dduvall: doc: Publish documentation for pipelinelib [integration/config] - 10https://gerrit.wikimedia.org/r/507871 (https://phabricator.wikimedia.org/T222199) [18:05:13] heya releng folks, i think i may have either a patch error or a CI error and i think the latter? https://gerrit.wikimedia.org/r/c/operations/dns/+/508650/ [18:05:25] https://integration.wikimedia.org/ci/job/operations-dns-lint-docker/600/console [18:05:32] 17:51:08 ERROR: InvocationError: '/srv/workspace/dnslint/.tox/py35-tests/bin/python /srv/workspace/dnslint/utils/deploy-check.py' [18:05:37] robh 18:51:07 E102|TOO_MANY_MGMT_NAMES: Found 3 name(s) for PTR '91.3.65.10.in-addr.arpa.', expected 2 (hostname, wmfNNNN): [18:05:45] paladox: those checks dont cause failures though [18:05:50] oh [18:05:57] 17:51:08 RESULT: 4 Errors, 2180 Warnings, 0 Ignored violations, 0 Ignored lines [18:06:04] i mean, it throws those [18:06:10] but we dont fail the build was my understanding [18:06:14] the later errors are what i dont get [18:06:24] ah [18:06:24] 18:51:08 ERROR: InterpreterNotFound: python3.7 [18:06:36] "ERROR: InvocationError: '/srv/workspace/dnslint/.tox/py35-tests/bin/python /srv/workspace/dnslint/utils/deploy-check.py'" onwards [18:06:37] and 18:51:08 ERROR: InterpreterNotFound: python3.6 [18:06:46] due to python mismatches i suppose? [18:07:00] It seemed like soething I should ask about ;D [18:07:23] (this may be an SRE issue not a RelEng issue, but my first instinct is to ask releng about this stuff ;) [18:08:13] PROBLEM - Work requests waiting in Zuul Gearman server on contint1001 is CRITICAL: CRITICAL: 35.71% of data above the critical threshold [140.0] https://grafana.wikimedia.org/dashboard/db/zuul-gearman?panelId=10&fullscreen&orgId=1 [18:08:37] oh wait, this may be a linter failu not a CI fail and i parsed it wrong [18:11:42] yeah, nm, this is a failure on patchset, disregard me sorry. [18:14:55] RECOVERY - Work requests waiting in Zuul Gearman server on contint1001 is OK: OK: Less than 30.00% above the threshold [90.0] https://grafana.wikimedia.org/dashboard/db/zuul-gearman?panelId=10&fullscreen&orgId=1 [18:32:56] robh: happens all the time ("oh no! CI is broken!"..... "nvm, semicolon") [18:33:04] hehe [18:33:09] guilty. [18:36:06] RECOVERY - Mathoid on deployment-mathoid is OK: HTTP OK: HTTP/1.1 200 OK - 925 bytes in 0.024 second response time [18:42:05] PROBLEM - Mathoid on deployment-mathoid is CRITICAL: connect to address 172.16.5.73 and port 10042: Connection refused [18:47:05] RECOVERY - Mathoid on deployment-mathoid is OK: HTTP OK: HTTP/1.1 200 OK - 925 bytes in 0.024 second response time [18:58:05] PROBLEM - Mathoid on deployment-mathoid is CRITICAL: connect to address 172.16.5.73 and port 10042: Connection refused [19:34:43] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team, 10Discovery-Search: quibble-vendor-mysql-hhvm-docker for WikibaseCirrusSearch takes over 40 minutes - https://phabricator.wikimedia.org/T222757 (10Smalyshev) [19:39:54] (03CR) 10Jforrester: "Will this work? Don't we need to add the new name, wait for the cut-over after a fortnight of sync updates, and then remove the old one, t" [tools/release] - 10https://gerrit.wikimedia.org/r/508643 (https://phabricator.wikimedia.org/T212182) (owner: 10Ladsgroup) [19:47:26] 10Continuous-Integration-Config, 10Release-Engineering-Team (Backlog), 10JavaScript, 10Patch-For-Review: Switch quibble-based CI jobs from node6 to node10 - https://phabricator.wikimedia.org/T222406 (10Jdforrester-WMF) [19:48:49] Krinkle: For T222406 did you mean that https://phabricator.wikimedia.org/transactions/detail/PHID-XACT-TASK-cjxwqlun5d3w3ph/ is sufficient? [19:48:50] T222406: Switch quibble-based CI jobs from node6 to node10 - https://phabricator.wikimedia.org/T222406 [19:50:15] James_F: LGTM, there's numerous ways to disable it, but that seems fine indeed [19:51:20] Thanks for the help. :-) [20:13:03] RECOVERY - Mathoid on deployment-mathoid is OK: HTTP OK: HTTP/1.1 200 OK - 925 bytes in 0.024 second response time [20:15:03] (03CR) 10Ladsgroup: "> Patch Set 1:" [tools/release] - 10https://gerrit.wikimedia.org/r/508643 (https://phabricator.wikimedia.org/T212182) (owner: 10Ladsgroup) [20:18:41] RECOVERY - Puppet staleness on deployment-logstash2 is OK: OK: Less than 1.00% above the threshold [3600.0] [20:19:06] PROBLEM - Mathoid on deployment-mathoid is CRITICAL: connect to address 172.16.5.73 and port 10042: Connection refused [20:31:26] (03CR) 10Dduvall: [C: 03+2] doc: Generate documentation with groovydoc [integration/pipelinelib] - 10https://gerrit.wikimedia.org/r/508656 (https://phabricator.wikimedia.org/T222199) (owner: 10Dduvall) [20:32:41] (03CR) 10PipelineBot: "pipeline-dashboard: service-pipeline-test" [integration/pipelinelib] - 10https://gerrit.wikimedia.org/r/508656 (https://phabricator.wikimedia.org/T222199) (owner: 10Dduvall) [20:32:43] (03Merged) 10jenkins-bot: doc: Generate documentation with groovydoc [integration/pipelinelib] - 10https://gerrit.wikimedia.org/r/508656 (https://phabricator.wikimedia.org/T222199) (owner: 10Dduvall) [20:38:31] (03CR) 10Thcipriani: [C: 03+1] "lgtm" (032 comments) [tools/scap] - 10https://gerrit.wikimedia.org/r/508488 (https://phabricator.wikimedia.org/T222539) (owner: 10Catrope) [20:47:56] (03PS1) 10Hashar: debian: depend on git [integration/zuul] (debian/jessie-wikimedia) - 10https://gerrit.wikimedia.org/r/508709 [20:47:58] (03PS1) 10Hashar: debian: remove override_dh_auto_test we dont use it [integration/zuul] (debian/jessie-wikimedia) - 10https://gerrit.wikimedia.org/r/508710 [20:48:00] (03PS1) 10Hashar: debian: remove openstack-pkg-tools [integration/zuul] (debian/jessie-wikimedia) - 10https://gerrit.wikimedia.org/r/508711 [20:48:03] (03PS1) 10Hashar: debian: add back test dependencies [integration/zuul] (debian/jessie-wikimedia) - 10https://gerrit.wikimedia.org/r/508712 [20:48:05] (03PS1) 10Hashar: debian: have testr use the virtualenv [integration/zuul] (debian/jessie-wikimedia) - 10https://gerrit.wikimedia.org/r/508713 [20:52:37] (03PS1) 10Hashar: integration/zuul: do run tests! [integration/config] - 10https://gerrit.wikimedia.org/r/508715 [20:56:37] 10Gerrit, 10Release-Engineering-Team (Backlog), 10Patch-For-Review: Upgrade to Gerrit 2.16.7 - https://phabricator.wikimedia.org/T200739 (10Paladox) [20:57:43] 10Gerrit, 10Release-Engineering-Team (Kanban), 10Patch-For-Review: Upgrade Gerrit to 2.15.12 - https://phabricator.wikimedia.org/T218515 (10Paladox) 05Openβ†’03Resolved We have been on gerrit 2.15.12 for a couple of weeks now, closing as resolved! :) [21:00:20] 10Release-Engineering-Team, 10MediaWiki-Core-Testing, 10Patch-For-Review, 10Wikimedia-production-error (Shared Build Failure), 10phan: phan 1.2.6 is OOMing on MediaWiki core - https://phabricator.wikimedia.org/T219114 (10hashar) 05Resolvedβ†’03Open I am opening this one again since it also affects exte... [21:03:12] (03CR) 10Hashar: [C: 04-1] "might be better to just create a "nocheck" variant of the job which would be voting then add debian-glue-non-voting which would fail until" [integration/config] - 10https://gerrit.wikimedia.org/r/508715 (owner: 10Hashar) [21:23:28] 10Deployments, 10MediaWiki-Internationalization, 10Performance-Team (Radar): Use static php array files for l10n cache instead of CDB - https://phabricator.wikimedia.org/T99740 (10Krinkle) Deployment plan, as recycled from 2015: * Enable `array` format on testwiki on Beta Cluster. (We used test2wiki in 2015... [21:23:39] 10Gerrit, 10Release-Engineering-Team (Backlog), 10Patch-For-Review: Upgrade to Gerrit 2.16.7 - https://phabricator.wikimedia.org/T200739 (10hashar) {T211139} I don't think it is blocker / subtask? Seems it should instead by blocked on the upgrade of Gerrit 2.16 upgrade. Zuul has been upgraded and should wo... [21:25:58] 10Gerrit: Deploy gerrit master (then 2.16) to gerrit.git.wmflabs.org - https://phabricator.wikimedia.org/T205346 (10Paladox) [21:26:02] 10Gerrit, 10Release-Engineering-Team (Backlog), 10Patch-For-Review: Upgrade to Gerrit 2.16.7 - https://phabricator.wikimedia.org/T200739 (10Paladox) [21:26:08] 10Gerrit, 10Operations, 10serviceops, 10Patch-For-Review: Convert Gerrit to use H2 as the database - https://phabricator.wikimedia.org/T211139 (10Paladox) [21:26:12] 10Gerrit, 10Release-Engineering-Team (Backlog), 10Patch-For-Review: Upgrade to Gerrit 2.16.7 - https://phabricator.wikimedia.org/T200739 (10Paladox) [21:27:35] (03PS1) 10Hashar: jjb: remove explict defaults: global [integration/config] - 10https://gerrit.wikimedia.org/r/508721 [21:28:10] 10Release-Engineering-Team, 10MediaWiki-Core-Testing, 10Patch-For-Review, 10Wikimedia-production-error (Shared Build Failure), 10phan: phan 1.2.6 is OOMing on MediaWiki core - https://phabricator.wikimedia.org/T219114 (10Daimona) @hashar For what concerns the progress bar, Lego's fix in phan was included... [21:28:32] (03CR) 10Hashar: "integration-config-jjb-diff-docker should show no difference in the jobs :]" [integration/config] - 10https://gerrit.wikimedia.org/r/508721 (owner: 10Hashar) [21:30:42] 10Gerrit, 10Release-Engineering-Team (Backlog), 10Patch-For-Review: Upgrade to Gerrit 2.16.7 - https://phabricator.wikimedia.org/T200739 (10Dzahn) >>! In T200739#5165755, @hashar wrote: > I think the last blockers are: No there are more blockers, at least T218844 and https://gerrit.wikimedia.org/r/c/operati... [21:39:41] 10Gerrit, 10Release-Engineering-Team (Backlog), 10Patch-For-Review: Upgrade to Gerrit 2.16.7 - https://phabricator.wikimedia.org/T200739 (10Paladox) @hashar +1 to after the hackathon, i've announced that PolyGerrit is becoming the default UI here https://lists.wikimedia.org/pipermail/wikitech-l/2019-May/0920... [22:07:00] (03PS10) 10Dduvall: doc: Publish documentation for pipelinelib [integration/config] - 10https://gerrit.wikimedia.org/r/507871 (https://phabricator.wikimedia.org/T222199) [22:09:03] (03CR) 10Dduvall: [C: 03+1] "Latest patchset fixes missing git clone and setting of working directory to /src during the gradle container run. I've created the `integr" [integration/config] - 10https://gerrit.wikimedia.org/r/507871 (https://phabricator.wikimedia.org/T222199) (owner: 10Dduvall) [22:09:58] thcipriani: ^ doc generation for pipelinelib tested successfully. could use a review/merge if you have time [22:12:24] marxarelli: very nice [22:12:31] * thcipriani reads [22:13:19] oh and https://gerrit.wikimedia.org/r/c/integration/docroot/+/507873 :) [22:14:03] 10Release-Engineering-Team, 10MediaWiki-Core-Testing, 10Patch-For-Review, 10Wikimedia-production-error (Shared Build Failure), 10phan: phan 1.2.6 is OOMing on MediaWiki core - https://phabricator.wikimedia.org/T219114 (10Jdforrester-WMF) >>! In T219114#5165773, @Daimona wrote: > @hashar For what concerns... [22:14:56] marxarelli: did you deploy the job? [22:15:10] i did, to test it [22:18:04] nice https://doc.wikimedia.org/pipelinelib/ [22:18:39] yeah [22:18:50] quite lackluster at the moment [22:19:04] but there nonetheless! [22:19:30] extant! the first step to luster. [22:19:33] (03CR) 10Thcipriani: [C: 03+2] doc: Publish documentation for pipelinelib [integration/config] - 10https://gerrit.wikimedia.org/r/507871 (https://phabricator.wikimedia.org/T222199) (owner: 10Dduvall) [22:19:42] i'm hoping i can get the PipelineStage docs to show up somewhere on the default page [22:19:48] haha [22:19:53] Need to… hah, thcipriani got there firsst. [22:20:12] not sure how to deploy the docroot patch though :\ [22:20:39] does that have to be manually submitted? [22:20:51] Also https://gerrit.wikimedia.org/r/c/integration/docroot/+/507873 [22:20:52] looks like it, considering it got a +2 and is still waiting :) [22:21:07] I think int-config needs a magic push into production too. [22:21:36] needs a zuul reload [22:21:39] yep, I've got the magic for that one [22:21:42] (03Merged) 10jenkins-bot: doc: Publish documentation for pipelinelib [integration/config] - 10https://gerrit.wikimedia.org/r/507871 (https://phabricator.wikimedia.org/T222199) (owner: 10Dduvall) [22:21:45] Cool. [22:23:04] !log reloading zuul to deploy https://gerrit.wikimedia.org/r/507871 [22:23:06] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [22:23:58] thcipriani: thanks! [22:24:45] seem promising https://integration.wikimedia.org/ci/job/integration-docroot-deploy/ [22:25:25] yep, that's a postmerge job [22:26:25] hrm...so it seems like this repo has gate and submit jobs [22:26:41] (03CR) 10Thcipriani: [C: 03+2] doc: Link to pipelinelib documentation [integration/docroot] - 10https://gerrit.wikimedia.org/r/507873 (https://phabricator.wikimedia.org/T222199) (owner: 10Dduvall) [22:27:07] that kicked off some jobs in zuul it seems [22:27:15] oh interesting. maybe it's a separate whitelist [22:27:22] (03Merged) 10jenkins-bot: doc: Link to pipelinelib documentation [integration/docroot] - 10https://gerrit.wikimedia.org/r/507873 (https://phabricator.wikimedia.org/T222199) (owner: 10Dduvall) [22:27:26] or maybe something was broken last time :) [22:27:36] (03CR) 10jenkins-bot: doc: Link to pipelinelib documentation [integration/docroot] - 10https://gerrit.wikimedia.org/r/507873 (https://phabricator.wikimedia.org/T222199) (owner: 10Dduvall) [22:27:42] whaaaaa [22:27:44] never! [22:29:44] hrm...output message tells me this needs a manual pull [22:33:40] > error: unable to unlink old 'org/wikimedia/doc/default.html': Permission denied [22:33:46] so that's fun [22:33:49] :( [22:34:12] I don't really get why, file is rw by wikidev...I'm in that group... [22:37:21] marxarelli: edited the file...and that worked..now the git repo is clean.../me files task [22:37:58] ah ok. that's very strange [22:38:02] thanks for the merge/deploy! [22:38:40] oh, or did it not fully deploy on account of the botched git pull? [22:41:31] 10Continuous-Integration-Infrastructure: integration/docroot error: unable to unlink old 'org/wikimedia/doc/default.html': Permission denied - https://phabricator.wikimedia.org/T222767 (10thcipriani) [22:42:32] 10Continuous-Integration-Infrastructure: integration/docroot error: unable to unlink old 'org/wikimedia/doc/default.html': Permission denied - https://phabricator.wikimedia.org/T222767 (10thcipriani) I did update the file by manually replacing it with the version in master. [22:42:42] marxarelli: it should be deployed. I see it [22:43:10] pipelinelib link on https://doc.wikimedia.org/ [22:43:30] grr. i don't see it [22:43:44] maybe i'm hitting a different varnish? [22:48:47] https://doc.wikimedia.org/?cache-busted=1 ? [22:49:21] thcipriani: oh werd. thar it is [22:49:24] http://tyler.zone/pipelinelib-is-glorious.png [22:49:37] :D [22:49:54] is it glorious though? [22:50:38] it inspires only an "it'll do" in me, mostly because of its groovy-ness [22:51:09] I refer you to my "extant" comment [22:51:26] extant is a marker on the path to glory. [22:51:54] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Testing, 10MW-1.34-notes (1.34.0-wmf.4; 2019-05-07), 10Patch-For-Review: Stop using jsonlint (as it's abandonware) and instead use eslint-plugin-json for the linting - https://phabricator.wikimedia.org/T220036 (10Jdforrester-WMF) [22:52:27] Pipelinius Maximus! [23:10:25] 10Release-Engineering-Team (Backlog), 10ORES, 10Operations, 10Release Pipeline, and 2 others: Execution of the deployment pipeline should be configurable via .pipeline/config.yaml - https://phabricator.wikimedia.org/T210267 (10dduvall) [23:10:29] 10Continuous-Integration-Config, 10Release-Engineering-Team (Kanban), 10Release Pipeline, 10Patch-For-Review: Post generated docs for pipelinelib - https://phabricator.wikimedia.org/T222199 (10dduvall) 05Openβ†’03Resolved API documentation for pipelinelib is now available at https://doc.wikimedia.org/pip... [23:17:11] i need a Firefox extension that simply does the exact same thing that happens if you click "save page as" in Firefox.. except that i need to feed it a list of a LOT of URLs to do that with. you would think that's easy to find .. but actually not.. everything is either outdated or doesn't do the same thing, or doesn't save it with working CSS/images / can't login and so on [23:17:25] if i just do the manual "save as" that does everything as i need it [23:17:42] i also already tried wget/curl/httrack and a bunch of extensions..nothing is the same [23:19:04] RECOVERY - Mathoid on deployment-mathoid is OK: HTTP OK: HTTP/1.1 200 OK - 925 bytes in 0.040 second response time [23:23:03] mutante, might be easiest to script interaction with the FF UI :) [23:24:08] Krenair: yea, i alrady considered automating the mouse .. i would need it to get URL from text file, copy/paste and click the button [23:24:11] mutante: http or https ? [23:24:35] Platonides: https (http redirects) [23:24:57] ... somehow I suspect anything capable of doing what mutante wants is going to be able to handle some TLS [23:25:06] PROBLEM - Mathoid on deployment-mathoid is CRITICAL: connect to address 172.16.5.73 and port 10042: Connection refused [23:26:59] the site is behind a login builds the HTML from multiple parts using js or whatnot.. so even if i handle the login with curl/wget i didn't get what i wanted [23:27:40] the browser does it right and makes it look the same when offline [23:27:59] saves the additional files in a subdir etc [23:30:07] RECOVERY - Mathoid on deployment-mathoid is OK: HTTP OK: HTTP/1.1 200 OK - 925 bytes in 0.025 second response time [23:34:08] hmm, puppeteer doesn't seem to have a method to call the save as dialog [23:35:06] Platonides, can you simulate the keyboard shortcut or click the buttons on screen? [23:36:07] PROBLEM - Mathoid on deployment-mathoid is CRITICAL: connect to address 172.16.5.73 and port 10042: Connection refused [23:36:37] not sure [23:36:50] here you would want to do that on the browser chrome [23:37:02] so sending keypress events probably wouldn't do [23:37:39] it should be easy to save pages by sending keypresses [23:37:50] Ctrl +S followed by Enter should do it [23:38:01] the problematic piece will be to wait the right amount of time between them [23:39:16] hmm.*nod* [23:46:53] PROBLEM - Free space - all mounts on deployment-fluorine02 is CRITICAL: CRITICAL: deployment-prep.deployment-fluorine02.diskspace._srv.byte_percentfree (<20.00%)