[00:09:40] 10Beta-Cluster-Infrastructure, 10Collaboration-Team-Triage, 10Maps, 10Collaboration-Feature-Rollouts (Collaboration-Maps): [betalabs] Uncaught TypeError: Cannot read property 'lat' of null - when 'Edit layer' option is used. - https://phabricator.wikimedia.org/T194440#4244740 (10Etonkovidova) [00:10:48] 10Beta-Cluster-Infrastructure, 10Collaboration-Team-Triage, 10Discovery, 10Maps, 10Collaboration-Feature-Rollouts (Collaboration-Maps): [betalabs] Uncaught TypeError: Cannot read property 'lat' of null - when 'Edit layer' option is used. - https://phabricator.wikimedia.org/T194440#4244743 (10Etonkovidova... [00:13:11] well, the scap deploy --init errors on deployment-deploy-01 that exist now seem like problems with the repos themselves and not with running scap deploy --init [00:13:34] the problem (aside from the lock file) is that we got rid of the trebuchet group in ldap [00:13:59] and! we added scap --init to scap_source, so this patch https://gerrit.wikimedia.org/r/#/c/361796/ was out-of-date [00:15:25] * thcipriani follows up on task [00:22:23] (03PS3) 10Krinkle: Add Hagar Shilo to the CI whitelist [integration/config] - 10https://gerrit.wikimedia.org/r/434722 (owner: 10Catrope) [00:22:26] (03CR) 10Krinkle: [C: 032] Add Hagar Shilo to the CI whitelist [integration/config] - 10https://gerrit.wikimedia.org/r/434722 (owner: 10Catrope) [00:22:40] Krenair there's no shinken package for stretch [00:22:41] 10Beta-Cluster-Infrastructure, 10Patch-For-Review, 10Puppet: Puppet broken on deployment-mx due to systemd on trusty - https://phabricator.wikimedia.org/T184244#4244761 (10Krenair) a:03Krenair [00:23:03] https://packages.debian.org/jessie/shinken [00:24:25] (03Merged) 10jenkins-bot: Add Hagar Shilo to the CI whitelist [integration/config] - 10https://gerrit.wikimedia.org/r/434722 (owner: 10Catrope) [00:29:19] (03PS7) 10Krinkle: Added dependency "ExtJSBase" to "BlueSpiceFoundation" [integration/config] - 10https://gerrit.wikimedia.org/r/389412 (owner: 10Robert Vogel) [00:29:21] (03PS8) 10Krinkle: Added dependency "ExtJSBase" to "BlueSpiceFoundation" [integration/config] - 10https://gerrit.wikimedia.org/r/389412 (owner: 10Robert Vogel) [00:32:02] (03CR) 10Krinkle: [C: 032] Added dependency "ExtJSBase" to "BlueSpiceFoundation" [integration/config] - 10https://gerrit.wikimedia.org/r/389412 (owner: 10Robert Vogel) [00:32:52] !log Reloading Zuul to deploy https://gerrit.wikimedia.org/r/434722 [00:32:55] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [00:33:39] (03Merged) 10jenkins-bot: Added dependency "ExtJSBase" to "BlueSpiceFoundation" [integration/config] - 10https://gerrit.wikimedia.org/r/389412 (owner: 10Robert Vogel) [00:40:46] !log Apply 'role::webperf' to deployment-webperf01, T195314 [00:40:48] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [00:40:48] T195314: Set up webperf node in Beta Cluster - https://phabricator.wikimedia.org/T195314 [00:41:06] !log Reloading Zuul to deploy https://gerrit.wikimedia.org/r/389412 [00:41:08] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [00:45:05] 10Beta-Cluster-Infrastructure, 10Performance-Team: Set up webperf node in Beta Cluster - https://phabricator.wikimedia.org/T195314#4244790 (10Krinkle) a:03Krinkle [00:45:29] 10Beta-Cluster-Infrastructure, 10Performance-Team: Set up webperf node in Beta Cluster - https://phabricator.wikimedia.org/T195314#4222943 (10Krinkle) p:05Triage>03Normal [00:45:32] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team, 10Operations, 10Patch-For-Review: Upgrade deployment-prep deployment servers to stretch - https://phabricator.wikimedia.org/T192561#4244792 (10thcipriani) >>! In T192561#4183530, @thcipriani wrote: > Broken stuff > ========= > 3. iegreview has inv... [00:50:01] PROBLEM - Puppet errors on deployment-webperf01 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [00:51:44] RECOVERY - Puppet errors on deployment-deploy-01 is OK: OK: Less than 1.00% above the threshold [0.0] [01:19:32] !log Add web proxy in Horizon for 'performance-beta', mapping to deployment-webperf01 port 80 - T195314 [01:19:34] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [01:19:35] T195314: Set up webperf node in Beta Cluster - https://phabricator.wikimedia.org/T195314 [01:19:59] 10Beta-Cluster-Infrastructure, 10Performance-Team, 10Patch-For-Review: Set up webperf node in Beta Cluster - https://phabricator.wikimedia.org/T195314#4244824 (10Krinkle) The web server is now reachable at [01:28:39] 10Beta-Cluster-Infrastructure, 10Performance-Team, 10Patch-For-Review: Set up webperf node in Beta Cluster - https://phabricator.wikimedia.org/T195314#4244842 (10Krinkle) [01:41:21] thcipriani: It seems the scap source for statsv/statsv is missing on beta's tin. I guess this is because of it being missing from scap::sources between the default one in Hiera for deployment_server and the override in deployment-prep/common.yaml. [01:41:28] Do you know why that overidde exists? [01:41:34] It seems to be duplicate. [01:43:14] IIRC beta doesn't have the role hierarchy in its hiera config, so the beta puppetmaster can't find that configuration (or that was the case at some point, I haven't looked into it in quite a while) [01:44:32] You mean the role can be applied in beta, but then hiera will not consider the yaml files for that role in puppet? [01:44:41] interesting [01:46:43] https://github.com/wikimedia/puppet/blob/production/modules/puppetmaster/files/labs.hiera.yaml#L16 [01:48:13] yeah, you can apply a role via horizon, but it won't find the hiera data for the role. I have a vague memory of a phab ticket about this is the way it is, but it was probably from around a few years ago [01:48:34] about *why* this is...etc [01:49:32] ah, that's right, production has the role backend for heira data and beta doesn't because... [01:50:00] https://github.com/wikimedia/puppet/blob/production/modules/puppetmaster/files/labs.hiera.yaml#L1-L4 vs https://github.com/wikimedia/puppet/blob/production/modules/puppetmaster/files/production.hiera.yaml#L1-L3 [01:51:46] Hm.. I wonder what would happen if we just add it to the labs one as well. [01:52:43] that's a good question. The most relevant ticket I could find in phab was: https://phabricator.wikimedia.org/T127771 [01:53:50] seems like horizon is role aware at this point? [01:54:14] Indeed [01:54:34] It is capable of applying classes, both directly via the instance/puppet config tab, and via prefix matching. [01:54:47] Which is used on deployment-prep to apply role classses as well [01:54:53] including e.g. deployment server for deployment-tin [01:57:31] 10Beta-Cluster-Infrastructure, 10Operations, 10Performance-Team, 10Puppet: Include role/common in beta-cluster hieradata hierarchy - https://phabricator.wikimedia.org/T196034#4244858 (10Krinkle) [02:00:46] 10Beta-Cluster-Infrastructure, 10Performance-Team, 10Patch-For-Review: Set up webperf node in Beta Cluster - https://phabricator.wikimedia.org/T195314#4244876 (10Krinkle) Summary from the initial puppet run(s): ```name=webperf01 syslog Could not retrieve catalog from remote server: Error 500 on SERVER: Serv... [02:16:55] (03PS1) 1020after4: Updated 'blockers' scap plugin [tools/release] - 10https://gerrit.wikimedia.org/r/436441 [02:20:46] 10Release-Engineering-Team (Kanban), 10Patch-For-Review, 10Release, 10Train Deployments: 1.32.0-wmf.6 deployment blockers - https://phabricator.wikimedia.org/T191052#4244898 (10mmodell) [02:20:48] 10Release-Engineering-Team (Kanban), 10Patch-For-Review, 10Release, 10Train Deployments: 1.32.0-wmf.5 deployment blockers - https://phabricator.wikimedia.org/T191051#4244899 (10mmodell) [02:42:31] 10Release-Engineering-Team (Kanban), 10Patch-For-Review, 10Release, 10Train Deployments: 1.32.0-wmf.6 deployment blockers - https://phabricator.wikimedia.org/T191052#4244926 (10Krinkle) [02:43:48] Project beta-scap-eqiad build #209823: 04FAILURE in 0.56 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/209823/ [02:59:00] Yippee, build fixed! [02:59:00] Project beta-scap-eqiad build #209824: 09FIXED in 5 min 15 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/209824/ [04:37:19] 10Beta-Cluster-Infrastructure, 10Operations, 10Performance-Team, 10Patch-For-Review, 10Puppet: Include role/common in beta-cluster hieradata hierarchy - https://phabricator.wikimedia.org/T196034#4244992 (10MoritzMuehlenhoff) p:05Triage>03Normal [05:09:34] PROBLEM - Free space - all mounts on deployment-fluorine02 is CRITICAL: CRITICAL: deployment-prep.deployment-fluorine02.diskspace._srv.byte_percentfree (<40.00%) [06:34:36] PROBLEM - Puppet errors on integration-slave-k8s-1018 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [06:35:39] PROBLEM - Puppet errors on integration-slave-docker-1007 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [06:37:18] PROBLEM - Puppet errors on saucelabs-03 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [06:40:13] PROBLEM - Puppet errors on integration-slave-docker-1005 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [06:40:25] PROBLEM - Puppet errors on integration-slave-docker-1010 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [06:41:18] uh [06:41:38] PROBLEM - Puppet errors on integration-slave-docker-1008 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [06:41:57] !log depooled integration-slave-docker-1010 in jenkins for testing [06:41:58] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [06:42:15] PROBLEM - Puppet errors on saucelabs-01 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [06:47:26] PROBLEM - Puppet errors on integration-slave-docker-1009 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [06:49:01] PROBLEM - Puppet errors on integration-slave-docker-1017 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [06:49:01] PROBLEM - Puppet errors on integration-slave-docker-1014 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [06:49:07] PROBLEM - Puppet errors on integration-slave-jessie-1001 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [06:49:36] PROBLEM - Puppet errors on integration-slave-docker-1021 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [06:49:42] PROBLEM - Puppet errors on integration-slave-jessie-1004 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [06:49:44] PROBLEM - Puppet errors on integration-slave-docker-1003 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [06:50:13] PROBLEM - Puppet errors on saucelabs-02 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [06:50:19] PROBLEM - Puppet errors on integration-slave-docker-1002 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [06:52:24] PROBLEM - Puppet errors on webperformance is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [06:52:48] PROBLEM - Puppet errors on integration-slave-docker-1006 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [06:52:53] PROBLEM - Puppet errors on integration-slave-docker-1001 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [06:53:54] PROBLEM - Puppet errors on integration-slave-docker-1011 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [06:54:32] PROBLEM - Puppet errors on castor02 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [06:54:41] 10Continuous-Integration-Config, 10Continuous-Integration-Infrastructure, 10Quibble, 10Patch-For-Review: Only run phpcs, parallel-lint, and other(?) slow, non-variant tasks in PHP7, not also in HHVM - https://phabricator.wikimedia.org/T195984#4245081 (10Legoktm) I tried rolling this out today, and all new... [06:55:42] PROBLEM - Puppet errors on integration-cumin is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [06:56:43] PROBLEM - Puppet errors on integration-r-lang-01 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [06:57:34] PROBLEM - Puppet errors on integration-slave-docker-1016 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [06:57:46] PROBLEM - Puppet errors on integration-slave-jessie-1003 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [06:58:50] PROBLEM - Puppet errors on integration-puppetmaster01 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [06:59:06] PROBLEM - Puppet errors on integration-slave-jessie-1002 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [06:59:17] PROBLEM - Puppet errors on integration-slave-docker-1012 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [06:59:36] RECOVERY - Free space - all mounts on deployment-fluorine02 is OK: OK: All targets OK [07:02:11] PROBLEM - Puppet errors on integration-slave-docker-1013 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [07:02:52] PROBLEM - Puppet errors on integration-slave-docker-1020 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [07:02:52] PROBLEM - Puppet errors on jenkinstest is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [07:02:52] PROBLEM - Puppet errors on integration-slave-docker-1004 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [07:03:45] PROBLEM - Puppet errors on integration-slave-docker-1015 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [07:04:37] PROBLEM - Puppet errors on integration-publishing is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [07:15:27] (03PS1) 10Legoktm: Bump Quibble jobs to 0.0.15 (try #2) [integration/config] - 10https://gerrit.wikimedia.org/r/436470 [07:15:28] (03PS1) 10Legoktm: Move "composer-test" to separate job for MediaWiki core (try #2) [integration/config] - 10https://gerrit.wikimedia.org/r/436471 [07:16:21] !log repooled integration-slave-docker-1010 [07:16:23] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [07:16:34] (03CR) 10Legoktm: [C: 032] Bump Quibble jobs to 0.0.15 (try #2) [integration/config] - 10https://gerrit.wikimedia.org/r/436470 (owner: 10Legoktm) [07:16:39] (03CR) 10Legoktm: [C: 032] Move "composer-test" to separate job for MediaWiki core (try #2) [integration/config] - 10https://gerrit.wikimedia.org/r/436471 (owner: 10Legoktm) [07:18:08] (03Merged) 10jenkins-bot: Bump Quibble jobs to 0.0.15 (try #2) [integration/config] - 10https://gerrit.wikimedia.org/r/436470 (owner: 10Legoktm) [07:18:33] (03Merged) 10jenkins-bot: Move "composer-test" to separate job for MediaWiki core (try #2) [integration/config] - 10https://gerrit.wikimedia.org/r/436471 (owner: 10Legoktm) [07:42:03] ok so 0.0.15 deployed well [07:42:09] now for moving it to a separate job [07:42:38] !log deployed https://gerrit.wikimedia.org/r/436471 [07:42:40] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [07:50:26] ffs [07:52:15] (03PS1) 10Legoktm: More robust check for a quibble based job [integration/config] - 10https://gerrit.wikimedia.org/r/436477 [07:52:25] (03CR) 10Legoktm: [C: 032] More robust check for a quibble based job [integration/config] - 10https://gerrit.wikimedia.org/r/436477 (owner: 10Legoktm) [07:53:59] (03Merged) 10jenkins-bot: More robust check for a quibble based job [integration/config] - 10https://gerrit.wikimedia.org/r/436477 (owner: 10Legoktm) [07:54:23] !log deployed https://gerrit.wikimedia.org/r/436477 [07:54:24] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [07:57:26] RECOVERY - Puppet errors on integration-slave-docker-1009 is OK: OK: Less than 1.00% above the threshold [0.0] [07:57:53] RECOVERY - Puppet errors on integration-slave-docker-1001 is OK: OK: Less than 1.00% above the threshold [0.0] [07:58:59] RECOVERY - Puppet errors on integration-slave-docker-1017 is OK: OK: Less than 1.00% above the threshold [0.0] [07:59:03] RECOVERY - Puppet errors on integration-slave-docker-1014 is OK: OK: Less than 1.00% above the threshold [0.0] [07:59:10] RECOVERY - Puppet errors on integration-slave-jessie-1001 is OK: OK: Less than 1.00% above the threshold [0.0] [07:59:36] RECOVERY - Puppet errors on integration-slave-docker-1021 is OK: OK: Less than 1.00% above the threshold [0.0] [07:59:42] RECOVERY - Puppet errors on integration-slave-jessie-1004 is OK: OK: Less than 1.00% above the threshold [0.0] [07:59:42] RECOVERY - Puppet errors on integration-slave-docker-1003 is OK: OK: Less than 1.00% above the threshold [0.0] [08:00:12] RECOVERY - Puppet errors on saucelabs-02 is OK: OK: Less than 1.00% above the threshold [0.0] [08:00:19] RECOVERY - Puppet errors on integration-slave-docker-1002 is OK: OK: Less than 1.00% above the threshold [0.0] [08:02:24] RECOVERY - Puppet errors on webperformance is OK: OK: Less than 1.00% above the threshold [0.0] [08:02:44] RECOVERY - Puppet errors on integration-slave-jessie-1003 is OK: OK: Less than 1.00% above the threshold [0.0] [08:02:48] RECOVERY - Puppet errors on integration-slave-docker-1006 is OK: OK: Less than 1.00% above the threshold [0.0] [08:03:51] RECOVERY - Puppet errors on integration-puppetmaster01 is OK: OK: Less than 1.00% above the threshold [0.0] [08:03:56] RECOVERY - Puppet errors on integration-slave-docker-1011 is OK: OK: Less than 1.00% above the threshold [0.0] [08:04:35] RECOVERY - Puppet errors on castor02 is OK: OK: Less than 1.00% above the threshold [0.0] [08:05:41] RECOVERY - Puppet errors on integration-cumin is OK: OK: Less than 1.00% above the threshold [0.0] [08:06:44] RECOVERY - Puppet errors on integration-r-lang-01 is OK: OK: Less than 1.00% above the threshold [0.0] [08:07:09] RECOVERY - Puppet errors on integration-slave-docker-1013 is OK: OK: Less than 1.00% above the threshold [0.0] [08:07:10] (03PS2) 10Legoktm: Run 'mediawiki-core-hhvmlint' job [integration/config] - 10https://gerrit.wikimedia.org/r/436403 [08:07:31] RECOVERY - Puppet errors on integration-slave-docker-1016 is OK: OK: Less than 1.00% above the threshold [0.0] [08:09:09] RECOVERY - Puppet errors on integration-slave-jessie-1002 is OK: OK: Less than 1.00% above the threshold [0.0] [08:09:16] RECOVERY - Puppet errors on integration-slave-docker-1012 is OK: OK: Less than 1.00% above the threshold [0.0] [08:09:17] (03CR) 10jerkins-bot: [V: 04-1] Run 'mediawiki-core-hhvmlint' job [integration/config] - 10https://gerrit.wikimedia.org/r/436403 (owner: 10Legoktm) [08:09:37] RECOVERY - Puppet errors on integration-publishing is OK: OK: Less than 1.00% above the threshold [0.0] [08:10:38] RECOVERY - Puppet errors on integration-slave-docker-1007 is OK: OK: Less than 1.00% above the threshold [0.0] [08:10:45] (03CR) 10Legoktm: "It's just that hhvm startup is much slower... https://github.com/JakubOnderka/PHP-Parallel-Lint/issues/47" [integration/config] - 10https://gerrit.wikimedia.org/r/436403 (owner: 10Legoktm) [08:11:39] (03PS3) 10Legoktm: Run 'mediawiki-core-hhvmlint' job [integration/config] - 10https://gerrit.wikimedia.org/r/436403 [08:12:52] RECOVERY - Puppet errors on integration-slave-docker-1020 is OK: OK: Less than 1.00% above the threshold [0.0] [08:12:54] RECOVERY - Puppet errors on jenkinstest is OK: OK: Less than 1.00% above the threshold [0.0] [08:12:54] RECOVERY - Puppet errors on integration-slave-docker-1004 is OK: OK: Less than 1.00% above the threshold [0.0] [08:13:29] (03CR) 10jerkins-bot: [V: 04-1] Run 'mediawiki-core-hhvmlint' job [integration/config] - 10https://gerrit.wikimedia.org/r/436403 (owner: 10Legoktm) [08:13:44] RECOVERY - Puppet errors on integration-slave-docker-1015 is OK: OK: Less than 1.00% above the threshold [0.0] [08:14:35] RECOVERY - Puppet errors on integration-slave-k8s-1018 is OK: OK: Less than 1.00% above the threshold [0.0] [08:15:15] RECOVERY - Puppet errors on integration-slave-docker-1005 is OK: OK: Less than 1.00% above the threshold [0.0] [08:15:23] RECOVERY - Puppet errors on integration-slave-docker-1010 is OK: OK: Less than 1.00% above the threshold [0.0] [08:15:28] (03PS4) 10Legoktm: Run 'mediawiki-core-hhvmlint' job [integration/config] - 10https://gerrit.wikimedia.org/r/436403 [08:17:18] RECOVERY - Puppet errors on saucelabs-01 is OK: OK: Less than 1.00% above the threshold [0.0] [08:17:18] RECOVERY - Puppet errors on saucelabs-03 is OK: OK: Less than 1.00% above the threshold [0.0] [08:19:32] (03CR) 10Legoktm: [C: 032] Run 'mediawiki-core-hhvmlint' job [integration/config] - 10https://gerrit.wikimedia.org/r/436403 (owner: 10Legoktm) [08:21:19] (03Merged) 10jenkins-bot: Run 'mediawiki-core-hhvmlint' job [integration/config] - 10https://gerrit.wikimedia.org/r/436403 (owner: 10Legoktm) [08:21:39] RECOVERY - Puppet errors on integration-slave-docker-1008 is OK: OK: Less than 1.00% above the threshold [0.0] [08:21:40] !log deployed https://gerrit.wikimedia.org/r/436403 [08:21:42] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [08:24:57] (03PS1) 10Legoktm: HHVM does not support `-v` as shorthand for `--version` [integration/config] - 10https://gerrit.wikimedia.org/r/436480 [08:25:31] sigh [08:25:49] :( [08:26:14] (03CR) 10Hashar: [C: 031] HHVM does not support `-v` as shorthand for `--version` [integration/config] - 10https://gerrit.wikimedia.org/r/436480 (owner: 10Legoktm) [08:29:20] (03CR) 10Legoktm: [C: 032] "INFO:jenkins_jobs.builder:Reconfiguring jenkins job mediawiki-core-hhvmlint" [integration/config] - 10https://gerrit.wikimedia.org/r/436480 (owner: 10Legoktm) [08:32:21] * legoktm crosses fingers [08:32:43] (03Merged) 10jenkins-bot: HHVM does not support `-v` as shorthand for `--version` [integration/config] - 10https://gerrit.wikimedia.org/r/436480 (owner: 10Legoktm) [08:39:25] (03PS1) 10Legoktm: Only run hhvm/php70lint if a PHP file was touched [integration/config] - 10https://gerrit.wikimedia.org/r/436482 [08:40:40] (03PS2) 10Legoktm: Only run hhvm/php70lint if a PHP file was touched [integration/config] - 10https://gerrit.wikimedia.org/r/436482 [08:41:31] (03CR) 10Legoktm: [C: 032] Only run hhvm/php70lint if a PHP file was touched [integration/config] - 10https://gerrit.wikimedia.org/r/436482 (owner: 10Legoktm) [08:42:51] (03Merged) 10jenkins-bot: Only run hhvm/php70lint if a PHP file was touched [integration/config] - 10https://gerrit.wikimedia.org/r/436482 (owner: 10Legoktm) [08:43:06] !log deployed https://gerrit.wikimedia.org/r/436482 [08:43:08] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [08:49:15] 10Scap: Scap required manual 'git update-server-info' on first run - https://phabricator.wikimedia.org/T196046#4245236 (10Volans) [08:50:27] success!!! [08:50:29] they merged [08:52:44] 10Continuous-Integration-Config, 10Continuous-Integration-Infrastructure, 10Quibble, 10Patch-For-Review: Only run phpcs, parallel-lint, and other(?) slow, non-variant tasks in PHP7, not also in HHVM - https://phabricator.wikimedia.org/T195984#4245253 (10Legoktm) 05Open>03Resolved The Chromium problem w... [09:05:10] 10Scap: Scap required manual 'git update-server-info' on first run - https://phabricator.wikimedia.org/T196046#4245310 (10Volans) [09:32:00] jdlrobson: will be around in a few. I need a coffee break [09:32:13] hashar: np [09:32:15] im joining now [09:32:39] jdlrobson: in like 5 minutes ~ [09:37:20] jdlrobson: back and joining [09:37:52] bah hangouts doesn't work anymore on firefox [09:39:07] it's google MEET now hashar [10:24:38] (03PS2) 10Hashar: git-changed-in-head did not detect renames [integration/jenkins] - 10https://gerrit.wikimedia.org/r/423676 [10:53:27] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10releng-201718-q3, 10Epic, 10Patch-For-Review: [EPIC] Migrate Mediawiki jobs from Nodepool to Docker - https://phabricator.wikimedia.org/T183512#4245564 (10hashar) [11:07:10] 10Continuous-Integration-Infrastructure (shipyard), 10MediaWiki-extensions-PropertySuggester, 10MediaWiki-extensions-WikibaseRepository, 10Wikidata: [PropertySuggester] PHP Fatal error: Class 'Wikibase\Repo\Tests\Api\WikibaseApiTestCase' not found in exten... - https://phabricator.wikimedia.org/T196062#4245591 [11:07:21] 10Continuous-Integration-Infrastructure (shipyard), 10MediaWiki-extensions-PropertySuggester, 10MediaWiki-extensions-WikibaseRepository, 10Wikidata: [PropertySuggester] PHP Fatal error: Class 'Wikibase\Repo\Tests\Api\WikibaseApiTestCase' not found in exten... - https://phabricator.wikimedia.org/T196062#4245603 [11:07:25] 10Continuous-Integration-Infrastructure (shipyard), 10MediaWiki-extensions-WikibaseRepository, 10Wikidata: Get Wikibase + dependencies to run with Quibble - https://phabricator.wikimedia.org/T196013#4245602 (10hashar) [11:08:02] 10Continuous-Integration-Infrastructure (shipyard), 10MediaWiki-extensions-WikibaseRepository, 10Wikidata: Get Wikibase + dependencies to run with Quibble - https://phabricator.wikimedia.org/T196013#4244189 (10hashar) T196062 is similar. The repo/autoload.php is not loaded when testing an extension having Wi... [11:21:55] 10Continuous-Integration-Infrastructure (shipyard), 10MediaWiki-extensions-PropertySuggester, 10MediaWiki-extensions-WikibaseRepository, 10Wikidata: [PropertySuggester] PHP Fatal error: Class 'Wikibase\Repo\Tests\Api\WikibaseApiTestCase' not found in exten... - https://phabricator.wikimedia.org/T196062#4245644 [11:25:50] hashar: https://gerrit.wikimedia.org/r/436327 [11:25:52] tricked it.. [11:32:04] (03CR) 10Hashar: More robust check for a quibble based job (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/436477 (owner: 10Legoktm) [11:36:27] (03PS1) 10Hashar: zuul: fix deps for Quibble jobs [integration/config] - 10https://gerrit.wikimedia.org/r/436510 (https://phabricator.wikimedia.org/T196062) [11:36:37] (03CR) 10Hashar: "https://gerrit.wikimedia.org/r/436510 zuul: fix deps for Quibble jobs" [integration/config] - 10https://gerrit.wikimedia.org/r/436477 (owner: 10Legoktm) [11:38:34] (03CR) 10Hashar: [C: 032] zuul: fix deps for Quibble jobs [integration/config] - 10https://gerrit.wikimedia.org/r/436510 (https://phabricator.wikimedia.org/T196062) (owner: 10Hashar) [11:40:46] (03Merged) 10jenkins-bot: zuul: fix deps for Quibble jobs [integration/config] - 10https://gerrit.wikimedia.org/r/436510 (https://phabricator.wikimedia.org/T196062) (owner: 10Hashar) [11:41:50] !log Fixed EXT_DEPENDENCIES no more being injected on quibble jobs T196062 [11:41:52] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [11:41:52] T196062: [PropertySuggester] PHP Fatal error: Class 'Wikibase\Repo\Tests\Api\WikibaseApiTestCase' not found in extensions/PropertySuggester/tests/phpunit/PropertySuggester/GetSuggestionsTest.php on line 24 - https://phabricator.wikimedia.org/T196062 [11:42:18] 10Continuous-Integration-Infrastructure (shipyard), 10MediaWiki-extensions-PropertySuggester, 10MediaWiki-extensions-WikibaseRepository, 10Wikidata, 10Patch-For-Review: EXT_DEPENDENCIES no more being injected on quibble jobs - https://phabricator.wikimedia.org/T196062#4245666 (10hashar) [11:42:36] 10Continuous-Integration-Infrastructure (shipyard), 10MediaWiki-extensions-WikibaseRepository, 10Wikidata: Get Wikibase + dependencies to run with Quibble - https://phabricator.wikimedia.org/T196013#4245669 (10hashar) [11:42:41] 10Continuous-Integration-Infrastructure (shipyard), 10MediaWiki-extensions-PropertySuggester, 10MediaWiki-extensions-WikibaseRepository, 10Wikidata, 10Patch-For-Review: EXT_DEPENDENCIES no more being injected on quibble jobs - https://phabricator.wikimedia.org/T196062#4245591 (10hashar) 05Open>03Resol... [11:42:59] hashar: will that fix the quibble job on https://gerrit.wikimedia.org/r/#/c/435976/ ? [11:49:57] jdlrobson: yes :) [11:56:13] 10Phabricator, 10Project-Admins, 10Security-Team: Create Tag for WMDE Fundraising Security issues - https://phabricator.wikimedia.org/T194286#4245727 (10Aklapper) Correct. [12:45:04] 10MediaWiki-Codesniffer: Add sniff to detect spaces before comma in argument lists - https://phabricator.wikimedia.org/T168970#4245808 (10thiemowmde) p:05Triage>03Normal More edge-cases I found by scanning all my local codebases: ```lang=php function foo( $a /* comment */, $b ) list( $a, , $b ) list( $a, $b,... [13:00:09] 10MediaWiki-Codesniffer: Add sniff to detect class_exists( 'string' ) - https://phabricator.wikimedia.org/T188144#4245820 (10thiemowmde) p:05Triage>03Normal Can we expand this ticket to all other cases with the same problem? * Strings like `'TheClass::theMethod'` can be replaced with either `[ TheClass::clas... [13:03:20] 10MediaWiki-Codesniffer: MediaWiki.Commenting.FunctionComment.MissingParamTag should handle "@param $var type" a little better - https://phabricator.wikimedia.org/T141412#2497580 (10thiemowmde) The problem I see here is that the sniff can't really know if the word following the variable name is documentation (wh... [13:11:49] 10Project-Admins, 10wikiba.se: permit a phabricator page for the FactGrid project - https://phabricator.wikimedia.org/T193071#4245850 (10Olaf_Simons) Hi, I confess I asked for this with a look at future tasks (and while I am officially out of office for another 10 days). I am not quite sure whether the Phabric... [13:36:33] PROBLEM - Puppet errors on deployment-eventlog05 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [13:40:58] zeljkof: thanks for improving the ci config for math. I am currently looking at https://gerrit.wikimedia.org/r/#/c/434199/ it had +2 but was not merged (since it was dependent on another patch, or the test pipeline failed. Now, after rebasing the test pipeline succeeded and jenkins bot did its +2 verification vote. However, on https://integration.wikimedia.org/zuul/ I don't see a gate-and-submit job for this change... is th [13:40:59] something is this the intended behavior? Should I do something to trigger the submit? [13:42:02] andrewbogott: was it you who pinged re deployment calendar being empty for the future? well, I just added up until the end of June :) [13:42:20] greg-g: cool, I will add my thing :) [13:55:10] physikerwelt: sorry, not working today, it's holiday here [13:57:01] zeljkof: enjoy [14:01:35] 10Release-Engineering-Team (Kanban), 10Jenkins, 10Security: Jenkins security release - 2018-06-04 - https://phabricator.wikimedia.org/T196074#4245961 (10greg) [14:03:23] I was talking to leszek_wmde about that, he was not sure if it's the intended behavior. But he votes +2 to trigger the gate-and-submit pipeline. If that's the intended behavior it should probably be added to https://www.mediawiki.org/wiki/Gerrit/%2B2 ? [14:11:10] (03PS1) 10Hashar: Always pass os.environ to run commands [integration/quibble] - 10https://gerrit.wikimedia.org/r/436545 [14:52:00] (03PS1) 10Hashar: Remove dummy test class [integration/quibble] - 10https://gerrit.wikimedia.org/r/436558 [14:52:02] (03PS1) 10Hashar: Spawn DevWebserver with OS environment variables [integration/quibble] - 10https://gerrit.wikimedia.org/r/436559 [15:13:53] (03PS1) 10Hashar: Set LOG_DIR environment variable [integration/quibble] - 10https://gerrit.wikimedia.org/r/436564 [15:15:41] (03PS1) 10Hashar: Add missing test file [integration/quibble] - 10https://gerrit.wikimedia.org/r/436565 (https://phabricator.wikimedia.org/T195634) [15:16:23] (03CR) 10Hashar: [C: 032] ""tox -e integration" failed" [integration/quibble] - 10https://gerrit.wikimedia.org/r/436565 (https://phabricator.wikimedia.org/T195634) (owner: 10Hashar) [15:17:13] (03Merged) 10jenkins-bot: Add missing test file [integration/quibble] - 10https://gerrit.wikimedia.org/r/436565 (https://phabricator.wikimedia.org/T195634) (owner: 10Hashar) [15:17:16] (03PS2) 10Hashar: Always pass os.environ to run commands [integration/quibble] - 10https://gerrit.wikimedia.org/r/436545 [15:17:18] (03PS2) 10Hashar: Remove dummy test class [integration/quibble] - 10https://gerrit.wikimedia.org/r/436558 [15:17:20] (03PS2) 10Hashar: Spawn DevWebserver with OS environment variables [integration/quibble] - 10https://gerrit.wikimedia.org/r/436559 [15:17:22] (03PS2) 10Hashar: Set LOG_DIR environment variable [integration/quibble] - 10https://gerrit.wikimedia.org/r/436564 [15:17:43] (03CR) 10jenkins-bot: Add missing test file [integration/quibble] - 10https://gerrit.wikimedia.org/r/436565 (https://phabricator.wikimedia.org/T195634) (owner: 10Hashar) [15:18:39] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10releng-201718-q3, 10Epic, 10Patch-For-Review: [EPIC] Migrate Mediawiki jobs from Nodepool to Docker - https://phabricator.wikimedia.org/T183512#4246175 (10hashar) [15:26:10] (03PS1) 10Hashar: Migrate Linter / MultiLanguageManager to Quibble [integration/config] - 10https://gerrit.wikimedia.org/r/436568 (https://phabricator.wikimedia.org/T183512) [15:27:46] PROBLEM - Puppet errors on deployment-redis01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [15:30:16] (03CR) 10Hashar: [C: 032] Migrate Linter / MultiLanguageManager to Quibble [integration/config] - 10https://gerrit.wikimedia.org/r/436568 (https://phabricator.wikimedia.org/T183512) (owner: 10Hashar) [15:30:58] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10releng-201718-q3, 10Epic, 10Patch-For-Review: [EPIC] Migrate Mediawiki jobs from Nodepool to Docker - https://phabricator.wikimedia.org/T183512#4246208 (10hashar) [15:32:48] (03Merged) 10jenkins-bot: Migrate Linter / MultiLanguageManager to Quibble [integration/config] - 10https://gerrit.wikimedia.org/r/436568 (https://phabricator.wikimedia.org/T183512) (owner: 10Hashar) [15:33:42] PROBLEM - Puppet errors on deployment-redis02 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [15:50:06] 10Beta-Cluster-Infrastructure, 10Operations, 10Performance-Team, 10Patch-For-Review, 10Puppet: Include role/common in beta-cluster hieradata hierarchy - https://phabricator.wikimedia.org/T196034#4246255 (10Krinkle) [15:59:09] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Next), 10Cloud-Services, 10Puppet, 10User-Joe: Re-think puppet management for deployment-prep - https://phabricator.wikimedia.org/T161675#3139285 (10Krinkle) @Joe I support the idea of not allowing sharing of role-related hieradata between prod a... [16:00:12] 10Beta-Cluster-Infrastructure, 10Operations, 10Performance-Team, 10Puppet: Include role/common in beta-cluster hieradata hierarchy - https://phabricator.wikimedia.org/T196034#4246279 (10Krinkle) [16:03:33] (03CR) 10Krinkle: Set LOG_DIR environment variable (031 comment) [integration/quibble] - 10https://gerrit.wikimedia.org/r/436564 (owner: 10Hashar) [16:06:47] 10Beta-Cluster-Infrastructure, 10Operations, 10Performance-Team, 10Puppet: Define scap::sources in a way that is shared between prod and beta - https://phabricator.wikimedia.org/T196034#4246302 (10Krinkle) [16:42:07] (03CR) 10Hashar: "I am not sure LOG_DIR is still needed, but it probably doesn't hurt to have it set in addition of MW_LOG_DIR :]" (031 comment) [integration/quibble] - 10https://gerrit.wikimedia.org/r/436564 (owner: 10Hashar) [16:43:37] (03CR) 10Hashar: "That might have various side effects, but mostly it is going to solve a lot of issues since env variable set by setup_environments() were " [integration/quibble] - 10https://gerrit.wikimedia.org/r/436545 (owner: 10Hashar) [16:44:29] (03CR) 10Hashar: "Same as https://gerrit.wikimedia.org/r/#/c/436545/ which dealt with the run* commands. That should let us get the MediaWiki debug logs i" [integration/quibble] - 10https://gerrit.wikimedia.org/r/436559 (owner: 10Hashar) [16:46:29] 10Beta-Cluster-Infrastructure, 10Analytics, 10Analytics-Kanban, 10Puppet: deployment-eventlog05 puppet error about missing mysql heartbeat.heartbeat table - https://phabricator.wikimedia.org/T191109#4093870 (10Nuria) a:03elukey [17:01:57] 10Beta-Cluster-Infrastructure, 10Performance-Team, 10Patch-For-Review: Set up webperf node in Beta Cluster - https://phabricator.wikimedia.org/T195314#4246542 (10Krinkle) [17:02:11] 10Beta-Cluster-Infrastructure, 10Performance-Team, 10Patch-For-Review: Set up webperf node in Beta Cluster - https://phabricator.wikimedia.org/T195314#4222943 (10Krinkle) [17:02:26] (03CR) 10Legoktm: "Gah :( thanks, that's what I get for trying to do that at 1am" [integration/config] - 10https://gerrit.wikimedia.org/r/436510 (https://phabricator.wikimedia.org/T196062) (owner: 10Hashar) [17:12:16] (03CR) 10Krinkle: [C: 031] Always pass os.environ to run commands [integration/quibble] - 10https://gerrit.wikimedia.org/r/436545 (owner: 10Hashar) [17:12:21] (03CR) 10Krinkle: [C: 032] Remove dummy test class [integration/quibble] - 10https://gerrit.wikimedia.org/r/436558 (owner: 10Hashar) [17:39:16] 10Beta-Cluster-Infrastructure, 10Performance-Team, 10Patch-For-Review: Set up webperf node in Beta Cluster - https://phabricator.wikimedia.org/T195314#4246697 (10Krinkle) [17:42:48] (03CR) 10Legoktm: [C: 031] git-changed-in-head did not detect renames [integration/jenkins] - 10https://gerrit.wikimedia.org/r/423676 (owner: 10Hashar) [17:43:52] I wonder why scap is not complaining about deployment-snapshot01 being on the mediawiki-installation dsh group but not having an SSH known host entry for it [17:45:01] ah, a sneaky ~jenkins-deploy/.ssh/known_hosts file [17:45:46] but no entry for snapshot [17:45:46] hmm [17:48:15] but there is an entry for its IP [17:49:17] 10Release-Engineering-Team (Kanban), 10Patch-For-Review, 10Release, 10Train Deployments: 1.32.0-wmf.6 deployment blockers - https://phabricator.wikimedia.org/T191052#4246717 (10Legoktm) [17:53:47] PROBLEM - Puppet errors on integration-slave-docker-1006 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [17:53:50] thcipriani, does scap ignore known host errors? [17:54:38] PROBLEM - Puppet errors on deployment-jobrunner03 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [17:55:33] PROBLEM - Puppet errors on castor02 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [17:55:42] PROBLEM - Puppet errors on deployment-apertium02 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [17:56:40] PROBLEM - Puppet errors on integration-cumin is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [17:56:40] PROBLEM - Puppet errors on deployment-puppetdb01 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [17:57:17] PROBLEM - Puppet errors on deployment-mediawiki-09 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [17:57:18] what [17:57:26] what'd you do?! :P [17:57:45] PROBLEM - Puppet errors on integration-r-lang-01 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [17:58:24] PROBLEM - Puppet errors on deployment-pdfrender02 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [17:58:25] PROBLEM - Puppet errors on deployment-zotero01 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [17:58:31] PROBLEM - Puppet errors on integration-slave-docker-1016 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [17:58:31] PROBLEM - Puppet errors on deployment-poolcounter04 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [17:58:45] PROBLEM - Puppet errors on integration-slave-jessie-1003 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [17:58:52] PROBLEM - Puppet errors on deployment-ores01 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [17:58:59] oh [17:59:04] is this: [17:59:12] CRITICAL: Puppet has 1 failures. Last run 2 minutes ago with 1 failures. Failed resources (up to 3 shown): File[/usr/local/share/ca-certificates/Puppet_ [17:59:17] CRITICAL: Puppet has 1 failures. Last run 2 minutes ago with 1 failures. Failed resources (up to 3 shown): File[/usr/local/share/ca-certificates/Puppet_Internal_CA.crt] [17:59:23] some body updated that cert [17:59:52] PROBLEM - Puppet errors on deployment-fluorine02 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [18:00:06] PROBLEM - Puppet errors on integration-slave-jessie-1002 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [18:00:08] PROBLEM - Puppet errors on deployment-prometheus01 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [18:00:16] see -cloud-admin [18:00:17] PROBLEM - Puppet errors on integration-slave-docker-1012 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [18:00:37] PROBLEM - Puppet errors on integration-publishing is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [18:00:43] PROBLEM - Puppet errors on deployment-ms-fe02 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [18:00:46] PROBLEM - Puppet errors on deployment-memc04 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [18:00:46] PROBLEM - Puppet errors on deployment-certcentral is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [18:01:03] PROBLEM - Puppet errors on deployment-conf03 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [18:01:14] PROBLEM - Puppet errors on deployment-restbase01 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [18:02:22] PROBLEM - Puppet errors on deployment-parsoid09 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [18:02:34] PROBLEM - Puppet errors on deployment-mcs01 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [18:02:57] PROBLEM - Puppet errors on deployment-mediawiki-07 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [18:02:59] PROBLEM - Puppet errors on deployment-certcentral-testclient is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [18:03:08] !log removed scap::sources override from Horizon Puppet config for deployment-prep - T195314 [18:03:09] PROBLEM - Puppet errors on integration-slave-docker-1013 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [18:03:11] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [18:03:11] T195314: Set up webperf node in Beta Cluster - https://phabricator.wikimedia.org/T195314 [18:03:17] PROBLEM - Puppet errors on deployment-mx02 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [18:03:23] PROBLEM - Puppet errors on deployment-logstash2 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [18:03:30] PROBLEM - Puppet errors on deployment-cache-upload04 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [18:03:52] PROBLEM - Puppet errors on integration-slave-docker-1020 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [18:03:52] PROBLEM - Puppet errors on jenkinstest is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [18:03:54] PROBLEM - Puppet errors on integration-slave-docker-1004 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [18:04:00] PROBLEM - Puppet errors on deployment-kafka-main-1 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [18:04:08] PROBLEM - Puppet errors on deployment-elastic07 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [18:04:10] PROBLEM - Puppet errors on deployment-cassandra3-01 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [18:04:39] !log un-cherry-pick https://gerrit.wikimedia.org/r/#/c/436581/ [18:04:41] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [18:04:46] PROBLEM - Puppet errors on integration-slave-docker-1015 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [18:05:13] PROBLEM - Puppet errors on deployment-kafka-main-2 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [18:05:33] PROBLEM - Puppet errors on integration-slave-k8s-1018 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [18:05:46] PROBLEM - Puppet errors on deployment-redis06 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [18:05:48] PROBLEM - Puppet errors on deployment-aqs03 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [18:05:58] PROBLEM - Puppet errors on deployment-tin is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [18:06:04] PROBLEM - Puppet errors on deployment-db03 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [18:06:12] PROBLEM - Puppet errors on integration-slave-docker-1005 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [18:06:39] PROBLEM - Puppet errors on integration-slave-docker-1007 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [18:07:21] PROBLEM - Puppet errors on deployment-cache-text04 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [18:07:24] PROBLEM - Puppet errors on deployment-memc07 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [18:07:54] PROBLEM - Puppet errors on deployment-redis05 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [18:08:16] PROBLEM - Puppet errors on saucelabs-01 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [18:08:22] PROBLEM - Puppet errors on saucelabs-03 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [18:08:53] PROBLEM - Puppet errors on deployment-sca02 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [18:08:59] PROBLEM - Puppet errors on deployment-elastic05 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [18:10:48] PROBLEM - Puppet errors on deployment-imagescaler02 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [18:11:26] PROBLEM - Puppet errors on integration-slave-docker-1010 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [18:12:07] PROBLEM - Puppet errors on deployment-sca01 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [18:12:09] PROBLEM - Puppet errors on deployment-memc06 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [18:12:09] PROBLEM - Puppet errors on deployment-db04 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [18:12:37] PROBLEM - Puppet errors on integration-slave-docker-1008 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [18:12:43] PROBLEM - Puppet errors on deployment-deploy-01 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [18:13:09] PROBLEM - Puppet errors on deployment-kafka-jumbo-2 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [18:13:34] !log deployment-prep integration : revert puppet change https://gerrit.wikimedia.org/r/#/c/436600/ | T187622 [18:13:36] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [18:13:37] T187622: role::puppet::self referenced in puppet_ssldir.rb - https://phabricator.wikimedia.org/T187622 [18:15:01] PROBLEM - Puppet errors on deployment-urldownloader is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [18:15:03] PROBLEM - Puppet errors on deployment-elastic06 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [18:15:15] PROBLEM - Puppet errors on deployment-cassandra3-02 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [18:15:44] PROBLEM - Puppet errors on integration-slave-docker-1003 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [18:15:47] dammit hashar stop changing things under me [18:16:12] PROBLEM - Puppet errors on deployment-aqs02 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [18:16:12] PROBLEM - Puppet errors on deployment-kafka-jumbo-1 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [18:16:13] PROBLEM - Puppet errors on deployment-mira is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [18:17:30] 10Phabricator (Upstream), 10Upstream: Option to Turn Off Status Updates in Phabricator Task-Threads - https://phabricator.wikimedia.org/T195728#4246803 (10Aklapper) [18:18:12] PROBLEM - Puppet errors on deployment-mathoid is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [18:18:16] Krenair: andrewbogott: I have reverted the patch about puppetmaster::self that broke the puppetmaster for CI and beta ( https://gerrit.wikimedia.org/r/#/c/436600/ ) [18:19:00] PROBLEM - Puppet errors on deployment-sentry01 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [18:19:18] hashar: I think we have a slightly better fix, stay tuned [18:19:24] we are talking in -cloud-admin, please stop touching it [18:21:17] PROBLEM - Puppet errors on deployment-chromium01 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [18:23:42] okay [18:23:53] we think we have a fix and it's applied now [18:24:10] puppet is looking okay on -jobrunner03 [18:24:30] shinken should begin to pick up recoveries later [18:25:42] RECOVERY - Puppet errors on integration-slave-docker-1003 is OK: OK: Less than 1.00% above the threshold [0.0] [18:29:39] RECOVERY - Puppet errors on deployment-jobrunner03 is OK: OK: Less than 1.00% above the threshold [0.0] [18:32:16] RECOVERY - Puppet errors on deployment-mediawiki-09 is OK: OK: Less than 1.00% above the threshold [0.0] [18:33:11] 10Project-Admins, 10wikiba.se: permit a phabricator page for the FactGrid project - https://phabricator.wikimedia.org/T193071#4246819 (10Aklapper) 05Open>03stalled No problem. :) I'l set the status to `stalled` for the time being; feel free to reset once there have been more discussions. :) [18:33:32] RECOVERY - Puppet errors on integration-slave-docker-1016 is OK: OK: Less than 1.00% above the threshold [0.0] [18:33:44] RECOVERY - Puppet errors on integration-slave-jessie-1003 is OK: OK: Less than 1.00% above the threshold [0.0] [18:33:48] RECOVERY - Puppet errors on integration-slave-docker-1006 is OK: OK: Less than 1.00% above the threshold [0.0] [18:33:58] 10Gerrit, 10Phabricator, 10Release-Engineering-Team (Kanban): Move Scap development to Gerrit - https://phabricator.wikimedia.org/T191373#4246821 (10demon) a:05demon>03None [18:34:07] Krenair: andrewbogott \o/ :] [18:35:16] RECOVERY - Puppet errors on integration-slave-docker-1012 is OK: OK: Less than 1.00% above the threshold [0.0] [18:35:30] RECOVERY - Puppet errors on castor02 is OK: OK: Less than 1.00% above the threshold [0.0] [18:35:39] RECOVERY - Puppet errors on deployment-apertium02 is OK: OK: Less than 1.00% above the threshold [0.0] [18:35:47] RECOVERY - Puppet errors on deployment-memc04 is OK: OK: Less than 1.00% above the threshold [0.0] [18:35:47] RECOVERY - Puppet errors on deployment-certcentral is OK: OK: Less than 1.00% above the threshold [0.0] [18:36:05] RECOVERY - Puppet errors on deployment-conf03 is OK: OK: Less than 1.00% above the threshold [0.0] [18:36:13] RECOVERY - Puppet errors on deployment-restbase01 is OK: OK: Less than 1.00% above the threshold [0.0] [18:36:28] Krenair: and it works on CI puppetmaster just fine [18:36:40] RECOVERY - Puppet errors on integration-cumin is OK: OK: Less than 1.00% above the threshold [0.0] [18:36:40] RECOVERY - Puppet errors on deployment-puppetdb01 is OK: OK: Less than 1.00% above the threshold [0.0] [18:37:02] ok good to hear [18:37:32] RECOVERY - Puppet errors on deployment-mcs01 is OK: OK: Less than 1.00% above the threshold [0.0] [18:37:44] RECOVERY - Puppet errors on integration-r-lang-01 is OK: OK: Less than 1.00% above the threshold [0.0] [18:37:58] RECOVERY - Puppet errors on deployment-mediawiki-07 is OK: OK: Less than 1.00% above the threshold [0.0] [18:37:58] RECOVERY - Puppet errors on deployment-certcentral-testclient is OK: OK: Less than 1.00% above the threshold [0.0] [18:38:13] RECOVERY - Puppet errors on integration-slave-docker-1013 is OK: OK: Less than 1.00% above the threshold [0.0] [18:38:23] RECOVERY - Puppet errors on deployment-logstash2 is OK: OK: Less than 1.00% above the threshold [0.0] [18:38:25] RECOVERY - Puppet errors on deployment-pdfrender02 is OK: OK: Less than 1.00% above the threshold [0.0] [18:38:25] RECOVERY - Puppet errors on deployment-zotero01 is OK: OK: Less than 1.00% above the threshold [0.0] [18:38:29] RECOVERY - Puppet errors on deployment-cache-upload04 is OK: OK: Less than 1.00% above the threshold [0.0] [18:38:31] RECOVERY - Puppet errors on deployment-poolcounter04 is OK: OK: Less than 1.00% above the threshold [0.0] [18:38:33] 10Release-Engineering-Team (Kanban), 10Technical-Debt: Review Platform Tech Debt backlog - https://phabricator.wikimedia.org/T196093#4246827 (10Jrbranaa) [18:38:54] RECOVERY - Puppet errors on deployment-ores01 is OK: OK: Less than 1.00% above the threshold [0.0] [18:39:08] RECOVERY - Puppet errors on deployment-elastic07 is OK: OK: Less than 1.00% above the threshold [0.0] [18:39:22] PROBLEM - Puppet errors on deployment-mx is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [18:39:53] RECOVERY - Puppet errors on deployment-fluorine02 is OK: OK: Less than 1.00% above the threshold [0.0] [18:40:06] RECOVERY - Puppet errors on integration-slave-jessie-1002 is OK: OK: Less than 1.00% above the threshold [0.0] [18:40:10] RECOVERY - Puppet errors on deployment-prometheus01 is OK: OK: Less than 1.00% above the threshold [0.0] [18:40:12] RECOVERY - Puppet errors on deployment-kafka-main-2 is OK: OK: Less than 1.00% above the threshold [0.0] [18:40:33] RECOVERY - Puppet errors on integration-slave-k8s-1018 is OK: OK: Less than 1.00% above the threshold [0.0] [18:40:37] RECOVERY - Puppet errors on integration-publishing is OK: OK: Less than 1.00% above the threshold [0.0] [18:40:41] RECOVERY - Puppet errors on deployment-ms-fe02 is OK: OK: Less than 1.00% above the threshold [0.0] [18:41:06] RECOVERY - Puppet errors on deployment-db03 is OK: OK: Less than 1.00% above the threshold [0.0] [18:41:37] RECOVERY - Puppet errors on integration-slave-docker-1007 is OK: OK: Less than 1.00% above the threshold [0.0] [18:42:21] 10Release-Engineering-Team (Kanban), 10Technical-Debt: Define an approach for tracking/managing tech debt for PEP - https://phabricator.wikimedia.org/T196096#4246874 (10Jrbranaa) [18:42:22] RECOVERY - Puppet errors on deployment-cache-text04 is OK: OK: Less than 1.00% above the threshold [0.0] [18:42:22] RECOVERY - Puppet errors on deployment-parsoid09 is OK: OK: Less than 1.00% above the threshold [0.0] [18:42:26] RECOVERY - Puppet errors on deployment-memc07 is OK: OK: Less than 1.00% above the threshold [0.0] [18:42:56] RECOVERY - Puppet errors on deployment-redis05 is OK: OK: Less than 1.00% above the threshold [0.0] [18:43:16] RECOVERY - Puppet errors on deployment-mx02 is OK: OK: Less than 1.00% above the threshold [0.0] [18:43:52] RECOVERY - Puppet errors on integration-slave-docker-1020 is OK: OK: Less than 1.00% above the threshold [0.0] [18:43:54] RECOVERY - Puppet errors on jenkinstest is OK: OK: Less than 1.00% above the threshold [0.0] [18:43:54] RECOVERY - Puppet errors on integration-slave-docker-1004 is OK: OK: Less than 1.00% above the threshold [0.0] [18:43:54] RECOVERY - Puppet errors on deployment-sca02 is OK: OK: Less than 1.00% above the threshold [0.0] [18:44:00] RECOVERY - Puppet errors on deployment-kafka-main-1 is OK: OK: Less than 1.00% above the threshold [0.0] [18:44:09] RECOVERY - Puppet errors on deployment-cassandra3-01 is OK: OK: Less than 1.00% above the threshold [0.0] [18:44:45] RECOVERY - Puppet errors on integration-slave-docker-1015 is OK: OK: Less than 1.00% above the threshold [0.0] [18:45:42] RECOVERY - Puppet errors on deployment-redis06 is OK: OK: Less than 1.00% above the threshold [0.0] [18:45:48] RECOVERY - Puppet errors on deployment-aqs03 is OK: OK: Less than 1.00% above the threshold [0.0] [18:45:59] RECOVERY - Puppet errors on deployment-tin is OK: OK: Less than 1.00% above the threshold [0.0] [18:46:12] RECOVERY - Puppet errors on integration-slave-docker-1005 is OK: OK: Less than 1.00% above the threshold [0.0] [18:46:22] RECOVERY - Puppet errors on integration-slave-docker-1010 is OK: OK: Less than 1.00% above the threshold [0.0] [18:47:04] RECOVERY - Puppet errors on deployment-sca01 is OK: OK: Less than 1.00% above the threshold [0.0] [18:47:08] RECOVERY - Puppet errors on deployment-memc06 is OK: OK: Less than 1.00% above the threshold [0.0] [18:48:09] RECOVERY - Puppet errors on deployment-kafka-jumbo-2 is OK: OK: Less than 1.00% above the threshold [0.0] [18:48:17] RECOVERY - Puppet errors on saucelabs-01 is OK: OK: Less than 1.00% above the threshold [0.0] [18:48:17] RECOVERY - Puppet errors on saucelabs-03 is OK: OK: Less than 1.00% above the threshold [0.0] [18:49:00] RECOVERY - Puppet errors on deployment-elastic05 is OK: OK: Less than 1.00% above the threshold [0.0] [18:50:48] RECOVERY - Puppet errors on deployment-imagescaler02 is OK: OK: Less than 1.00% above the threshold [0.0] [18:51:12] RECOVERY - Puppet errors on deployment-aqs02 is OK: OK: Less than 1.00% above the threshold [0.0] [18:51:13] RECOVERY - Puppet errors on deployment-mira is OK: OK: Less than 1.00% above the threshold [0.0] [18:51:15] RECOVERY - Puppet errors on deployment-kafka-jumbo-1 is OK: OK: Less than 1.00% above the threshold [0.0] [18:52:08] RECOVERY - Puppet errors on deployment-db04 is OK: OK: Less than 1.00% above the threshold [0.0] [18:52:41] RECOVERY - Puppet errors on integration-slave-docker-1008 is OK: OK: Less than 1.00% above the threshold [0.0] [18:54:09] 10Release-Engineering-Team (Kanban), 10Technical-Debt: Define an approach for tracking/managing tech debt for PEP - https://phabricator.wikimedia.org/T196096#4246919 (10Jrbranaa) p:05Triage>03Normal [18:54:30] 10Release-Engineering-Team (Kanban), 10Technical-Debt: Review Platform Tech Debt backlog - https://phabricator.wikimedia.org/T196093#4246920 (10Jrbranaa) p:05Triage>03Normal [18:55:01] RECOVERY - Puppet errors on deployment-urldownloader is OK: OK: Less than 1.00% above the threshold [0.0] [18:55:03] RECOVERY - Puppet errors on deployment-elastic06 is OK: OK: Less than 1.00% above the threshold [0.0] [18:55:14] RECOVERY - Puppet errors on deployment-cassandra3-02 is OK: OK: Less than 1.00% above the threshold [0.0] [18:56:16] RECOVERY - Puppet errors on deployment-chromium01 is OK: OK: Less than 1.00% above the threshold [0.0] [18:58:13] RECOVERY - Puppet errors on deployment-mathoid is OK: OK: Less than 1.00% above the threshold [0.0] [18:58:59] RECOVERY - Puppet errors on deployment-sentry01 is OK: OK: Less than 1.00% above the threshold [0.0] [19:01:39] 10Release-Engineering-Team, 10Epic: FY2017/18 Program 3 Outcome 2: Organizational technical debt is reduced. - https://phabricator.wikimedia.org/T174089#4246942 (10Jrbranaa) [19:01:43] 10Release-Engineering-Team (Kanban), 10Epic: FY2017/18 Program 3 Outcome 2 Objective 3: Promote and surface important technical debt topics at large gatherings of Wikimedia developers (e.g., DevSummit and Hackathon(s)) - https://phabricator.wikimedia.org/T174096#4246940 (10Jrbranaa) 05Open>03Resolved Altho... [19:01:53] PROBLEM - SSH on integration-slave-docker-1020 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:02:04] 10Release-Engineering-Team (Kanban), 10Wikimedia-Hackathon-2018, 10User-Ryasmeen: Quality Assurance SIG (special interest group) meetup - https://phabricator.wikimedia.org/T194937#4246943 (10Jrbranaa) 05Open>03Resolved [19:06:45] RECOVERY - SSH on integration-slave-docker-1020 is OK: SSH OK - OpenSSH_6.7p1 Debian-5+deb8u4 (protocol 2.0) [19:14:10] PROBLEM - App Server Main HTTP Response on deployment-mediawiki-09 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:20:44] 10Release-Engineering-Team (Kanban), 10Surveys: Survey for Beta Cluster use cases - https://phabricator.wikimedia.org/T194818#4246990 (10Jrbranaa) Survey has been sent out. The survey will closed on June 15th. [19:24:09] RECOVERY - App Server Main HTTP Response on deployment-mediawiki-09 is OK: HTTP OK: HTTP/1.1 200 OK - 46944 bytes in 9.899 second response time [19:35:20] thcipriani: o/ have a minute for a scap question? [19:36:20] mdholloway: sure, doing the train at the moment, but go ahead and ask and I'll reply when I get a sec [19:36:46] ok, thanks! here's what i'm seeing: [19:36:47] 19:31:43 Started deploy [tilerator/deploy@UNKNOWN] (cleartables) [19:37:01] any idea offhand i would be seeing the 'UNKNOWN' in place of a commit id? [19:37:05] *why [19:37:29] command was `scap deploy --environment cleartables -l maps-test2004.codfw.wmnet` [19:38:23] hashar: I think we fixed the project local puppetmaster bug upstream. If you still have that cherry-pick you might try removing it and updating to see if things work in beta cluster [19:38:44] there was a sneaky nil that wasn't being checked for properly [19:39:08] actually, i'm going to try one other thing... [19:40:56] ok, looks like if i create a branch on tin itself and try to deploy from it, it fools scap somehow [19:41:17] hrm, UNKOWN is probably our git_disclosable_head function giving up :) [19:43:16] mdholloway: were you in a detatched head state or on a branch? [19:43:38] thcipriani: i was on a branch [19:43:54] local branch or did it have an upstream? [19:43:59] bd808: yup seems krenair/andrew fixed it :) [19:44:19] thcipriani: local to tin [19:44:22] bd808: that seems to be working all fine now. [19:44:26] yeah [19:44:30] I was testing the fix on the beta cluster [19:47:27] no_justification https://gerrit-review.googlesource.com/c/gerrit/+/182230 es 5 support! [19:50:47] mdholloway: I think that was the problem, git_disclosable_head is trying to figure out what part of your branch is public in case you have security patches applied on top so we don't broadcast the sha1 of a patch that isn't public yet. Since you had a local-only branch it gave up somewhere I'd guess. If you file a task I can take a deeper dive/play with that function a bit to see if I can do more [19:50:50] sane-making. [19:51:09] aha, that makes sense [19:51:41] thanks. now i'm having some other issues but i don't think they relate to scap ;) [19:52:17] heh, well that's...well it's not good, but it's something :) [19:53:10] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10releng-201718-q3, 10Epic, 10Patch-For-Review: [EPIC] Migrate Mediawiki jobs from Nodepool to Docker - https://phabricator.wikimedia.org/T183512#4247052 (10hashar) [19:53:42] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10releng-201718-q3, 10Epic, 10Patch-For-Review: [EPIC] Migrate Mediawiki jobs from Nodepool to Docker - https://phabricator.wikimedia.org/T183512#4174523 (10hashar) I have removed extensions that are archived and the... [19:56:51] (03PS1) 10Hashar: Archive WikivoteMapsYandex [integration/config] - 10https://gerrit.wikimedia.org/r/436622 (https://phabricator.wikimedia.org/T193844) [19:58:56] 10Continuous-Integration-Config, 10TestMe: fix or mark as inactive extensions currently failing CI - https://phabricator.wikimedia.org/T134090#4247070 (10hashar) 05Open>03Resolved a:03hashar I am closing this. There are still extensions that are non voting but I will make a point of either archiving them... [20:00:04] (03CR) 10Hashar: [C: 032] Archive WikivoteMapsYandex [integration/config] - 10https://gerrit.wikimedia.org/r/436622 (https://phabricator.wikimedia.org/T193844) (owner: 10Hashar) [20:00:07] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10releng-201718-q3, 10Epic, 10Patch-For-Review: [EPIC] Migrate Mediawiki jobs from Nodepool to Docker - https://phabricator.wikimedia.org/T183512#4247107 (10hashar) [20:01:42] (03Merged) 10jenkins-bot: Archive WikivoteMapsYandex [integration/config] - 10https://gerrit.wikimedia.org/r/436622 (https://phabricator.wikimedia.org/T193844) (owner: 10Hashar) [20:02:57] (03PS1) 10Hashar: Migrate CloseWikis to Quibble [integration/config] - 10https://gerrit.wikimedia.org/r/436623 (https://phabricator.wikimedia.org/T183512) [20:04:08] (03CR) 10Hashar: [C: 032] Migrate CloseWikis to Quibble [integration/config] - 10https://gerrit.wikimedia.org/r/436623 (https://phabricator.wikimedia.org/T183512) (owner: 10Hashar) [20:06:16] (03Merged) 10jenkins-bot: Migrate CloseWikis to Quibble [integration/config] - 10https://gerrit.wikimedia.org/r/436623 (https://phabricator.wikimedia.org/T183512) (owner: 10Hashar) [20:11:00] 10Beta-Cluster-Infrastructure, 10Patch-For-Review, 10Puppet: Set up puppet exported resources to collect ssh host keys for beta - https://phabricator.wikimedia.org/T72792#4247163 (10Krenair) It's working but it's very loud - I've made https://gerrit.wikimedia.org/r/#/c/436624/ to deal with that Also probably... [20:14:07] (03PS1) 10Hashar: Archive Genderize extension [integration/config] - 10https://gerrit.wikimedia.org/r/436630 (https://phabricator.wikimedia.org/T196108) [20:14:28] (03CR) 10Hashar: [C: 032] Archive Genderize extension [integration/config] - 10https://gerrit.wikimedia.org/r/436630 (https://phabricator.wikimedia.org/T196108) (owner: 10Hashar) [20:15:42] (03Merged) 10jenkins-bot: Archive Genderize extension [integration/config] - 10https://gerrit.wikimedia.org/r/436630 (https://phabricator.wikimedia.org/T196108) (owner: 10Hashar) [20:25:23] no_justification https://gerrit.git.wmflabs.org/r/q/status:open ! [20:25:27] gerrit 2.15 [20:26:01] and i see notedb.config [20:26:34] (03PS1) 10MGChecker: Add possibility to change allowed prefixes [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/436633 (https://phabricator.wikimedia.org/T191812) [20:27:26] (03CR) 10jerkins-bot: [V: 04-1] Add possibility to change allowed prefixes [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/436633 (https://phabricator.wikimedia.org/T191812) (owner: 10MGChecker) [20:27:33] 10Continuous-Integration-Config, 10Release-Engineering-Team (Watching / External), 10Discovery, 10Product-Analytics, and 2 others: Add lint/CI to all wikimedia/discovery analytics repositories - https://phabricator.wikimedia.org/T153856#4247216 (10mpopov) 05Open>03declined My team was in disarray and r... [20:29:08] no_justification seems the font was not applied? [20:29:13] (03PS1) 10Hashar: Migrate ext needing git submodule to Quibble [integration/config] - 10https://gerrit.wikimedia.org/r/436676 (https://phabricator.wikimedia.org/T183512) [20:29:24] and also (2.15.1-202-g7e37a6c479-dirty) [20:29:35] (03CR) 10Hashar: [C: 032] Migrate ext needing git submodule to Quibble [integration/config] - 10https://gerrit.wikimedia.org/r/436676 (https://phabricator.wikimedia.org/T183512) (owner: 10Hashar) [20:30:58] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10releng-201718-q3, 10Epic, 10Patch-For-Review: [EPIC] Migrate Mediawiki jobs from Nodepool to Docker - https://phabricator.wikimedia.org/T183512#4247225 (10hashar) [20:31:44] (03Merged) 10jenkins-bot: Migrate ext needing git submodule to Quibble [integration/config] - 10https://gerrit.wikimedia.org/r/436676 (https://phabricator.wikimedia.org/T183512) (owner: 10Hashar) [20:34:36] (03PS2) 10MGChecker: Add possibility to change allowed prefixes [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/436633 (https://phabricator.wikimedia.org/T191812) [20:35:30] 10Continuous-Integration-Infrastructure (shipyard), 10MediaWiki-extensions-WikibaseRepository, 10Wikidata: Get Wikibase + dependencies to run with Quibble - https://phabricator.wikimedia.org/T196013#4247247 (10hashar) [20:37:43] (03CR) 10jerkins-bot: [V: 04-1] Add possibility to change allowed prefixes [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/436633 (https://phabricator.wikimedia.org/T191812) (owner: 10MGChecker) [20:40:57] (03PS3) 10MGChecker: Add possibility to change allowed prefixes [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/436633 (https://phabricator.wikimedia.org/T191812) [20:41:09] 10Continuous-Integration-Infrastructure (shipyard), 10MediaWiki-extensions-WikibaseRepository, 10Wikidata: Get Wikibase + dependencies to run with Quibble - https://phabricator.wikimedia.org/T196013#4247252 (10hashar) I think the issue is the Wikibase.php entry point relies on `$wgWikimediaJenkinsCI` being s... [20:44:30] addshore hi, wondering if i can ask you a wikibase question please? [20:44:41] (03CR) 10jerkins-bot: [V: 04-1] Add possibility to change allowed prefixes [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/436633 (https://phabricator.wikimedia.org/T191812) (owner: 10MGChecker) [20:44:48] we are currently getting this Wikimedia\Assert\ParameterTypeException from line 89 of /srv/mediawiki/w/extensions/Wikibase/vendor/wikimedia/assert/src/Assert.php: Bad value for parameter $maxSerializedEntitySizeInBytes: must be a integer [20:44:52] on mediawiki 1.30 [20:47:13] (03PS4) 10MGChecker: Add possibility to change allowed prefixes [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/436633 (https://phabricator.wikimedia.org/T191812) [20:52:00] (03PS5) 10MGChecker: Add possibility to change allowed prefixes [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/436633 (https://phabricator.wikimedia.org/T191812) [20:56:20] 10Release-Engineering-Team (Kanban), 10Wikimedia-Hackathon-2018: Quality Assurance SIG (special interest group) meetup - https://phabricator.wikimedia.org/T194937#4247274 (10Ryasmeen) [20:57:34] (03CR) 10jerkins-bot: [V: 04-1] Add possibility to change allowed prefixes [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/436633 (https://phabricator.wikimedia.org/T191812) (owner: 10MGChecker) [20:58:03] 10Beta-Cluster-Infrastructure, 10Collaboration-Team-Triage, 10Discovery, 10Maps, 10Collaboration-Feature-Rollouts (Collaboration-Maps): [betalabs] Uncaught TypeError: Cannot read property 'lat' of null - when 'Edit layer' option is used. - https://phabricator.wikimedia.org/T194440#4247276 (10Etonkovidova) [20:58:16] (03PS6) 10MGChecker: Add possibility to change allowed prefixes [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/436633 (https://phabricator.wikimedia.org/T191812) [21:02:06] (03PS7) 10MGChecker: Add possibility to change allowed prefixes [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/436633 (https://phabricator.wikimedia.org/T191812) [21:04:42] 10MediaWiki-Codesniffer, 10MediaWiki-extensions-Variables, 10Patch-For-Review: Allow configuring MediaWiki.NamingConventions.ValidGlobalName.wgPrefix to allow additional prefixes - https://phabricator.wikimedia.org/T191812#4247283 (10MGChecker) p:05Triage>03Normal Should I catch the case that someone exp... [21:04:45] (03CR) 10MGChecker: "Sorry for the many patchsets, composer test didn't run for me because of" [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/436633 (https://phabricator.wikimedia.org/T191812) (owner: 10MGChecker) [21:07:43] (03CR) 10jerkins-bot: [V: 04-1] Add possibility to change allowed prefixes [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/436633 (https://phabricator.wikimedia.org/T191812) (owner: 10MGChecker) [21:10:10] 10Continuous-Integration-Config, 10Release-Engineering-Team (Watching / External), 10Discovery, 10Product-Analytics, and 2 others: Add lint/CI to all wikimedia/discovery analytics repositories - https://phabricator.wikimedia.org/T153856#4247299 (10Legoktm) 05declined>03Open :( Regardless, it's always... [21:11:41] 10Continuous-Integration-Infrastructure (shipyard), 10MediaWiki-extensions-WikibaseRepository, 10Wikidata, 10Patch-For-Review: Get Wikibase + dependencies to run with Quibble - https://phabricator.wikimedia.org/T196013#4247301 (10hashar) install.php --with-extensions does complete. The LocalSettings.php th... [21:13:46] hashar: now that I've spent a day thinking about it, it probably would have been easier to just bump the jenkins timeout for quibble + hhvm to an hour or something [21:14:15] legoktm: have you ended up reverting? [21:14:28] nope [21:14:28] legoktm: there is no harm trying optimizing anyway [21:14:40] I feel like I would just break more things in trying to revert it all [21:14:44] it is not like we are dealing with human lives or trying to land airplanes! [21:14:49] :p [21:15:13] we can always fix it up later on [21:15:18] 10Release-Engineering-Team (Kanban), 10User-greg: Write JD for DevProd position - https://phabricator.wikimedia.org/T193502#4247308 (10greg) 05Open>03Resolved We worked on this at our team offsite in Barcelona this month. I've added the language to that google doc and am now waiting on getting the go-ahead... [21:15:23] I will probably need the git changed in head trick for Wikibase as well [21:15:35] php -l + phpcs against 1500 files is a bit slow [21:17:17] ah, yep :( they have a lot of files too [21:17:43] but anyway, update.php doesn't even work yet :] [21:18:01] I gotta figure out a way to inject $wgJenkinsWikimediaCI = true; BEFORE require_once( Wikibase.php ) [21:18:17] or maybe try to add an extension.json with a callback [21:19:30] (03CR) 10Legoktm: "That's alright, I can help with fixing up the tests :) I'll review in more detail later today hopefully." [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/436633 (https://phabricator.wikimedia.org/T191812) (owner: 10MGChecker) [21:19:43] anyway I am off to bed. I am tired [21:19:59] at least I got some patches for quibble and managed to migrate a few more extensions [21:22:10] 10Release-Engineering-Team (Kanban), 10Quibble, 10Patch-For-Review: Error: 1071 Specified key was too long; max key length is 767 bytes - https://phabricator.wikimedia.org/T193222#4247329 (10jcrespo) @hashar The consensus seems to be going towards abandoning utf8/utf8mb4 (the mysql config) and embracing UTF-... [21:37:34] thcipriani: so, more on that (probably) scap-unrelated issue i mentioned a couple of hours ago: i'm seeing new error output in the service i've been deploying (tilerator) that persists even when i reset to the previous semi-working commit on tin (on a real branch that exists in the repo on gerrit) and redeploy. could there be some bad state cached somewhere or something? [21:39:04] there could be, what are you seeing? [21:39:31] the deploy log looks fine, but this shows up in the tilerator logs: [21:39:33] you're resetting and redeploying and the error output looks like it didn't redeploy? [21:40:07] https://www.irccloud.com/pastebin/n2kcFVtv/ [21:41:12] besides being not super easy to diagnose, that's the same error that was getting thrown when i was trying to deploy tilerator@UNKNOWN [21:41:57] er, when scap was trying to deploy an unknown commit, anyway [21:43:01] as best i can recall, these were my steps [21:43:26] (note to readers, this is on a dedicated test server ;)) [21:43:33] so it looks like the current HEAD of /srv/deployment/tilerator/deploy on tin is in the same code as is in /srv/deployment/tilerator/deploy on maps-test2004, FWIW [21:43:55] scap didn't restart the service with the very last deploy since it thought that the current revision was already live on that server [21:44:46] ah, that's why that happened. that makes sense. though, having seen that, i restarted manually, and it didn't seem to help. [21:45:34] hrm [21:46:11] the config for tilerator and tileratorui also seems to be linked to the correct revision [21:46:26] this is what i did in a nutshell: (1) created a new 'stretch-hacks' branch from stretch on tin with a new commit and deployed, got the tilerator@UNKNOWN issue, and the new error pasted above [21:47:00] (2) cherry-picked the new commit from stretch-hacks onto the real stretch branch, deployed, same result [21:47:31] (3) `git reset --hard HEAD^` to the last "good" commit and deployed, same result [21:49:28] well, i should say that in (2) and (3) i didn't see the UNKNOWN thing, scap identified a commit, but the new error output (with basically the most common error in JS and no file or line number) remains. [21:49:54] for the record, T195476 is what i was trying to debug in the first place [21:49:54] T195476: Unable to create source "v3"self._closeAsync is not a function error - https://phabricator.wikimedia.org/T195476 [21:50:43] hrm, afaict the current HEAD on tin:/srv/deployment/tilerator/deploy is what's deployed to the server. IIRC there was some fiddling with scap configuration files happening, so maybe the actual content of the config files is incorrect? [21:50:58] I don't know what they should look like, though, so I can't say for sure [21:51:13] the lack of line number in the error output is, yeah, very unhelpful here :) [21:54:13] hmm, i certainly wouldn't swear to the correctness of our scap config files, but fwiw i didn't touch any here [21:54:45] I just remember there being a question yesterday about scap --environment [21:55:21] and i ended up inadvertently doing one or two no-op deploys before creating the stretch-hacks branch and making things worse [21:56:00] after those no-op deploys, the tilerator logs were still as seen in T195476 [21:56:01] T195476: Unable to create source "v3"self._closeAsync is not a function error - https://phabricator.wikimedia.org/T195476 [21:56:03] were you using --environment cleartables? [21:56:07] yep [21:56:25] `scap deploy --environment cleartables -l maps-test2004.codfw.wmnet` each time [21:57:07] could you try: scap deploy --force --environment cleartables -l maps-test2004.codfw.wmnet [21:57:25] I don't know that that will change anything, but it should re-deploy everything from scratch for that machine [21:57:26] our `scap.cfg` almost certainly is wrong, but i guess things are nonetheless working well enough to deploy code :D [21:57:42] aha, i'll give it a shot [21:58:19] the --force flag means scap won't try to be smart about redeploying what's already been deployed, restarting services, etc [21:58:33] I'll watch the deploy log [22:00:11] hmm, this is exhibiting something else i'd noticed but neglected to mention -- it gets stuck on the promote and restart_service stage [22:00:42] looking at the logs it seems like it's waiting for Port 6534 not up [22:01:10] and failed since it didn't come up in 2 minutes [22:02:08] hmm, that's something. this is actually the first time i've waited the 2 minutes rather than aborting it. [22:02:47] what is 6534? [22:03:19] I see 6533 is up [22:04:19] 6534 is tileratorui [22:06:03] hrm, well that service is getting restarted ok so it seems, but that port is not coming back up [22:06:47] * pnorman waves [22:11:05] I don't know why that port is not coming back up. I guess probably because of the fatal you posted earlier. [22:12:11] Anything I can help with? [22:14:26] mdholloway: FWIW, the one other maps-test server I checked 2001 is not running the version you're trying to deploy to 2004, you have 3 or so commits on top, is that a known thing? [22:14:44] maps-test2004 is running the new cleartables style [22:15:08] hmm, i thought tilerator was 6533 and tileratorui was 6534, but it looks like, in fact, tilerator is 6534 and tileratorui is 6535 [22:16:00] pnorman: so test2004 is having some trouble during deployment, it's restarting servers, but the tilerator port is not coming back up it seems after restart [22:16:22] which may be due to the error mdholloway posted earlier: https://www.irccloud.com/pastebin/n2kcFVtv/ [22:16:53] length of undefined, yep. I think that's where we were at before I left for food [22:17:07] * mdholloway nods [22:17:13] but I don't really know how to troubleshoot that error. All I can say is that scap looks like it's deployed everything normally. [22:17:21] I don't think there's a scap issue. [22:17:50] yeah, doesn't seem so. [22:18:05] especially after the redeploy with `--force`. [22:23:52] well, thanks for the sanity check, thcipriani [22:30:28] PROBLEM - Host deployment-puppetmaster02 is DOWN: CRITICAL - Host Unreachable (10.68.21.200) [22:32:11] mdholloway: any time :) [22:49:45] (03CR) 10Legoktm: "I noticed recently when trying to debug that quibble doesn't create the junit reports for jenkins "test result"...is it possible to merge " [integration/quibble] - 10https://gerrit.wikimedia.org/r/426742 (owner: 10Legoktm) [22:57:00] PROBLEM - Puppet errors on deployment-tin is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [23:00:55] paladox: sure [23:01:07] addshore thanks, we resolved the issue [23:01:09] now! [23:01:19] What was the question? :P [23:01:25] I'm intreguied! [23:01:50] we got this error [23:01:51] Wikimedia\Assert\ParameterTypeException from line 89 of /srv/mediawiki/w/extensions/Wikibase/vendor/wikimedia/assert/src/Assert.php: Bad value for parameter $maxSerializedEntitySizeInBytes: must be a integer [23:02:03] Oooh [23:02:08] but later found it was caused by but fixed by https://git.io/vhcLT [23:02:34] Aaah [23:02:36] Coolio! [23:02:46] :) [23:12:15] PROBLEM - Puppet errors on deployment-mira is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [23:47:48] (03PS1) 10Reedy: Swap ObjectFactory to use composer-test-php55-package not composer-test-package [integration/config] - 10https://gerrit.wikimedia.org/r/436701 [23:52:58] (03PS2) 10Reedy: Swap ObjectFactory to use composer-test-php55-package not composer-test-package [integration/config] - 10https://gerrit.wikimedia.org/r/436701 [23:53:02] (03CR) 10Reedy: [C: 032] Swap ObjectFactory to use composer-test-php55-package not composer-test-package [integration/config] - 10https://gerrit.wikimedia.org/r/436701 (owner: 10Reedy) [23:54:29] (03Merged) 10jenkins-bot: Swap ObjectFactory to use composer-test-php55-package not composer-test-package [integration/config] - 10https://gerrit.wikimedia.org/r/436701 (owner: 10Reedy) [23:55:00] !log Reloading Zuul to deploy https://gerrit.wikimedia.org/r/436701 [23:55:02] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL