[02:24:34] PROBLEM - Puppet staleness on deployment-kafka01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [43200.0] [03:11:43] Project mediawiki-core-code-coverage build #3002: 04FAILURE in 11 min: https://integration.wikimedia.org/ci/job/mediawiki-core-code-coverage/3002/ [03:12:49] Trying to @cover or @use not existing method "Sanitizer::urlEscapeId" [03:14:12] commented on https://gerrit.wikimedia.org/r/375099 [03:34:06] (03PS1) 10KartikMistry: Add apertium-crh, -tur and -crh-tur packages [integration/config] - 10https://gerrit.wikimedia.org/r/377391 (https://phabricator.wikimedia.org/T174765) [04:46:48] PROBLEM - Mediawiki Error Rate on graphite-labs is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [10.0] [04:56:49] PROBLEM - Mediawiki Error Rate on graphite-labs is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [10.0] [05:36:48] PROBLEM - Mediawiki Error Rate on graphite-labs is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [10.0] [07:25:34] (03PS1) 10KartikMistry: Add apertium-cat-srd [integration/config] - 10https://gerrit.wikimedia.org/r/377408 (https://phabricator.wikimedia.org/T174987) [07:36:40] (03CR) 10Hashar: "In composer.json, you might want to add:" [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/377042 (owner: 10Legoktm) [07:48:35] 10Release-Engineering-Team (Kanban), 10Operations, 10Release Pipeline: Provision Docker >= 17.05 on contint1001 - https://phabricator.wikimedia.org/T175293#3599497 (10hashar) In apt.wikimedia.org we have: | docker.io | 1.6.2~dfsg1-1~bpo8+1 | http://mirrors.wikimedia.org/debian/ jessie-backports/main amd64 P... [07:53:54] 10Release-Engineering-Team (Next), 10Release Pipeline: Secret storage on contint1001 for Docker registry password - https://phabricator.wikimedia.org/T175298#3599501 (10hashar) We have bunch of credentials in https://integration.wikimedia.org/ci/credentials/ , they can then be exported as environment variable... [07:55:20] hashar: Bonjour, Is scap okay today? Can I add things to SWAT? [07:55:29] (If not, how can I help?) [07:55:35] I have no clue [07:55:47] potentially yeah should have been fixed. It seems some stuff got deployed yesterday [07:56:42] okay, let me add those :) [07:58:30] (03CR) 10Hashar: [C: 032] Add apertium-crh, -tur and -crh-tur packages [integration/config] - 10https://gerrit.wikimedia.org/r/377391 (https://phabricator.wikimedia.org/T174765) (owner: 10KartikMistry) [07:58:36] (03CR) 10Hashar: [C: 032] Add apertium-cat-srd [integration/config] - 10https://gerrit.wikimedia.org/r/377408 (https://phabricator.wikimedia.org/T174987) (owner: 10KartikMistry) [07:59:29] (03Merged) 10jenkins-bot: Add apertium-crh, -tur and -crh-tur packages [integration/config] - 10https://gerrit.wikimedia.org/r/377391 (https://phabricator.wikimedia.org/T174765) (owner: 10KartikMistry) [07:59:31] (03CR) 10jerkins-bot: [V: 04-1] Add apertium-cat-srd [integration/config] - 10https://gerrit.wikimedia.org/r/377408 (https://phabricator.wikimedia.org/T174987) (owner: 10KartikMistry) [08:00:30] (03PS2) 10Hashar: Add apertium-cat-srd [integration/config] - 10https://gerrit.wikimedia.org/r/377408 (https://phabricator.wikimedia.org/T174987) (owner: 10KartikMistry) [08:02:00] (03CR) 10Hashar: [C: 032] Add apertium-cat-srd [integration/config] - 10https://gerrit.wikimedia.org/r/377408 (https://phabricator.wikimedia.org/T174987) (owner: 10KartikMistry) [08:02:56] (03Merged) 10jenkins-bot: Add apertium-cat-srd [integration/config] - 10https://gerrit.wikimedia.org/r/377408 (https://phabricator.wikimedia.org/T174987) (owner: 10KartikMistry) [08:29:54] 10Gerrit, 10Developer-Relations, 10Documentation: [[mw:Gerrit/Tutorial]] is way too much information for new contributors - https://phabricator.wikimedia.org/T161901#3599535 (10Qgil) p:05Triage>03Low Let me insist. :) >>! In T161901#3196709, @Qgil wrote: > Is https://www.mediawiki.org/wiki/Gerrit/Getti... [08:35:10] 10Beta-Cluster-Infrastructure: Deployment wiki is flooded by spam and should be cleaned up, perhaps even restricted more - https://phabricator.wikimedia.org/T175197#3585793 (10Sau226) Mainframe do you need or want steward rights across the entire network so you can stop spam cross-wiki? Please tell me your opini... [08:51:52] RECOVERY - Mediawiki Error Rate on graphite-labs is OK: OK: Less than 1.00% above the threshold [1.0] [09:30:13] 10Release-Engineering-Team (Kanban), 10Release Pipeline (Blubber): Package Blubber - https://phabricator.wikimedia.org/T175609#3599700 (10Joe) `dh-make-golang` is what I'd use for creating a debian package from scratch, as it will also prepare packages for any dependency (read: any library dependency that stil... [09:44:21] (03PS6) 10MarcoAurelio: Whitelist Dvorapa on Zuul CI [integration/config] - 10https://gerrit.wikimedia.org/r/375765 [09:44:37] (03CR) 10MarcoAurelio: "@hashar: Got your approval? Regards." [integration/config] - 10https://gerrit.wikimedia.org/r/375765 (owner: 10MarcoAurelio) [10:46:23] (03PS1) 10Hashar: dib: migrate out of ::packages [integration/config] - 10https://gerrit.wikimedia.org/r/377424 [10:47:12] (03CR) 10Hashar: [C: 032] "graphoid::packages , mathoid::packages, trendingedits::packages all disappeared in favor of adding the packages directly in the profile." [integration/config] - 10https://gerrit.wikimedia.org/r/377424 (owner: 10Hashar) [10:48:06] (03Merged) 10jenkins-bot: dib: migrate out of ::packages [integration/config] - 10https://gerrit.wikimedia.org/r/377424 (owner: 10Hashar) [10:48:56] !log nodepool: force updating jessie image to grab php5.5-luasandbox - T161882 T174972 [10:49:02] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [10:49:02] T161882: Migrate PHP5.5 jobs from Trusty to Jessie - https://phabricator.wikimedia.org/T161882 [10:49:02] T174972: Package php modules for Zend 5.5 on Jessie - https://phabricator.wikimedia.org/T174972 [10:50:51] 10Continuous-Integration-Infrastructure (phase-out-trusty), 10Release-Engineering-Team (Kanban), 10Patch-For-Review: Package php modules for Zend 5.5 on Jessie - https://phabricator.wikimedia.org/T174972#3578458 (10hashar) And I have packaged php-luasandbox. The bulk of the work is done, what is left is mayb... [11:09:04] !log Image snapshot-ci-jessie-1505213295 in wmflabs-eqiad is ready [11:09:08] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [11:26:47] (03PS1) 10Aude: Bump Wikidata to wmf/1.30.0-wmf.18 [tools/release] - 10https://gerrit.wikimedia.org/r/377431 [11:27:13] (03CR) 10Aude: [C: 032] Bump Wikidata to wmf/1.30.0-wmf.18 [tools/release] - 10https://gerrit.wikimedia.org/r/377431 (owner: 10Aude) [11:30:05] (03Merged) 10jenkins-bot: Bump Wikidata to wmf/1.30.0-wmf.18 [tools/release] - 10https://gerrit.wikimedia.org/r/377431 (owner: 10Aude) [11:55:35] (03CR) 10Phedenskog: Run WebPageTest tests from Asia to verify the new cache pop. (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/362972 (https://phabricator.wikimedia.org/T168416) (owner: 10Phedenskog) [12:08:13] (03PS5) 10Phedenskog: Run WebPageTest tests from Asia to verify the new cache pop. [integration/config] - 10https://gerrit.wikimedia.org/r/362972 (https://phabricator.wikimedia.org/T168416) [12:09:58] (03PS6) 10Phedenskog: Run WebPageTest tests from Asia to verify the new cache pop. [integration/config] - 10https://gerrit.wikimedia.org/r/362972 (https://phabricator.wikimedia.org/T168416) [12:32:33] (03CR) 10Hashar: [C: 032] Migrate mwext-textextension-php55 to jessie [integration/config] - 10https://gerrit.wikimedia.org/r/377448 (https://phabricator.wikimedia.org/T161882) (owner: 10Hashar) [12:33:33] (03Merged) 10jenkins-bot: Migrate mwext-textextension-php55 to jessie [integration/config] - 10https://gerrit.wikimedia.org/r/377448 (https://phabricator.wikimedia.org/T161882) (owner: 10Hashar) [12:39:15] (03PS1) 10Hashar: Strip '-jessie' prefix from a Zuul template [integration/config] - 10https://gerrit.wikimedia.org/r/377450 [12:43:51] (03CR) 10Hashar: [C: 032] "Noop in Zuul :]" [integration/config] - 10https://gerrit.wikimedia.org/r/377450 (owner: 10Hashar) [12:44:04] (03PS1) 10Hashar: mwext-testextension-php55 to jessie [integration/config] - 10https://gerrit.wikimedia.org/r/377452 (https://phabricator.wikimedia.org/T161882) [12:45:41] (03Merged) 10jenkins-bot: Strip '-jessie' prefix from a Zuul template [integration/config] - 10https://gerrit.wikimedia.org/r/377450 (owner: 10Hashar) [12:46:29] (03CR) 10Hashar: [C: 032] mwext-testextension-php55 to jessie [integration/config] - 10https://gerrit.wikimedia.org/r/377452 (https://phabricator.wikimedia.org/T161882) (owner: 10Hashar) [12:47:26] (03Merged) 10jenkins-bot: mwext-testextension-php55 to jessie [integration/config] - 10https://gerrit.wikimedia.org/r/377452 (https://phabricator.wikimedia.org/T161882) (owner: 10Hashar) [12:51:33] (03PS1) 10Hashar: Remove some trusty jobs [integration/config] - 10https://gerrit.wikimedia.org/r/377453 (https://phabricator.wikimedia.org/T161882) [12:52:17] hashar: IS there a task for the infra/jenkins issue? Is that also the reason for no wmf branch this week? [12:52:29] (per wikitech:Deployments) [12:52:43] Krinkle: which issue? [12:53:25] hoaoo [12:53:29] hashar: Deployments cancelled due "to Jenkins/infra issue" [12:53:40] scap was not able to ssh to the mediawiki app servers due to some ssh config [12:53:46] but that one should be solved [12:53:47] solve [12:53:52] Okay [12:53:58] So wmf.18 will still happen? [12:54:02] for the eevening swat, I have no diea [12:54:07] yeah wmf.18 should happen [12:54:22] we cut the branch directly in Gerrit [12:54:53] Oh, right. Tuesday hasn't happened yet in SF [12:54:59] Okay, I misread [12:55:00] Thanks :) [12:56:11] Krinkle: I made a task for that and greg-g marked it as resolved [12:56:37] btw Krinkle now that you're here, I wonder if you could review some centralauth patches (2) ? [12:59:20] Lua error in package.lua at line 80: module 'Module:IPAc-en/data' not found. < (03PS1) 10Hashar: mwext-Wikibase-client-tests-mysql-php55 to Jessie [integration/config] - 10https://gerrit.wikimedia.org/r/377457 (https://phabricator.wikimedia.org/T161882) [13:01:29] (03CR) 10Hashar: [C: 04-2] "Pending tests on https://gerrit.wikimedia.org/r/#/c/70395/19" [integration/config] - 10https://gerrit.wikimedia.org/r/377457 (https://phabricator.wikimedia.org/T161882) (owner: 10Hashar) [13:09:28] !log nodepool: deleting alien instance: openstack server delete ci-jessie-wikimedia-815477 [13:09:32] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [13:11:35] 10Release-Engineering-Team (Watching / External), 10Operations, 10monitoring, 10Tracking, 10Wikimedia-Incident: Tracking: Monitoring and alerts for "business" metrics - https://phabricator.wikimedia.org/T140942#3600174 (10Gilles) [13:44:39] (03CR) 10Hashar: [C: 032] "Tested and each flavor has:" [integration/config] - 10https://gerrit.wikimedia.org/r/377457 (https://phabricator.wikimedia.org/T161882) (owner: 10Hashar) [13:44:47] (03CR) 10Hashar: [C: 032] Remove some trusty jobs [integration/config] - 10https://gerrit.wikimedia.org/r/377453 (https://phabricator.wikimedia.org/T161882) (owner: 10Hashar) [13:46:42] (03Merged) 10jenkins-bot: Remove some trusty jobs [integration/config] - 10https://gerrit.wikimedia.org/r/377453 (https://phabricator.wikimedia.org/T161882) (owner: 10Hashar) [13:46:45] (03Merged) 10jenkins-bot: mwext-Wikibase-client-tests-mysql-php55 to Jessie [integration/config] - 10https://gerrit.wikimedia.org/r/377457 (https://phabricator.wikimedia.org/T161882) (owner: 10Hashar) [13:47:10] Yippee, build fixed! [13:47:11] Project selenium-VisualEditor » firefox,beta,Linux,BrowserTests build #520: 09FIXED in 3 min 9 sec: https://integration.wikimedia.org/ci/job/selenium-VisualEditor/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=BrowserTests/520/ [13:52:10] 10Continuous-Integration-Infrastructure (phase-out-trusty), 10Release-Engineering-Team (Kanban), 10Patch-For-Review: Migrate PHP5.5 jobs from Trusty to Jessie - https://phabricator.wikimedia.org/T161882#3600334 (10hashar) We no more have any jobs on nodepool Trusty instances (label: ci-trusty-wikimedia) `\o/` [13:56:33] hashar :) [13:56:42] does that mean we can go to java 8 now? :) [13:56:53] PROBLEM - Puppet errors on deployment-mira is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [14:00:41] paladox: there are still some jobs running on the permanent Trusty slaves [14:00:47] oh [14:00:59] hashar: any news on the issue we had with eu swat yesterday? [14:01:38] Zppix: yes, Pacific Electric didn't received WMF's paycheck so they cut the power supplies :P [14:01:59] tabbycat: i feel sarcasm [14:02:21] yes: the real issue was scap not being able to ssh to some nodes [14:02:24] Zppix i think they fixed that [14:02:26] at least on EU SWAT [14:02:32] was due to a puppet change for scap [14:02:38] on evening swat, I think jenkins had some issues [14:02:46] jenkins stalled [14:02:54] in any case, congrats to those who helped resolve the issue [14:03:02] paladox: so i can reschedule my patch? [14:03:22] Zppix i think so. [14:03:47] ok morning swat it is [14:04:38] (03PS1) 10Hashar: Move some jobs from Trusty to Jessie [integration/config] - 10https://gerrit.wikimedia.org/r/377467 (https://phabricator.wikimedia.org/T161882) [14:05:18] !log Deleting integration-slave-trusty-1004 - T161882 [14:05:22] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [14:05:22] T161882: Migrate PHP5.5 jobs from Trusty to Jessie - https://phabricator.wikimedia.org/T161882 [14:07:47] Lua error in package.lua at line 80: module 'Module:Category handler/data' not found. [14:08:53] (03CR) 10Hashar: [C: 032] Move some jobs from Trusty to Jessie [integration/config] - 10https://gerrit.wikimedia.org/r/377467 (https://phabricator.wikimedia.org/T161882) (owner: 10Hashar) [14:09:50] PROBLEM - Host integration-slave-trusty-1004 is DOWN: CRITICAL - Host Unreachable (10.68.23.143) [14:10:01] (03Merged) 10jenkins-bot: Move some jobs from Trusty to Jessie [integration/config] - 10https://gerrit.wikimedia.org/r/377467 (https://phabricator.wikimedia.org/T161882) (owner: 10Hashar) [14:13:13] hashar i wonder can this https://github.com/wikimedia/integration-config/blob/ec07802d8bfd44294411cda543fcf1887339c372/jjb/php.yaml#L35 be moved to nodepool? [14:22:30] 10Continuous-Integration-Infrastructure (phase-out-trusty), 10Release-Engineering-Team (Kanban): Remove trusty from dib and clean out puppet - https://phabricator.wikimedia.org/T175696#3600481 (10hashar) [14:22:44] 10Continuous-Integration-Infrastructure (phase-out-trusty), 10Release-Engineering-Team (Kanban): Remove trusty from dib and clean out puppet - https://phabricator.wikimedia.org/T175696#3600497 (10hashar) [14:22:47] 10Continuous-Integration-Infrastructure (phase-out-trusty), 10Release-Engineering-Team (Kanban), 10Patch-For-Review: Migrate PHP5.5 jobs from Trusty to Jessie - https://phabricator.wikimedia.org/T161882#3600498 (10hashar) [14:22:53] 10Continuous-Integration-Infrastructure (phase-out-trusty), 10Release-Engineering-Team (Kanban): Remove trusty from dib and clean out puppet - https://phabricator.wikimedia.org/T175696#3600481 (10hashar) a:03hashar [14:23:49] (03PS1) 10Hashar: dib: remove Ubuntu Trusty [integration/config] - 10https://gerrit.wikimedia.org/r/377475 (https://phabricator.wikimedia.org/T175696) [14:24:24] 10Continuous-Integration-Infrastructure (phase-out-trusty), 10Release-Engineering-Team (Kanban), 10Patch-For-Review: Remove trusty from Nodepool and clean out puppet - https://phabricator.wikimedia.org/T175696#3600507 (10hashar) [14:27:39] (03CR) 10Hashar: [C: 032] dib: remove Ubuntu Trusty [integration/config] - 10https://gerrit.wikimedia.org/r/377475 (https://phabricator.wikimedia.org/T175696) (owner: 10Hashar) [14:30:29] (03Merged) 10jenkins-bot: dib: remove Ubuntu Trusty [integration/config] - 10https://gerrit.wikimedia.org/r/377475 (https://phabricator.wikimedia.org/T175696) (owner: 10Hashar) [14:30:53] (03PS1) 10Hashar: Migrate composer-validate jobs to Jessie [integration/config] - 10https://gerrit.wikimedia.org/r/377477 (https://phabricator.wikimedia.org/T161882) [14:31:52] RECOVERY - Puppet errors on deployment-mira is OK: OK: Less than 1.00% above the threshold [0.0] [14:33:51] Yippee, build fixed! [14:33:52] Project selenium-WikiLove » firefox,beta,Linux,BrowserTests build #514: 09FIXED in 1 min 50 sec: https://integration.wikimedia.org/ci/job/selenium-WikiLove/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=BrowserTests/514/ [14:34:15] (03CR) 10Hashar: [C: 032] Migrate composer-validate jobs to Jessie [integration/config] - 10https://gerrit.wikimedia.org/r/377477 (https://phabricator.wikimedia.org/T161882) (owner: 10Hashar) [14:35:23] (03Merged) 10jenkins-bot: Migrate composer-validate jobs to Jessie [integration/config] - 10https://gerrit.wikimedia.org/r/377477 (https://phabricator.wikimedia.org/T161882) (owner: 10Hashar) [14:38:45] PROBLEM - Puppet errors on integration-slave-jessie-1003 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [14:39:09] PROBLEM - Puppet errors on integration-slave-jessie-1004 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [14:39:19] (03PS1) 10Hashar: integration-composer-check-php55 to Nodepool and Jessie [integration/config] - 10https://gerrit.wikimedia.org/r/377478 (https://phabricator.wikimedia.org/T161882) [14:40:17] (03CR) 10Hashar: [C: 032] integration-composer-check-php55 to Nodepool and Jessie [integration/config] - 10https://gerrit.wikimedia.org/r/377478 (https://phabricator.wikimedia.org/T161882) (owner: 10Hashar) [14:41:26] (03Merged) 10jenkins-bot: integration-composer-check-php55 to Nodepool and Jessie [integration/config] - 10https://gerrit.wikimedia.org/r/377478 (https://phabricator.wikimedia.org/T161882) (owner: 10Hashar) [14:46:39] !log provisionning integration-slave-jessie-1003 and integration-slave-jessie-1004 to move php55lint to them. NOT READY YET - T161882 [14:46:45] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [14:46:45] T161882: Migrate PHP5.5 jobs from Trusty to Jessie - https://phabricator.wikimedia.org/T161882 [14:47:24] 10Continuous-Integration-Infrastructure (phase-out-trusty), 10Release-Engineering-Team (Kanban), 10Patch-For-Review: Migrate PHP5.5 jobs from Trusty to Jessie - https://phabricator.wikimedia.org/T161882#3600613 (10hashar) Still have to switch the jobs using the label phpflavor-php55 labs-tools-ZppixBot-php5... [14:48:17] 10Continuous-Integration-Infrastructure (phase-out-trusty), 10Release-Engineering-Team (Kanban), 10Patch-For-Review: Migrate PHP5.5 jobs from Trusty to Jessie - https://phabricator.wikimedia.org/T161882#3600615 (10hashar) [14:48:49] The last jobs to migrate are the php55lint ones: labs-tools-ZppixBot-php55lint , mediawiki-core-php55lint , mwgate-php55lint , operations-mw-config-php55lint , php55lint [14:49:03] which potentially can later be moved to Docker :] [14:51:12] you rang? [14:51:15] oh nevermind [14:57:47] !log Deleted all left over jenkins jobs having ci-trusty-wikimedia label. - T161882 [14:57:51] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [14:57:51] T161882: Migrate PHP5.5 jobs from Trusty to Jessie - https://phabricator.wikimedia.org/T161882 [15:00:09] 10Continuous-Integration-Infrastructure (phase-out-trusty), 10Release-Engineering-Team (Kanban), 10Patch-For-Review: Migrate PHP5.5 jobs from Trusty to Jessie - https://phabricator.wikimedia.org/T161882#3600691 (10hashar) And I have deleted some left over jobs that had the `ci-trusty-wikimedia` label althoug... [15:04:09] RECOVERY - Puppet errors on integration-slave-jessie-1004 is OK: OK: Less than 1.00% above the threshold [0.0] [15:06:59] PROBLEM - Puppet errors on deployment-kafka-jumbo-2 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [15:07:25] PROBLEM - Puppet errors on deployment-kafka-jumbo-1 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [15:08:46] RECOVERY - Puppet errors on integration-slave-jessie-1003 is OK: OK: Less than 1.00% above the threshold [0.0] [15:09:18] 10Continuous-Integration-Infrastructure (phase-out-trusty), 10Release-Engineering-Team (Kanban), 10Patch-For-Review: Package php modules for Zend 5.5 on Jessie - https://phabricator.wikimedia.org/T174972#3600734 (10hashar) https://gerrit.wikimedia.org/r/#/c/377469/2/modules/aptrepo/files/distributions-wikime... [15:15:12] idle & [15:16:21] Project mediawiki-core-code-coverage build #3003: 04STILL FAILING in 16 min: https://integration.wikimedia.org/ci/job/mediawiki-core-code-coverage/3003/ [15:39:18] PROBLEM - Puppet errors on integration-slave-docker-1001 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [16:13:08] PROBLEM - Puppet errors on deployment-mathoid is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [16:19:39] PROBLEM - Puppet errors on deployment-kafka05 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [16:22:30] PROBLEM - Puppet errors on deployment-sca02 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [16:22:50] PROBLEM - Puppet errors on deployment-cache-text04 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [16:27:29] yarrrr [16:27:33] deployment-puppetmaster02 not rebasing well [16:27:48] error: could not apply 9974c0f79a... swift: use implicit /dev/swift prefix for swift devices [16:27:49] Recorded preimage for 'modules/swift/manifests/init_device.pp' [16:27:49] Could not pick 9974c0f79ab7c8a1474849f91a616d656cde859d [16:27:49] Rebase failed! See error messages above. [16:27:49] Reverting rebase attempt [16:28:35] ah that's godog [16:28:37] ^ [16:28:54] godog: , i think you have a patch applied in deployment prep that won't rebase anymore [16:29:13] yeah I'll poke at it tomorrow, gehel was mentioning it too, sorry today I can't [16:29:41] hm, ok, it means that pupetmaster in deployment prep can't update puppet, can get rid of it from there? [16:30:20] godog: ^? [16:30:45] PROBLEM - Puppet errors on deployment-kafka03 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [16:30:46] un-cherry pick it? [16:31:03] PROBLEM - Puppet errors on deployment-trending01 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [16:32:46] PROBLEM - Puppet errors on deployment-eventlog02 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [16:32:52] ottomata: I'm doing puppet swat now, you can un cherry-pick it, please watch ms-be and ms-fe if they break [16:33:04] in deployment-prep? [16:33:30] PROBLEM - Puppet errors on deployment-eventlogging04 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [16:33:42] these are me ^ [16:33:48] will fix asap, all the sudden a meeting started too [16:33:48] ahhh [16:34:38] yeah in deployment-prep ottomata [16:39:44] PROBLEM - Puppet errors on deployment-kafka04 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [16:40:08] PROBLEM - Puppet errors on deployment-cache-upload04 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [16:42:27] PROBLEM - Puppet errors on deployment-changeprop is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [16:44:03] PROBLEM - Puppet errors on deployment-sca03 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [16:45:21] PROBLEM - Puppet errors on deployment-logstash2 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [16:45:26] PROBLEM - Puppet errors on deployment-sca01 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [16:53:07] RECOVERY - Puppet errors on deployment-mathoid is OK: OK: Less than 1.00% above the threshold [0.0] [17:14:19] RECOVERY - Puppet errors on integration-slave-docker-1001 is OK: OK: Less than 1.00% above the threshold [0.0] [17:25:50] 10Beta-Cluster-Infrastructure: Deployment wiki is flooded by spam and should be cleaned up, perhaps even restricted more - https://phabricator.wikimedia.org/T175197#3601220 (10Mainframe98) While it would be nice, I don't see an immediate need for it. Granted, I'm only looking at deployment wiki for now, so I don... [17:30:19] 10Gerrit, 10Patch-For-Review, 10User-Ladsgroup: Make gerrit use the new WMF logo - https://phabricator.wikimedia.org/T174576#3601241 (10greg) 05Open>03Resolved Merged, deployed, done. [17:45:07] RECOVERY - Puppet errors on deployment-cache-upload04 is OK: OK: Less than 1.00% above the threshold [0.0] [17:47:46] RECOVERY - Puppet errors on deployment-eventlog02 is OK: OK: Less than 1.00% above the threshold [0.0] [17:48:31] RECOVERY - Puppet errors on deployment-eventlogging04 is OK: OK: Less than 1.00% above the threshold [0.0] [17:49:03] RECOVERY - Puppet errors on deployment-sca03 is OK: OK: Less than 1.00% above the threshold [0.0] [17:49:39] RECOVERY - Puppet errors on deployment-kafka05 is OK: OK: Less than 1.00% above the threshold [0.0] [17:49:45] RECOVERY - Puppet errors on deployment-kafka04 is OK: OK: Less than 1.00% above the threshold [0.0] [17:50:22] RECOVERY - Puppet errors on deployment-logstash2 is OK: OK: Less than 1.00% above the threshold [0.0] [17:50:27] RECOVERY - Puppet errors on deployment-sca01 is OK: OK: Less than 1.00% above the threshold [0.0] [17:50:45] RECOVERY - Puppet errors on deployment-kafka03 is OK: OK: Less than 1.00% above the threshold [0.0] [17:51:02] RECOVERY - Puppet errors on deployment-trending01 is OK: OK: Less than 1.00% above the threshold [0.0] [17:52:26] RECOVERY - Puppet errors on deployment-changeprop is OK: OK: Less than 1.00% above the threshold [0.0] [17:57:27] RECOVERY - Puppet errors on deployment-sca02 is OK: OK: Less than 1.00% above the threshold [0.0] [17:57:50] RECOVERY - Puppet errors on deployment-cache-text04 is OK: OK: Less than 1.00% above the threshold [0.0] [18:17:16] PROBLEM - Free space - all mounts on deployment-tin is CRITICAL: CRITICAL: deployment-prep.deployment-tin.diskspace._mnt.byte_percentfree (No valid datapoints found)deployment-prep.deployment-tin.diskspace._srv.byte_percentfree (<10.00%) [18:21:01] uh oh [18:21:10] I'm migrating locations, bbiab [18:21:53] hasharAway: thcipriani fyi that trusty removal landed in nodepool now [18:25:26] chasemp: ah, cool, that should make nodepool's logic a bit simpler. Thanks! [18:32:01] lol animoji [18:32:03] woops [18:32:06] wrong place [18:39:48] FYI godog, i rebased and resolved conflicts, hopefully correctly. [18:43:27] PROBLEM - Puppet errors on deployment-changeprop is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [18:49:43] !log nodepool: deleted image image-ci-trusty_old_20170804 Keeping image-ci-trusty just in case [18:49:47] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [19:00:04] 10Deployment-Systems, 10Tracking: Trebuchet blockers for MediaWiki (tracking) - https://phabricator.wikimedia.org/T45338#3601699 (10demon) [19:02:25] RECOVERY - Puppet errors on deployment-kafka-jumbo-1 is OK: OK: Less than 1.00% above the threshold [0.0] [19:03:55] 10Deployment-Systems, 10Release-Engineering-Team (Backlog), 10Operations: Trebuchet targets for test/testrepo are out of date - https://phabricator.wikimedia.org/T149180#3601711 (10demon) 05Open>03declined Nobody cares anymore. [19:04:44] 10Beta-Cluster-Infrastructure, 10Continuous-Integration-Infrastructure: Ensure /srv/deployment/integration/slave-scripts is latest master on deployment-tin - https://phabricator.wikimedia.org/T97324#3601714 (10demon) 05Open>03declined Nobody cares anymore. [19:05:15] 10Scap (Scap3-Adoption-Phase1), 10Wikimedia-Wikimania-Scholarships, 10Patch-For-Review, 10User-bd808: Deploy scholarships with scap3 - https://phabricator.wikimedia.org/T129134#3601716 (10mmodell) {icon check color=green} Done? [19:06:56] RECOVERY - Puppet errors on deployment-kafka-jumbo-2 is OK: OK: Less than 1.00% above the threshold [0.0] [19:09:27] Project selenium-MinervaNeue » chrome,beta,Linux,BrowserTests build #115: 04FAILURE in 20 min: https://integration.wikimedia.org/ci/job/selenium-MinervaNeue/BROWSER=chrome,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=BrowserTests/115/ [19:10:12] PROBLEM - Puppet errors on integration-slave-jessie-1004 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [19:18:27] RECOVERY - Puppet errors on deployment-changeprop is OK: OK: Less than 1.00% above the threshold [0.0] [19:20:24] PROBLEM - Puppet errors on integration-slave-jessie-1002 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [19:24:44] PROBLEM - Puppet errors on integration-slave-jessie-1003 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [19:25:28] PROBLEM - Puppet errors on integration-slave-jessie-1001 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [19:37:49] no_justification, sorry for ping just wanted to say i think in gerrit 2.15, it will upgrade the mysql connector which i did as it does java 8 https://gerrit-review.googlesource.com/#/c/gerrit/+/107230/ to 6.x. which means we wont be able to use the debian repo. [19:40:05] !log hacked integration-slave-jessie hosts to ship them php5.5 [19:40:09] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [19:49:47] RECOVERY - Puppet errors on integration-slave-jessie-1003 is OK: OK: Less than 1.00% above the threshold [0.0] [19:50:09] RECOVERY - Puppet errors on integration-slave-jessie-1004 is OK: OK: Less than 1.00% above the threshold [0.0] [19:50:23] RECOVERY - Puppet errors on integration-slave-jessie-1002 is OK: OK: Less than 1.00% above the threshold [0.0] [19:50:25] RECOVERY - Puppet errors on integration-slave-jessie-1001 is OK: OK: Less than 1.00% above the threshold [0.0] [20:16:05] k puppet is happy [20:20:24] hashar: Happy might not be the right word. Puppet is never /happy/ :p [20:20:33] Maybe just: "not currently raging in anger?" [20:20:34] Hehehehe [20:21:48] (03PS1) 10Hashar: Migrate phplint jobs to Jessie [integration/config] - 10https://gerrit.wikimedia.org/r/377556 (https://phabricator.wikimedia.org/T161882) [20:30:59] (03PS2) 10Hashar: Migrate phplint jobs to Jessie [integration/config] - 10https://gerrit.wikimedia.org/r/377556 (https://phabricator.wikimedia.org/T161882) [20:32:22] (03CR) 10Hashar: [C: 032] Migrate phplint jobs to Jessie [integration/config] - 10https://gerrit.wikimedia.org/r/377556 (https://phabricator.wikimedia.org/T161882) (owner: 10Hashar) [20:33:30] (03Merged) 10jenkins-bot: Migrate phplint jobs to Jessie [integration/config] - 10https://gerrit.wikimedia.org/r/377556 (https://phabricator.wikimedia.org/T161882) (owner: 10Hashar) [20:35:45] !log pooling integration-slave-jessie-1003 and integration-slave-jessie-1004 [20:35:48] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [20:37:06] i get everything to work with soy, then it gets stuck on the tests. [20:37:22] they should create phpunits for java, i really want to ignore some things. [20:38:22] I am a job away from phasing out trusty entirely :] [20:38:31] :) [20:38:36] what job are you left on? [20:38:38] :) [20:38:45] php-compile-php55 [20:38:55] will deal with it tomorrow [20:39:06] ok [20:40:30] 10Continuous-Integration-Infrastructure (phase-out-trusty), 10Release-Engineering-Team (Kanban), 10Patch-For-Review: Migrate PHP5.5 jobs from Trusty to Jessie - https://phabricator.wikimedia.org/T161882#3602189 (10hashar) php-compile-php55 is the last job still on Trusty. [20:41:12] paladox: and the Nodepool trusty instances are gone :] [20:41:20] :) [20:41:20] :) [20:41:27] so as soon as I manage to migrate that last job, that means that every single CI slave will be running Jessie [20:41:49] hashar that means we can go onto java 8 and jenkins 2.60 soon :) [20:42:24] yeahhhh :] [20:42:45] all since you mentionned it now requires java8 which is not on trusty [20:42:51] that was a good incentive to phase out trusty [20:43:01] at the price of spending a couple days figuring out how to package php5.5 for jessie [20:43:01] yeh [20:43:05] which is more or less done [20:43:08] :) [20:43:19] the port is terribly dirty though [20:43:23] but the tests pass [20:43:36] I even found a bug in mediawiki/core :] [20:43:56] oh [20:44:01] https://gerrit.wikimedia.org/r/#/c/377232/ [20:44:15] on some edge case an exception is thrown, which prevent some state from being restored [20:44:18] :) [20:44:31] that potentially cause later tests to fail [20:44:32] ! [20:44:44] tldr [20:44:46] I got a fatal error [20:47:17] PROBLEM - Free space - all mounts on deployment-tin is CRITICAL: CRITICAL: deployment-prep.deployment-tin.diskspace._mnt.byte_percentfree (No valid datapoints found)deployment-prep.deployment-tin.diskspace._srv.byte_percentfree (<11.11%) [20:47:41] thcipriani: deployment-tin, we should probably rebuild it with more disk space [20:47:48] twentyafterfour was hinting at that iirc [20:48:07] 10Scap, 10Analytics-Kanban, 10EventBus, 10Patch-For-Review, 10User-Elukey: eventlogging-service-eventbus scap deployments should depool/pool during deployment - https://phabricator.wikimedia.org/T171506#3602262 (10Nuria) 05Open>03Resolved [20:48:31] ah yeah [20:48:32] https://phabricator.wikimedia.org/T166492 [20:48:35] m1.large [20:48:48] IF ONLY WE COULD JUST CHANGE THE FLAVOR [20:50:11] yeah, larger disk would be nicer on that machine. we're definitely not very careful about disk space on deployment hosts. [20:50:35] well [20:50:51] removed the /srv/mediawiki/.git dir on deployment-tin since it will be regenerated and it's not used in deployment. [20:51:03] a few TBytes of disk in RAID is cheaper than us trying to figure out a way to save disk? :D [20:51:07] so /srv is at 57% usage now [20:51:16] !!! [20:51:53] need to figure out how to manage that repo better before it's ready for prime time. We've got it turned off behind a feature flagged in production. [21:02:15] RECOVERY - Free space - all mounts on deployment-tin is OK: OK: deployment-prep.deployment-tin.diskspace._mnt.byte_percentfree (No valid datapoints found) [21:36:17] sleep & [21:37:53] O/ hashar [21:57:58] so fun thing ... building a new mwv box. Tried multiple times but each time it tries to clone core it dies. Ran it manually, same thing. it gets as far as "remote: Compressing objects: 99% (676275/676276)" before "error: RPC failed; curl 56 GnuTLS recv error (-110): The TLS connection was non-properly terminated.". If anyone has ideas before i blow the next hour figuring out what it's [21:58:04] doing .. [22:08:43] 10Release-Engineering-Team (Kanban): wmf.14 Blocker - Post Mortem - Cannot flush pre-lock snapshot because writes are pending - https://phabricator.wikimedia.org/T173477#3602431 (10Jrbranaa) 05Open>03Resolved Hello @aaron, Per our chat. I'd like to get some more info from you if possible. We could do it a... [22:09:32] 10Release-Engineering-Team (Kanban): wmf.14 Blocker - Post Mortem - Cannot flush pre-lock snapshot because writes are pending - https://phabricator.wikimedia.org/T173477#3602438 (10Jrbranaa) [22:24:59] ebernhardson: there's an open bug about that. it's some form of git + RNG hell [23:23:24] 10Continuous-Integration-Infrastructure: integration-slave-jessie-1003 (and others?) missing jsduck executable - https://phabricator.wikimedia.org/T175764#3602645 (10Jdforrester-WMF) [23:24:52] 10Continuous-Integration-Infrastructure: integration-slave-jessie-1003 (and others?) missing jsduck executable - https://phabricator.wikimedia.org/T175764#3602659 (10Jdforrester-WMF) Also integration-slave-jessie-1002 in https://integration.wikimedia.org/ci/job/mwext-VisualEditor-jsduck/6494/console [23:25:41] 10Continuous-Integration-Infrastructure: integration-slave-jessie-1003 (and others?) missing jsduck executable - https://phabricator.wikimedia.org/T175764#3602663 (10Jdforrester-WMF) Aha. Last successful pass was on integration-slave-*trusty*-1001 with https://integration.wikimedia.org/ci/job/mwext-VisualEditor-...