[00:00:11] maintenance-disconnect-full-disks build 19720 integration-slave-docker-1021: OFFLINE due to disk space [00:00:11] maintenance-disconnect-full-disks build 19720 integration-slave-docker-1021: OFFLINE due to disk space [00:25:11] maintenance-disconnect-full-disks build 19725 integration-slave-docker-1021: OFFLINE due to disk space [00:25:11] maintenance-disconnect-full-disks build 19725 integration-slave-docker-1021: OFFLINE due to disk space [00:50:11] maintenance-disconnect-full-disks build 19730 integration-slave-docker-1021: OFFLINE due to disk space [00:50:11] maintenance-disconnect-full-disks build 19730 integration-slave-docker-1021: OFFLINE due to disk space [01:15:11] maintenance-disconnect-full-disks build 19735 integration-slave-docker-1021: OFFLINE due to disk space [01:15:11] maintenance-disconnect-full-disks build 19735 integration-slave-docker-1021: OFFLINE due to disk space [01:40:13] maintenance-disconnect-full-disks build 19740 integration-slave-docker-1021: OFFLINE due to disk space [01:40:13] maintenance-disconnect-full-disks build 19740 integration-slave-docker-1021: OFFLINE due to disk space [02:05:10] maintenance-disconnect-full-disks build 19745 integration-slave-docker-1021: OFFLINE due to disk space [02:05:10] maintenance-disconnect-full-disks build 19745 integration-slave-docker-1021: OFFLINE due to disk space [02:12:03] PROBLEM - Puppet staleness on deployment-maps05 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [43200.0] [02:12:03] PROBLEM - Puppet staleness on deployment-maps05 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [43200.0] [02:30:10] maintenance-disconnect-full-disks build 19750 integration-slave-docker-1021: OFFLINE due to disk space [02:30:10] maintenance-disconnect-full-disks build 19750 integration-slave-docker-1021: OFFLINE due to disk space [02:55:11] maintenance-disconnect-full-disks build 19755 integration-slave-docker-1021: OFFLINE due to disk space [02:55:11] maintenance-disconnect-full-disks build 19755 integration-slave-docker-1021: OFFLINE due to disk space [03:20:10] maintenance-disconnect-full-disks build 19760 integration-slave-docker-1021: OFFLINE due to disk space [03:20:10] maintenance-disconnect-full-disks build 19760 integration-slave-docker-1021: OFFLINE due to disk space [03:45:10] maintenance-disconnect-full-disks build 19765 integration-slave-docker-1021: OFFLINE due to disk space [03:45:10] maintenance-disconnect-full-disks build 19765 integration-slave-docker-1021: OFFLINE due to disk space [04:10:11] maintenance-disconnect-full-disks build 19770 integration-slave-docker-1021: OFFLINE due to disk space [04:10:11] maintenance-disconnect-full-disks build 19770 integration-slave-docker-1021: OFFLINE due to disk space [04:35:10] maintenance-disconnect-full-disks build 19775 integration-slave-docker-1021: OFFLINE due to disk space [04:35:10] maintenance-disconnect-full-disks build 19775 integration-slave-docker-1021: OFFLINE due to disk space [05:00:09] maintenance-disconnect-full-disks build 19780 integration-slave-docker-1021: OFFLINE due to disk space [05:00:10] maintenance-disconnect-full-disks build 19780 integration-slave-docker-1021: OFFLINE due to disk space [05:25:11] maintenance-disconnect-full-disks build 19785 integration-slave-docker-1021: OFFLINE due to disk space [05:25:11] maintenance-disconnect-full-disks build 19785 integration-slave-docker-1021: OFFLINE due to disk space [05:50:10] maintenance-disconnect-full-disks build 19790 integration-slave-docker-1021: OFFLINE due to disk space [05:50:10] maintenance-disconnect-full-disks build 19790 integration-slave-docker-1021: OFFLINE due to disk space [06:15:10] maintenance-disconnect-full-disks build 19795 integration-slave-docker-1021: OFFLINE due to disk space [06:15:10] maintenance-disconnect-full-disks build 19795 integration-slave-docker-1021: OFFLINE due to disk space [06:40:10] maintenance-disconnect-full-disks build 19800 integration-slave-docker-1021: OFFLINE due to disk space [06:40:11] maintenance-disconnect-full-disks build 19800 integration-slave-docker-1021: OFFLINE due to disk space [07:05:11] maintenance-disconnect-full-disks build 19805 integration-slave-docker-1021: OFFLINE due to disk space [07:05:11] maintenance-disconnect-full-disks build 19805 integration-slave-docker-1021: OFFLINE due to disk space [07:13:14] 10Project-Admins: Phabricator project for an extension LangCodeOverride - https://phabricator.wikimedia.org/T209231 (10Aklapper) https://www.mediawiki.org/wiki/Extension:LangCodeOverride states that the code is located at https://github.com/jeblad/LangCodeOverride/ and that page already offers an "Issues" tab fo... [07:13:14] 10Project-Admins: Phabricator project for an extension LangCodeOverride - https://phabricator.wikimedia.org/T209231 (10Aklapper) https://www.mediawiki.org/wiki/Extension:LangCodeOverride states that the code is located at https://github.com/jeblad/LangCodeOverride/ and that page already offers an "Issues" tab fo... [07:30:10] maintenance-disconnect-full-disks build 19810 integration-slave-docker-1021: OFFLINE due to disk space [07:30:10] maintenance-disconnect-full-disks build 19810 integration-slave-docker-1021: OFFLINE due to disk space [07:55:10] maintenance-disconnect-full-disks build 19815 integration-slave-docker-1021: OFFLINE due to disk space [07:55:10] maintenance-disconnect-full-disks build 19815 integration-slave-docker-1021: OFFLINE due to disk space [08:20:09] maintenance-disconnect-full-disks build 19820 integration-slave-docker-1021: OFFLINE due to disk space [08:20:10] maintenance-disconnect-full-disks build 19820 integration-slave-docker-1021: OFFLINE due to disk space [08:28:54] 10MediaWiki-Releasing, 10MediaWiki-Installer, 10Epic, 10MW-1.33-release: Expand the set of bundled extensions and skins in MediaWiki 1.33 - https://phabricator.wikimedia.org/T209220 (10Catrope) [08:28:54] 10MediaWiki-Releasing, 10MediaWiki-Installer, 10Epic, 10MW-1.33-release: Expand the set of bundled extensions and skins in MediaWiki 1.33 - https://phabricator.wikimedia.org/T209220 (10Catrope) [08:45:11] maintenance-disconnect-full-disks build 19825 integration-slave-docker-1021: OFFLINE due to disk space [08:45:11] maintenance-disconnect-full-disks build 19825 integration-slave-docker-1021: OFFLINE due to disk space [09:10:10] maintenance-disconnect-full-disks build 19830 integration-slave-docker-1021: OFFLINE due to disk space [09:10:10] maintenance-disconnect-full-disks build 19830 integration-slave-docker-1021: OFFLINE due to disk space [09:12:11] thats lots of nodes :P [09:12:11] thats lots of nodes :P [09:16:05] (03PS1) 10Hashar: PhpTags: enable php7.0, disable php7.1 [integration/config] - 10https://gerrit.wikimedia.org/r/473163 (https://phabricator.wikimedia.org/T188585) [09:16:05] (03PS1) 10Hashar: PhpTags: enable php7.0, disable php7.1 [integration/config] - 10https://gerrit.wikimedia.org/r/473163 (https://phabricator.wikimedia.org/T188585) [09:17:23] (03PS2) 10Hashar: PhpTags: enable php7.0, disable php7.1 [integration/config] - 10https://gerrit.wikimedia.org/r/473163 (https://phabricator.wikimedia.org/T188585) [09:17:23] (03PS2) 10Hashar: PhpTags: enable php7.0, disable php7.1 [integration/config] - 10https://gerrit.wikimedia.org/r/473163 (https://phabricator.wikimedia.org/T188585) [09:17:41] oh no.. i was looking at the build number :P [09:17:41] oh no.. i was looking at the build number :P [09:19:08] (03CR) 10jerkins-bot: [V: 04-1] PhpTags: enable php7.0, disable php7.1 [integration/config] - 10https://gerrit.wikimedia.org/r/473163 (https://phabricator.wikimedia.org/T188585) (owner: 10Hashar) [09:19:09] (03CR) 10jerkins-bot: [V: 04-1] PhpTags: enable php7.0, disable php7.1 [integration/config] - 10https://gerrit.wikimedia.org/r/473163 (https://phabricator.wikimedia.org/T188585) (owner: 10Hashar) [09:26:01] addshore: I will clean up that docker slave :) [09:26:01] addshore: I will clean up that docker slave :) [09:26:08] already doing it ;) [09:26:08] already doing it ;) [09:26:10] i beat you ;) [09:26:11] i beat you ;) [09:26:22] {done} [09:26:22] {done} [09:26:27] great [09:26:27] great [09:26:36] I guess it will automagicaly come back online? [09:26:36] I guess it will automagicaly come back online? [09:26:41] I have no idea [09:26:41] I have no idea [09:26:56] lets see :D [09:26:56] lets see :D [09:26:57] !log integration-slave-docker-1021:/# docker rmi $(docker images | grep " months " |grep -v " [1-5] months " | awk '{print $3}') [09:26:57] !log integration-slave-docker-1021:/# docker rmi $(docker images | grep " months " |grep -v " [1-5] months " | awk '{print $3}') [09:26:59] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [09:26:59] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [09:27:24] I usually also prune container and images via: sudo docker container prune -f; sudo docker image prune -f [09:27:24] I usually also prune container and images via: sudo docker container prune -f; sudo docker image prune -f [09:27:36] that probably gets rid of all images though no? [09:27:36] that probably gets rid of all images though no? [09:27:40] we also do not garbage collect images that are no more used [09:27:40] we also do not garbage collect images that are no more used [09:27:41] so yeah [09:27:41] so yeah [09:28:00] get rid of all of them :) they will be redownloaded as jobs run on it [09:28:00] get rid of all of them :) they will be redownloaded as jobs run on it [09:28:15] yeh, but i guess there sin't any point in getting rid of the very recent ones :P [09:28:15] yeh, but i guess there sin't any point in getting rid of the very recent ones :P [09:28:32] tox for example is 4 months ago [09:28:32] tox for example is 4 months ago [09:28:35] sudo docker images|awk '{ print $3 }'|xargs docker rmi [09:28:36] sudo docker images|awk '{ print $3 }'|xargs docker rmi [09:28:36] :) [09:28:36] :) [09:28:44] hah [09:28:44] hah [09:29:00] releng/tox is old since we probably haven't refreshed the container image since then [09:29:00] releng/tox is old since we probably haven't refreshed the container image since then [09:29:32] !log integration-slave-docker-1021:/# docker rmi $(docker images | grep " months " |grep -v " [1-2] months " | awk '{print $3}') [09:29:32] !log integration-slave-docker-1021:/# docker rmi $(docker images | grep " months " |grep -v " [1-2] months " | awk '{print $3}') [09:29:33] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [09:29:33] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [09:29:38] okay, i just left 2 months of images ;) [09:29:38] okay, i just left 2 months of images ;) [09:30:44] (03PS3) 10Hashar: PhpTags: enable php7.0, disable php7.1 [integration/config] - 10https://gerrit.wikimedia.org/r/473163 (https://phabricator.wikimedia.org/T188585) [09:30:44] (03PS3) 10Hashar: PhpTags: enable php7.0, disable php7.1 [integration/config] - 10https://gerrit.wikimedia.org/r/473163 (https://phabricator.wikimedia.org/T188585) [09:31:04] (03CR) 10Hashar: "PS2 was wrong" [integration/config] - 10https://gerrit.wikimedia.org/r/473163 (https://phabricator.wikimedia.org/T188585) (owner: 10Hashar) [09:31:04] (03CR) 10Hashar: "PS2 was wrong" [integration/config] - 10https://gerrit.wikimedia.org/r/473163 (https://phabricator.wikimedia.org/T188585) (owner: 10Hashar) [09:31:43] !log manually brought integration-slave-docker-1021 back online [09:31:43] !log manually brought integration-slave-docker-1021 back online [09:31:45] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [09:31:45] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [09:32:23] (03CR) 10jerkins-bot: [V: 04-1] PhpTags: enable php7.0, disable php7.1 [integration/config] - 10https://gerrit.wikimedia.org/r/473163 (https://phabricator.wikimedia.org/T188585) (owner: 10Hashar) [09:32:23] (03CR) 10jerkins-bot: [V: 04-1] PhpTags: enable php7.0, disable php7.1 [integration/config] - 10https://gerrit.wikimedia.org/r/473163 (https://phabricator.wikimedia.org/T188585) (owner: 10Hashar) [09:33:39] (03PS4) 10Hashar: PhpTags: enable php7.0, disable php7.1 [integration/config] - 10https://gerrit.wikimedia.org/r/473163 (https://phabricator.wikimedia.org/T188585) [09:33:39] (03PS4) 10Hashar: PhpTags: enable php7.0, disable php7.1 [integration/config] - 10https://gerrit.wikimedia.org/r/473163 (https://phabricator.wikimedia.org/T188585) [09:35:49] (03CR) 10Hashar: [C: 032] "Finally. Sorry for the spam." [integration/config] - 10https://gerrit.wikimedia.org/r/473163 (https://phabricator.wikimedia.org/T188585) (owner: 10Hashar) [09:35:49] (03CR) 10Hashar: [C: 032] "Finally. Sorry for the spam." [integration/config] - 10https://gerrit.wikimedia.org/r/473163 (https://phabricator.wikimedia.org/T188585) (owner: 10Hashar) [09:37:35] (03Merged) 10jenkins-bot: PhpTags: enable php7.0, disable php7.1 [integration/config] - 10https://gerrit.wikimedia.org/r/473163 (https://phabricator.wikimedia.org/T188585) (owner: 10Hashar) [09:37:35] (03Merged) 10jenkins-bot: PhpTags: enable php7.0, disable php7.1 [integration/config] - 10https://gerrit.wikimedia.org/r/473163 (https://phabricator.wikimedia.org/T188585) (owner: 10Hashar) [09:41:42] addshore: the slave is back online. I guess Jenkins noticed some disk space went free [09:41:42] addshore: the slave is back online. I guess Jenkins noticed some disk space went free [09:52:38] hashar: nah i manually poked it ;) [09:52:38] hashar: nah i manually poked it ;) [09:53:00] it looked like it checked every 25 mins though, so I guess it would have come back [09:53:00] it looked like it checked every 25 mins though, so I guess it would have come back [10:08:08] 10Gerrit, 10GitHub-Mirrors, 10Release Pipeline (Blubber): https://gerrit.wikimedia.org/r/#/projects/blubber should be mirrored to github - https://phabricator.wikimedia.org/T209280 (10hashar) [10:08:08] 10Gerrit, 10GitHub-Mirrors, 10Release Pipeline (Blubber): https://gerrit.wikimedia.org/r/#/projects/blubber should be mirrored to github - https://phabricator.wikimedia.org/T209280 (10hashar) [10:09:56] 10Gerrit, 10GitHub-Mirrors, 10Release Pipeline (Blubber): https://gerrit.wikimedia.org/r/#/projects/blubber should be mirrored to github - https://phabricator.wikimedia.org/T209280 (10hashar) [10:09:56] 10Gerrit, 10GitHub-Mirrors, 10Release Pipeline (Blubber): https://gerrit.wikimedia.org/r/#/projects/blubber should be mirrored to github - https://phabricator.wikimedia.org/T209280 (10hashar) [10:12:14] 10Gerrit, 10GitHub-Mirrors, 10Release Pipeline (Blubber): https://gerrit.wikimedia.org/r/#/projects/blubber should be mirrored to github - https://phabricator.wikimedia.org/T209280 (10hashar) ` $ ssh -p 29418 gerrit.wikimedia.org replication start blubber --wait Error: Cannot replicate to git@github.com:wiki... [10:12:14] 10Gerrit, 10GitHub-Mirrors, 10Release Pipeline (Blubber): https://gerrit.wikimedia.org/r/#/projects/blubber should be mirrored to github - https://phabricator.wikimedia.org/T209280 (10hashar) ` $ ssh -p 29418 gerrit.wikimedia.org replication start blubber --wait Error: Cannot replicate to git@github.com:wiki... [10:14:05] 10Gerrit, 10GitHub-Mirrors, 10Release Pipeline (Blubber): https://gerrit.wikimedia.org/r/#/projects/blubber should be mirrored to github - https://phabricator.wikimedia.org/T209280 (10hashar) 05Open>03Resolved a:03hashar I have manually created the GitHub repository at https://github.com/wikimedia/blub... [10:14:05] 10Gerrit, 10GitHub-Mirrors, 10Release Pipeline (Blubber): https://gerrit.wikimedia.org/r/#/projects/blubber should be mirrored to github - https://phabricator.wikimedia.org/T209280 (10hashar) 05Open>03Resolved a:03hashar I have manually created the GitHub repository at https://github.com/wikimedia/blub... [10:29:21] (03CR) 10Hashar: [C: 032] Remove redundant/confusing comments from Wikibase job [integration/config] - 10https://gerrit.wikimedia.org/r/473004 (owner: 10Thiemo Kreuz (WMDE)) [10:29:21] (03CR) 10Hashar: [C: 032] Remove redundant/confusing comments from Wikibase job [integration/config] - 10https://gerrit.wikimedia.org/r/473004 (owner: 10Thiemo Kreuz (WMDE)) [10:32:43] (03Merged) 10jenkins-bot: Remove redundant/confusing comments from Wikibase job [integration/config] - 10https://gerrit.wikimedia.org/r/473004 (owner: 10Thiemo Kreuz (WMDE)) [10:32:43] (03Merged) 10jenkins-bot: Remove redundant/confusing comments from Wikibase job [integration/config] - 10https://gerrit.wikimedia.org/r/473004 (owner: 10Thiemo Kreuz (WMDE)) [10:36:03] PROBLEM - Puppet errors on deployment-mediawiki-07 is CRITICAL: CRITICAL: 2.25% of data above the critical threshold [3.0] [10:36:03] PROBLEM - Puppet errors on deployment-mediawiki-07 is CRITICAL: CRITICAL: 2.25% of data above the critical threshold [3.0] [10:48:47] (03PS1) 10Hashar: Migrate DonationInterface to Docker job (non voting) [integration/config] - 10https://gerrit.wikimedia.org/r/473181 (https://phabricator.wikimedia.org/T203084) [10:48:47] (03PS1) 10Hashar: Migrate DonationInterface to Docker job (non voting) [integration/config] - 10https://gerrit.wikimedia.org/r/473181 (https://phabricator.wikimedia.org/T203084) [10:49:45] hashar: is it ok for me to restart nodepool on labnodepool1001.eqiad.wmnet to pikc up graphite dns changes? [10:49:45] hashar: is it ok for me to restart nodepool on labnodepool1001.eqiad.wmnet to pikc up graphite dns changes? [10:50:20] (03CR) 10jerkins-bot: [V: 04-1] Migrate DonationInterface to Docker job (non voting) [integration/config] - 10https://gerrit.wikimedia.org/r/473181 (https://phabricator.wikimedia.org/T203084) (owner: 10Hashar) [10:50:20] (03CR) 10jerkins-bot: [V: 04-1] Migrate DonationInterface to Docker job (non voting) [integration/config] - 10https://gerrit.wikimedia.org/r/473181 (https://phabricator.wikimedia.org/T203084) (owner: 10Hashar) [10:51:46] godog: yeah can be restarted at any time :) [10:51:47] godog: yeah can be restarted at any time :) [10:52:04] godog: and hopefully I will have dropped Nodepool by end of this week :) [10:52:04] godog: and hopefully I will have dropped Nodepool by end of this week :) [10:52:16] oh, nice to know hashar ! [10:52:16] oh, nice to know hashar ! [10:52:44] yeah looks like nodepool was the only thing still sending statsd to the old address [10:52:44] yeah looks like nodepool was the only thing still sending statsd to the old address [10:53:52] (03PS2) 10Hashar: Migrate DonationInterface to Docker job (non voting) [integration/config] - 10https://gerrit.wikimedia.org/r/473181 (https://phabricator.wikimedia.org/T203084) [10:53:52] (03PS2) 10Hashar: Migrate DonationInterface to Docker job (non voting) [integration/config] - 10https://gerrit.wikimedia.org/r/473181 (https://phabricator.wikimedia.org/T203084) [10:54:11] (03CR) 10Hashar: [C: 032] "I really need to phase out the legacy Nodepool infrastructure, so I am hereby removing the old legacy job and replacing it with a Docker b" [integration/config] - 10https://gerrit.wikimedia.org/r/473181 (https://phabricator.wikimedia.org/T203084) (owner: 10Hashar) [10:54:11] (03CR) 10Hashar: [C: 032] "I really need to phase out the legacy Nodepool infrastructure, so I am hereby removing the old legacy job and replacing it with a Docker b" [integration/config] - 10https://gerrit.wikimedia.org/r/473181 (https://phabricator.wikimedia.org/T203084) (owner: 10Hashar) [10:54:17] one less job on Nodepool [10:54:17] one less job on Nodepool [10:57:16] (03Merged) 10jenkins-bot: Migrate DonationInterface to Docker job (non voting) [integration/config] - 10https://gerrit.wikimedia.org/r/473181 (https://phabricator.wikimedia.org/T203084) (owner: 10Hashar) [10:57:16] (03Merged) 10jenkins-bot: Migrate DonationInterface to Docker job (non voting) [integration/config] - 10https://gerrit.wikimedia.org/r/473181 (https://phabricator.wikimedia.org/T203084) (owner: 10Hashar) [10:57:20] PROBLEM - Puppet errors on deployment-jobrunner03 is CRITICAL: CRITICAL: 5.56% of data above the critical threshold [3.0] [10:57:20] PROBLEM - Puppet errors on deployment-jobrunner03 is CRITICAL: CRITICAL: 5.56% of data above the critical threshold [3.0] [11:00:39] PROBLEM - Puppet errors on deployment-mwmaint01 is CRITICAL: CRITICAL: 6.74% of data above the critical threshold [3.0] [11:00:40] PROBLEM - Puppet errors on deployment-mwmaint01 is CRITICAL: CRITICAL: 6.74% of data above the critical threshold [3.0] [11:04:46] 10Continuous-Integration-Infrastructure (shipyard), 10Operations, 10Kubernetes: set a harbor registry for testing - https://phabricator.wikimedia.org/T209271 (10fselles) After looking into it a little bit, packaging harbor would be challenging. Harbor is a set of microservices published as containers. The in... [11:04:46] 10Continuous-Integration-Infrastructure (shipyard), 10Operations, 10Kubernetes: set a harbor registry for testing - https://phabricator.wikimedia.org/T209271 (10fselles) After looking into it a little bit, packaging harbor would be challenging. Harbor is a set of microservices published as containers. The in... [11:05:23] PROBLEM - Puppet errors on deployment-deploy02 is CRITICAL: CRITICAL: 5.62% of data above the critical threshold [3.0] [11:05:23] PROBLEM - Puppet errors on deployment-deploy02 is CRITICAL: CRITICAL: 5.62% of data above the critical threshold [3.0] [11:05:39] PROBLEM - Puppet errors on deployment-deploy01 is CRITICAL: CRITICAL: 4.55% of data above the critical threshold [3.0] [11:05:39] PROBLEM - Puppet errors on deployment-deploy01 is CRITICAL: CRITICAL: 4.55% of data above the critical threshold [3.0] [11:05:48] PROBLEM - Puppet errors on deployment-mediawiki-09 is CRITICAL: CRITICAL: 2.22% of data above the critical threshold [3.0] [11:05:48] PROBLEM - Puppet errors on deployment-mediawiki-09 is CRITICAL: CRITICAL: 2.22% of data above the critical threshold [3.0] [11:23:03] Project beta-scap-eqiad build #227627: 04FAILURE in 12 min: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/227627/ [11:23:04] Project beta-scap-eqiad build #227627: 04FAILURE in 12 min: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/227627/ [11:23:04] Project beta-update-databases-eqiad build #29745: 04FAILURE in 3 min 4 sec: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/29745/ [11:23:04] Project beta-update-databases-eqiad build #29745: 04FAILURE in 3 min 4 sec: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/29745/ [11:37:37] Yippee, build fixed! [11:37:37] Yippee, build fixed! [11:37:37] Project beta-scap-eqiad build #227628: 09FIXED in 12 min: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/227628/ [11:37:37] Project beta-scap-eqiad build #227628: 09FIXED in 12 min: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/227628/ [12:02:08] 10Project-Admins: Estonian version of "needs-volunteer" tag - https://phabricator.wikimedia.org/T209354 (10reosarevok) [12:02:08] 10Project-Admins: Estonian version of "needs-volunteer" tag - https://phabricator.wikimedia.org/T209354 (10reosarevok) [12:03:34] 10MediaWiki-Codesniffer, 10Patch-For-Review: Allow deprecated @type in MediaWiki.Commenting.FunctionAnnotations.UnrecognizedAnnotation - https://phabricator.wikimedia.org/T203922 (10thiemowmde) I do miss a bit of context here. In all code I have seen so far all `@type` can be replaced with `@var`, or just remo... [12:03:34] 10MediaWiki-Codesniffer, 10Patch-For-Review: Allow deprecated @type in MediaWiki.Commenting.FunctionAnnotations.UnrecognizedAnnotation - https://phabricator.wikimedia.org/T203922 (10thiemowmde) I do miss a bit of context here. In all code I have seen so far all `@type` can be replaced with `@var`, or just remo... [12:06:02] RECOVERY - Puppet errors on deployment-mediawiki-07 is OK: OK: Less than 1.00% above the threshold [2.0] [12:06:02] RECOVERY - Puppet errors on deployment-mediawiki-07 is OK: OK: Less than 1.00% above the threshold [2.0] [12:22:42] Yippee, build fixed! [12:22:42] Yippee, build fixed! [12:22:42] Project beta-update-databases-eqiad build #29746: 09FIXED in 2 min 41 sec: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/29746/ [12:22:42] Project beta-update-databases-eqiad build #29746: 09FIXED in 2 min 41 sec: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/29746/ [12:35:47] RECOVERY - Puppet errors on deployment-mediawiki-09 is OK: OK: Less than 1.00% above the threshold [2.0] [12:35:47] RECOVERY - Puppet errors on deployment-mediawiki-09 is OK: OK: Less than 1.00% above the threshold [2.0] [12:38:03] 10Continuous-Integration-Config, 10Release-Engineering-Team (Kanban), 10Wikidata, 10wikidata-tech-focus, and 2 others: Move Wikibase to using the normal mediawiki extension (quibble) jobs - https://phabricator.wikimedia.org/T188717 (10hashar) Status: I need a few patches for Wikibase to be merged in in ord... [12:38:03] 10Continuous-Integration-Config, 10Release-Engineering-Team (Kanban), 10Wikidata, 10wikidata-tech-focus, and 2 others: Move Wikibase to using the normal mediawiki extension (quibble) jobs - https://phabricator.wikimedia.org/T188717 (10hashar) Status: I need a few patches for Wikibase to be merged in in ord... [12:42:33] 10Continuous-Integration-Infrastructure (shipyard), 10Operations, 10Kubernetes: set a harbor registry for testing - https://phabricator.wikimedia.org/T209271 (10Joe) I would say that this sounds like a better direction to go into, yes. What we still miss is a clear idea of how we want our registry infrastru... [12:42:33] 10Continuous-Integration-Infrastructure (shipyard), 10Operations, 10Kubernetes: set a harbor registry for testing - https://phabricator.wikimedia.org/T209271 (10Joe) I would say that this sounds like a better direction to go into, yes. What we still miss is a clear idea of how we want our registry infrastru... [12:55:41] RECOVERY - Puppet errors on deployment-mwmaint01 is OK: OK: Less than 1.00% above the threshold [2.0] [12:55:42] RECOVERY - Puppet errors on deployment-mwmaint01 is OK: OK: Less than 1.00% above the threshold [2.0] [12:57:24] RECOVERY - Puppet errors on deployment-jobrunner03 is OK: OK: Less than 1.00% above the threshold [2.0] [12:57:24] RECOVERY - Puppet errors on deployment-jobrunner03 is OK: OK: Less than 1.00% above the threshold [2.0] [13:00:21] RECOVERY - Puppet errors on deployment-deploy02 is OK: OK: Less than 1.00% above the threshold [2.0] [13:00:21] RECOVERY - Puppet errors on deployment-deploy02 is OK: OK: Less than 1.00% above the threshold [2.0] [13:00:34] RECOVERY - Puppet errors on deployment-deploy01 is OK: OK: Less than 1.00% above the threshold [2.0] [13:00:34] RECOVERY - Puppet errors on deployment-deploy01 is OK: OK: Less than 1.00% above the threshold [2.0] [13:16:30] (03PS1) 10Hashar: Migrate Wikibase client/repo to Docker + cleanup [integration/config] - 10https://gerrit.wikimedia.org/r/473197 (https://phabricator.wikimedia.org/T188717) [13:16:30] (03PS1) 10Hashar: Migrate Wikibase client/repo to Docker + cleanup [integration/config] - 10https://gerrit.wikimedia.org/r/473197 (https://phabricator.wikimedia.org/T188717) [13:23:08] (03PS1) 10Hashar: Archive operations/debs/nodepool [integration/config] - 10https://gerrit.wikimedia.org/r/473199 (https://phabricator.wikimedia.org/T190097) [13:23:08] (03PS1) 10Hashar: Archive operations/debs/nodepool [integration/config] - 10https://gerrit.wikimedia.org/r/473199 (https://phabricator.wikimedia.org/T190097) [13:27:28] (03PS1) 10Hashar: Stop running rake test for integration/config [integration/config] - 10https://gerrit.wikimedia.org/r/473201 (https://phabricator.wikimedia.org/T190097) [13:27:28] (03PS1) 10Hashar: Stop running rake test for integration/config [integration/config] - 10https://gerrit.wikimedia.org/r/473201 (https://phabricator.wikimedia.org/T190097) [13:31:32] 10Continuous-Integration-Infrastructure (shipyard), 10Operations, 10Kubernetes: set a harbor registry for testing - https://phabricator.wikimedia.org/T209271 (10akosiaris) >>! In T209271#4741841, @fselles wrote: > After looking into it a little bit, packaging harbor would be challenging. Harbor is a set of m... [13:31:32] 10Continuous-Integration-Infrastructure (shipyard), 10Operations, 10Kubernetes: set a harbor registry for testing - https://phabricator.wikimedia.org/T209271 (10akosiaris) >>! In T209271#4741841, @fselles wrote: > After looking into it a little bit, packaging harbor would be challenging. Harbor is a set of m... [13:32:30] (03PS1) 10Hashar: Remove Nodepool / diskimage-builder material [integration/config] - 10https://gerrit.wikimedia.org/r/473202 (https://phabricator.wikimedia.org/T190097) [13:32:30] (03PS1) 10Hashar: Remove Nodepool / diskimage-builder material [integration/config] - 10https://gerrit.wikimedia.org/r/473202 (https://phabricator.wikimedia.org/T190097) [13:37:11] 10Continuous-Integration-Config, 10Release-Engineering-Team (Kanban), 10Wikidata, 10Wikidata-Campsite, and 3 others: Move Wikibase to using the normal mediawiki extension (quibble) jobs - https://phabricator.wikimedia.org/T188717 (10Addshore) [13:37:11] 10Continuous-Integration-Config, 10Release-Engineering-Team (Kanban), 10Wikidata, 10Wikidata-Campsite, and 3 others: Move Wikibase to using the normal mediawiki extension (quibble) jobs - https://phabricator.wikimedia.org/T188717 (10Addshore) [13:38:25] (03CR) 10jerkins-bot: [V: 04-1] Remove Nodepool / diskimage-builder material [integration/config] - 10https://gerrit.wikimedia.org/r/473202 (https://phabricator.wikimedia.org/T190097) (owner: 10Hashar) [13:38:25] (03CR) 10jerkins-bot: [V: 04-1] Remove Nodepool / diskimage-builder material [integration/config] - 10https://gerrit.wikimedia.org/r/473202 (https://phabricator.wikimedia.org/T190097) (owner: 10Hashar) [13:45:24] (03PS2) 10Hashar: Drop integration-config-dib-jessie [integration/config] - 10https://gerrit.wikimedia.org/r/473201 (https://phabricator.wikimedia.org/T190097) [13:45:24] (03PS2) 10Hashar: Drop integration-config-dib-jessie [integration/config] - 10https://gerrit.wikimedia.org/r/473201 (https://phabricator.wikimedia.org/T190097) [13:52:16] (03CR) 10Hashar: [C: 032] Drop integration-config-dib-jessie [integration/config] - 10https://gerrit.wikimedia.org/r/473201 (https://phabricator.wikimedia.org/T190097) (owner: 10Hashar) [13:52:16] (03CR) 10Hashar: [C: 032] Drop integration-config-dib-jessie [integration/config] - 10https://gerrit.wikimedia.org/r/473201 (https://phabricator.wikimedia.org/T190097) (owner: 10Hashar) [13:53:35] (03CR) 10Hashar: [C: 032] Archive operations/debs/nodepool [integration/config] - 10https://gerrit.wikimedia.org/r/473199 (https://phabricator.wikimedia.org/T190097) (owner: 10Hashar) [13:53:35] (03CR) 10Hashar: [C: 032] Archive operations/debs/nodepool [integration/config] - 10https://gerrit.wikimedia.org/r/473199 (https://phabricator.wikimedia.org/T190097) (owner: 10Hashar) [13:55:10] (03Merged) 10jenkins-bot: Drop integration-config-dib-jessie [integration/config] - 10https://gerrit.wikimedia.org/r/473201 (https://phabricator.wikimedia.org/T190097) (owner: 10Hashar) [13:55:10] (03Merged) 10jenkins-bot: Drop integration-config-dib-jessie [integration/config] - 10https://gerrit.wikimedia.org/r/473201 (https://phabricator.wikimedia.org/T190097) (owner: 10Hashar) [13:59:17] (03PS2) 10Hashar: Archive operations/debs/nodepool [integration/config] - 10https://gerrit.wikimedia.org/r/473199 (https://phabricator.wikimedia.org/T190097) [13:59:17] (03PS2) 10Hashar: Archive operations/debs/nodepool [integration/config] - 10https://gerrit.wikimedia.org/r/473199 (https://phabricator.wikimedia.org/T190097) [13:59:24] (03CR) 10Hashar: [C: 032] Archive operations/debs/nodepool [integration/config] - 10https://gerrit.wikimedia.org/r/473199 (https://phabricator.wikimedia.org/T190097) (owner: 10Hashar) [13:59:25] (03CR) 10Hashar: [C: 032] Archive operations/debs/nodepool [integration/config] - 10https://gerrit.wikimedia.org/r/473199 (https://phabricator.wikimedia.org/T190097) (owner: 10Hashar) [14:01:08] (03Merged) 10jenkins-bot: Archive operations/debs/nodepool [integration/config] - 10https://gerrit.wikimedia.org/r/473199 (https://phabricator.wikimedia.org/T190097) (owner: 10Hashar) [14:01:08] (03Merged) 10jenkins-bot: Archive operations/debs/nodepool [integration/config] - 10https://gerrit.wikimedia.org/r/473199 (https://phabricator.wikimedia.org/T190097) (owner: 10Hashar) [14:03:43] 10Continuous-Integration-Config, 10Release-Engineering-Team (Kanban), 10Wikidata, 10wikidata-tech-focus, and 3 others: Move Wikibase to using the normal mediawiki extension (quibble) jobs - https://phabricator.wikimedia.org/T188717 (10Addshore) [14:03:43] 10Continuous-Integration-Config, 10Release-Engineering-Team (Kanban), 10Wikidata, 10wikidata-tech-focus, and 3 others: Move Wikibase to using the normal mediawiki extension (quibble) jobs - https://phabricator.wikimedia.org/T188717 (10Addshore) [14:08:06] (03CR) 10Hashar: [C: 04-1] "Need to migrate the Wikibase jobs first: https://gerrit.wikimedia.org/r/473197" [integration/config] - 10https://gerrit.wikimedia.org/r/473202 (https://phabricator.wikimedia.org/T190097) (owner: 10Hashar) [14:08:06] (03CR) 10Hashar: [C: 04-1] "Need to migrate the Wikibase jobs first: https://gerrit.wikimedia.org/r/473197" [integration/config] - 10https://gerrit.wikimedia.org/r/473202 (https://phabricator.wikimedia.org/T190097) (owner: 10Hashar) [14:22:59] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10Operations, 10cloud-services-team, 10Nodepool: Phase out Nodepool from production - https://phabricator.wikimedia.org/T209361 (10hashar) [14:22:59] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10Operations, 10cloud-services-team, 10Nodepool: Phase out Nodepool from production - https://phabricator.wikimedia.org/T209361 (10hashar) [14:23:18] 10Release-Engineering-Team, 10Epic, 10Patch-For-Review: Migrate all CI jobs from Nodepool, deprecate its use - https://phabricator.wikimedia.org/T190097 (10hashar) [14:23:18] 10Release-Engineering-Team, 10Epic, 10Patch-For-Review: Migrate all CI jobs from Nodepool, deprecate its use - https://phabricator.wikimedia.org/T190097 (10hashar) [14:23:22] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10Operations, 10cloud-services-team, 10Nodepool: Phase out Nodepool from production - https://phabricator.wikimedia.org/T209361 (10hashar) [14:23:22] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10Operations, 10cloud-services-team, 10Nodepool: Phase out Nodepool from production - https://phabricator.wikimedia.org/T209361 (10hashar) [14:29:10] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10Operations, 10cloud-services-team, 10Nodepool: Phase out Nodepool from production - https://phabricator.wikimedia.org/T209361 (10hashar) [14:29:10] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10Operations, 10cloud-services-team, 10Nodepool: Phase out Nodepool from production - https://phabricator.wikimedia.org/T209361 (10hashar) [14:49:06] 10Release-Engineering-Team (Kanban), 10Release, 10Train Deployments: 1.33.0-wmf.4 deployment blockers - https://phabricator.wikimedia.org/T206658 (10hashar) Deployment notes: {F27212402} To be pasted on https://www.mediawiki.org/wiki/MediaWiki_1.33/wmf.4/Changelog [14:49:06] 10Release-Engineering-Team (Kanban), 10Release, 10Train Deployments: 1.33.0-wmf.4 deployment blockers - https://phabricator.wikimedia.org/T206658 (10hashar) Deployment notes: {F27212402} To be pasted on https://www.mediawiki.org/wiki/MediaWiki_1.33/wmf.4/Changelog [14:49:36] 10Phabricator, 10Release-Engineering-Team (Kanban), 10Security-Team, 10User-MModell: Create a security issue task type with additional attributes - https://phabricator.wikimedia.org/T204160 (10chasemp) >>! In T204160#4737053, @Krenair wrote: >>>! In T204160#4737052, @Tgr wrote: >> Subscribers are the most... [14:49:36] 10Phabricator, 10Release-Engineering-Team (Kanban), 10Security-Team, 10User-MModell: Create a security issue task type with additional attributes - https://phabricator.wikimedia.org/T204160 (10chasemp) >>! In T204160#4737053, @Krenair wrote: >>>! In T204160#4737052, @Tgr wrote: >> Subscribers are the most... [14:56:46] 10Release-Engineering-Team (Watching / External), 10MediaWiki-Commenting, 10Core Platform Team ( Code Health (TEC13)), 10Core Platform Team Kanban (Doing), and 3 others: Deploy refactored comment storage - https://phabricator.wikimedia.org/T166733 (10Anomie) >>! In T166733#4740371, @Anomie wrote: > All wik... [14:56:46] 10Release-Engineering-Team (Watching / External), 10MediaWiki-Commenting, 10Core Platform Team ( Code Health (TEC13)), 10Core Platform Team Kanban (Doing), and 3 others: Deploy refactored comment storage - https://phabricator.wikimedia.org/T166733 (10Anomie) >>! In T166733#4740371, @Anomie wrote: > All wik... [16:52:49] 10MediaWiki-Codesniffer, 10MediaWiki-General-or-Unknown: Enforce usage of phpdoc's @api and @internal tags to indicate public and private structural elements - https://phabricator.wikimedia.org/T209394 (10dbarratt) [16:52:49] 10MediaWiki-Codesniffer, 10MediaWiki-General-or-Unknown: Enforce usage of phpdoc's @api and @internal tags to indicate public and private structural elements - https://phabricator.wikimedia.org/T209394 (10dbarratt) [17:24:34] 10Continuous-Integration-Infrastructure (shipyard), 10Operations, 10Kubernetes: set a harbor registry for testing - https://phabricator.wikimedia.org/T209271 (10fselles) Regarding @Joe 's comment. The last picture should be something similar to this, this is quite complex and is build up from the idea that... [17:24:34] 10Continuous-Integration-Infrastructure (shipyard), 10Operations, 10Kubernetes: set a harbor registry for testing - https://phabricator.wikimedia.org/T209271 (10fselles) Regarding @Joe 's comment. The last picture should be something similar to this, this is quite complex and is build up from the idea that... [17:29:26] 10Continuous-Integration-Infrastructure (shipyard), 10Operations, 10Kubernetes: improve docker registry architecture - https://phabricator.wikimedia.org/T209271 (10fselles) [17:29:26] 10Continuous-Integration-Infrastructure (shipyard), 10Operations, 10Kubernetes: improve docker registry architecture - https://phabricator.wikimedia.org/T209271 (10fselles) [17:42:08] 10Gerrit, 10Release-Engineering-Team, 10Patch-For-Review: Upgrade gerrit to 2.15.6 - https://phabricator.wikimedia.org/T205784 (10hashar) [17:42:08] 10Gerrit, 10Release-Engineering-Team, 10Patch-For-Review: Upgrade gerrit to 2.15.6 - https://phabricator.wikimedia.org/T205784 (10hashar) [17:43:09] 10Gerrit, 10Release-Engineering-Team, 10Patch-For-Review: Upgrade gerrit to 2.15.6 - https://phabricator.wikimedia.org/T205784 (10hashar) 05Open>03Resolved a:03thcipriani We went with Gerrit 2.15.6 since all the preparation work has been done already and it was deemed stable for production deployment.... [17:43:09] 10Gerrit, 10Release-Engineering-Team, 10Patch-For-Review: Upgrade gerrit to 2.15.6 - https://phabricator.wikimedia.org/T205784 (10hashar) 05Open>03Resolved a:03thcipriani We went with Gerrit 2.15.6 since all the preparation work has been done already and it was deemed stable for production deployment.... [18:14:20] PROBLEM - Host integration-slave-docker-1034 is DOWN: CRITICAL - Host Unreachable (10.68.23.35) [18:14:20] PROBLEM - Host integration-slave-docker-1034 is DOWN: CRITICAL - Host Unreachable (10.68.23.35) [18:17:57] thcipriani hi, a user reported a gerrit error in #wikimedia-dev (error occurs when they save there email) [18:17:57] thcipriani hi, a user reported a gerrit error in #wikimedia-dev (error occurs when they save there email) [19:20:21] Krenair and/or whoever: I have another round of deployment-prep VMs that I want to move today. For starters, deployment-elastic05 and deployment-elastic07. We moved deployment-elastic06 recently but don't remember if it just worked or if we had to depool or something [19:20:21] Krenair and/or whoever: I have another round of deployment-prep VMs that I want to move today. For starters, deployment-elastic05 and deployment-elastic07. We moved deployment-elastic06 recently but don't remember if it just worked or if we had to depool or something [19:30:22] 10Release-Engineering-Team (Kanban), 10Cloud-Services, 10Epic: Migrate the Integration cloud project to eqiad1-r - https://phabricator.wikimedia.org/T208803 (10Andrew) I just moved integration-slave-docker-1038 and integration-slave-docker-1034 to eqiad1-r and repooled them. They look OK to me, so far. [19:30:22] 10Release-Engineering-Team (Kanban), 10Cloud-Services, 10Epic: Migrate the Integration cloud project to eqiad1-r - https://phabricator.wikimedia.org/T208803 (10Andrew) I just moved integration-slave-docker-1038 and integration-slave-docker-1034 to eqiad1-r and repooled them. They look OK to me, so far. [19:30:47] greg-g: o/ just reading over https://wikitech.wikimedia.org/wiki/Deployments#Upcoming. weeks of 10th and 17th december aren't mentioned. are they service as usual (train, swats) or an omission? [19:30:47] greg-g: o/ just reading over https://wikitech.wikimedia.org/wiki/Deployments#Upcoming. weeks of 10th and 17th december aren't mentioned. are they service as usual (train, swats) or an omission? [19:31:15] phuedx: correct [19:31:15] phuedx: correct [19:31:24] er the prior :) [19:31:24] er the prior :) [19:31:41] they are business as usual (to finally be explicit) [19:31:41] they are business as usual (to finally be explicit) [19:35:21] woah. is that unusual? [19:35:21] woah. is that unusual? [19:38:19] no, we have always done the first couple weeks of december and not the last [19:38:19] no, we have always done the first couple weeks of december and not the last [19:38:41] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team, 10Cloud-Services, 10Epic: Migrate deployment-prep to eqiad1 - https://phabricator.wikimedia.org/T208101 (10Andrew) [19:38:42] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team, 10Cloud-Services, 10Epic: Migrate deployment-prep to eqiad1 - https://phabricator.wikimedia.org/T208101 (10Andrew) [19:39:09] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team, 10Cloud-Services, 10Epic: Migrate deployment-prep to eqiad1 - https://phabricator.wikimedia.org/T208101 (10Andrew) ...is anybody there? [19:39:09] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team, 10Cloud-Services, 10Epic: Migrate deployment-prep to eqiad1 - https://phabricator.wikimedia.org/T208101 (10Andrew) ...is anybody there? [19:39:11] great! i'm totally misremembering the last couple of years [19:39:11] great! i'm totally misremembering the last couple of years [19:39:16] a non-zero number of people seem to remember us not deploying at all in December, that's never been the case since I've been here and in charge of it :) (Feb '13) :) [19:39:16] a non-zero number of people seem to remember us not deploying at all in December, that's never been the case since I've been here and in charge of it :) (Feb '13) :) [19:39:37] PROBLEM - SSH on deployment-sca04 is CRITICAL: (Service Check Timed Out) [19:39:37] PROBLEM - SSH on deployment-sca04 is CRITICAL: (Service Check Timed Out) [19:39:54] PROBLEM - SSH on deployment-kafka-main-1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:39:54] PROBLEM - SSH on deployment-kafka-main-1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:40:19] PROBLEM - SSH on deployment-mx02 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:40:19] PROBLEM - SSH on deployment-mx02 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:40:22] PROBLEM - SSH on deployment-jobrunner03 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:40:22] PROBLEM - SSH on deployment-jobrunner03 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:40:23] PROBLEM - SSH on deployment-poolcounter04 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:40:23] PROBLEM - SSH on deployment-poolcounter04 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:40:28] PROBLEM - SSH on deployment-conf03 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:40:28] PROBLEM - SSH on deployment-conf03 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:40:28] PROBLEM - SSH on deployment-restbase02 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:40:28] PROBLEM - SSH on deployment-restbase02 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:40:29] PROBLEM - SSH on deployment-deploy01 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:40:29] PROBLEM - SSH on deployment-deploy01 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:40:32] PROBLEM - SSH on deployment-mwmaint01 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:40:32] PROBLEM - SSH on deployment-mwmaint01 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:40:32] PROBLEM - SSH on deployment-sentry01 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:40:32] PROBLEM - SSH on deployment-sentry01 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:40:35] PROBLEM - SSH on deployment-db04 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:40:35] PROBLEM - SSH on deployment-db04 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:40:36] PROBLEM - SSH on deployment-eventlog05 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:40:36] PROBLEM - SSH on deployment-eventlog05 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:40:39] PROBLEM - SSH on deployment-mediawiki-07 is CRITICAL: (Service Check Timed Out) [19:40:39] PROBLEM - SSH on deployment-mediawiki-07 is CRITICAL: (Service Check Timed Out) [19:40:48] PROBLEM - SSH on deployment-redis06 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:40:48] PROBLEM - SSH on deployment-redis06 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:40:55] PROBLEM - SSH on deployment-changeprop is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:40:55] PROBLEM - SSH on deployment-changeprop is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:41:09] PROBLEM - SSH on deployment-etcd-01 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:41:09] PROBLEM - SSH on deployment-etcd-01 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:41:10] PROBLEM - SSH on deployment-db03 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:41:10] PROBLEM - SSH on deployment-db03 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:41:13] PROBLEM - SSH on deployment-rd3-cptest-master01 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:41:13] PROBLEM - SSH on deployment-rd3-cptest-master01 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:41:18] PROBLEM - SSH on deployment-memc04 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:41:18] PROBLEM - SSH on deployment-memc04 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:41:18] PROBLEM - SSH on deployment-memc05 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:41:18] PROBLEM - SSH on deployment-memc05 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:41:19] PROBLEM - SSH on deployment-zotero01 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:41:19] PROBLEM - SSH on deployment-zotero01 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:41:24] PROBLEM - SSH on deployment-imagescaler01 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:41:24] PROBLEM - SSH on deployment-imagescaler01 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:41:33] PROBLEM - SSH on deployment-dumps-puppetmaster02 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:41:33] PROBLEM - SSH on deployment-dumps-puppetmaster02 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:41:40] PROBLEM - SSH on deployment-mathoid is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:41:40] PROBLEM - SSH on deployment-mathoid is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:41:41] PROBLEM - SSH on deployment-fluorine02 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:41:41] PROBLEM - SSH on deployment-fluorine02 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:41:46] PROBLEM - SSH on deployment-ms-be03 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:41:46] PROBLEM - SSH on deployment-ms-be03 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:41:46] PROBLEM - SSH on deployment-webperf12 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:41:46] PROBLEM - SSH on deployment-webperf12 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:41:55] PROBLEM - SSH on deployment-elastic07 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:41:55] PROBLEM - SSH on deployment-elastic07 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:42:00] PROBLEM - SSH on deployment-maps03 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:42:01] PROBLEM - SSH on deployment-maps03 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:42:11] PROBLEM - SSH on deployment-cache-upload04 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:42:11] PROBLEM - SSH on deployment-cache-upload04 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:42:12] PROBLEM - SSH on deployment-webperf11 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:42:12] PROBLEM - SSH on deployment-webperf11 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:42:12] PROBLEM - SSH on deployment-puppetdb02 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:42:12] PROBLEM - SSH on deployment-puppetdb02 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:42:14] PROBLEM - SSH on deployment-elastic06 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:42:14] PROBLEM - SSH on deployment-elastic06 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:42:18] PROBLEM - SSH on deployment-mediawiki-09 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:42:18] PROBLEM - SSH on deployment-mediawiki-09 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:42:19] PROBLEM - SSH on deployment-redis05 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:42:19] PROBLEM - SSH on deployment-redis05 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:42:22] PROBLEM - SSH on deployment-kafka-jumbo-1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:42:22] PROBLEM - SSH on deployment-kafka-jumbo-1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:42:29] PROBLEM - SSH on deployment-restbase01 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:42:29] PROBLEM - SSH on deployment-restbase01 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:42:41] PROBLEM - SSH on deployment-cumin is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:42:41] PROBLEM - SSH on deployment-cumin is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:42:46] PROBLEM - SSH on deployment-chromium02 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:42:46] PROBLEM - SSH on deployment-chromium02 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:42:47] PROBLEM - SSH on deployment-logstash2 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:42:47] PROBLEM - SSH on deployment-logstash2 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:42:47] PROBLEM - SSH on deployment-kafka-jumbo-2 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:42:47] PROBLEM - SSH on deployment-kafka-jumbo-2 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:42:50] PROBLEM - SSH on deployment-sca01 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:42:50] PROBLEM - SSH on deployment-sca01 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:42:50] PROBLEM - SSH on deployment-ms-be04 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:42:51] PROBLEM - SSH on deployment-ms-be04 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:42:55] PROBLEM - SSH on deployment-maps04 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:42:56] PROBLEM - SSH on deployment-maps04 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:43:02] PROBLEM - SSH on deployment-aqs02 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:43:03] PROBLEM - SSH on deployment-aqs02 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:43:06] PROBLEM - SSH on deployment-ores01 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:43:06] PROBLEM - SSH on deployment-ores01 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:43:07] PROBLEM - SSH on deployment-snapshot01 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:43:07] PROBLEM - SSH on deployment-snapshot01 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:43:20] PROBLEM - SSH on deployment-maps05 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:43:20] PROBLEM - SSH on deployment-maps05 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:43:24] i assume this is wmcs moving deployment-rep projec to different IPs [19:43:24] i assume this is wmcs moving deployment-rep projec to different IPs [19:43:25] PROBLEM - SSH on deployment-elastic05 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:43:25] PROBLEM - SSH on deployment-sca02 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:43:25] PROBLEM - SSH on deployment-elastic05 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:43:25] PROBLEM - SSH on deployment-sca02 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:43:32] PROBLEM - SSH on deployment-aqs03 is CRITICAL: (Service Check Timed Out) [19:43:32] PROBLEM - SSH on deployment-aqs03 is CRITICAL: (Service Check Timed Out) [19:43:46] PROBLEM - SSH on deployment-mcs01 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:43:46] PROBLEM - SSH on deployment-mcs01 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:43:46] PROBLEM - SSH on deployment-cache-text04 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:43:46] PROBLEM - SSH on deployment-cache-text04 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:43:46] PROBLEM - SSH on deployment-pdfrender02 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:43:46] PROBLEM - SSH on deployment-pdfrender02 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:43:51] PROBLEM - SSH on deployment-urldownloader02 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:43:51] PROBLEM - SSH on deployment-urldownloader02 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:44:02] might have been nice to downtime shinken ? [19:44:02] might have been nice to downtime shinken ? [19:44:14] PROBLEM - SSH on deployment-memc07 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:44:14] PROBLEM - SSH on deployment-memc07 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:44:17] PROBLEM - SSH on deployment-ms-fe02 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:44:17] PROBLEM - SSH on deployment-ms-fe02 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:44:17] PROBLEM - SSH on deployment-ircd is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:44:17] PROBLEM - SSH on deployment-ircd is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:44:33] PROBLEM - SSH on deployment-chromium01 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:44:34] PROBLEM - SSH on deployment-chromium01 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:44:47] PROBLEM - SSH on deployment-cpjobqueue is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:44:47] PROBLEM - SSH on deployment-cpjobqueue is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:44:48] PROBLEM - SSH on deployment-prometheus01 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:44:48] PROBLEM - SSH on deployment-prometheus01 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:44:48] PROBLEM - SSH on deployment-imagescaler02 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:44:48] PROBLEM - SSH on deployment-imagescaler02 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:44:52] PROBLEM - Content Translation Server on deployment-sca01 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:44:52] PROBLEM - Content Translation Server on deployment-sca01 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:45:02] PROBLEM - SSH on deployment-aqs01 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:45:02] PROBLEM - SSH on deployment-aqs01 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:45:02] PROBLEM - SSH on deployment-deploy02 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:45:02] PROBLEM - SSH on deployment-deploy02 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:45:04] PROBLEM - SSH on deployment-puppetmaster03 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:45:04] PROBLEM - SSH on deployment-puppetmaster03 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:45:04] PROBLEM - SSH on deployment-memc06 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:45:04] PROBLEM - SSH on deployment-memc06 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:45:04] PROBLEM - SSH on deployment-zookeeper02 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:45:04] PROBLEM - SSH on deployment-zookeeper02 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:45:14] PROBLEM - SSH on deployment-parsoid09 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:45:14] PROBLEM - SSH on deployment-parsoid09 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:45:15] PROBLEM - SSH on deployment-kafka-main-2 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:45:15] PROBLEM - SSH on deployment-kafka-main-2 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:45:46] hi releng! [19:45:46] hi releng! [19:46:12] someone on my team pushed a bad tag value to our SmashPig repo, and now we can't delete it [19:46:12] someone on my team pushed a bad tag value to our SmashPig repo, and now we can't delete it [19:46:20] can anyone here help? [19:46:20] can anyone here help? [19:46:32] The repo is gerrit.wikimedia.org:29418/wikimedia/fundraising/SmashPig [19:46:32] The repo is gerrit.wikimedia.org:29418/wikimedia/fundraising/SmashPig [19:46:48] and the tag we'd like to delete is v0.5.9.3 [19:46:48] and the tag we'd like to delete is v0.5.9.3 [19:47:18] greg-g: i am no longer a member of that set of folk [19:47:18] greg-g: i am no longer a member of that set of folk [19:47:53] \o/ [19:47:53] \o/ [19:47:55] alright [19:47:55] alright [19:47:56] OUT [19:47:56] OUT [19:54:37] mutante: the issue is with shinken itself I think [19:54:38] mutante: the issue is with shinken itself I think [19:55:00] phuedx|afk: have a good one :) [19:55:00] phuedx|afk: have a good one :) [19:55:54] ejegg: I added temporarily added "Delete Reference" for refs/tags/* for the SmashPig repo, you should be able to delete the tag now [19:55:54] ejegg: I added temporarily added "Delete Reference" for refs/tags/* for the SmashPig repo, you should be able to delete the tag now [19:56:10] thanks thcipriani, that did the trick! [19:56:10] thanks thcipriani, that did the trick! [19:56:34] ejegg: all done? Can I revert that permission? [19:56:34] ejegg: all done? Can I revert that permission? [19:57:43] yep thcipriani, all done [19:57:43] yep thcipriani, all done [19:58:42] cool, thanks [19:58:42] cool, thanks [19:59:59] thank you [19:59:59] thank you [20:03:08] gotcha, Andrew, ack [20:03:08] gotcha, Andrew, ack [20:06:48] hmph, this is fixed with a patch but the deployment-prep puppetmaster has stopped merging [20:06:48] hmph, this is fixed with a patch but the deployment-prep puppetmaster has stopped merging [20:07:12] thcipriani, have a minute to resolve rebase conflicts on deployment-puppetmaster03? I can try but I'm getting pretty far out of my lane :) [20:07:12] thcipriani, have a minute to resolve rebase conflicts on deployment-puppetmaster03? I can try but I'm getting pretty far out of my lane :) [20:08:01] sure I can take a look [20:08:01] sure I can take a look [20:08:20] thanks [20:08:20] thanks [20:10:47] hrm, this one seems to be what's tripping it up https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/439774/ [20:10:47] hrm, this one seems to be what's tripping it up https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/439774/ [20:11:42] ah a change was merged [20:11:42] ah a change was merged [20:11:47] that touches the labs config for that file [20:11:47] that touches the labs config for that file [20:12:19] yes https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/473263/ [20:12:19] yes https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/473263/ [20:13:45] 10Phabricator: Add a tag removes another one - https://phabricator.wikimedia.org/T209411 (10Framawiki) [20:13:45] 10Phabricator: Add a tag removes another one - https://phabricator.wikimedia.org/T209411 (10Framawiki) [20:14:40] so I guess I will just add the new line to minimal and wrap in an if $::realm block and let Krenair take a look at that later to see if that's how he wants it done [20:14:40] so I guess I will just add the new line to minimal and wrap in an if $::realm block and let Krenair take a look at that later to see if that's how he wants it done [20:15:15] 10Phabricator: Add a tag removes another one - https://phabricator.wikimedia.org/T209411 (10Peachey88) That is working as intended afaik, Data Services is a Subproject of Cloud Services. [20:15:15] 10Phabricator: Add a tag removes another one - https://phabricator.wikimedia.org/T209411 (10Peachey88) That is working as intended afaik, Data Services is a Subproject of Cloud Services. [20:17:15] and I guess https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/470865/ will merged and can be un-cherry-picked [20:17:15] and I guess https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/470865/ will merged and can be un-cherry-picked [20:19:13] 10Phabricator: Add a tag removes another one - https://phabricator.wikimedia.org/T209411 (10Peachey88) 05Open>03Invalid [20:19:13] 10Phabricator: Add a tag removes another one - https://phabricator.wikimedia.org/T209411 (10Peachey88) 05Open>03Invalid [20:22:44] andrewbogott: git surgery complete [20:22:44] andrewbogott: git surgery complete [20:23:03] thanks! Looks like the ferm change I was hoping for is getting rolled out [20:23:03] thanks! Looks like the ferm change I was hoping for is getting rolled out [20:24:39] Krenair: whenever you're around I modified your patch on beta https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/439774/ to incorporate https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/473263/ by wrapping the new code in an <% if @realm == 'labs' %> as you did elsewhere in that file. It's on the HEAD of the puppet repo on deployment-puppetmaster03, but not yet pushed to gerrit, FYI. [20:24:39] Krenair: whenever you're around I modified your patch on beta https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/439774/ to incorporate https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/473263/ by wrapping the new code in an <% if @realm == 'labs' %> as you did elsewhere in that file. It's on the HEAD of the puppet repo on deployment-puppetmaster03, but not yet pushed to gerrit, FYI. [21:04:43] ohi [21:04:43] ohi [21:05:33] thcipriani, thanks! [21:05:33] thcipriani, thanks! [21:05:51] realm branches are much better in the middle of files rather than covering entire files [21:05:51] realm branches are much better in the middle of files rather than covering entire files [21:07:10] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team, 10Cloud-Services, 10Epic: Migrate deployment-prep to eqiad1 - https://phabricator.wikimedia.org/T208101 (10mmodell) @andrew: I won't be around on those days to help out as I'm on vacation from the 26th through the 30th. Maybe @dan or @hashar will... [21:07:10] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team, 10Cloud-Services, 10Epic: Migrate deployment-prep to eqiad1 - https://phabricator.wikimedia.org/T208101 (10mmodell) @andrew: I won't be around on those days to help out as I'm on vacation from the 26th through the 30th. Maybe @dan or @hashar will... [21:07:19] It would be good to know why that patch was required on the labs version but not the prod one, and this has highlighted the discrepancy. [21:07:19] It would be good to know why that patch was required on the labs version but not the prod one, and this has highlighted the discrepancy. [21:08:31] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team, 10Cloud-Services, 10Epic: Migrate deployment-prep to eqiad1 - https://phabricator.wikimedia.org/T208101 (10Krenair) I think you mean @Dduvall :) [21:08:31] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team, 10Cloud-Services, 10Epic: Migrate deployment-prep to eqiad1 - https://phabricator.wikimedia.org/T208101 (10Krenair) I think you mean @Dduvall :) [21:09:13] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team, 10Cloud-Services, 10Epic: Migrate deployment-prep to eqiad1 - https://phabricator.wikimedia.org/T208101 (10mmodell) hahah good catch @krenair, thanks [21:09:13] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team, 10Cloud-Services, 10Epic: Migrate deployment-prep to eqiad1 - https://phabricator.wikimedia.org/T208101 (10mmodell) hahah good catch @krenair, thanks [21:39:31] Krenair: can you advise re: moving deployment-elastic05 and deployment-elastic07? Seems like you diagnosed an issue with search failover but I don't recall if you fixed it or not [21:39:31] Krenair: can you advise re: moving deployment-elastic05 and deployment-elastic07? Seems like you diagnosed an issue with search failover but I don't recall if you fixed it or not [21:40:36] I think I chatted with someone in -discovery who realised that only one of them was working because MediaWiki was trying to connect on ports that for some reason the other two weren't listening on yet [21:40:36] I think I chatted with someone in -discovery who realised that only one of them was working because MediaWiki was trying to connect on ports that for some reason the other two weren't listening on yet [21:40:48] (some manual restart required after puppet ran to make them start listening or something IIRC) [21:40:48] (some manual restart required after puppet ran to make them start listening or something IIRC) [21:41:26] ok, and the only one that was working is o6, the one we moved before. So… probably 05 and 07 aren't doing anything anyway? [21:41:26] ok, and the only one that was working is o6, the one we moved before. So… probably 05 and 07 aren't doing anything anyway? [21:41:56] well [21:41:56] well [21:42:11] that thing was fixed [21:42:11] that thing was fixed [21:42:16] so they could well be in active use by MW now [21:42:16] so they could well be in active use by MW now [21:42:40] I'd just move one at a time and see what happens [21:42:40] I'd just move one at a time and see what happens [21:43:36] ok :) [21:43:36] ok :) [21:44:12] here goes 05 [21:44:12] here goes 05 [21:47:09] Krenair, the other VMs of interest to me today are deployment-urldownloader02, deployment-webperf12, deployment-dumps-puppetmaster02 and deployment-deploy02 [21:47:09] Krenair, the other VMs of interest to me today are deployment-urldownloader02, deployment-webperf12, deployment-dumps-puppetmaster02 and deployment-deploy02 [21:48:07] urldownloader02 going down make break some probably obscure functionality [21:48:07] urldownloader02 going down make break some probably obscure functionality [21:48:12] hopefully nothing too serious [21:48:12] hopefully nothing too serious [21:48:21] no idea what webperf does [21:48:21] no idea what webperf does [21:48:32] dumps-puppetmaster does what it says on the tin, should be good to go [21:48:32] dumps-puppetmaster does what it says on the tin, should be good to go [21:49:30] ok, I'll do that one first then :) [21:49:30] ok, I'll do that one first then :) [21:49:35] deploy02 is the equivalent of mira in the old mira vs. tin setup and is probably fairly safe to move but I'd check with someone who knows scap's internals better how that affects the scap process [21:49:35] deploy02 is the equivalent of mira in the old mira vs. tin setup and is probably fairly safe to move but I'd check with someone who knows scap's internals better how that affects the scap process [21:50:13] e.g. may want to take it off some list, do the move, then re-sync and put it back on [21:50:13] e.g. may want to take it off some list, do the move, then re-sync and put it back on [21:52:29] * andrewbogott wonders who knows about scap internals and then just pings thcipriani like always [21:52:29] ACTION wonders who knows about scap internals and then just pings thcipriani like always  [21:53:53] worst likely case is you may break the beta-code-update stuff for a while [21:53:53] worst likely case is you may break the beta-code-update stuff for a while [21:54:06] not a massive deal imo [21:54:06] not a massive deal imo [21:55:07] hrm, yeah, taking it off of the scap-masters list would prevent beta-code-update from freaking out (although it should be a soft error, so I'd say just move it) [21:55:07] hrm, yeah, taking it off of the scap-masters list would prevent beta-code-update from freaking out (although it should be a soft error, so I'd say just move it) [21:55:57] thcipriani: I'll just move it, starting in maybe 5 minutes. OK with you? [21:55:57] thcipriani: I'll just move it, starting in maybe 5 minutes. OK with you? [21:56:00] "soft error" meaning the deployment will still proceed, it'll just report an error in the output and exit non-0 [21:56:00] "soft error" meaning the deployment will still proceed, it'll just report an error in the output and exit non-0 [21:56:05] andrewbogott: yep, works for me [21:56:05] andrewbogott: yep, works for me [21:58:06] happen to know what webperf does? [21:58:06] happen to know what webperf does? [21:58:26] Timo is away :/ [21:58:26] Timo is away :/ [22:00:26] * thcipriani removes deploy02 temporarily from scap-masters [22:00:26] ACTION removes deploy02 temporarily from scap-masters [22:00:28] https://wikitech.wikimedia.org/wiki/Webperf [22:00:28] https://wikitech.wikimedia.org/wiki/Webperf [22:00:33] "webperf is a set of scripts that aggregate data from EventLogging to statsd and Graphite." [22:00:33] "webperf is a set of scripts that aggregate data from EventLogging to statsd and Graphite." [22:02:24] hm [22:02:24] hm [22:02:47] !log disable puppet on deployment-deploy01 temporarily while deployment-deploy02 is migrating to preserve dsh files [22:02:48] !log disable puppet on deployment-deploy01 temporarily while deployment-deploy02 is migrating to preserve dsh files [22:02:49] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [22:02:49] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [22:03:12] that webperf node will finish moving in 5 seconds :) [22:03:12] that webperf node will finish moving in 5 seconds :) [22:03:12] ^ andrewbogott let me know when deploy02 is done migrating and I'll undo my temporary scap appeasement [22:03:12] ^ andrewbogott let me know when deploy02 is done migrating and I'll undo my temporary scap appeasement [22:03:26] Project beta-scap-eqiad build #227667: 04FAILURE in 23 min: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/227667/ [22:03:26] Project beta-scap-eqiad build #227667: 04FAILURE in 23 min: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/227667/ [22:04:30] oh good :) [22:04:30] oh good :) [22:05:49] it's in the mediawiki-installation file as well I guess [22:05:49] it's in the mediawiki-installation file as well I guess [22:06:25] although for whatever reason I can't reach deployment-deploy01 now :\ [22:06:25] although for whatever reason I can't reach deployment-deploy01 now :\ [22:07:18] I can barely reach anything — my home network seems to be dying [22:07:18] I can barely reach anything — my home network seems to be dying [22:07:59] hrm, in my case I am having some trouble hitting primary.bastion.wmflabs.org [22:07:59] hrm, in my case I am having some trouble hitting primary.bastion.wmflabs.org [22:09:25] I can't reach production hosts either, so either something very bad is happening, or something very local [22:09:25] I can't reach production hosts either, so either something very bad is happening, or something very local [22:09:44] thcipriani: how about now? [22:09:44] thcipriani: how about now? [22:09:53] andrewbogott: there's a he.net problem based on -operations [22:09:53] andrewbogott: there's a he.net problem based on -operations [22:09:58] so depending on your isp routing [22:09:58] so depending on your isp routing [22:10:01] that would do it [22:10:01] that would do it [22:10:53] FWIW I was able to get back into labs bastion now [22:10:53] FWIW I was able to get back into labs bastion now [22:19:35] Yippee, build fixed! [22:19:35] Yippee, build fixed! [22:19:35] Project beta-scap-eqiad build #227668: 09FIXED in 14 min: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/227668/ [22:19:35] Project beta-scap-eqiad build #227668: 09FIXED in 14 min: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/227668/ [22:37:51] thcipriani: deploy02 is moved and running. [22:37:51] thcipriani: deploy02 is moved and running. [22:38:05] andrewbogott: okie doke, thanks! [22:38:05] andrewbogott: okie doke, thanks! [22:39:30] !log reenabling and running puppet on deployment-deploy01 [22:39:30] !log reenabling and running puppet on deployment-deploy01 [22:39:31] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [22:39:31] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [22:45:18] Krenair: all done — labvirt1016 is empty now [22:45:18] Krenair: all done — labvirt1016 is empty now [22:45:29] There'll be another round of this next week for 1015 but there are fewer things there [22:45:29] There'll be another round of this next week for 1015 but there are fewer things there [22:49:21] RECOVERY - SSH on deployment-jobrunner03 is OK: SSH OK - OpenSSH_7.4p1 Debian-10+deb9u4 (protocol 2.0) [22:49:22] RECOVERY - SSH on deployment-jobrunner03 is OK: SSH OK - OpenSSH_7.4p1 Debian-10+deb9u4 (protocol 2.0) [22:49:22] RECOVERY - SSH on deployment-maps05 is OK: SSH OK - OpenSSH_7.4p1 Debian-10+deb9u4 (protocol 2.0) [22:49:22] RECOVERY - SSH on deployment-maps05 is OK: SSH OK - OpenSSH_7.4p1 Debian-10+deb9u4 (protocol 2.0) [22:49:22] RECOVERY - SSH on deployment-mwmaint01 is OK: SSH OK - OpenSSH_7.4p1 Debian-10+deb9u4 (protocol 2.0) [22:49:22] RECOVERY - SSH on deployment-mwmaint01 is OK: SSH OK - OpenSSH_7.4p1 Debian-10+deb9u4 (protocol 2.0) [22:49:22] RECOVERY - SSH on deployment-mediawiki-07 is OK: SSH OK - OpenSSH_7.4p1 Debian-10+deb9u4 (protocol 2.0) [22:49:22] RECOVERY - SSH on deployment-mediawiki-07 is OK: SSH OK - OpenSSH_7.4p1 Debian-10+deb9u4 (protocol 2.0) [23:01:48] 10Release-Engineering-Team (Kanban), 10Release, 10Train Deployments: 1.33.0-wmf.4 deployment blockers - https://phabricator.wikimedia.org/T206658 (10hashar) [23:01:48] 10Release-Engineering-Team (Kanban), 10Release, 10Train Deployments: 1.33.0-wmf.4 deployment blockers - https://phabricator.wikimedia.org/T206658 (10hashar) [23:02:49] 10Release-Engineering-Team (Kanban), 10Release, 10Train Deployments: 1.33.0-wmf.4 deployment blockers - https://phabricator.wikimedia.org/T206658 (10hashar) T209429 is about memcached errors appearing with 1.33.0-wmf.4 on group0 [23:02:49] 10Release-Engineering-Team (Kanban), 10Release, 10Train Deployments: 1.33.0-wmf.4 deployment blockers - https://phabricator.wikimedia.org/T206658 (10hashar) T209429 is about memcached errors appearing with 1.33.0-wmf.4 on group0 [23:21:42] PROBLEM - SSH on integration-slave-docker-1021 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:21:42] PROBLEM - SSH on integration-slave-docker-1021 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:26:32] RECOVERY - SSH on integration-slave-docker-1021 is OK: SSH OK - OpenSSH_6.7p1 Debian-5+deb8u7 (protocol 2.0) [23:26:32] RECOVERY - SSH on integration-slave-docker-1021 is OK: SSH OK - OpenSSH_6.7p1 Debian-5+deb8u7 (protocol 2.0) [23:57:43] PROBLEM - SSH on integration-slave-docker-1021 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:57:43] PROBLEM - SSH on integration-slave-docker-1021 is CRITICAL: CRITICAL - Socket timeout after 10 seconds