[00:24:10] 10Gerrit, 06Release-Engineering-Team, 06Operations, 13Patch-For-Review: Investigate why gerrit slowed down on 17/10/2016 / 18/10/2016 / 21/10/2016 - https://phabricator.wikimedia.org/T148478#2750748 (10Paladox) Ive been talking to @dzahn about this. What were thinking is of testing the different gc availa... [00:32:39] 10Gerrit, 06Release-Engineering-Team, 06Operations, 13Patch-For-Review: Investigate why gerrit slowed down on 17/10/2016 / 18/10/2016 / 21/10/2016 - https://phabricator.wikimedia.org/T148478#2750773 (10Dzahn) I read some of http://blog.takipi.com/garbage-collectors-serial-vs-parallel-vs-cms-vs-the-g1-and-w... [00:54:06] 10Continuous-Integration-Infrastructure, 06Wikipedia-Android-App-Backlog, 13Patch-For-Review: Puppet on CI Trusty slaves: Duplicate declaration: Exec[jenkins-deploy kvm membership] - https://phabricator.wikimedia.org/T149294#2750826 (10Niedzielski) Ah, thanks for the summary! [00:59:43] 10Continuous-Integration-Infrastructure, 06Wikipedia-Android-App-Backlog, 13Patch-For-Review: Puppet on CI Trusty slaves: Duplicate declaration: Exec[jenkins-deploy kvm membership] - https://phabricator.wikimedia.org/T149294#2747883 (10Dzahn) I merged the xdummy change a little while ago. [01:08:43] 10Continuous-Integration-Infrastructure, 06Wikipedia-Android-App-Backlog, 13Patch-For-Review: Puppet on CI Trusty slaves: Duplicate declaration: Exec[jenkins-deploy kvm membership] - https://phabricator.wikimedia.org/T149294#2750857 (10Niedzielski) @Dzahn, @hashar thanks! [01:13:02] PROBLEM - SSH on deployment-sca02 is CRITICAL: Server answer [01:58:00] RECOVERY - SSH on deployment-sca02 is OK: SSH OK - OpenSSH_6.7p1 Debian-5+deb8u3 (protocol 2.0) [03:26:22] 10Continuous-Integration-Infrastructure, 10MediaWiki-Codesniffer: CodeSniffer Generic.Formatting.SpaceAfterCast.NoSpace is incorrect - https://phabricator.wikimedia.org/T50450#520636 (10Samwilson) Is this really resolved? Surely if the coding standards say no-space, then it makes sense for phpcs to check that?... [03:30:02] PROBLEM - SSH on deployment-sca02 is CRITICAL: Server answer [04:05:00] RECOVERY - SSH on deployment-sca02 is OK: SSH OK - OpenSSH_6.7p1 Debian-5+deb8u3 (protocol 2.0) [04:11:00] PROBLEM - SSH on deployment-sca02 is CRITICAL: Server answer [06:47:58] Project selenium-Wikibase » chrome,beta,Linux,contintLabsSlave && UbuntuTrusty build #155: 04FAILURE in 2 hr 7 min: https://integration.wikimedia.org/ci/job/selenium-Wikibase/BROWSER=chrome,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/155/ [07:01:00] RECOVERY - SSH on deployment-sca02 is OK: SSH OK - OpenSSH_6.7p1 Debian-5+deb8u3 (protocol 2.0) [07:07:01] PROBLEM - SSH on deployment-sca02 is CRITICAL: Server answer [08:06:28] 10Continuous-Integration-Infrastructure, 10MediaWiki-Codesniffer: CodeSniffer Generic.Formatting.SpaceAfterCast.NoSpace is incorrect - https://phabricator.wikimedia.org/T50450#2751250 (10hashar) @Samwilson well the discussion on this ticket went with "lets not enforce any rule for space or non space after cast... [08:06:49] 10Continuous-Integration-Config, 10Tool-Labs-tools-stewardbots, 13Patch-For-Review: Implement jenkins tests on labs/tools/stewardbots - https://phabricator.wikimedia.org/T128503#2751251 (10MarcoAurelio) Okay, so, to summarize, we have PHP lint tests running. We still lack (IMO): * python checks * and maybe... [08:07:04] 10Continuous-Integration-Config, 10Tool-Labs-tools-stewardbots: Implement jenkins tests on labs/tools/stewardbots - https://phabricator.wikimedia.org/T128503#2751252 (10MarcoAurelio) [08:21:59] RECOVERY - SSH on deployment-sca02 is OK: SSH OK - OpenSSH_6.7p1 Debian-5+deb8u3 (protocol 2.0) [08:27:59] PROBLEM - SSH on deployment-sca02 is CRITICAL: Server answer [08:38:33] hashar: hey! I wonder if you know what's the quickest way to get a request for a pending Git repo approved? I'm currently blocked and waiting on one to submit some code (and about to use Github instead) [08:39:35] jdlrobson: noooo github!!!!!!! :] [08:39:45] ;) [08:39:50] that freaks me out !! [08:39:58] good it had the right effect then hehe [08:40:08] can you use bitbucket.org instead? [08:40:27] don't you have access to the Gerrit creation page at https://gerrit.wikimedia.org/r/#/admin/create-project/ ? [08:40:48] if so I can assist [08:40:58] else, link me to the request / a repo name and I will handle the creation [08:42:21] https://www.mediawiki.org/wiki/Git/New_repositories/Requests [08:42:39] hashar: it's right at the bottom `mediawiki/services/trending-edits` [08:43:04] it would be useful if this page gave some kind of SLA of how long the process is expected to take and who to poke if it takes longer than that [08:43:05] I hate that wiki page [08:43:26] I will make it to inherit from mediawiki/services [08:43:54] https://gerrit.wikimedia.org/r/#/admin/projects/mediawiki/services/trending-edits,access [08:43:57] created [08:44:03] and own by the "mediawiki" gerrit group [08:44:21] and mediawiki-services [08:47:46] jdlrobson: should be good now [08:47:52] hashar: <3 [08:47:59] repo is now owned by the group mediawiki-services-trending-edits [08:48:00] proof https://gerrit.wikimedia.org/r/#/admin/projects/mediawiki/services/trending-edits,access [08:48:01] lemme give it a spin [08:48:19] I have made you a member of it https://gerrit.wikimedia.org/r/#/admin/groups/1032,members [08:48:24] hashar: who has +2 to it ? [08:48:27] the group including all of mediawiki-services [08:48:35] all members? [08:48:41] and the repo inherits right from mediawiki/services which is owned by... mediawiki-services [08:49:05] ahh so i cant even +2? interesting :) [08:49:05] ultimately it inherits from the 'mediawiki' group https://gerrit.wikimedia.org/r/#/admin/projects/mediawiki,access [08:49:12] so a lot should be able to CR+2 [08:50:37] yep. Thanks hashar ! [08:52:36] jdlrobson: will want to add a project in Phabricator as well [08:52:40] and obviously some CI glue :] [08:57:27] already is a project [08:57:58] RECOVERY - SSH on deployment-sca02 is OK: SSH OK - OpenSSH_6.7p1 Debian-5+deb8u3 (protocol 2.0) [09:01:08] hashar: having issues with using git review [09:01:14] https://www.irccloud.com/pastebin/3OVXj32Q/ [09:04:59] PROBLEM - SSH on deployment-sca02 is CRITICAL: Server answer [09:05:30] jdlrobson hi, its qchris [09:06:02] hi paladox have you encountered the above before? [09:06:07] Yes [09:06:11] It can take time [09:06:29] jdlrobson if you want it quicker you can ask him in -devtools [09:06:49] jdlrobson qchris is the creator of gerrit [09:07:42] paladox: i dont think i need the creator of gerrit to fix this :-) seems like an issue with my setup that should be easily fixed im currently googling [09:07:47] oh [09:08:00] QChris contracted for the WMF a few years ago [09:08:06] and participated in Gerrit development / maintenance [09:08:27] he eventually moved to analytics team and left :( [09:08:27] jdlrobson what problem are you having? [09:08:48] hashar but qchris was one of the founders of gerrit :) [09:08:49] i've just setup a new repo and im having issues sending the first commit via git review [09:08:58] Oh [09:09:03] jdlrobson: have you cloned it? [09:09:04] jdlrobson try plain git [09:09:17] Or use the inline edit [09:09:29] what is the output of: git remote -v && git fetch --all [09:09:37] hashar did you add the .gitreview file? [09:09:47] na it is an empty repo [09:09:52] Oh thats why [09:09:58] qchris always adds the file [09:10:08] let me try an add it [09:10:24] i have a local .gitreview file [09:10:39] oh, there dosent seem to be a branch here https://gerrit.wikimedia.org/r/#/admin/projects/mediawiki/services/trending-edits,branches [09:10:43] no master [09:12:17] warning: remote HEAD refers to nonexistent ref, unable to checkout. [09:12:22] jdlrobson hashar ^^ [09:13:31] running git branch -a hasent found any branches [09:14:09] hashar when you created the repo did you click Create initial empty commit ? [09:15:11] that did it. Thanks paladox [09:15:28] Oh, jdlrobson not sure what did it? [09:15:40] i added a branch [09:15:41] and your welcome :) [09:15:42] master [09:15:45] ah :) [09:17:50] jdlrobson it seems to have refs/meta/config in branch master [09:18:53] jdlrobson: all set ? [09:19:02] paladox: I always create the repos empty [09:19:17] maybe? depends on whether refs/meta/config should be in master [09:19:39] hashar oh, you can select the creation intial commit [09:19:58] It should create a blank repo but have a commit there. [09:20:21] jdlrobson, ive https://gerrit.wikimedia.org/r/318508 [09:21:31] jdlrobson im not sure about the diffusion repo, hashar do you have permission to create the diffusion repo? [09:28:00] Diffusion I have no idea [09:28:11] I think they are created/synced manually from time to time [09:51:34] PROBLEM - Puppet run on deployment-apertium01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [09:57:16] Oh [10:09:04] 10Gerrit, 06Release-Engineering-Team, 06Operations, 13Patch-For-Review: Investigate why gerrit slowed down on 17/10/2016 / 18/10/2016 / 21/10/2016 - https://phabricator.wikimedia.org/T148478#2751520 (10ArielGlenn) Let's look at the logs (I'm doing so). We still don't have logs for a slowdown event, but I... [10:35:28] 10Continuous-Integration-Config, 10Tool-Labs-tools-stewardbots: Implement jenkins tests on labs/tools/stewardbots - https://phabricator.wikimedia.org/T128503#2751559 (10hashar) The CI definition in integration/config.git zuul/layout.yaml is: ``` - name: labs/tools/stewardbots template: - name: comp... [11:37:59] 10Continuous-Integration-Config, 10Tool-Labs-tools-stewardbots, 13Patch-For-Review: Implement jenkins tests on labs/tools/stewardbots - https://phabricator.wikimedia.org/T128503#2751642 (10hashar) 05stalled>03Open https://gerrit.wikimedia.org/r/318521 is a first pass and should be a good base to build up... [12:00:53] (03PS1) 10Hashar: Drop one use of zuul-cloner-extdeps [integration/config] - 10https://gerrit.wikimedia.org/r/318524 [12:02:36] (03CR) 10Hashar: [C: 032] Drop one use of zuul-cloner-extdeps [integration/config] - 10https://gerrit.wikimedia.org/r/318524 (owner: 10Hashar) [12:03:35] (03Merged) 10jenkins-bot: Drop one use of zuul-cloner-extdeps [integration/config] - 10https://gerrit.wikimedia.org/r/318524 (owner: 10Hashar) [12:18:53] (03PS1) 10Hashar: Normalize prepare-mediawiki-zuul macro to use deps.txt [integration/config] - 10https://gerrit.wikimedia.org/r/318526 [13:01:29] (03PS2) 10Hashar: Fix dirty VisualEditor submodule [integration/config] - 10https://gerrit.wikimedia.org/r/297126 (https://phabricator.wikimedia.org/T121479) (owner: 10JanZerebecki) [13:02:38] (03CR) 10Hashar: [C: 032] "I got rid of the parameters in the ve-submodule-update macro. The list of dependencies is always in /deps.txt . Had to move the macro ca" [integration/config] - 10https://gerrit.wikimedia.org/r/297126 (https://phabricator.wikimedia.org/T121479) (owner: 10JanZerebecki) [13:02:42] (03CR) 10Hashar: [C: 032] Normalize prepare-mediawiki-zuul macro to use deps.txt [integration/config] - 10https://gerrit.wikimedia.org/r/318526 (owner: 10Hashar) [13:03:18] 10Continuous-Integration-Infrastructure, 13Patch-For-Review: Zuul-cloner fails in mediawiki-extensions-hhvm job due to dirty VisualEditor submodule - https://phabricator.wikimedia.org/T121479#2751770 (10hashar) 05Open>03Resolved a:03hashar Should be good now. [13:04:53] (03Merged) 10jenkins-bot: Normalize prepare-mediawiki-zuul macro to use deps.txt [integration/config] - 10https://gerrit.wikimedia.org/r/318526 (owner: 10Hashar) [13:05:26] (03Merged) 10jenkins-bot: Fix dirty VisualEditor submodule [integration/config] - 10https://gerrit.wikimedia.org/r/297126 (https://phabricator.wikimedia.org/T121479) (owner: 10JanZerebecki) [13:27:06] (03PS1) 10Robert Vogel: Add jenkins tests [integration/config] - 10https://gerrit.wikimedia.org/r/318536 [13:32:00] (03PS2) 10Robert Vogel: Add jenkins tests [integration/config] - 10https://gerrit.wikimedia.org/r/318536 [13:40:00] RECOVERY - SSH on deployment-sca02 is OK: SSH OK - OpenSSH_6.7p1 Debian-5+deb8u3 (protocol 2.0) [13:46:00] PROBLEM - SSH on deployment-sca02 is CRITICAL: Server answer [13:52:43] (03PS1) 10Hashar: Simplify zuul-cloner-extdeps [integration/config] - 10https://gerrit.wikimedia.org/r/318541 [13:54:01] (03CR) 10Hashar: [C: 032] Simplify zuul-cloner-extdeps [integration/config] - 10https://gerrit.wikimedia.org/r/318541 (owner: 10Hashar) [13:54:49] (03Merged) 10jenkins-bot: Simplify zuul-cloner-extdeps [integration/config] - 10https://gerrit.wikimedia.org/r/318541 (owner: 10Hashar) [14:04:12] (03PS1) 10Hashar: Wikibase: hardcode ext-name [integration/config] - 10https://gerrit.wikimedia.org/r/318544 [14:06:42] (03CR) 10Tobias Gritschacher: [C: 031] Pin selenium-webdriver < 3 [selenium] - 10https://gerrit.wikimedia.org/r/318311 (https://phabricator.wikimedia.org/T149319) (owner: 10Hashar) [14:10:43] (03PS1) 10Hashar: zuul-cloner-extdeps drop 'ext-name' parameter [integration/config] - 10https://gerrit.wikimedia.org/r/318546 [14:10:55] (03CR) 10Hashar: [C: 032] Wikibase: hardcode ext-name [integration/config] - 10https://gerrit.wikimedia.org/r/318544 (owner: 10Hashar) [14:11:52] (03Merged) 10jenkins-bot: Wikibase: hardcode ext-name [integration/config] - 10https://gerrit.wikimedia.org/r/318544 (owner: 10Hashar) [14:13:18] (03CR) 10Hashar: [C: 032] zuul-cloner-extdeps drop 'ext-name' parameter [integration/config] - 10https://gerrit.wikimedia.org/r/318546 (owner: 10Hashar) [14:15:32] (03Merged) 10jenkins-bot: zuul-cloner-extdeps drop 'ext-name' parameter [integration/config] - 10https://gerrit.wikimedia.org/r/318546 (owner: 10Hashar) [14:26:01] (03PS3) 10Hashar: Add jenkins tests [integration/config] - 10https://gerrit.wikimedia.org/r/318536 (owner: 10Robert Vogel) [14:26:06] (03CR) 10Hashar: [C: 032] Add jenkins tests [integration/config] - 10https://gerrit.wikimedia.org/r/318536 (owner: 10Robert Vogel) [14:26:18] (03CR) 10Hashar: "Thanks Robert :]" [integration/config] - 10https://gerrit.wikimedia.org/r/318536 (owner: 10Robert Vogel) [14:28:05] (03Merged) 10jenkins-bot: Add jenkins tests [integration/config] - 10https://gerrit.wikimedia.org/r/318536 (owner: 10Robert Vogel) [14:46:35] PROBLEM - Puppet run on zuul-dev-jessie is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [14:53:11] PROBLEM - Host deployment-pdf02 is DOWN: CRITICAL - Host Unreachable (10.68.16.129) [14:54:29] PROBLEM - Host deployment-conftool is DOWN: CRITICAL - Host Unreachable (10.68.20.30) [15:21:59] RECOVERY - nodepoold running on labnodepool1001 is OK: PROCS OK: 1 process with UID = 113 (nodepool), regex args ^/usr/bin/python /usr/bin/nodepoold -d [15:25:05] PROBLEM - Puppet run on deployment-zotero01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [15:25:12] PROBLEM - Puppet run on deployment-elastic06 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [15:25:17] PROBLEM - Puppet run on integration-slave-trusty-1012 is CRITICAL: CRITICAL: 83.33% of data above the critical threshold [0.0] [15:25:17] PROBLEM - Puppet run on deployment-ores-redis is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [15:25:18] Project beta-scap-eqiad build #126359: 04FAILURE in 17 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/126359/ [15:26:33] PROBLEM - Puppet run on deployment-mira is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [15:26:37] PROBLEM - Puppet run on deployment-redis02 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [15:26:41] PROBLEM - Puppet run on deployment-apertium02 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [15:26:41] PROBLEM - Puppet run on deployment-mathoid is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [15:26:45] PROBLEM - Puppet run on deployment-mx is CRITICAL: CRITICAL: 75.00% of data above the critical threshold [0.0] [15:26:47] PROBLEM - Puppet run on integration-slave-jessie-android is CRITICAL: CRITICAL: 37.50% of data above the critical threshold [0.0] [15:26:48] PROBLEM - Puppet run on deployment-kafka04 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [15:26:52] PROBLEM - Puppet run on integration-slave-trusty-1004 is CRITICAL: CRITICAL: 42.86% of data above the critical threshold [0.0] [15:27:30] PROBLEM - Puppet run on deployment-poolcounter04 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [15:27:58] PROBLEM - Puppet run on deployment-eventlogging03 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [15:28:00] PROBLEM - Puppet run on deployment-puppetmaster is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [15:29:50] Project beta-scap-eqiad build #126360: 04STILL FAILING in 20 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/126360/ [15:30:58] PROBLEM - Puppet run on deployment-pdf01 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [15:31:08] PROBLEM - Puppet run on deployment-tin is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [15:31:14] PROBLEM - Puppet run on deployment-sca01 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [15:31:21] PROBLEM - Puppet run on integration-slave-trusty-1006 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [15:31:25] PROBLEM - Puppet run on deployment-kafka03 is CRITICAL: CRITICAL: 37.50% of data above the critical threshold [0.0] [15:31:35] PROBLEM - Puppet run on integration-publisher is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [15:31:35] PROBLEM - Puppet run on deployment-secureredirexperiment is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [15:31:43] PROBLEM - Puppet run on deployment-ms-be01 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [15:32:11] PROBLEM - Puppet run on deployment-ircd is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [15:32:18] PROBLEM - Puppet run on deployment-memc04 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [15:32:31] PROBLEM - Puppet run on deployment-tmh01 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [15:32:51] PROBLEM - Puppet run on deployment-imagescaler01 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [15:32:53] PROBLEM - Puppet run on deployment-db03 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [15:33:02] PROBLEM - Puppet run on deployment-kafka01 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [15:33:02] PROBLEM - Puppet run on deployment-sentry01 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [15:33:23] 03Scap3, 10Parsoid: Limited deploy still wants to deploy to canaries - https://phabricator.wikimedia.org/T149128#2752255 (10thcipriani) 05Open>03Resolved [15:34:52] RECOVERY - Host Generic Beta Cluster is UP: PING OK - Packet loss = 0%, RTA = 0.51 ms [15:36:55] Yippee, build fixed! [15:36:56] Project beta-scap-eqiad build #126361: 09FIXED in 1 min 56 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/126361/ [15:37:52] RECOVERY - Puppet run on deployment-mediawiki04 is OK: OK: Less than 1.00% above the threshold [0.0] [15:39:48] PROBLEM - Puppet run on deployment-phab02 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [15:42:51] 03Scap3, 10Parsoid: Limited deploy still wants to deploy to canaries - https://phabricator.wikimedia.org/T149128#2752310 (10thcipriani) This fix will go out with the next release, FYI. [15:45:55] RECOVERY - Puppet run on deployment-memc05 is OK: OK: Less than 1.00% above the threshold [0.0] [15:46:45] RECOVERY - Puppet run on deployment-cache-upload04 is OK: OK: Less than 1.00% above the threshold [0.0] [15:47:09] RECOVERY - Puppet run on integration-slave-trusty-1011 is OK: OK: Less than 1.00% above the threshold [0.0] [15:47:23] RECOVERY - Puppet run on deployment-ms-be02 is OK: OK: Less than 1.00% above the threshold [0.0] [15:47:24] RECOVERY - Puppet run on deployment-mediawiki06 is OK: OK: Less than 1.00% above the threshold [0.0] [15:47:54] RECOVERY - Puppet run on deployment-elastic05 is OK: OK: Less than 1.00% above the threshold [0.0] [15:48:35] RECOVERY - Puppet run on deployment-db04 is OK: OK: Less than 1.00% above the threshold [0.0] [15:51:04] PROBLEM - Puppet run on deployment-phab01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [15:51:30] RECOVERY - Puppet run on deployment-cache-text04 is OK: OK: Less than 1.00% above the threshold [0.0] [15:51:42] RECOVERY - Puppet run on deployment-mathoid is OK: OK: Less than 1.00% above the threshold [0.0] [15:51:50] RECOVERY - Puppet run on deployment-kafka04 is OK: OK: Less than 1.00% above the threshold [0.0] [15:52:03] RECOVERY - Puppet run on integration-slave-jessie-1001 is OK: OK: Less than 1.00% above the threshold [0.0] [15:52:59] RECOVERY - Puppet run on repository is OK: OK: Less than 1.00% above the threshold [0.0] [15:53:47] Project selenium-MobileFrontend » firefox,beta,Linux,contintLabsSlave && UbuntuTrusty build #207: 04FAILURE in 31 min: https://integration.wikimedia.org/ci/job/selenium-MobileFrontend/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/207/ [15:55:05] RECOVERY - Puppet run on deployment-zotero01 is OK: OK: Less than 1.00% above the threshold [0.0] [15:55:52] RECOVERY - Puppet run on deployment-logstash2 is OK: OK: Less than 1.00% above the threshold [0.0] [15:56:19] RECOVERY - Puppet run on deployment-changeprop is OK: OK: Less than 1.00% above the threshold [0.0] [15:56:20] RECOVERY - Puppet run on deployment-salt02 is OK: OK: Less than 1.00% above the threshold [0.0] [15:56:36] RECOVERY - Puppet run on deployment-redis02 is OK: OK: Less than 1.00% above the threshold [0.0] [16:00:12] RECOVERY - Puppet run on integration-slave-trusty-1012 is OK: OK: Less than 1.00% above the threshold [0.0] [16:00:14] RECOVERY - Puppet run on deployment-elastic06 is OK: OK: Less than 1.00% above the threshold [0.0] [16:00:14] RECOVERY - Puppet run on integration-slave-trusty-1016 is OK: OK: Less than 1.00% above the threshold [0.0] [16:00:15] RECOVERY - Puppet run on deployment-ores-redis is OK: OK: Less than 1.00% above the threshold [0.0] [16:00:22] RECOVERY - Puppet run on castor is OK: OK: Less than 1.00% above the threshold [0.0] [16:00:31] RECOVERY - Puppet run on integration-slave-trusty-1001 is OK: OK: Less than 1.00% above the threshold [0.0] [16:00:37] RECOVERY - Puppet run on deployment-prometheus01 is OK: OK: Less than 1.00% above the threshold [0.0] [16:00:37] RECOVERY - Puppet run on deployment-jobrunner02 is OK: OK: Less than 1.00% above the threshold [0.0] [16:00:39] RECOVERY - Puppet run on deployment-kafka05 is OK: OK: Less than 1.00% above the threshold [0.0] [16:01:15] RECOVERY - Puppet run on integration-puppetmaster01 is OK: OK: Less than 1.00% above the threshold [0.0] [16:01:31] RECOVERY - Puppet run on deployment-mira is OK: OK: Less than 1.00% above the threshold [0.0] [16:01:41] RECOVERY - Puppet run on deployment-apertium02 is OK: OK: Less than 1.00% above the threshold [0.0] [16:01:45] RECOVERY - Puppet run on deployment-mx is OK: OK: Less than 1.00% above the threshold [0.0] [16:01:46] RECOVERY - Puppet run on deployment-sca03 is OK: OK: Less than 1.00% above the threshold [0.0] [16:01:47] RECOVERY - Puppet run on integration-slave-jessie-android is OK: OK: Less than 1.00% above the threshold [0.0] [16:03:03] RECOVERY - Puppet run on deployment-puppetmaster is OK: OK: Less than 1.00% above the threshold [0.0] [16:05:44] RECOVERY - Puppet run on integration-slave-jessie-1003 is OK: OK: Less than 1.00% above the threshold [0.0] [16:05:48] RECOVERY - Puppet run on deployment-aqs01 is OK: OK: Less than 1.00% above the threshold [0.0] [16:05:56] RECOVERY - Puppet run on deployment-pdf01 is OK: OK: Less than 1.00% above the threshold [0.0] [16:06:10] RECOVERY - Puppet run on deployment-tin is OK: OK: Less than 1.00% above the threshold [0.0] [16:06:16] RECOVERY - Puppet run on deployment-sca01 is OK: OK: Less than 1.00% above the threshold [0.0] [16:06:22] RECOVERY - Puppet run on deployment-stream is OK: OK: Less than 1.00% above the threshold [0.0] [16:06:34] RECOVERY - Puppet run on deployment-secureredirexperiment is OK: OK: Less than 1.00% above the threshold [0.0] [16:06:50] RECOVERY - Puppet run on integration-slave-trusty-1004 is OK: OK: Less than 1.00% above the threshold [0.0] [16:07:51] RECOVERY - Puppet run on deployment-imagescaler01 is OK: OK: Less than 1.00% above the threshold [0.0] [16:11:21] RECOVERY - Puppet run on integration-slave-trusty-1006 is OK: OK: Less than 1.00% above the threshold [0.0] [16:11:26] RECOVERY - Puppet run on deployment-kafka03 is OK: OK: Less than 1.00% above the threshold [0.0] [16:11:35] RECOVERY - Puppet run on integration-publisher is OK: OK: Less than 1.00% above the threshold [0.0] [16:11:46] RECOVERY - Puppet run on deployment-ms-be01 is OK: OK: Less than 1.00% above the threshold [0.0] [16:12:22] RECOVERY - Puppet run on deployment-memc04 is OK: OK: Less than 1.00% above the threshold [0.0] [16:12:34] RECOVERY - Puppet run on deployment-tmh01 is OK: OK: Less than 1.00% above the threshold [0.0] [16:12:54] RECOVERY - Puppet run on deployment-db03 is OK: OK: Less than 1.00% above the threshold [0.0] [16:13:02] RECOVERY - Puppet run on deployment-kafka01 is OK: OK: Less than 1.00% above the threshold [0.0] [16:21:23] RECOVERY - Puppet run on integration-slave-precise-1002 is OK: OK: Less than 1.00% above the threshold [0.0] [16:48:03] ohai shinken-wm [16:52:30] RECOVERY - Puppet run on integration-slave-trusty-1003 is OK: OK: Less than 1.00% above the threshold [0.0] [16:55:43] (03PS4) 10Andrew Bogott: Puppet doc now ignore /bin files [integration/config] - 10https://gerrit.wikimedia.org/r/309332 (https://phabricator.wikimedia.org/T143233) (owner: 10Hashar) [16:58:25] hasharAway: care to merge ^ ? [16:59:38] or… anyone else who has +2 in that repo? [17:54:52] greg-g, how many changes per hour do we have? [17:55:44] Krenair: I'd have to do some sleuthing [17:56:01] I need to run an errand, bbiab [17:56:08] ah, nevermind then [18:00:22] (03CR) 10Paladox: [C: 031] Puppet doc now ignore /bin files [integration/config] - 10https://gerrit.wikimedia.org/r/309332 (https://phabricator.wikimedia.org/T143233) (owner: 10Hashar) [18:57:23] 10Continuous-Integration-Infrastructure, 10Ladies-That-FOSS-MediaWiki, 13Patch-For-Review: Jenkins: Set up PHPUnit testing on PostgreSQL backend - https://phabricator.wikimedia.org/T39602#2752709 (10saper) [19:52:42] (03CR) 10Hashar: [C: 04-1] "I thought I had that one fixed! From T143233 it is not quite right yet, the HTML links use absolute path and are thus all broken :(" [integration/config] - 10https://gerrit.wikimedia.org/r/309332 (https://phabricator.wikimedia.org/T143233) (owner: 10Hashar) [20:22:21] (03PS5) 10Hashar: Puppet doc now deletes bin directories [integration/config] - 10https://gerrit.wikimedia.org/r/309332 (https://phabricator.wikimedia.org/T143233) [20:23:11] unable to convert "\xB0" from ASCII-8BIT to UTF-8 for files/misc/geoiplogtag, skipping [20:23:20] bah [20:33:15] andrewbogott: I am trying to get the puppet doc generation job fixed :/ [20:40:23] hasharAway: thanks! Apparently it's not so simple :( [20:45:18] andrewbogott: I am tired and not paying attention :D [20:45:21] that is the main cause [20:45:38] (03PS6) 10Hashar: Puppet doc now deletes bin directories [integration/config] - 10https://gerrit.wikimedia.org/r/309332 (https://phabricator.wikimedia.org/T143233) [20:45:43] ok — it's not urgent, just messy [20:45:51] cause really: rm 'modules/*/bin' [20:45:57] that escapes the wildcard :D [20:47:35] (03CR) 10Andrew Bogott: [C: 031] "Seems reasonable to me..." [integration/config] - 10https://gerrit.wikimedia.org/r/309332 (https://phabricator.wikimedia.org/T143233) (owner: 10Hashar) [20:48:29] https://integration.wikimedia.org/ci/job/operations-puppet-doc/27478/console [20:48:36] lets see what goes on there [20:50:00] (03PS1) 10Hashar: Fix puppet-doc base dir for operations-puppet [integration/config] - 10https://gerrit.wikimedia.org/r/318642 [20:58:42] andrewbogott: ah it completed something :) [21:01:01] (03CR) 10Hashar: [C: 032] "The job pass now. The output is complete crap though but at least it is no more failing." [integration/config] - 10https://gerrit.wikimedia.org/r/309332 (https://phabricator.wikimedia.org/T143233) (owner: 10Hashar) [21:01:03] (03CR) 10Hashar: [C: 032] Fix puppet-doc base dir for operations-puppet [integration/config] - 10https://gerrit.wikimedia.org/r/318642 (owner: 10Hashar) [21:02:43] (03Merged) 10jenkins-bot: Puppet doc now deletes bin directories [integration/config] - 10https://gerrit.wikimedia.org/r/309332 (https://phabricator.wikimedia.org/T143233) (owner: 10Hashar) [21:02:45] (03Merged) 10jenkins-bot: Fix puppet-doc base dir for operations-puppet [integration/config] - 10https://gerrit.wikimedia.org/r/318642 (owner: 10Hashar) [22:42:23] twentyafterfour: i'm just adding new stuff for phab2001 but not touching the existing iridium one .. because Friday etc :) [22:42:32] so no worries about the current prod [22:47:18] mutante: ok cool [22:47:41] I'm just following along, trying to learn a bit more about our network architecture [22:49:52] twentyafterfour: yes, the networks are like this https://phabricator.wikimedia.org/T143363#2753335 [22:50:06] and i look them up like this: [22:50:13] /wmf/dns/templates$ grep public 153.80.208.in-addr.arpa [22:50:13] ; 208.80.153.0/27 (public1-a-codfw) [22:50:13] ; 208.80.153.32/27 (public1-b-codfw) [22:50:14] .. [22:50:48] twentyafterfour: what i am wondering next is that we have: [22:50:55] 1 ssh::server::listen_address: "10.64.32.150" [22:51:09] for the existing one in eqiad [22:51:21] but that is not iridium-vcs.eqiad.wmnet [22:51:43] that is 10.64.32.186 [22:52:07] 31.150 is iridium.eqiad.wmnet itself [22:52:24] does that make sense for the ssh::server::listen_address ? [22:52:36] mutante: there should be too ssh servers [22:52:47] the main ssh server which should be on the iridium ip [22:52:56] and the git ssh server which should be a separate ip [22:52:57] yes, so i expected this variable [22:53:02] to mean the second SSH server [22:53:09] but i am wrong it seems [22:53:27] ssh::server::listen_address is the main one, then [22:53:45] I'm pretty sure the second ssh server is configured in phabricator::vcs puppet module [22:54:16] role/eqiad/phabricator/main.yaml:phabricator::vcs::listen_addresses: [22:54:41] ah, yea, that is this change https://gerrit.wikimedia.org/r/#/c/317295/ [22:54:53] we can merge that now, it only affects codfw [22:55:05] then we should be able to re-enable puppet again on phab2001 [22:55:10] without getting that duplicate IP [22:55:18] but instead getting the right one now [23:01:15] mutante: sweet! [23:03:27] awesome, mutante [23:04:56] mutante: I rebased your patch, should be ready to submit now [23:06:23] yep, in one minute, multi-tasking with papaul [23:11:00] :) [23:15:17] twentyafterfour: re-enabled puppet.. running it [23:15:48] it needed that for other stuff too. for example i see it gets now that Icinga is not on neon anymore and stuff [23:16:33] it is not starting ssh-phab yet.. but that is for later [23:16:42] just looking at that IP now [23:17:01] cool [23:17:16] hmm. not quite there yet [23:17:21] we might need another puppet change [23:17:31] if we get the git ssh port working I can clusterize all of our repos and then we'll have geographically distributed, master-master replicated git [23:17:36] gotta check where we do the hiera lookup [23:17:40] hmm [23:17:42] it _should_ have happened automatically [23:17:52] since ./eqiad/ vs ./codfw/ in hiera [23:18:07] but we still have the 10.64.32.186 on both [23:18:19] i'm removing that again [23:18:43] and disabling puppet again [23:19:00] but still we got some other changes applied, heh [23:20:37] hmm I see interface::ip in the role is hardcoded? [23:21:14] yes, so we have this: [23:21:20] modules/role/manifests/phabricator/main.pp: address => '10.64.32.186', [23:21:22] * twentyafterfour needs to update puppet [23:21:23] modules/role/manifests/phabricator/main.pp: address => '2620:0:861:103:10:64:32:186', [23:21:26] but we already have this: [23:21:29] hieradata/role/eqiad/phabricator/main.yaml: - "10.64.32.186" [23:21:29] hieradata/role/eqiad/phabricator/main.yaml: - "[2620:0:861:103:10:64:32:186]" [23:21:37] we want it to look that up in hiera [23:21:46] i just assumed it does because that was already there [23:22:39] does hiera work like that? [23:22:49] I thought it only worked on class parameters like that [23:23:01] but not resources which interface::ip is a resource not a class [23:23:30] it's just not automatic, we have to tell it to look it up , like .. [23:23:55] hiera() [23:24:20] $servers = hiera('puppetmaster::servers', {}) [23:24:31] similar to that and then use the variable, yes [23:25:21] look, i have to bring back a rental car by 5pm [23:25:27] because my car broke, heh [23:25:34] mutante: no prob, I can submit a patch for this [23:25:37] but you get the idea.. we want a lookup [23:25:40] cool, olk [23:25:42] ok [23:25:47] :) [23:25:55] :) cu later then [23:26:01] thanks! [23:26:09] yw