[01:26:06] PROBLEM - Puppet staleness on integration-slave-trusty-1013 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [43200.0] [04:18:43] Yippee, build fixed! [04:18:43] Project selenium-MultimediaViewer » firefox,beta,Linux,contintLabsSlave && UbuntuTrusty build #125: 09FIXED in 22 min: https://integration.wikimedia.org/ci/job/selenium-MultimediaViewer/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/125/ [04:54:33] 10Beta-Cluster-Infrastructure, 06Labs: puppet::self hosts now have two servers set - https://phabricator.wikimedia.org/T144108#2589909 (10yuvipanda) @mmodell https://phabricator.wikimedia.org/T120159 [04:55:50] 10Beta-Cluster-Infrastructure, 06Labs: puppet::self hosts now have two servers set - https://phabricator.wikimedia.org/T144108#2589910 (10yuvipanda) role::puppet::self for puppet *clients* is doubly terrible. I'll spend next week getting rid of that across labs - see https://phabricator.wikimedia.org/T120159#2... [05:34:07] RECOVERY - Long lived cherry-picks on puppetmaster on deployment-puppetmaster is OK: OK: Less than 100.00% above the threshold [0.0] [06:06:13] 10Beta-Cluster-Infrastructure, 10ContentTranslation-Deployments, 10MediaWiki-extensions-ContentTranslation: Beta: cxserver is not updated - https://phabricator.wikimedia.org/T144149#2589947 (10KartikMistry) [06:12:20] 10Beta-Cluster-Infrastructure, 10ContentTranslation-Deployments, 10MediaWiki-extensions-ContentTranslation: Beta: cxserver is not updated - https://phabricator.wikimedia.org/T144149#2589959 (10KartikMistry) 05Open>03Invalid No. We use sca01, so this is invalid. [06:50:00] Project selenium-Wikibase » firefox,test,Linux,contintLabsSlave && UbuntuTrusty build #91: 04FAILURE in 2 hr 9 min: https://integration.wikimedia.org/ci/job/selenium-Wikibase/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=test,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/91/ [07:50:01] !log integration-slave-trusty-1013 puppet.conf certname was set to 'undef' breaking puppet [07:50:05] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [07:55:43] RECOVERY - SSH on integration-slave-trusty-1012 is OK: SSH OK - OpenSSH_6.6.1p1 Ubuntu-2ubuntu2.8 (protocol 2.0) [07:56:44] !log hard rebooting integration-slave-trusty-1012 via horizon and restarting puppet manually [07:56:47] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [08:00:35] PROBLEM - Puppet staleness on integration-slave-trusty-1012 is CRITICAL: CRITICAL: 37.50% of data above the critical threshold [43200.0] [08:05:36] RECOVERY - Puppet staleness on integration-slave-trusty-1012 is OK: OK: Less than 1.00% above the threshold [3600.0] [08:06:04] RECOVERY - Puppet staleness on integration-slave-trusty-1013 is OK: OK: Less than 1.00% above the threshold [3600.0] [08:17:10] 10Beta-Cluster-Infrastructure, 10ContentTranslation-Deployments, 10MediaWiki-extensions-ContentTranslation: Beta: cxserver is not updated - https://phabricator.wikimedia.org/T144149#2590122 (10Krenair) 05Invalid>03Open No, it definitely has cxserver: krenair@bastion-01:~$ ldapsearch -x dc=deployment-sca0... [08:23:25] 10Continuous-Integration-Infrastructure, 06Operations, 07Zuul: Upgrade Zuul on gallium to 2.5.0-8-gcbc7f62-wmf1precise1 - https://phabricator.wikimedia.org/T144088#2590129 (10hashar) Got Zuul upgraded on gallium (the Zuul server). Dependencies upgrades I failed to add in the changelog: | Module | Old | New... [08:25:42] 10Continuous-Integration-Infrastructure, 07Upstream, 07WorkType-Maintenance, 07Zuul: Zuul deadlocks if unknown repo has activity in Gerrit - https://phabricator.wikimedia.org/T128569#2590133 (10hashar) I have upgraded Zuul server on gallium with `zuul_2.5.0-8-gcbc7f62-wmf1precise1`. It includes upstream pa... [08:31:04] 10Continuous-Integration-Infrastructure, 07Upstream, 07WorkType-Maintenance, 07Zuul: Zuul deadlocks if unknown repo has activity in Gerrit - https://phabricator.wikimedia.org/T128569#2590140 (10hashar) 05Open>03Resolved The main cause has been fixed by upstream commit: 0deaaadac7143692961b9d28abee8cea5... [08:36:30] 10Deployment-Systems, 03Scap3: Update Debian Package for Scap3 - https://phabricator.wikimedia.org/T127762#2590155 (10fgiunchedi) 05Open>03Resolved @thcipriani for sure! package is built/uploaded [08:39:05] godog: do you also mass upgrade the salt package on the fleet? [08:39:25] err [08:39:33] godog: do you also mass upgrade the **scap** package on the fleet? [08:39:52] hashar: nope that'll be taken care of by https://gerrit.wikimedia.org/r/#/c/307028/ once it is merged, likely later today or tomorrow's puppet swat at the latest [08:40:11] ah the version is pinned in puppet [08:40:12] neat [08:58:47] RECOVERY - Puppet run on deployment-changeprop is OK: OK: Less than 1.00% above the threshold [0.0] [09:11:43] 10Beta-Cluster-Infrastructure, 10ContentTranslation-Deployments, 10MediaWiki-extensions-ContentTranslation: Beta: cxserver is not updated - https://phabricator.wikimedia.org/T144149#2590235 (10hashar) On beta, cxserver has been moved from a dedicated instance to the deployment-scaXX machine. That has been do... [09:12:17] kart_: I have added a few more details to the "cxserver does not update" task [09:49:58] 10Continuous-Integration-Infrastructure, 07WorkType-NewFunctionality, 07Zuul: On slaves, install zuul-cloner from Gerrit in a venv instead of using a deb package - https://phabricator.wikimedia.org/T113538#2590308 (10hashar) 05Open>03declined The debian packaging dance is good enough for now. Might revis... [10:08:35] 10Continuous-Integration-Infrastructure, 10Packaging, 07Zuul: Package / puppetize zuul-clear-refs.py - https://phabricator.wikimedia.org/T103529#2590324 (10hashar) a:03hashar [10:25:45] 10Continuous-Integration-Config, 07Regression, 07Zuul: integration-zuul-layoutdiff claims difference when there is none - https://phabricator.wikimedia.org/T143966#2590330 (10hashar) p:05Triage>03Normal [10:29:16] (03PS1) 10Hashar: Add __repr__ to a few classes [integration/zuul] (patch-queue/debian/precise-wikimedia) - 10https://gerrit.wikimedia.org/r/307258 [10:29:18] (03PS1) 10Hashar: Don't merge queues if the common job is 'noop' [integration/zuul] (patch-queue/debian/precise-wikimedia) - 10https://gerrit.wikimedia.org/r/307259 [10:30:04] (03CR) 10Hashar: [C: 032 V: 032] "That is for T143966 "integration-zuul-layoutdiff claims difference when there is none"" [integration/zuul] (patch-queue/debian/precise-wikimedia) - 10https://gerrit.wikimedia.org/r/307258 (owner: 10Hashar) [10:30:31] (03CR) 10Hashar: [C: 032 V: 032] "Cherry pick from PS2 of upstream patch. Should fix the issue, will get it included in the next zuul.deb" [integration/zuul] (patch-queue/debian/precise-wikimedia) - 10https://gerrit.wikimedia.org/r/307259 (owner: 10Hashar) [10:37:54] (03PS1) 10Hashar: Couple patches pending upstream merge [integration/zuul] (debian/precise-wikimedia) - 10https://gerrit.wikimedia.org/r/307264 (https://phabricator.wikimedia.org/T143966) [10:37:56] (03PS1) 10Hashar: New release 2.5.0-8-gcbc7f62-wmf2precise1 [integration/zuul] (debian/precise-wikimedia) - 10https://gerrit.wikimedia.org/r/307265 [10:39:02] (03CR) 10Paladox: [C: 031] New release 2.5.0-8-gcbc7f62-wmf2precise1 [integration/zuul] (debian/precise-wikimedia) - 10https://gerrit.wikimedia.org/r/307265 (owner: 10Hashar) [10:40:43] 10Continuous-Integration-Config, 13Patch-For-Review, 07Regression, 07Upstream, 07Zuul: integration-zuul-layoutdiff claims difference when there is none - https://phabricator.wikimedia.org/T143966#2590368 (10hashar) p:05Normal>03Low Fixed by cherry picking PS2 of https://review.openstack.org/#/c/36106... [10:41:19] hashar: okay! [10:43:03] 10Beta-Cluster-Infrastructure, 10ContentTranslation-Deployments, 10MediaWiki-extensions-ContentTranslation: Beta: cxserver is not updated - https://phabricator.wikimedia.org/T144149#2590374 (10KartikMistry) @Krenair Beta can't load balance, so only using sca01 instance. [10:46:12] (03CR) 10Hashar: [C: 032 V: 032] Couple patches pending upstream merge [integration/zuul] (debian/precise-wikimedia) - 10https://gerrit.wikimedia.org/r/307264 (https://phabricator.wikimedia.org/T143966) (owner: 10Hashar) [10:46:18] (03CR) 10Hashar: [C: 032 V: 032] New release 2.5.0-8-gcbc7f62-wmf2precise1 [integration/zuul] (debian/precise-wikimedia) - 10https://gerrit.wikimedia.org/r/307265 (owner: 10Hashar) [10:50:43] 10Continuous-Integration-Infrastructure, 06Operations, 07Zuul: Upgrade Zuul on gallium to 2.5.0-8-gcbc7f62-wmf2precise1 - https://phabricator.wikimedia.org/T144088#2590397 (10hashar) [10:51:51] 10Continuous-Integration-Infrastructure, 07Upstream, 07WorkType-Maintenance, 07Zuul: Zuul deadlocks if unknown repo has activity in Gerrit - https://phabricator.wikimedia.org/T128569#2590405 (10hashar) [10:51:53] 10Continuous-Integration-Infrastructure, 06Operations, 07Zuul: Upgrade Zuul on gallium to 2.5.0-8-gcbc7f62-wmf2precise1 - https://phabricator.wikimedia.org/T144088#2588273 (10hashar) 05Open>03Resolved I have build yet another package to cherry pick a few more patches we needed. New package is on gallium... [10:52:39] 10Continuous-Integration-Infrastructure, 07Upstream, 07WorkType-Maintenance, 07Zuul: Zuul deadlocks if unknown repo has activity in Gerrit - https://phabricator.wikimedia.org/T128569#2079165 (10hashar) [10:52:41] 10Continuous-Integration-Infrastructure, 06Operations, 07Zuul: Upgrade Zuul on gallium to 2.5.0-8-gcbc7f62-wmf2precise1 - https://phabricator.wikimedia.org/T144088#2590406 (10hashar) 05Resolved>03Open did not meant to resolve it sorry. Still need upload to apt.wm.o [10:55:05] 10Continuous-Integration-Infrastructure, 10Packaging, 07Zuul: Package / puppetize zuul-clear-refs.py - https://phabricator.wikimedia.org/T103529#2590427 (10hashar) I have included the script in our Debian package for Precise which ship it as `/usr/bin/zuul-clear-ref`. That is meant to be used on zuul-merger... [11:15:30] 10Continuous-Integration-Config, 06Operations, 06Operations-Software-Development: Flake8 for python files without extension in puppet repo - https://phabricator.wikimedia.org/T144169#2590514 (10Volans) [11:38:21] 10Continuous-Integration-Config, 06Operations, 06Operations-Software-Development: Flake8 for python files without extension in puppet repo - https://phabricator.wikimedia.org/T144169#2590544 (10hashar) A bit of context for the CI part: The Jenkins job `operations-puppet-tox` is pretty simple, it basically:... [12:03:51] (03PS1) 10Paladox: Merge branch 'debian/precise-wikimedia' into debian/jessie-wikimedia [integration/zuul] (debian/jessie-wikimedia) - 10https://gerrit.wikimedia.org/r/307276 [12:03:53] (03PS1) 10Paladox: New release 2.5.0-8-gcbc7f62-wmf2jessie1 [integration/zuul] (debian/jessie-wikimedia) - 10https://gerrit.wikimedia.org/r/307277 [12:04:03] (03Abandoned) 10Paladox: 2.5.0-8-gcbc7f62-wmf1jessie1 [integration/zuul] (debian/jessie-wikimedia) - 10https://gerrit.wikimedia.org/r/307063 (owner: 10Paladox) [12:04:11] (03Abandoned) 10Paladox: Merge remote-tracking branch 'origin/debian/precise-wikimedia' into debian/jessie-wikimedia [integration/zuul] (debian/jessie-wikimedia) - 10https://gerrit.wikimedia.org/r/307066 (owner: 10Paladox) [12:14:55] 10Deployment-Systems, 03Scap3, 10scap, 06Operations: Make keyholder work with systemd - https://phabricator.wikimedia.org/T144043#2590601 (10MoritzMuehlenhoff) a:03MoritzMuehlenhoff I'll look into that. [12:30:57] (03CR) 10Mobrovac: [C: 031] Parsoid's tool and roundtrip tests should be on node v4 [integration/config] - 10https://gerrit.wikimedia.org/r/306710 (owner: 10Arlolra) [12:43:29] 10Beta-Cluster-Infrastructure, 06Operations, 10Traffic, 13Patch-For-Review, 07WorkType-Maintenance: On beta cluster varnish stats process points to production statsd - https://phabricator.wikimedia.org/T116898#2590674 (10hashar) I have rebased the Puppet patch https://gerrit.wikimedia.org/r/#/c/249490/... [12:46:07] 10Beta-Cluster-Infrastructure, 06Operations, 10Traffic, 13Patch-For-Review, 07WorkType-Maintenance: On beta cluster varnish stats process points to production statsd - https://phabricator.wikimedia.org/T116898#2590718 (10hashar) The patch has been cherry picked on beta cluster for quite a while already s... [12:55:39] 10Continuous-Integration-Config, 06Operations, 06Operations-Software-Development: Flake8 for python files without extension in puppet repo - https://phabricator.wikimedia.org/T144169#2590769 (10Volans) [12:57:59] 10Continuous-Integration-Config, 06Operations, 06Operations-Software-Development: Flake8 for python files without extension in puppet repo - https://phabricator.wikimedia.org/T144169#2590775 (10Volans) >>! In T144169#2590544, @hashar wrote: > So for 1 //Fix the Jenkins job to search for those files and inclu... [12:58:08] 10Deployment-Systems, 06Labs, 10Tool-Labs: Add release engineering people to tools.jouncebot user group - https://phabricator.wikimedia.org/T144175#2590777 (10hashar) [13:35:33] 10Deployment-Systems, 06Labs, 10Tool-Labs: Add release engineering people to tools.jouncebot user group - https://phabricator.wikimedia.org/T144175#2590935 (10chasemp) p:05Triage>03Normal 18 members of https://phabricator.wikimedia.org/project/members/20/, some of whom I don't recognize. Please provide... [13:47:14] hashar: hey yo, I've got like an hour an half want to revert the smallest of https://phabricator.wikimedia.org/T143938 [13:48:14] Project selenium-VisualEditor » firefox,beta,Linux,contintLabsSlave && UbuntuTrusty build #129: 04FAILURE in 4 min 12 sec: https://integration.wikimedia.org/ci/job/selenium-VisualEditor/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/129/ [13:52:06] 10Deployment-Systems, 06Labs, 10Tool-Labs: Add release engineering people to tools.jouncebot user group - https://phabricator.wikimedia.org/T144175#2590981 (10hashar) [13:52:32] 10Deployment-Systems, 06Labs, 10Tool-Labs: Add release engineering people to tools.jouncebot user group - https://phabricator.wikimedia.org/T144175#2590777 (10hashar) My bad sorry. Edited with list of each of our labs shell accounts: ``` dduvall demon gjg hashar twentyafterfour thcipriani zfilipin ``` [13:53:27] hiy ok! i just merged https://gerrit.wikimedia.org/r/#/c/307288/, so that should go out with this week's deploy train, ja? [13:53:40] let me know if I need to do anything else (e.g. create the wmf version branch) [13:54:30] ottomata: that will be included when I cut the branch tomorrow around 10-11am UTC [13:55:07] and if needed, can be cherry pick to the current version ( wmf.16 ) and added in the swat deploy [13:56:07] 10Beta-Cluster-Infrastructure, 06Release-Engineering-Team: Beta puppetmaster cherry-pick process - https://phabricator.wikimedia.org/T135427#2591000 (10hashar) [13:56:09] 10Beta-Cluster-Infrastructure, 06Operations, 10Traffic, 13Patch-For-Review, 07WorkType-Maintenance: On beta cluster varnish stats process points to production statsd - https://phabricator.wikimedia.org/T116898#2590998 (10hashar) 05Open>03Resolved Has been kindly reviewed and merged in by @BBlack . On... [13:56:45] 10Beta-Cluster-Infrastructure, 06Release-Engineering-Team: Beta puppetmaster cherry-pick process - https://phabricator.wikimedia.org/T135427#2298408 (10hashar) >>! In T135427#2588200, @hashar wrote: > |@hashar|cache: vary statsd_server with hiera|https://gerrit.wikimedia.org/r/#/c/249490/ Got reviewed, merged... [13:59:48] nope, not needed [13:59:51] should go on regular train hashar [14:00:05] i'll test that stuff on wikitech when it goes out tomorrow (right?) [14:00:15] and then be around to babysit when it goes out to other wikis on wednesday (right?) [14:00:56] 10Continuous-Integration-Config, 13Patch-For-Review: Add Python validation to operations/software repo - https://phabricator.wikimedia.org/T143559#2591012 (10Volans) 05Open>03Resolved a:03Volans [14:05:21] ottomata: yeah that is about right [14:05:30] ottomata: it is also already on beta cluster :) [14:08:27] oh great [14:12:02] I want to land this patch of mine in my tools repo [14:12:03] https://phabricator.wikimedia.org/D326 [14:12:09] How can I do it? [14:12:23] It doesn't let me add myself as reviewer [14:14:10] Oh, I also really don't mind if it doesn't show my IP https://phabricator.wikimedia.org/diffusion/pushlog/?repositories=R1957 [14:19:57] Amir1: at the bottom of https://phabricator.wikimedia.org/D326 you have to "Accept Revision" [14:20:16] then at the top that will shows something like: Next step: arc land D326 [14:20:30] which you will have to run on your machine to get the commit merged locally and pushed over the http api [14:20:33] or something like that [14:20:58] Strangely I don't see that option [14:21:15] maybe becuase I pushed it already [14:37:49] 10Deployment-Systems, 06Labs, 10Tool-Labs: Add release engineering people to tools.jouncebot user group - https://phabricator.wikimedia.org/T144175#2591094 (10chasemp) It's possibly early morning fugue state but I don't see: demon (I do see chad) twentyafterfour gjg I added (to tools if necessary as well):... [14:47:16] Amir1 i have accepted it for you [14:47:30] oh thanks Platonides [14:47:35] hashar you carnt accept your own patches, you need to set an option in phabricator to do this [14:47:38] paladox: [14:47:44] Your welcome [14:47:49] typo :D [14:48:32] hashar Amir1 https://phabricator.wikimedia.org/T131622 [14:49:55] 10Deployment-Systems, 06Labs, 10Tool-Labs: Add release engineering people to tools.jouncebot user group - https://phabricator.wikimedia.org/T144175#2590777 (10Paladox) @chasemp that would be ^demon [14:52:49] 05Gerrit-Migration, 10Differential, 10Phabricator, 13Patch-For-Review: Enable differential.allow-self-accept in phabricator - https://phabricator.wikimedia.org/T131622#2591131 (10Paladox) We should have some way of letting project authors merge there own patches without needing to have someone accept it fo... [14:53:49] Amir1 you do arc land [14:53:56] arc land master [14:57:24] paladox: it's already pushed so basically I can't do anything [14:57:58] Amir1 oh so you landed it, i guess you can now close https://phabricator.wikimedia.org/D326 [14:58:29] PROBLEM - Puppet run on deployment-sca02 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [14:58:48] done [14:59:15] PROBLEM - Puppet run on deployment-sca01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [14:59:43] Thanks [14:59:44] :) [15:04:02] (03PS1) 10Paladox: Merge branch 'upstream' [integration/zuul] - 10https://gerrit.wikimedia.org/r/307308 [15:08:34] (03PS2) 10Paladox: Merge branch 'upstream' [integration/zuul] - 10https://gerrit.wikimedia.org/r/307308 [15:11:12] (03PS1) 10Paladox: Merge upstream into master [integration/zuul] - 10https://gerrit.wikimedia.org/r/307311 [15:12:05] (03Abandoned) 10Paladox: Merge upstream into master [integration/zuul] - 10https://gerrit.wikimedia.org/r/307311 (owner: 10Paladox) [15:12:17] 10Deployment-Systems, 10Tool-Labs-tools-Other: Jouncebot not joining #wikimedia-operations - https://phabricator.wikimedia.org/T144189#2591190 (10bd808) [15:20:55] 10Deployment-Systems, 10Tool-Labs-tools-Other: Jouncebot not joining #wikimedia-operations - https://phabricator.wikimedia.org/T144189#2591190 (10Paladox) Could it be someone quieted the ip of stashbot? [15:22:40] hashar https://gerrit.wikimedia.org/r/307308 :) [15:38:48] Yippee, build fixed! [15:38:49] Project selenium-MobileFrontend » chrome,beta,Linux,contintLabsSlave && UbuntuTrusty build #137: 09FIXED in 16 min: https://integration.wikimedia.org/ci/job/selenium-MobileFrontend/BROWSER=chrome,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/137/ [15:46:14] Yippee, build fixed! [15:46:15] Project selenium-MobileFrontend » firefox,beta,Linux,contintLabsSlave && UbuntuTrusty build #137: 09FIXED in 24 min: https://integration.wikimedia.org/ci/job/selenium-MobileFrontend/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/137/ [15:52:18] 10Deployment-Systems, 10Tool-Labs-tools-Other, 13Patch-For-Review: Jouncebot not joining #wikimedia-operations - https://phabricator.wikimedia.org/T144189#2591368 (10bd808) I patched in some additional logging, but am not seeing any clear reason why things aren't working yet: ``` 2016-08-29 15:47:58,387 - IN... [15:57:39] 10Deployment-Systems, 10Tool-Labs-tools-Other, 13Patch-For-Review: Jouncebot not joining #wikimedia-operations - https://phabricator.wikimedia.org/T144189#2591407 (10bd808) ``` [09:56] jouncebot has userhost tools.joun@instance-tools-exec-1404.tools.wmflabs.org and real name “https://github.com/mattofak/joun... [16:00:05] (03PS36) 10Zfilipin: WIP Run language screenshots script for VisualEditor in Jenkins [integration/config] - 10https://gerrit.wikimedia.org/r/300035 (https://phabricator.wikimedia.org/T139613) [16:14:15] 10Deployment-Systems, 10Tool-Labs-tools-Other, 13Patch-For-Review: Jouncebot not joining #wikimedia-operations - https://phabricator.wikimedia.org/T144189#2591190 (10Platonides) Fixed by temporarily removing the inheritance from #wikimedia-bans Most probably, it was affected by the “ban everyone not register... [16:25:10] 10Deployment-Systems, 10Tool-Labs-tools-Other, 13Patch-For-Review: Jouncebot not joining #wikimedia-operations - https://phabricator.wikimedia.org/T144189#2591699 (10bd808) >>! In T144189#2591536, @Platonides wrote: > Fixed by temporarily removing the inheritance from #wikimedia-bans Most probably, it was af... [16:46:05] greg-g: can I start piggy backing on the standing services deploy windows for striker deploys? [16:59:49] bd808: yup [17:05:15] 10Beta-Cluster-Infrastructure, 06Release-Engineering-Team, 10DBA, 13Patch-For-Review, 07WorkType-Maintenance: Upgrade mariadb in deployment-prep from Precise/MariaDB 5.5 to Jessie/MariaDB 5.10 - https://phabricator.wikimedia.org/T138778#2591932 (10dduvall) I think we're ready to roll with this on the pup... [17:06:44] Project beta-scap-eqiad build #117658: 04FAILURE in 1 min 57 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/117658/ [17:07:04] twentyafterfour hi, what do i put in phabricator_active_server [17:07:06] please? [17:07:58] greg-g: <# thx [17:07:59] <3 [17:08:50] Im going to make phabricator_active_server optional on labs. [17:10:26] RECOVERY - Host deployment-parsoid05 is UP: PING OK - Packet loss = 0%, RTA = 0.59 ms [17:11:41] 10Beta-Cluster-Infrastructure, 06Release-Engineering-Team, 10DBA, 13Patch-For-Review, 07WorkType-Maintenance: Upgrade mariadb in deployment-prep from Precise/MariaDB 5.5 to Jessie/MariaDB 5.10 - https://phabricator.wikimedia.org/T138778#2591952 (10jcrespo) @dduval, assuming everyhing is prepared (the mac... [17:13:04] Actually maybe we can keep it as it is [17:16:28] Project beta-scap-eqiad build #117659: 04STILL FAILING in 1 min 53 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/117659/ [17:16:37] Im thinking maybe i should make it or should not, but not sure. [17:16:45] phabricator_active_server: 'iridium' thats what set in production [17:16:50] doint know what to set it in labs. [17:16:51] ? [17:18:43] paladox: it's used to set up alerting [17:18:54] so it has no use in labs [17:19:02] Ok, i will disable it in labs [17:19:23] thanks [17:20:06] hrm, deployment-tmh01 is asking me for a password on login :\ [17:23:46] that's...weird. There appear to be no servers in /etc/ldap.yaml on that host [17:24:07] twentyafterfour https://gerrit.wikimedia.org/r/307335 but im not sure if that is the correct syntax to use? [17:26:35] Project beta-scap-eqiad build #117660: 04STILL FAILING in 1 min 53 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/117660/ [17:27:22] ^^ 17:26:32 17:26:32 sudo -u mwdeploy -n -- /usr/bin/rsync -l deployment-tin.eqiad.wmflabs::common/wikiversions*.{json,php} /srv/mediawiki on deployment-tmh01.deployment-prep.eqiad.wmflabs returned [255]: Permission denied (publickey,keyboard-interactive). [17:27:28] PROBLEM - Host deployment-parsoid05 is DOWN: PING CRITICAL - Packet loss = 100% [17:27:42] 10Beta-Cluster-Infrastructure, 10ContentTranslation-Deployments, 10MediaWiki-extensions-ContentTranslation: Beta: cxserver is not updated - https://phabricator.wikimedia.org/T144149#2592051 (10mobrovac) 05Open>03Invalid a:03KartikMistry @KartikMistry, when we were switching CXServer to scap3 we agreed... [17:28:07] yeah, beta-scap-eqiad failing due to /etc/ldap.yaml not having a server list on *tmh01* [17:28:26] oh [17:28:45] gonna see if a puppet run fixes it for whatever reason [17:29:25] ok thanks [17:30:52] sweet, should be fixed! [17:31:16] ostriches hi, could you merge https://gerrit.wikimedia.org/r/#/c/307071/ please? [17:31:52] Yippee, build fixed! [17:31:52] Project beta-scap-eqiad build #117661: 09FIXED in 1 min 56 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/117661/ [17:32:02] thanks [17:32:31] PROBLEM - Puppet run on deployment-tmh01 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [17:32:59] I can but I have zero clue how to deploy it or what to do after it merges :p [17:53:54] oh [17:55:10] ostriches https://www.mediawiki.org/wiki/Wikibugs [17:55:27] https://www.mediawiki.org/wiki/Wikibugs#Deploying_changes [17:55:54] I don't have fabric installed, nor do I think I have access to that bot. [17:57:14] oh ok [17:57:57] twentyafterfour hi, could you merge https://gerrit.wikimedia.org/r/#/c/307071/ and deploy it please? [17:58:10] it says here https://tools.wmflabs.org/?tool=wikibugs that you have access to the bot [18:02:07] * paladox does this http://www.neowin.net/news/apple-is-being-sued-over-iphone-66-plus-touch-disease means i have a free reapir even though my phone hasent broke yet (iphone 6 plus) [18:12:28] RECOVERY - Puppet run on deployment-tmh01 is OK: OK: Less than 1.00% above the threshold [0.0] [18:24:13] 10Deployment-Systems, 10scap: handle logstash timeouts separately from spikes in errors reported by logstash - https://phabricator.wikimedia.org/T144033#2592255 (10greg) p:05Unbreak!>03High [18:28:25] 06Release-Engineering-Team (Deployment-Blockers), 05Release: MW-1.28.0-wmf.17 deployment blockers - https://phabricator.wikimedia.org/T142117#2592271 (10greg) [18:29:04] 06Release-Engineering-Team (Deployment-Blockers), 05Release: MW-1.28.0-wmf.17 deployment blockers - https://phabricator.wikimedia.org/T142117#2523354 (10greg) [18:29:29] 06Release-Engineering-Team (Deployment-Blockers), 05Release: MW-1.28.0-wmf.17 deployment blockers - https://phabricator.wikimedia.org/T142117#2523354 (10greg) [18:44:55] ostriches twentyafterfour hi, did you see https://gerrit.wikimedia.org/r/#/c/306413/ ?, Thiemo Mättig (WMDE) managed to finally fix the expanding phabricator links in gerrit. [18:44:56] :) [18:47:59] I see everything. All seeing, all knowing :) [18:48:21] Oh :) [18:48:51] All tested and plain tasks and https links works, i see no breakage either. [18:50:08] Yay my bt smart hub has now enabled ipv6 addresses for me, /me no longer shows my ipv4 address [18:52:19] paladox: I haven't ever deployed changes to wikibugs before... [18:52:29] Oh [18:52:40] twentyafterfour https://www.mediawiki.org/wiki/Wikibugs#Deploying_changes [18:52:50] and thanks for merging [18:53:14] twentyafterfour could you also re review https://gerrit.wikimedia.org/r/#/c/307335/ please since i made some more changes? [18:53:36] Im also wondering do i do it here https://github.com/wikimedia/operations-puppet/blob/2bfe0f97682cda19d647a2f6874f04a2c299e494/modules/phabricator/manifests/monitoring.pp#L10 too [18:53:36] ? [18:58:17] paladox: I don't think we want labs to be running the dump script [18:58:27] Ok [18:59:03] Ok i have changed it to not running in labs now :) [19:01:00] it's the same as the else { } block so you can just leave it out entirely [19:01:09] Oh but it fails [19:01:18] twentyafterfour with [19:01:19] Error: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find data item phabricator_active_server in any Hiera data file and no default supplied at /etc/puppet/modules/phabricator/manifests/init.pp:250 on node phab-03.phabricator.eqiad.wmflabs [19:01:19] Warning: Not using cache on failed catalog [19:01:19] Error: Could not retrieve catalog; skipping run [19:01:26] if/else blocks are evil :) [19:01:35] Instead should vary ensure => on hiera [19:01:38] so if i do labs first then the rest it shoulden fail i think [19:01:43] oh [19:01:46] ostriches: this is true [19:02:11] but we wanted a single quick switch to flip the active server between eqiad and codfw [19:02:37] paladox: provide a hiera value in the labs hiera config [19:02:48] a value for phab_active_server [19:02:52] twentyafterfour oh, but what do i put in it [19:02:55] twentyafterfour: The switch is fine, but if/else isn't the right way to use them imho :) [19:02:56] phab-03 [19:03:05] Also: all hiera values should have some default [19:03:14] phab-03.phabricator.eqiad.wmflabs [19:03:18] "Could not find data item" is an unacceptable failure imho [19:03:21] paladox: do what ostriches said: give it a default value [19:03:41] Oh, do you mean phab_active_server "" [19:04:45] $phabricator_active_server = hiera('phabricator_active_server', 'iridium') [19:04:56] or something like that [19:04:58] No no, that's even worse! :) [19:05:05] hiera() calls are evillllll [19:05:14] I've been meaning to fix all of this crap [19:05:22] ostriches: talk to dzahn [19:05:35] ;) [19:05:36] twentyafterfour now fails with [19:05:36] Error: Could not retrieve catalog from remote server: Error 400 on SERVER: is not an integer, but is used as an index of an array at /etc/puppet/modules/role/manifests/phabricator/main.pp:48 on node phab-03.phabricator.eqiad.wmflabs [19:05:36] Warning: Not using cache on failed catalog [19:05:36] Error: Could not retrieve catalog; skipping run [19:06:01] ostriches: I have no clue [19:06:10] trusted_proxies => $cache_misc_nodes[$::site], [19:07:23] paladox: you are going to run into a lot of errors. the production phabricator role is very different from labs and I think it will take a lot of work to get through the errors [19:07:39] twentyafterfour yeh, im fixing these errors [19:07:47] as part of https://phabricator.wikimedia.org/T144112 [19:07:54] to match how we did with gerrit [19:11:16] 10Deployment-Systems, 10scap: handle logstash timeouts separately from spikes in errors reported by logstash - https://phabricator.wikimedia.org/T144033#2592498 (10mmodell) Sorry I didn't mean to do that. Creating a sub-task should not clone to parent's priority ;) [19:11:58] twentyafterfour should i trusted_proxies => $cache_misc_nodes[$::site], disable that on labs? [19:14:00] paladox: I don't know, it's going to make a mess of the phabricator role if you add a bunch of conditional stuff in there [19:14:11] I prefer separate labs role honestly [19:14:16] Oh ok [19:14:18] the whole thing needs a lot of refactoring [19:14:24] in order to make it work for both [19:14:45] * ostriches has a first patch [19:14:49] Incoming! [19:14:53] Oh :) [19:14:59] ostriches: link plz [19:15:47] https://gerrit.wikimedia.org/r/#/c/307354/ [19:16:02] :) [19:16:19] * ostriches will run through compiler next [19:16:23] Make sure it's a no-op [19:16:39] How to clean up puppet manifests 101: small incremental no-op changes. [19:17:11] ostriches i added Bug: T144112 to your commit msg [19:17:22] Please don't edit my commits. [19:17:26] Oh sorry [19:19:42] ostriches gerrit now correctly tells wikibug who the actual author of the patch is https://phabricator.wikimedia.org/T144112#2592532 [19:19:45] :) [19:25:00] shouldn't that really be in hiera too? ;) [19:25:31] twentyafterfour: What should be? [19:30:00] ostriches twentyafterfour https://gerrit.wikimedia.org/r/#/c/307357/ ? [19:31:32] Um no, why would you do that? [19:31:39] Roles belong in roles/, not in the module [19:32:03] Because then it is under one roof, instead of mutiple [19:32:10] That's not how it works [19:32:13] Abandon plz [19:32:14] Oh [19:32:20] Ok [19:33:07] ostriches are you working on the phabricator refactor? Ie changing it so it works better accross production and labs? [19:33:54] labs is a side goal yes [19:34:22] mainly I just want it so I can look at the manifests without stabbing myself [19:34:33] I'm tired of stabbing myself [19:34:36] Ok thanks [19:34:39] It hurts [19:34:43] Oh [19:34:47] * bd808 sticks a cork on ostriches fork [19:34:55] lol [19:35:41] bd808: they say humor is the most mature defense mechanism :) [19:36:21] Thats going to be alot of work [19:37:14] https://youtu.be/eF8QAeQm3ZM?t=348 [19:37:18] puppet is easy [19:37:33] Oh, i find php easy :) [19:37:47] LOL [19:38:02] bd808: bahahahaha [19:38:28] safety first! [19:38:34] * paladox gets reminders of updates in bash yay [19:38:35] 50 packages can be updated. [19:38:35] 5 updates are security updates. [19:38:55] i doint even have to run apt-get update, microsoft do it for me on a regular basis [19:40:00] :) [19:40:11] twentyafterfour: when do you plan to update phab? That's a disadvantage we got at that milestone-system: We can't comment [19:40:59] twentyafterfour if we update, there was a change upstream with the logo and a new config was added for customisation of the logo and phabricator text [19:41:21] IE instead of PHABRICATOR it is now Phabricator, we could possibly change it to WM Phabricator [19:55:34] LUKE081515: can't comment? [19:56:13] I haven't updated in a while because there haven't been any changes that seemed helpful and there are some which will be disruptive (like the one paladox just mentioned) [19:56:15] twentyafterfour: I think you can't add a comment to a project? My comment in that case would be: Are there plans, when do we make that update [19:56:18] ? [19:56:44] I don't know probably not this week maybe next week [19:57:15] twentyafterfour could you pull from upstream [19:57:20] and merge into the repo please [19:57:24] why do you need to add a comment to a project? [19:57:25] for testing in phab-01 [19:57:30] paladox: ok [19:58:03] thanks :) [19:58:42] twentyafterfour: before we had tasks for updating, there I could add a comment, now we have a subproject for that, so I can't [19:59:38] ahh [19:59:52] I understand now ;) [20:05:46] twentyafterfour :) [20:23:04] PROBLEM - Puppet run on deployment-ms-be01 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [20:25:17] twentyafterfour when apply the labs phabricator role im getting this error [20:25:18] Error: Execution of '/usr/bin/scap deploy-local --repo phabricator/deployment -D log_json:False' returned 70: [20:25:18] Error: /Stage[main]/Phabricator/Scap::Target[phabricator/deployment]/Package[phabricator/deployment]/ensure: change from absent to present failed: Execution of '/usr/bin/s [20:25:19] PROBLEM - Puppet run on deployment-memc04 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [20:25:59] PROBLEM - Puppet run on deployment-memc05 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [20:26:21] PROBLEM - Puppet run on deployment-mediawiki03 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [20:27:51] PROBLEM - Puppet run on deployment-zookeeper01 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [20:28:19] I guess fixing the production phabricator role should be a priority, and also make scap deploying it optional and instead allow users to choose to do deploying it through git instead of scap. [20:28:29] since it requires a floating ip [20:29:03] Im wondering should i change the status of the task to high priority. [20:29:11] twentyafterfour ostriches ^^ [20:29:23] Not now. [20:29:30] Ok [20:30:00] * paladox goes and watch tv now :) [20:32:00] or better yet not require an ip in scap [20:38:04] RECOVERY - Puppet run on deployment-ms-be01 is OK: OK: Less than 1.00% above the threshold [0.0] [20:40:18] RECOVERY - Puppet run on deployment-memc04 is OK: OK: Less than 1.00% above the threshold [0.0] [20:45:57] RECOVERY - Puppet run on deployment-memc05 is OK: OK: Less than 1.00% above the threshold [0.0] [20:54:28] 06Release-Engineering-Team (Deployment-Blockers), 13Patch-For-Review, 05Release: MW-1.28.0-wmf.16 deployment blockers - https://phabricator.wikimedia.org/T141551#2592784 (10greg) [20:57:48] RECOVERY - Puppet run on deployment-zookeeper01 is OK: OK: Less than 1.00% above the threshold [0.0] [20:59:25] 06Release-Engineering-Team (Deployment-Blockers), 13Patch-For-Review, 05Release: MW-1.28.0-wmf.16 deployment blockers - https://phabricator.wikimedia.org/T141551#2592798 (10greg) 05Open>03Resolved [21:01:18] RECOVERY - Puppet run on deployment-mediawiki03 is OK: OK: Less than 1.00% above the threshold [0.0] [21:31:48] PROBLEM - Puppet run on deployment-aqs01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [21:48:41] twentyafterfour hi, have you managed to merge from upstream into the repo please? [21:49:13] paladox: working on it [21:49:29] twentyafterfour thanks :) :) [21:49:51] twentyafterfour guessing we could use the phabricator new logo config for setting our logo. [21:50:10] twentyafterfour also what about WM phabricator as the text next to the logo :) [21:50:18] ? [21:53:08] 10Deployment-Systems, 06Labs, 10Tool-Labs: Add release engineering people to tools.jouncebot user group - https://phabricator.wikimedia.org/T144175#2592987 (10AlexMonk-WMF) 05Open>03Resolved a:03chasemp @chasemp, if you added through wikitech then it's possible you were trying to use the provided uids... [22:12:02] 10scap: Tab completion doesn't work well for directories - https://phabricator.wikimedia.org/T144244#2593079 (10Catrope) [22:21:27] paladox: pushed to wmf/dev branch [22:21:37] twentyafterfour thanks :) [22:22:18] twentyafterfour could you press the update button for ghttps://phabricator.wikimedia.org/diffusion/PHAB/ please [22:22:22] https://phabricator.wikimedia.org/diffusion/PHAB/ [22:22:25] ? [22:25:46] * paladox deploys https://github.com/wikimedia/phabricator/commits/wmf/dev to phab-01 [22:25:47] :) [22:27:23] twentyafterfour could you also update arcanist and libphutil please? [22:27:26] repos [22:31:52] https://i.imgur.com/9bRQdPP.png [22:32:19] oh [22:32:23] yeh [22:32:32] LOL [22:32:35] greg-g ^^ [22:33:04] paladox: pushed arcanist and libphutil, both also using a branch named wmf/dev [22:33:12] twentyafterfour thanks :) [22:33:34] * twentyafterfour 🎔 unicode [22:33:40] twentyafterfour could you also press the update button in those repos in phabricator please [22:33:42] and oh [22:33:47] https://phabricator.wikimedia.org/diffusion/PHAB/manage/status/ [22:33:53] :) [22:34:25] thankyou very much twentyafterfour :) [22:34:42] done [22:34:49] thanks [22:34:50] 10Continuous-Integration-Infrastructure: Frivolous Jenkins failures for Selenium due to DB error - https://phabricator.wikimedia.org/T144247#2593143 (10Catrope) [22:35:00] paladox: what? [22:35:10] greg-g https://i.imgur.com/9bRQdPP.png [22:35:36] 10scap: Tab completion doesn't work well for directories - https://phabricator.wikimedia.org/T144244#2593159 (10thcipriani) p:05Triage>03Normal a:03thcipriani Blerg. My doing. I've also noticed a slight delay that I'd like to get rid of if possible. [22:35:42] new customisation support [22:36:07] greg-g: phabricator finally made the logo and wordmark customizable in a proper way [22:36:13] ah, sweet [22:36:22] now the question is, do we customize the wordmark or keep 'phabricator' as before [22:36:43] I'd go with: the least amount of change for now [22:38:01] yes of course.. [22:38:05] I was just playing ;) [22:38:10] 🎔🎔🎔 [22:38:16] 10scap: Tab completion doesn't work well for directories - https://phabricator.wikimedia.org/T144244#2593177 (10thcipriani) [22:38:18] 10scap: scap sync-(file|dir) breaks tab complete - https://phabricator.wikimedia.org/T142548#2593179 (10thcipriani) [22:38:34] twentyafterfour ok deployed to phab-01 :) [22:38:34] twentyafterfour: <3 [22:38:59] We will need to update the config for the lgoo [22:39:04] lgoo = logo [22:39:50] twentyafterfour ^^, ie set it some where in puppet [22:45:10] twentyafterfour we no longer need a sprite image any more, we need to update out logo [22:45:15] https://phabricator.wikimedia.org/diffusion/PHAB/browse/wmf%252Fstable/webroot/rsrc/image/sprite-menu.png [22:45:31] would you be able to update it please? [22:46:17] 10Beta-Cluster-Infrastructure, 10ContentTranslation-Deployments, 10MediaWiki-extensions-ContentTranslation: Beta: cxserver is not updated - https://phabricator.wikimedia.org/T144149#2593209 (10Krenair) >>! In T144149#2590374, @KartikMistry wrote: > @Krenair Beta can't load balance, so only using sca01 instan... [22:47:06] greg-g I wonder could we have it as WMF phabricator? [22:47:12] site logo text [22:56:16] 10Beta-Cluster-Infrastructure, 10MediaWiki-extensions-ORES, 06Revision-Scoring-As-A-Service, 13Patch-For-Review, 15User-Ladsgroup: Switch beta to use the proper wiki models for scoring (rather than "testwiki") - https://phabricator.wikimedia.org/T143567#2593238 (10Halfak) 05Open>03Resolved [22:59:38] twentyafterfour Will need replacing with something similar to https://phabricator.wikimedia.org/diffusion/PHAB/browse/wmf%252Fdev/webroot/rsrc/image/logo/light-eye.png but with the wmf logo [23:02:07] with the wmf logo here https://phabricator.wikimedia.org/diffusion/PHAB/browse/wmf%252Fstable/webroot/rsrc/image/sprite-menu.png but [23:02:17] doint need to have them all just the one wmf logo [23:02:57] it's not WMF Phabricator. It's Wikimedia Phabricator. [23:03:25] Oh [23:03:27] ok [23:03:59] there's a difference :) [23:04:05] WMF != Wikimedia [23:04:39] Oh, i thought wikimedia and wikimedia foundation are the same thing [23:06:18] see https://www.mediawiki.org/wiki/Differences_between_Wikipedia,_Wikimedia,_MediaWiki,_and_wiki#Wikimedia [23:07:05] Ok, thanks [23:09:17] "wikimedia phabricator" doesn't fit [23:09:20] in the space allowed [23:09:59] Oh, i guess we could use wm phabricator or wmf phabricator? [23:11:23] twentyafterfour ^^ [23:15:19] let's not decide this here [23:16:21] Ok [23:34:25] I'm just gonna keep it saying "Phabricator" [23:41:19] twentyafterfour how did you manage to get the wmf logo on [23:41:19] https://i.imgur.com/9bRQdPP.png [23:41:19] since re using https://phabricator.wikimedia.org/diffusion/PHAB/browse/wmf%252Fstable/webroot/rsrc/image/sprite-menu.png but dosent work [23:41:20] Im trying to manually create a .png file using the wmf logo but it isent working [23:41:29] paladox: [23:41:39] https://i.imgur.com/tgC0Djk.png [23:42:46] twentyafterfour thanks :) [23:51:22] twentyafterfour could you review and land https://phabricator.wikimedia.org/D327 please?