[03:18:43] Project browsertests-MobileFrontend-en.m.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #817: FAILURE in 36 min: https://integration.wikimedia.org/ci/job/browsertests-MobileFrontend-en.m.wikipedia.beta.wmflabs.org-linux-firefox-sauce/817/ [05:23:59] Project browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-windows_7-internet_explorer-11-sauce build #541: FAILURE in 21 min: https://integration.wikimedia.org/ci/job/browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-windows_7-internet_explorer-11-sauce/541/ [08:03:03] ;) [08:09:13] laughing out loud [08:25:11] :D [08:28:36] 10Deployment-Systems, 7Documentation: update wikitech trebuchet instructions which still mention deployment::target - https://phabricator.wikimedia.org/T90571#1636406 (10ArielGlenn) 5Open>3Resolved a:3ArielGlenn This is done; reference to the now dead 'grain' entry in the config have also been removed, a... [08:31:21] 10Deployment-Systems: Trebuchet blockers (tracking) - https://phabricator.wikimedia.org/T45338#1636414 (10ArielGlenn) Given that the deployment group has decided to move ahead with scap3 rather than fix up trebuchet, is this bug still valid? [08:34:42] zeljkof-meeting: to read https://wikitech.wikimedia.org/wiki/Help:Self-hosted_puppetmaster [08:38:31] 10Deployment-Systems: Remove "php-" from wiki version numbers - https://phabricator.wikimedia.org/T63733#1636427 (10Legoktm) According to Tim in T45340#457954, this is just a historical leftover: > [00:24:40] the php- prefix predates the name "mediawiki", it annoys me that it was kept through hetde... [08:42:41] !log rebased integration puppetmaster 61870d1..8cf247f [08:42:44] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [08:48:01] zeljkof-meeting: git rebase --interactive origin/production [08:48:01] :D [08:51:04] 10Continuous-Integration-Infrastructure: Migrate all jobs to labs slaves - https://phabricator.wikimedia.org/T86659#1636466 (10hashar) [08:51:30] 10Continuous-Integration-Infrastructure: Migrate all jobs to labs slaves - https://phabricator.wikimedia.org/T86659#1636467 (10hashar) p:5Normal>3Low Almost completed. There is still a few jobs on gallium.wikimedia.org though. [09:00:40] hashar: did I freeze again? [10:53:40] 10Deployment-Systems: scap / deployment of branches should get rid of old caches on tin /var/lib/l10nupdate/caches - https://phabricator.wikimedia.org/T112508#1636695 (10hashar) 3NEW [11:02:21] 10Deployment-Systems, 6Release-Engineering-Team: Trebuchet should repack / pack-refs git repos under /srv/deployment - https://phabricator.wikimedia.org/T112509#1636723 (10hashar) 3NEW [11:03:16] 10Gerrit-Migration, 10Wikidata: [Task] move git repositories that are dependencies of wikidata to gerrit - https://phabricator.wikimedia.org/T74907#1636729 (10JeroenDeDauw) I agree, it makes sense to have all repos WMDE is owner of under the WMDE org. That's not everything Wikidata related though. For instance... [11:19:24] 10Beta-Cluster, 6Labs, 10MediaWiki-General-or-Unknown, 6operations, 5Patch-For-Review: Create a poolcounter instance in deployment-prep - https://phabricator.wikimedia.org/T112501#1636779 (10Krenair) [11:19:47] 10Beta-Cluster, 6Labs, 10MediaWiki-General-or-Unknown, 6operations: Create a poolcounter instance in deployment-prep - https://phabricator.wikimedia.org/T112501#1636375 (10Krenair) [11:37:03] 10Deployment-Systems, 10Salt: [Trebuchet] fetch/checkout timeout should be configurable per repo - https://phabricator.wikimedia.org/T67601#1636831 (10ArielGlenn) [12:03:21] 10Beta-Cluster, 6operations, 7HHVM: Convert work machines (tin, terbium) to Trusty and hhvm usage - https://phabricator.wikimedia.org/T87036#1636917 (10Krenair) Trusty replacement for tin = mira? [12:40:35] (03CR) 10Hashar: [C: 032] "I have created the Jenkins jobs and will deploy on Zuul. Note flake8 fails :(" [integration/config] - 10https://gerrit.wikimedia.org/r/237993 (owner: 10Ladsgroup) [12:42:57] (03Merged) 10jenkins-bot: Add pywikibot/wikibase [integration/config] - 10https://gerrit.wikimedia.org/r/237993 (owner: 10Ladsgroup) [12:50:16] (03CR) 10Zfilipin: "All green :) https://gerrit.wikimedia.org/r/#/c/201148/" [integration/config] - 10https://gerrit.wikimedia.org/r/237638 (https://phabricator.wikimedia.org/T1361) (owner: 10Zfilipin) [12:55:41] Project beta-scap-eqiad build #69972: FAILURE in 1 min 40 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/69972/ [12:56:49] Yippee, build fixed! [12:56:50] Project beta-scap-eqiad build #69973: FIXED in 1 min 6 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/69973/ [13:00:08] Yippee, build fixed! [13:00:09] Project browsertests-MobileFrontend-en.m.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #784: FIXED in 28 min: https://integration.wikimedia.org/ci/job/browsertests-MobileFrontend-en.m.wikipedia.beta.wmflabs.org-linux-chrome-sauce/784/ [13:24:26] (03PS1) 10Hashar: Migrate puppet-lint from Precise to Trusty [integration/config] - 10https://gerrit.wikimedia.org/r/238140 [13:24:49] zeljkof: from this morning, https://gerrit.wikimedia.org/r/238140 migrates puppet lint jobs to Trusty boxes [13:36:53] (03CR) 10Ladsgroup: "Thanks, I will fix flake8 issues ASAP" [integration/config] - 10https://gerrit.wikimedia.org/r/237993 (owner: 10Ladsgroup) [13:52:30] (03CR) 10Zfilipin: [C: 031] Migrate puppet-lint from Precise to Trusty [integration/config] - 10https://gerrit.wikimedia.org/r/238140 (owner: 10Hashar) [13:53:14] hashar: thanks :) [14:07:25] (03CR) 10Hashar: [C: 032] "Updated jobs:" [integration/config] - 10https://gerrit.wikimedia.org/r/238140 (owner: 10Hashar) [14:08:27] 10Continuous-Integration-Infrastructure, 6Labs, 10Labs-Infrastructure: mediawiki-core-phplint clone the whole repo from Zuul and times out - https://phabricator.wikimedia.org/T110512#1637305 (10hashar) 5Open>3Resolved I have manually cloned the mediawiki/core repo on that slave for that job. That has le... [14:08:30] 10Continuous-Integration-Infrastructure, 6Labs, 10Labs-Infrastructure: integration-slave-trusty-1014 and integration-slave-trusty-1017 instances can't boot anymore, ended up corrupted. Need rebuild - https://phabricator.wikimedia.org/T110052#1637307 (10hashar) [14:09:49] (03Merged) 10jenkins-bot: Migrate puppet-lint from Precise to Trusty [integration/config] - 10https://gerrit.wikimedia.org/r/238140 (owner: 10Hashar) [14:11:27] 10Continuous-Integration-Config, 6Labs, 10Tool-Labs: Job labs-toollabs-debian-glue is failing for labs/toollabs repository - https://phabricator.wikimedia.org/T110939#1637318 (10hashar) a:3hashar Assigning to myself since I am probably the only one going/willing to fix this. [14:26:26] 10Continuous-Integration-Config: Create Jenkins bot for mediawiki/extensions/SafeDelete - https://phabricator.wikimedia.org/T112370#1637366 (10hashar) p:5Triage>3Normal You might want to add: * An extension.json file https://www.mediawiki.org/wiki/Manual:Extension_registration * The i18n files validator htt... [14:28:35] 10Continuous-Integration-Config, 6Labs, 10Tool-Labs: Don't know what to put in setup.py/requirements.txt to satisfy both dpkg-buildpackage and tox-flake8 - https://phabricator.wikimedia.org/T110445#1637368 (10hashar) 5Open>3Resolved a:3hashar Seems that is a work in progress on https://gerrit.wikimedia... [14:47:49] (03PS4) 10TasneemLo: Add IfElseStructureSniff to handle else structures [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/237733 (https://phabricator.wikimedia.org/T101311) [14:55:11] 10Beta-Cluster, 6Labs, 10MediaWiki-General-or-Unknown, 6operations: Create a poolcounter instance in deployment-prep - https://phabricator.wikimedia.org/T112501#1637443 (10Andrew) I see this problem and can reproduce it on another instance. No idea as to the cause yet. [14:56:11] hashar and/or other beta people: Is this a familiar issue with creating a new beta instance? https://phabricator.wikimedia.org/T112501 [14:56:50] andrewbogott: maybe there is some ferm rule active [14:56:54] or the puppet master is dead [14:57:09] I take it that’s a ‘no’ :) [14:57:23] the issue only happens on new instances, not on existing ones. [14:57:42] :-( [14:57:52] some network routing issue maybe? [14:58:03] yeah, seems like [14:59:20] I asked moritz in -operations earlier [14:59:39] Krenair: hmm, it's not caused by ferm rules on either deployment-puppetmaster nor deployment-poolcounter01 (they don't have any), maybe related to the openstack update? I'll have a look at the logs [14:59:39] thanks [14:59:39] I checked and other hosts were successfully connecting to that port [14:59:39] <_joe_> moritzm, Krenair I'm on it [14:59:40] <_joe_> it's clearly a higher-level problem in the cloud network [14:59:41] <_joe_> my guess is andrewbogott might know something more about that [15:02:31] ok, I think I may have a fix [15:05:44] great. what was wrong? [15:05:48] 10Beta-Cluster, 6Labs, 10MediaWiki-General-or-Unknown, 6operations: Create a poolcounter instance in deployment-prep - https://phabricator.wikimedia.org/T112501#1637501 (10Andrew) This appears to be yet another issue with the nova rolling-upgrade process. The new instance, deployment-puppetmaster, was run... [15:06:20] 10Beta-Cluster, 6Labs, 10MediaWiki-General-or-Unknown, 6operations: Create a poolcounter instance in deployment-prep - https://phabricator.wikimedia.org/T112501#1637502 (10Andrew) [15:06:31] Krenair: ^ [15:08:27] thanks [15:09:58] 10Beta-Cluster, 6Labs, 10MediaWiki-General-or-Unknown, 6operations: Create a poolcounter instance in deployment-prep - https://phabricator.wikimedia.org/T112501#1637526 (10Krenair) Signed and puppet successfully ran on deployment-poolcounter01.deployment-prep.eqiad.wmflabs [15:10:59] Too bad I already wore my ‘I broke it and then I fixed it’ shirt this week and it’s in the laundry [15:15:14] Anyone know why deployment-logstash2.eqiad.wmflabs has no signed puppet certificate? there's a request sitting open on the master... [15:15:47] root@deployment-puppetmaster:~# find /var/lib/puppet/server/ssl/ca -name deployment-logstash* [15:15:47] /var/lib/puppet/server/ssl/ca/requests/deployment-logstash2.eqiad.wmflabs.pem [15:15:47] /var/lib/puppet/server/ssl/ca/signed/deployment-logstash2.deployment-prep.eqiad.wmflabs.pem [15:18:23] I guess _808db might know [15:22:51] I should probably file a ticket [15:23:53] (03PS3) 10Hashar: (WIP) Generic debian-glue (WIP) [integration/config] - 10https://gerrit.wikimedia.org/r/226911 [15:24:22] 10Beta-Cluster: deployment-logstash2 puppet certificate - https://phabricator.wikimedia.org/T112537#1637567 (10Krenair) 3NEW [15:25:08] 10Beta-Cluster: deployment-logstash2 puppet certificate - https://phabricator.wikimedia.org/T112537#1637575 (10Krenair) [15:28:11] andrewbogott, so how's the migration going in general? [15:28:20] (03PS1) 10Hashar: Tweak pywikibot/wikibase jobs [integration/config] - 10https://gerrit.wikimedia.org/r/238159 [15:28:21] most of the labs hosts now on kilo? [15:28:47] Yep — I have three more virt hosts that I’m doing right now... [15:29:01] That just leaves Horizon which will probably wait for another day (unless it breaks) [15:29:13] Does anyone actually use Horizon yet? [15:29:21] (03PS2) 10Hashar: Tweak pywikibot/wikibase jobs [integration/config] - 10https://gerrit.wikimedia.org/r/238159 [15:29:29] We have horizon? [15:29:44] (03CR) 10Hashar: [C: 032] "We can make the py3 jobs voting when they start passing. Same for the 'doc' env which does not exist right now." [integration/config] - 10https://gerrit.wikimedia.org/r/238159 (owner: 10Hashar) [15:29:51] JohnFLewis, https://horizon.wikimedia.org/ [15:30:14] Krenair: not really, which is why it will wait :) [15:30:23] JohnFLewis: horizon.wikimedia.org [15:30:31] JohnFLewis: use your shell name [15:30:32] (03CR) 10jenkins-bot: [V: 04-1] Tweak pywikibot/wikibase jobs [integration/config] - 10https://gerrit.wikimedia.org/r/238159 (owner: 10Hashar) [15:30:43] Nice. [15:31:16] (03PS3) 10Hashar: Tweak pywikibot/wikibase jobs [integration/config] - 10https://gerrit.wikimedia.org/r/238159 [15:31:24] (03CR) 10Hashar: [C: 032] Tweak pywikibot/wikibase jobs [integration/config] - 10https://gerrit.wikimedia.org/r/238159 (owner: 10Hashar) [15:31:30] 10Deployment-Systems: Trebuchet blockers (tracking) - https://phabricator.wikimedia.org/T45338#1637594 (10greg) Well. I guess I should have clarified this task to be able deploying MediaWiki with trebuchet, but sure. [15:31:51] 10Deployment-Systems: Trebuchet blockers for MediaWiki (tracking) - https://phabricator.wikimedia.org/T45338#1637597 (10greg) 5Open>3stalled [15:31:58] andrewbogott: deployment-prep looks nice, shown in the interface :) [15:32:47] JohnFLewis: it really needs 2fa before I throw in too many features [15:33:14] (03Merged) 10jenkins-bot: Tweak pywikibot/wikibase jobs [integration/config] - 10https://gerrit.wikimedia.org/r/238159 (owner: 10Hashar) [15:33:45] What other features are there? It seems feature-complete (in theory for a user at least) to me? [15:34:53] JohnFLewis: dns is the main one I’m interested in. [15:35:15] And it doesn’t do any kind of user/group management at the moment [15:35:21] So it can’t really replace wikitech [15:35:27] Ah the move away from LDAP DNS [15:35:31] so it allows all project members to administer projects? [15:35:48] No, it just only allows project admins to do anything at all [15:36:02] So you can't actually use it for any project other than one you admin. great.... [15:36:14] Openstack doesn’t really have a concept of a non-admin member, that’s a thing we rolled ourselves. [15:37:01] Krenair: I noticed that since it only shows 3 projects :) [15:37:25] 10Beta-Cluster, 6Labs, 10MediaWiki-General-or-Unknown, 6operations: Create a poolcounter instance in deployment-prep - https://phabricator.wikimedia.org/T112501#1637622 (10Andrew) [15:37:37] 10Beta-Cluster, 6Labs, 10MediaWiki-General-or-Unknown, 6operations: Create a poolcounter instance in deployment-prep - https://phabricator.wikimedia.org/T112501#1637624 (10Andrew) 5Open>3Resolved [15:38:32] Or even just viewing other people's projects [15:38:51] I assume it has some sort of cloudadmin concept, andrewbogott? [15:42:18] Krenair: sort of — a lot of their group management was redone in Kilo and I haven’t caught up yet [15:45:12] Krenair: https://phabricator.wikimedia.org/T91830 [15:48:15] 10Browser-Tests, 5Patch-For-Review, 3Reading-Web: Failed Jenkins job sets Sauce Labs job to passed - https://phabricator.wikimedia.org/T105589#1637677 (10Jdlrobson) This is still happening.... https://integration.wikimedia.org/ci/job/browsertests-MobileFrontend-en.m.wikipedia.beta.wmflabs.org-linux-chrome-sa... [16:01:57] (03PS4) 10Hashar: (WIP) Generic debian-glue (WIP) [integration/config] - 10https://gerrit.wikimedia.org/r/226911 [16:05:12] (03PS1) 10Hashar: Migrate labs/toollabs to new debian glue job [integration/config] - 10https://gerrit.wikimedia.org/r/238171 (https://phabricator.wikimedia.org/T110939) [16:05:52] (03PS5) 10Hashar: Generic debian-glue [integration/config] - 10https://gerrit.wikimedia.org/r/226911 [16:05:54] (03PS2) 10Hashar: Migrate labs/toollabs to new debian glue job [integration/config] - 10https://gerrit.wikimedia.org/r/238171 (https://phabricator.wikimedia.org/T110939) [16:06:23] (03CR) 10Hashar: [C: 032] "Job deployed" [integration/config] - 10https://gerrit.wikimedia.org/r/226911 (owner: 10Hashar) [16:07:58] (03CR) 10Hashar: [C: 032] Migrate labs/toollabs to new debian glue job [integration/config] - 10https://gerrit.wikimedia.org/r/238171 (https://phabricator.wikimedia.org/T110939) (owner: 10Hashar) [16:08:31] (03Merged) 10jenkins-bot: Generic debian-glue [integration/config] - 10https://gerrit.wikimedia.org/r/226911 (owner: 10Hashar) [16:09:55] (03Merged) 10jenkins-bot: Migrate labs/toollabs to new debian glue job [integration/config] - 10https://gerrit.wikimedia.org/r/238171 (https://phabricator.wikimedia.org/T110939) (owner: 10Hashar) [16:13:58] 10Continuous-Integration-Config, 6Labs, 10Tool-Labs, 5Patch-For-Review: Job labs-toollabs-debian-glue is failing for labs/toollabs repository - https://phabricator.wikimedia.org/T110939#1637822 (10hashar) Fixed as can be seen on https://gerrit.wikimedia.org/r/#/c/234934/ The new `debian-glue` jobs takes... [16:14:01] (03PS1) 10Jforrester: Add Thalia Chan to V+2'ers [integration/config] - 10https://gerrit.wikimedia.org/r/238176 [16:14:14] 10Continuous-Integration-Config, 6Labs, 10Tool-Labs, 5Patch-For-Review: Job labs-toollabs-debian-glue is failing for labs/toollabs repository - https://phabricator.wikimedia.org/T110939#1637823 (10hashar) 5Open>3Resolved [16:18:09] (03PS1) 10Ladsgroup: python3 tests passes in pywikibot/wikibase, mark them voting [integration/config] - 10https://gerrit.wikimedia.org/r/238178 [16:33:37] 10Beta-Cluster: Unable to connect to deployment-eventlogging02.eqiad.wmflabs - https://phabricator.wikimedia.org/T112540#1637900 (10Krenair) [16:34:07] (03CR) 10Krinkle: [C: 032] Add Thalia Chan to V+2'ers [integration/config] - 10https://gerrit.wikimedia.org/r/238176 (owner: 10Jforrester) [16:34:20] 10Beta-Cluster: Unable to connect to deployment-eventlogging02.eqiad.wmflabs - https://phabricator.wikimedia.org/T112540#1637657 (10Krenair) > I'm able to connect to other WMF machines successfully Which? [16:36:05] (03PS1) 10JanZerebecki: Correct a few outdated comment regarding dependencies [integration/config] - 10https://gerrit.wikimedia.org/r/238180 [16:39:30] 10Beta-Cluster: Unable to connect to deployment-eventlogging02.eqiad.wmflabs - https://phabricator.wikimedia.org/T112540#1637926 (10Krenair) ```root@deployment-eventlogging02:~# ldaplist -l passwd mholloway-shell | grep sshPublicKey sshPublicKey: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDOCdcpNkkSNQqxsgVQ7qifuIbry... [16:40:40] (03PS2) 10Krinkle: Add Thalia Chan to V+2'ers [integration/config] - 10https://gerrit.wikimedia.org/r/238176 (owner: 10Jforrester) [16:40:47] (03CR) 10Krinkle: [C: 032] Add Thalia Chan to V+2'ers [integration/config] - 10https://gerrit.wikimedia.org/r/238176 (owner: 10Jforrester) [16:41:44] (03Merged) 10jenkins-bot: Add Thalia Chan to V+2'ers [integration/config] - 10https://gerrit.wikimedia.org/r/238176 (owner: 10Jforrester) [16:45:52] !log !log Reloading Zuul to deploy https://gerrit.wikimedia.org/r/238176 [16:45:56] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [16:47:00] 10Browser-Tests, 10QuickSurveys, 3Reading-Web-Sprint-56-4, 6Release-Engineering-Team: QA: Setup browser tests on beta cluster so we can share test articles - https://phabricator.wikimedia.org/T112204#1637984 (10phuedx) [16:48:51] 10Browser-Tests, 10QuickSurveys, 3Reading-Web-Sprint-56-4, 6Release-Engineering-Team: QA: Setup browser tests on beta cluster so we can share test articles - https://phabricator.wikimedia.org/T112204#1637992 (10KLans_WMF) [17:04:47] (03PS1) 10JanZerebecki: Create composer job variants to test MW extensions [integration/config] - 10https://gerrit.wikimedia.org/r/238188 [17:06:06] (03PS2) 10JanZerebecki: Create composer job variants to test MW extensions [integration/config] - 10https://gerrit.wikimedia.org/r/238188 [17:15:00] 10Browser-Tests, 10QuickSurveys, 3Reading-Web-Sprint-56-Four Lions, 6Release-Engineering-Team: QA: Setup browser tests on beta cluster so we can share test articles - https://phabricator.wikimedia.org/T112204#1638113 (10KLans_WMF) p:5Triage>3Normal [17:26:33] (03PS3) 10JanZerebecki: Create composer job variants to test MW extensions [integration/config] - 10https://gerrit.wikimedia.org/r/238188 [17:30:39] Krinkle: I want to add a generic phpunit job and a generic qunit job with composer update to experimental to all extensions, wanna take a look before I deploy it? [17:30:52] https://gerrit.wikimedia.org/r/#/c/238188/ [17:38:03] 10Continuous-Integration-Config: create php-composer-test-{phpflavor} - https://phabricator.wikimedia.org/T112551#1638231 (10JanZerebecki) 3NEW [17:39:25] 10Beta-Cluster: Unable to connect to deployment-eventlogging02.eqiad.wmflabs - https://phabricator.wikimedia.org/T112540#1638240 (10Mholloway) >>! In T112540#1637900, @Krenair wrote: >> I'm able to connect to other WMF machines successfully > > Which? Specifically tin, scb1001, & scb1002. [17:39:58] 10Beta-Cluster: Unable to connect to deployment-eventlogging02.eqiad.wmflabs - https://phabricator.wikimedia.org/T112540#1638242 (10Krenair) >>! In T112540#1638240, @Mholloway wrote: >>>! In T112540#1637900, @Krenair wrote: >>> I'm able to connect to other WMF machines successfully >> >> Which? > > Specificall... [17:41:56] 10Continuous-Integration-Config: create php-composer-test-{phpflavor} - https://phabricator.wikimedia.org/T112551#1638249 (10Legoktm) Already exists in job-templates.yaml? ``` # Run composer update and composer test # Intended for libraries that are published as composer packages - job-template: name: 'php-c... [17:43:22] 10Continuous-Integration-Config: create php-composer-test-{phpflavor} - https://phabricator.wikimedia.org/T112551#1638252 (10JanZerebecki) 5Open>3Invalid a:3JanZerebecki Oops, yes, missed that. [17:43:28] 10Beta-Cluster: Unable to connect to deployment-eventlogging02.eqiad.wmflabs - https://phabricator.wikimedia.org/T112540#1638255 (10Mholloway) >>! In T112540#1637926, @Krenair wrote: > ```root@deployment-eventlogging02:~# ldaplist -l passwd mholloway-shell | grep sshPublicKey > sshPublicKey: ssh-rsa AAAAB3NzaC1... [17:44:12] 10Continuous-Integration-Config: create php-composer-test-{phpflavor} - https://phabricator.wikimedia.org/T112551#1638258 (10JanZerebecki) [17:44:14] 10Beta-Cluster: Unable to connect to deployment-eventlogging02.eqiad.wmflabs - https://phabricator.wikimedia.org/T112540#1638260 (10Krenair) 5Open>3Resolved a:3Krenair Great [17:58:12] 3Scap3: Add documentation of the new scap3 features to the scap docs - https://phabricator.wikimedia.org/T112554#1638309 (10mmodell) 3NEW [17:58:36] 3Scap3: Add documentation of the new scap3 features to the scap docs - https://phabricator.wikimedia.org/T112554#1638321 (10mmodell) a:3mmodell [17:59:59] thcipriani: {{ done }} [18:00:53] mobrovac: nice, thanks! [18:15:52] (03CR) 10JanZerebecki: "Deployed to Jenkins: (['mediawiki-phpunit-hhvm-composer', 'mediawiki-phpunit-zend-composer', 'mwext-qunit-composer', 'mwext-testextension-" [integration/config] - 10https://gerrit.wikimedia.org/r/238188 (owner: 10JanZerebecki) [18:18:10] (03CR) 10JanZerebecki: "Reran some jobs:" [integration/config] - 10https://gerrit.wikimedia.org/r/238188 (owner: 10JanZerebecki) [18:19:08] (03CR) 10JanZerebecki: [C: 032] Create composer job variants to test MW extensions [integration/config] - 10https://gerrit.wikimedia.org/r/238188 (owner: 10JanZerebecki) [18:21:05] (03Merged) 10jenkins-bot: Create composer job variants to test MW extensions [integration/config] - 10https://gerrit.wikimedia.org/r/238188 (owner: 10JanZerebecki) [18:22:40] !log reloading zuul for 72d41fb..bd97ce4 [18:22:44] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [18:23:23] 5Continuous-Integration-Scaling: Disposable VMs need a cache for package managers - https://phabricator.wikimedia.org/T112560#1638442 (10hashar) 3NEW [18:24:20] 5Continuous-Integration-Scaling: Evaluate angry-caching-proxy as a package managers cache - https://phabricator.wikimedia.org/T112561#1638453 (10hashar) 3NEW [18:25:27] !log deleted integration-zuul-server . Was to play test the zuul .deb package. Not needed anymore. [18:25:30] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [18:25:38] (03Abandoned) 10Thcipriani: Add config deployment [tools/scap] - 10https://gerrit.wikimedia.org/r/235385 (owner: 10Thcipriani) [18:36:00] ostriches: I think you can do per host hiera on wikitech now as opposed to https://gerrit.wikimedia.org/r/#/c/235777/2 [18:36:07] may be easier to move to that thatn keeping this in ops/puppet [18:36:15] but just an fyi [18:37:44] andrewbogott: if you have the ops meeting today. Might want to talk about giving us the ability to force run puppet on hosts ( https://gerrit.wikimedia.org/r/#/c/234539/ ) :-} [18:38:47] chasemp: Actually I think it should be std across all phab installs. [18:38:52] Not needing any per-host mess [18:40:44] hashar: oh, that one didn’t make it into the schedule :( The other sudo request did though! [18:40:53] hashar: the ops meeting happens earlier now :) [18:42:01] (03PS1) 10Thcipriani: Add pattern-matching arg to limit deploy hosts [tools/scap] - 10https://gerrit.wikimedia.org/r/238208 [18:42:43] andrewbogott: ahhhh ;-} [18:42:47] not a big deal [18:44:21] 10Continuous-Integration-Infrastructure, 5MW-1.26-release, 5Patch-For-Review: Fetch dependencies using composer instead of cloning mediawiki/vendor for non-wmf branches - https://phabricator.wikimedia.org/T90303#1638559 (10JanZerebecki) https://gerrit.wikimedia.org/r/#/c/238188/ created generic jobs with qun... [18:44:55] marxarelli: I filled a task to find a way to cache npm/pip/gem etc package managers https://phabricator.wikimedia.org/T112560 [18:45:07] marxarelli: I will look at https://www.npmjs.com/package/angry-caching-proxy :D [18:45:46] hashar: cool, i'll take a look in a bit [18:47:02] I am creating an instance for that [18:47:30] andrewbogott: can we get specialized flavors on labs? I think I could use a 1 CPU / 100 GB box :-} [18:47:48] (03PS1) 10Thcipriani: Add --environment flag to cli.Application [tools/scap] - 10https://gerrit.wikimedia.org/r/238211 [18:48:18] 5Continuous-Integration-Scaling: Evaluate angry-caching-proxy as a package managers cache - https://phabricator.wikimedia.org/T112561#1638604 (10hashar) I have created `angry-caching-proxy` in the integration project for that. [18:48:30] hashar: yeah, I can add project-specific flavors. Open a phab task for me? [18:48:44] andrewbogott: is there any specific impact on labs infra ? [18:49:06] my idea is to save CPU / memory [18:49:10] it’s harmless, I’ve done it before for other projects. [18:49:21] okok [18:49:32] will do whenever I have an idea of how much disk space is actually needed :D [18:49:38] It will probably save some resources. Disk space won’t matter, but memory/cpu might. [18:49:52] (It doesn’t really save disk space since things are copy-on-write anyway) [18:56:16] 3Scap3: Add documentation of the new scap3 features to the scap docs - https://phabricator.wikimedia.org/T112554#1638630 (10mmodell) p:5Triage>3High [18:56:18] 10Continuous-Integration-Infrastructure, 5MW-1.26-release: Fetch dependencies using composer instead of cloning mediawiki/vendor for non-wmf branches - https://phabricator.wikimedia.org/T90303#1638631 (10JanZerebecki) [19:00:29] (03PS1) 10Thcipriani: Allow full path to hosts file [tools/scap] - 10https://gerrit.wikimedia.org/r/238213 [19:02:28] thcipriani: twentyafterfour ostriches should we do the beta cluster triage ? [19:02:55] I can join if you're up for it [19:02:58] zeljkof: that browser test is broken i never finished it up :) [19:03:12] (03PS1) 10JanZerebecki: Move the generic qunit-composer job to the qunit template [integration/config] - 10https://gerrit.wikimedia.org/r/238214 [19:03:23] hashar: I'm in the call now, we can make it a quick on hopefully :) [19:03:30] jdlrobson: :) I was just wondering why the bot is running, not the jenkins job [19:03:38] zeljkof: the bot? [19:03:58] it's an old patch. the bot is no longer used. [19:04:11] he's just still a reviewer [19:04:56] jdlrobson: I see, ruby+rubocop is enabled for Gather, I am rechecking open commits that have ruby code, to see what will happen [19:07:15] jdlrobson: if you have a few minutes, feel free to review open commits for Gather, all of them should be small and easy to review [19:07:28] (03CR) 10JanZerebecki: [C: 032] Move the generic qunit-composer job to the qunit template [integration/config] - 10https://gerrit.wikimedia.org/r/238214 (owner: 10JanZerebecki) [19:07:45] ruby+rubocop is running as we speak, so there will be even more information about patches when the jobs finish [19:07:55] ostriches: I think in prod it will be different with possibly ssh on 22 on all ports [19:08:00] but the phab one only istening on the external ip [19:08:03] it may be simpler even [19:08:08] but I haven't worked it out fully yet [19:08:15] (03Merged) 10jenkins-bot: Move the generic qunit-composer job to the qunit template [integration/config] - 10https://gerrit.wikimedia.org/r/238214 (owner: 10JanZerebecki) [19:08:38] 10Beta-Cluster: deployment-logstash2 puppet certificate - https://phabricator.wikimedia.org/T112537#1638653 (10mmodell) p:5Triage>3High [19:08:52] !log reloading zuul for bd97ce4..4abd32e [19:08:55] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [19:13:18] 10Beta-Cluster: deployment-logstash2 puppet certificate - https://phabricator.wikimedia.org/T112537#1638661 (10thcipriani) 5Open>3Resolved a:3thcipriani Seems like deployment-puppetmaster didn't autosign this certificate. Should be fixed now. [19:14:59] chasemp: that's the system sshd on 2222. 22 will be git ssh [19:15:18] I mean in teh case of prod where we have an ext ip only that is ssh it can be 22 [19:15:22] and 22 in the other case as well [19:15:30] since only system level ssh will listen on the private ip [19:15:40] long way of saying, the prod setup will be different [19:15:45] as it's currently thought of [19:15:56] 10Beta-Cluster: deployment-logstash2 puppet certificate - https://phabricator.wikimedia.org/T112537#1638672 (10Krenair) >>! In T112537#1638661, @thcipriani wrote: > Seems like deployment-puppetmaster didn't autosign this certificate. Should be fixed now. It's not just that. There was already a signed certificat... [19:16:01] I thought we couldn't do that because misc-lb? [19:16:18] Since we want iridium to remain private [19:17:08] well we could ssh via a bastion? to iridium ssh? [19:17:13] well, thas horse has been beaten pretty well but I think we can [19:17:40] 10Beta-Cluster: deployment-logstash2 puppet certificate - https://phabricator.wikimedia.org/T112537#1638679 (10hashar) 5Resolved>3Open So we used to have some lame cron job to brute sign certificates every minutes. That has been replaced by puppet signing itself instead: modules/puppet/manifests/self/c... [19:20:57] 10Beta-Cluster, 10Continuous-Integration-Infrastructure, 5Patch-For-Review: deployment-logstash2 puppet certificate - https://phabricator.wikimedia.org/T112537#1638686 (10hashar) [19:21:01] 3Scap3, 7Documentation: Add documentation of the new scap3 features to the scap docs - https://phabricator.wikimedia.org/T112554#1638687 (10Aklapper) [19:22:11] the way brandon has outlined (as I understand is) is similar to how lvs works for the app servers [19:22:29] essentially, the ext ip is on loopback and should be able to terminate requests from lvs [19:22:43] the internal ip is the one dns has and is the ip that sshd (regular) is listening on [19:22:47] in theory this all works [19:24:03] (03CR) 10Zfilipin: "Rubocop green, Ruby red, as expected. https://gerrit.wikimedia.org/r/#/c/225238/" [integration/config] - 10https://gerrit.wikimedia.org/r/237375 (https://phabricator.wikimedia.org/T1361) (owner: 10Zfilipin) [19:25:42] 10Beta-Cluster, 10pywikibot-core: Link.langlinkUnsafe does not work on Beta-Cluster wikis - https://phabricator.wikimedia.org/T112006#1638715 (10thcipriani) p:5Triage>3Normal [19:28:50] 10Beta-Cluster: Output of wmf-beta-update-databases.py is not clear on errors - https://phabricator.wikimedia.org/T110407#1638732 (10mmodell) p:5Triage>3Low [19:29:07] 10Beta-Cluster, 10MediaWiki-JobRunner: Video transcode job runner on beta cluster runs 5 jobs even though configured for 2 - https://phabricator.wikimedia.org/T110916#1638736 (10thcipriani) p:5Triage>3Normal [19:29:19] Hey ostriches, who does dapatrick have to bug to get merge rights for new repo? Or to have jenkins merge on +2? [19:30:26] 10Beta-Cluster, 10MediaWiki-JobRunner: Video transcode job runner on beta cluster runs 5 jobs even though configured for 2 - https://phabricator.wikimedia.org/T110916#1638744 (10hashar) Seems the new instance is now deployment-tmh01.deployment-prep.eqiad.wmflabs /etc/jobrunner/jobrunner.conf has: ``` lang=j... [19:32:50] is this wikimedia/security/* csteipp? [19:33:10] automated-scanning [19:33:19] Yeah [19:33:24] He should already have +2 ther [19:33:26] there* [19:33:36] dapatrick: ^ [19:33:45] According to the ACLs he owns it... [19:33:51] He can definitely +2, but can't seem to merge [19:34:05] ah [19:34:08] 5Continuous-Integration-Scaling: Evaluate angry-caching-proxy as a package managers cache - https://phabricator.wikimedia.org/T112561#1638765 (10hashar) Example with npm: ``` $ http_proxy=http://10.68.19.184:8080 npm install --registry http://registry.npmjs.org/ jshint jshint@2.8.0 node_modules/jshint ├── stri... [19:34:12] Krenair: No merge happened, https://gerrit.wikimedia.org/r/#/c/238220/ [19:34:36] Do you not have a 'Publish and Submit' button in the review screen? [19:35:03] Nope, I don't see that. [19:35:09] huh [19:35:25] Hrm [19:35:27] Weird [19:35:43] It shows to me [19:36:55] (03CR) 1020after4: [C: 031] Add pattern-matching arg to limit deploy hosts [tools/scap] - 10https://gerrit.wikimedia.org/r/238208 (owner: 10Thcipriani) [19:37:13] Inherited submit from wikimedia/* probably [19:37:25] Oh, god dammit [19:37:27] It's ldap/wmf [19:37:28] Again [19:37:34] All-Projects doesn't set Submit for Project Owners, most hierarchies get this right. [19:37:36] Yeah [19:37:42] Lemme fix that for wikimedia/* [19:37:56] I hate ldap-based ACLs in gerrit [19:38:34] (03CR) 1020after4: [C: 031] Add commit-message-validator.py tool [integration/jenkins] - 10https://gerrit.wikimedia.org/r/237719 (https://phabricator.wikimedia.org/T109119) (owner: 10BryanDavis) [19:38:48] Krenair: fixed: https://git.wikimedia.org/commitdiff/wikimedia/0629d7edce54b1a2a05ce8c200780e9fc58bd75d [19:38:53] dapatrick: Try again, sir. [19:39:31] I'll do that next time this comes up [19:39:37] Now I have "Publish and Submit". [19:39:38] Thanks. [19:39:58] yw [19:40:10] thanks ostriches [19:40:14] Krenair: Have I mentioned how much our ACL structure is a complete clusterfsck? [19:40:30] ostriches, haha. you would be preaching to the choir :) [19:41:40] (03CR) 1020after4: [C: 032] Allow full path to hosts file [tools/scap] - 10https://gerrit.wikimedia.org/r/238213 (owner: 10Thcipriani) [19:54:24] 5Continuous-Integration-Scaling: Evaluate angry-caching-proxy as a package managers cache - https://phabricator.wikimedia.org/T112561#1638862 (10hashar) Same with pip, need the http version hence: ``` $ http_proxy=http://10.68.19.184:8080 pip install --index-url http://pypi.python.org/simple PyYAML Downloading/... [19:57:49] 5Continuous-Integration-Scaling: Evaluate angry-caching-proxy as a package managers cache - https://phabricator.wikimedia.org/T112561#1638873 (10hashar) The cache is pretty simple, under `/srv/angry-caching-proxy/cache` each package is associated with two files: * : the actual file * .json : angry... [19:58:31] 5Continuous-Integration-Scaling, 10Ops-Access-Requests, 6operations, 5Patch-For-Review: contint-admins can't start/stop nodepool (lack sudo) - https://phabricator.wikimedia.org/T111374#1638874 (10Dzahn) a:3Dzahn [19:59:59] 10Beta-Cluster, 10RESTBase, 10RESTBase-Cassandra: RESTBase broken in deployment-prep enwiki: Error in Cassandra table storage backend - https://phabricator.wikimedia.org/T112579#1638875 (10Krenair) 3NEW [20:03:33] 5Continuous-Integration-Scaling: Evaluate angry-caching-proxy as a package managers cache - https://phabricator.wikimedia.org/T112561#1638895 (10hashar) I haven't checked, but gem probably comes with an HTTPS index nowadays. So the soft works out of the box with no configuration needs and it is straightforward... [20:03:57] 5Continuous-Integration-Scaling, 10Ops-Access-Requests, 6operations, 5Patch-For-Review: contint-admins can't start/stop nodepool (lack sudo) - https://phabricator.wikimedia.org/T111374#1638900 (10Dzahn) In the ops meeting it has been said that this is approved in principal, but that we don't want to use sy... [20:13:59] 10Beta-Cluster, 10RESTBase, 10RESTBase-Cassandra: RESTBase broken in deployment-prep enwiki: Error in Cassandra table storage backend - https://phabricator.wikimedia.org/T112579#1638939 (10GWicke) 5Open>3Resolved a:3GWicke This was caused by an old cron job running nightly repairs on both cassandra hos... [20:31:30] 10Continuous-Integration-Infrastructure, 7Epic: Provide (pre-merge) code coverage reports on patchsets - https://phabricator.wikimedia.org/T101544#1639040 (10Krinkle) a:3Krinkle [20:33:31] 10Continuous-Integration-Config: JJB qunit macro ignores curl error exit code - https://phabricator.wikimedia.org/T99854#1639053 (10Krinkle) This cURL request is not part of any test. I added it last year for debugging information. It's a build artefact essentially. If there is an error, it will provide informa... [20:33:38] 10Continuous-Integration-Config: JJB qunit macro ignores curl error exit code - https://phabricator.wikimedia.org/T99854#1639055 (10Krinkle) p:5Low>3Lowest a:5Krinkle>3None [20:54:27] 5Continuous-Integration-Scaling, 10Ops-Access-Requests, 6operations, 5Patch-For-Review: contint-admins can't start/stop nodepool (lack sudo) - https://phabricator.wikimedia.org/T111374#1639162 (10hashar) Thanks @Dzahn, I followed up on @akosiaris comment and adjust the sudo rule to use service instead of s... [21:05:03] (03PS2) 10Hashar: python3 tests passes in pywikibot/wikibase, mark them voting [integration/config] - 10https://gerrit.wikimedia.org/r/238178 (owner: 10Ladsgroup) [21:05:17] (03CR) 10Hashar: [C: 032] "Gret, thanks Amir!" [integration/config] - 10https://gerrit.wikimedia.org/r/238178 (owner: 10Ladsgroup) [21:06:09] (03Merged) 10jenkins-bot: python3 tests passes in pywikibot/wikibase, mark them voting [integration/config] - 10https://gerrit.wikimedia.org/r/238178 (owner: 10Ladsgroup) [21:10:14] 5Continuous-Integration-Scaling, 10Ops-Access-Requests, 6operations, 5Patch-For-Review: contint-admins can't start/stop nodepool (lack sudo) - https://phabricator.wikimedia.org/T111374#1639211 (10Dzahn) @hashar looks all good. thanks. i will merge. i also saw on labnodepool1001 "nodepool" is now recognized... [21:19:05] 5Continuous-Integration-Scaling, 10Ops-Access-Requests, 6operations, 5Patch-For-Review: contint-admins can't start/stop nodepool (lack sudo) - https://phabricator.wikimedia.org/T111374#1639253 (10Dzahn) 14:17 < mutante> !log labnodepool1001 - re-enable puppet and nodepool 14:19 < mutante> hashar: %contint... [21:19:14] 5Continuous-Integration-Scaling, 10Ops-Access-Requests, 6operations, 5Patch-For-Review: contint-admins can't start/stop nodepool (lack sudo) - https://phabricator.wikimedia.org/T111374#1639254 (10Dzahn) 5Open>3Resolved [21:20:52] 5Continuous-Integration-Scaling, 10Ops-Access-Requests, 6operations: contint-admins can't start/stop nodepool (lack sudo) - https://phabricator.wikimedia.org/T111374#1602573 (10Dzahn) [21:23:24] 5Continuous-Integration-Scaling, 10Ops-Access-Requests, 6operations: contint-admins can't start/stop nodepool (lack sudo) - https://phabricator.wikimedia.org/T111374#1639264 (10hashar) Thanks. Wrote some lame notes on the wiki https://wikitech.wikimedia.org/w/index.php?title=Nodepool&diff=177626&oldid=177201 [21:58:20] 10Continuous-Integration-Infrastructure, 10Ops-Access-Requests, 6operations, 5Patch-For-Review: Let contint-admins force run puppet with /usr/local/sbin/puppet-run - https://phabricator.wikimedia.org/T110943#1639409 (10Dzahn) @Robh did we have an outcome today? [21:59:53] 10Continuous-Integration-Infrastructure, 10Ops-Access-Requests, 6operations, 5Patch-For-Review: Let contint-admins force run puppet with /usr/local/sbin/puppet-run - https://phabricator.wikimedia.org/T110943#1639413 (10RobH) Someone said it was approved in the meeting notes, but since I wasn't on clinic du... [22:01:20] 10Continuous-Integration-Infrastructure, 10Ops-Access-Requests, 6operations, 5Patch-For-Review: Let contint-admins force run puppet with /usr/local/sbin/puppet-run - https://phabricator.wikimedia.org/T110943#1639416 (10Dzahn) Ok, thanks, i'll take it then since i'm on duty and just did the other contint-ad... [22:03:51] 10Continuous-Integration-Infrastructure, 10Ops-Access-Requests, 6operations, 5Patch-For-Review: Let contint-admins force run puppet with /usr/local/sbin/puppet-run - https://phabricator.wikimedia.org/T110943#1639418 (10RobH) I was incorrect. That was a different task (Checking notes on https://office.wik... [22:04:23] 10Continuous-Integration-Infrastructure, 10Ops-Access-Requests, 6operations, 5Patch-For-Review: Let contint-admins force run puppet with /usr/local/sbin/puppet-run - https://phabricator.wikimedia.org/T110943#1639419 (10RobH) a:5RobH>3None [22:09:03] 10Continuous-Integration-Infrastructure, 10Ops-Access-Requests, 6operations, 5Patch-For-Review: Let contint-admins force run puppet with /usr/local/sbin/puppet-run - https://phabricator.wikimedia.org/T110943#1639428 (10Dzahn) a:3Dzahn [22:23:30] 10Continuous-Integration-Infrastructure, 10Ops-Access-Requests, 6operations, 5Patch-For-Review: Let contint-admins force run puppet with /usr/local/sbin/puppet-run - https://phabricator.wikimedia.org/T110943#1639470 (10Dzahn) 5Open>3stalled [22:24:57] anyone mind fixing https://phabricator.wikimedia.org/T112598 for me? just needs a quick edit on wikitech :) [22:26:34] should it really be "elasticsearch::cluster_hosts::eqiad": ? [22:26:43] ebernhardson: https://wikitech.wikimedia.org/w/index.php?title=Hiera%3ADeployment-prep&action=historysubmit&type=revision&diff=177630&oldid=176471 [22:33:51] chasemp: i think so? thats the key used in hiera, and it has to be quoted to have : with yaml [22:34:37] legoktm: thanks! [22:34:42] ebernhardson: I think it's messed up but may work out due to other things being messed up, I mean that in a productive way really. I have to hack some of htis up to make it modular enough for multisite so if it works no worries [22:34:47] I'll be breaking it I'm sure :) [23:08:58] Project browsertests-Gather-en.m.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #261: FAILURE in 11 min: https://integration.wikimedia.org/ci/job/browsertests-Gather-en.m.wikipedia.beta.wmflabs.org-linux-chrome-sauce/261/