[00:08:38] legoktm, who should I poke about wikibugs still being AWOL? (it's only in 2 channels, afaict) [00:09:10] quiddity: it shouldn't be running at all. redis is down, so everything is b0rked [00:09:20] ah! k [00:10:33] (sorry, I see scrollback now) [00:15:18] Yippee, build fixed! [00:15:18] Project beta-scap-eqiad build #44797: FIXED in 42 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/44797/ [03:56:08] Project browsertests-UploadWizard-commons.wikimedia.beta.wmflabs.org-linux-chrome-sauce build #527: FAILURE in 22 min: https://integration.wikimedia.org/ci/job/browsertests-UploadWizard-commons.wikimedia.beta.wmflabs.org-linux-chrome-sauce/527/ [04:47:00] Yippee, build fixed! [04:47:00] Project browsertests-Flow-en.wikipedia.beta.wmflabs.org-linux-firefox-monobook-sauce build #360: FIXED in 48 min: https://integration.wikimedia.org/ci/job/browsertests-Flow-en.wikipedia.beta.wmflabs.org-linux-firefox-monobook-sauce/360/ [05:38:04] Yippee, build fixed! [05:38:05] Project browsertests-VisualEditor-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #570: FIXED in 22 min: https://integration.wikimedia.org/ci/job/browsertests-VisualEditor-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce/570/ [06:14:34] Project beta-scap-eqiad build #44833: FAILURE in 41 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/44833/ [06:26:02] Yippee, build fixed! [06:26:02] Project browsertests-Gather-en.m.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #21: FIXED in 5 min 20 sec: https://integration.wikimedia.org/ci/job/browsertests-Gather-en.m.wikipedia.beta.wmflabs.org-linux-chrome-sauce/21/ [06:34:37] Yippee, build fixed! [06:34:37] Project beta-scap-eqiad build #44835: FIXED in 40 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/44835/ [06:35:14] RECOVERY - Free space - all mounts on deployment-bastion is OK: OK: All targets OK [06:46:42] Yippee, build fixed! [06:46:43] Project browsertests-UniversalLanguageSelector-commons.wikimedia.beta.wmflabs.org-linux-firefox-sauce build #509: FIXED in 15 min: https://integration.wikimedia.org/ci/job/browsertests-UniversalLanguageSelector-commons.wikimedia.beta.wmflabs.org-linux-firefox-sauce/509/ [06:56:41] PROBLEM - Puppet failure on deployment-bastion is CRITICAL: CRITICAL: 75.00% of data above the critical threshold [0.0] [07:37:56] hello [08:34:07] 10Beta-Cluster: Parser cache (memcached?) broken in Beta Cluster - https://phabricator.wikimedia.org/T91310#1108690 (10yuvipanda) I haven't touched memcached. [08:35:08] 10Beta-Cluster, 6operations, 7Puppet: Use keyholder for deploy key management - https://phabricator.wikimedia.org/T92367#1108691 (10yuvipanda) 3NEW a:3yuvipanda [08:50:02] !gerrit [08:50:06] stupid bot [08:59:07] (03CR) 10Hashar: [C: 031] "Hey Timo, I have missed reviewing this change apparently. Seems to be a good idea to split the templates a bit indeed. Would you mind red" [integration/config] - 10https://gerrit.wikimedia.org/r/177180 (owner: 10Krinkle) [09:01:41] !log made Zuul clone on labs to use the master branch instead of the labs one. There is no point in keeping separate ones anymore [09:01:46] !log https://gerrit.wikimedia.org/r/#/c/195287/ [09:01:48] Logged the message, Master [09:01:52] Logged the message, Master [09:07:13] !log Deleted refs/heads/labs branch in integration/zuul.git [09:07:16] Logged the message, Master [09:07:39] PROBLEM - Puppet failure on deployment-bastion is CRITICAL: CRITICAL: 25.00% of data above the critical threshold [0.0] [09:08:06] 10Continuous-Integration, 5Patch-For-Review: Delete 'labs' branch from integration/zuul.git - https://phabricator.wikimedia.org/T91984#1108739 (10hashar) 5Open>3Resolved I have deleted refs/heads/labs branch in integration/zuul.git . Will probably want to prune the stalled branch in all git working copies... [09:12:39] RECOVERY - Puppet failure on deployment-bastion is OK: OK: Less than 1.00% above the threshold [0.0] [09:23:44] hashar: what can be reason for: https://integration.wikimedia.org/ci/job/cxserver-deploy-npm/53/console [09:24:00] we've package.json up-to-date. [09:24:04] (03CR) 10Hashar: [C: 04-1] "Upstream already provide a Sniff that checks function arguments:" [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/193766 (owner: 10Legoktm) [09:25:03] kart_: something screwed up ? :D Can you please fill a Task and copy paste in it the console output from https://integration.wikimedia.org/ci/job/cxserver-deploy-npm/53/consoleText ? [09:25:55] the job has been slightly changed as well https://integration.wikimedia.org/ci/job/cxserver-deploy-npm/jobConfigHistory/showDiffFiles?timestamp1=2015-02-09_13-08-57×tamp2=2015-03-04_16-08-39 [09:26:05] might be a local hack I forgot to commit :( [09:26:09] <-- blame [09:34:18] hashar: :) [09:37:58] kart_: https://gerrit.wikimedia.org/r/#/c/189473/ hasn't been merged. Gotta review it again [09:38:06] kart_: can you fill the task please ?: ) [09:43:11] hashar: what should be title etc? [09:46:52] 10Continuous-Integration: Fix npm oid jobs - https://phabricator.wikimedia.org/T92369#1108817 (10KartikMistry) 3NEW a:3hashar [09:47:07] (03PS3) 10KartikMistry: (WIP) Hack for npm oid jobs (WIP) [integration/config] - 10https://gerrit.wikimedia.org/r/189473 (https://phabricator.wikimedia.org/T92369) (owner: 10Hashar) [09:47:36] hashar: done [09:48:13] (03PS4) 10KartikMistry: WIP: Hack for npm oid jobs [integration/config] - 10https://gerrit.wikimedia.org/r/189473 (https://phabricator.wikimedia.org/T92369) (owner: 10Hashar) [10:04:17] 10Beta-Cluster, 10ContentTranslation-Deployments, 10MediaWiki-extensions-ContentTranslation, 5ContentTranslation-Release4, 3LE-Sprint-84: Setup new wikis in Beta Cluster for Content Translation - https://phabricator.wikimedia.org/T90683#1108850 (10Arrbee) p:5Normal>3High [10:24:47] Project beta-scap-eqiad build #44858: FAILURE in 43 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/44858/ [10:34:42] Yippee, build fixed! [10:34:43] Project beta-scap-eqiad build #44859: FIXED in 39 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/44859/ [10:43:34] Hey, can I have access to beta.wmflabs? The user research people want to use en.wikipedia.beta.wmflabs.org for doing research on updated citation/reference functionality, and I was going to import some pages for them to test on. [10:45:40] * werdna looks hopefully at hashar [10:45:46] slash Reedy [11:00:01] kart_: will look at it later in the afternoon. Have to investigate the side effects on the parsoid jobs :/ [11:00:09] werdna: fill a task please :] [11:01:07] werdna: but for such testing you can probably just create a user account on http://en.wikipedia.beta.wmflabs.org/ can't you? [11:02:51] !log created integration-zuul-packaged.eqiad.wmflabs to test out the Zuul debian package [11:02:55] Logged the message, Master [11:18:39] hashar: I read fill a glass please ;) [11:27:57] hashar: I can't use Special:Import that way [11:28:49] 10Beta-Cluster: SSH Access to Beta for Werdna - https://phabricator.wikimedia.org/T92371#1108922 (10werdna) 3NEW [11:28:57] hashar: https://phabricator.wikimedia.org/T92371 :) [11:52:44] werdna, don't you just want to be added here? http://en.wikipedia.beta.wmflabs.org/w/index.php?title=Special:ListUsers&group=import [11:53:08] I'm sure you're perfectly entitled to shell access like everyone else, but I'm not sure you technically need it for this :p [11:53:18] Krenair: could do that, but it *is* a 12 MB dump of 700 pages, so I thought would be better to import on cmd lin [11:53:19] e [11:53:27] ok :) [11:57:06] werdna, try sshing to deployment-bastion.eqiad.wmflabs [11:57:42] werdna@deployment-bastion:~$ [11:58:33] 10Beta-Cluster: SSH Access to Beta for Werdna - https://phabricator.wikimedia.org/T92371#1108951 (10Krenair) 5Open>3Resolved a:3Krenair [11:58:42] <3 thx Krenair [11:59:24] various docs at https://wikitech.wikimedia.org/wiki/Nova_Resource:Deployment-prep [12:01:01] https://wikitech.wikimedia.org/wiki/Nova_Resource:Deployment-prep/Import_to_betawiki [12:01:02] <3 [12:16:49] (03PS5) 10KartikMistry: WIP: Hack for npm oid jobs [integration/config] - 10https://gerrit.wikimedia.org/r/189473 (https://phabricator.wikimedia.org/T92369) (owner: 10Hashar) [12:18:32] werdna: it took few days to figure out importing process right :) [12:24:22] 10Continuous-Integration, 10Wikidata, 3§ Wikidata-Sprint-2015-02-03, 3§ Wikidata-Sprint-2015-02-25: fix the qunit tests for wikidata: mwext-Wikibase-qunit - https://phabricator.wikimedia.org/T74184#1109009 (10Tobi_WMDE_SW) Needs release of ValueView and $wgResourceLoaderMaxQueryLength set. [12:24:41] 10Continuous-Integration, 10Wikidata, 3§ Wikidata-Sprint-2015-02-03, 3§ Wikidata-Sprint-2015-02-25, 3§ Wikidata-Sprint-2015-03-11: fix the qunit tests for wikidata: mwext-Wikibase-qunit - https://phabricator.wikimedia.org/T74184#1109011 (10Tobi_WMDE_SW) [12:35:20] Project beta-scap-eqiad build #44871: FAILURE in 37 Sekunden: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/44871/ [12:36:05] 10Continuous-Integration, 10MediaWiki-ResourceLoader, 10MediaWiki-Vagrant, 10Wikidata, and 3 others: qunit test broken without explicitly setting $wgResourceLoaderMaxQueryLength - https://phabricator.wikimedia.org/T90453#1109063 (10Tobi_WMDE_SW) [12:37:14] 10Continuous-Integration, 7Technical-Debt: Delete old jobs not (or no longer) managed by JJB - https://phabricator.wikimedia.org/T91410#1109080 (10hashar) Please do not blindly delete them, some of those jobs are triggered in Zuul such as the analytics* ones. The mwext-*-composer-{hhvm,zend} jobs have been c... [12:37:34] PROBLEM - SSH on deployment-lucid-salt is CRITICAL: Connection refused [12:43:59] 10Continuous-Integration, 7Technical-Debt: Delete old jobs not (or no longer) managed by JJB - https://phabricator.wikimedia.org/T91410#1109101 (10zeljkofilipin) >>! In T91410#1109080, @hashar wrote: > The browsertests* we might have forget to delete them or they might have been deployed from a patch pending r... [12:54:35] Yippee, build fixed! [12:54:36] Project beta-scap-eqiad build #44873: FIXED in 40 Sekunden: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/44873/ [13:01:17] 10Staging, 6operations: mariadb puppet module doesn't start mysql service in labs (possibly anywhere) - https://phabricator.wikimedia.org/T91797#1109229 (10Springle) >>! In T91797#1096926, @thcipriani wrote: > Maybe it would be a better to say that I expected that adding `role::mariadb` to a fresh server would... [13:06:09] hashar: video froze, are you copying a big file? [13:06:23] yeah [13:16:29] 10Staging, 6operations: mariadb puppet module doesn't start mysql service in labs (possibly anywhere) - https://phabricator.wikimedia.org/T91797#1109268 (10yuvipanda) >>! In T91797#1109229, @Springle wrote: > While I sympathize with the request, if this happens it needs to default to off or be put into a separ... [13:19:24] twentyafterfour: ping when around, I’ll merge the fatalmonitor text [13:19:27] *change [13:27:59] 10Staging, 6operations: mariadb puppet module doesn't start mysql service in labs (possibly anywhere) - https://phabricator.wikimedia.org/T91797#1109283 (10Springle) # Provision box, sign puppet, first run, etc # xtrabackup clone & prepare data from another server # Start MariaDB service, wait for replic... [13:51:15] (03Abandoned) 10Krinkle: WIP: Refactor zuul layout templates [integration/config] - 10https://gerrit.wikimedia.org/r/177180 (owner: 10Krinkle) [13:53:31] 10Continuous-Integration, 7Technical-Debt: Delete old jobs not (or no longer) managed by JJB - https://phabricator.wikimedia.org/T91410#1109338 (10Krinkle) >>! In T91410#1109080, @hashar wrote: > Please do not blindly delete them, some of those jobs are triggered in Zuul such as the analytics* ones. Why are... [14:05:40] !log Jenkins web dashboard is in German [14:05:46] Logged the message, Master [14:07:03] 10Continuous-Integration, 7Technical-Debt: Delete old jobs not (or no longer) managed by JJB - https://phabricator.wikimedia.org/T91410#1109401 (10hashar) >>! In T91410#1109338, @Krinkle wrote: >>>! In T91410#1109080, @hashar wrote: >> Please do not blindly delete them, some of those jobs are triggered in Zuul... [14:08:58] (03PS7) 10Hashar: Package python deps with dh-virtualenv [integration/zuul] (debian/precise-wikimedia) - 10https://gerrit.wikimedia.org/r/195272 (https://phabricator.wikimedia.org/T48552) [14:09:15] (03CR) 10Hashar: "check experimental" [integration/zuul] (debian/precise-wikimedia) - 10https://gerrit.wikimedia.org/r/195272 (https://phabricator.wikimedia.org/T48552) (owner: 10Hashar) [14:17:11] 10Continuous-Integration: Generic phplint job is extremely slow for mediawiki/core - https://phabricator.wikimedia.org/T92042#1109429 (10Krinkle) This one ran for master: https://integration.wikimedia.org/ci/job/phplint/828/ Only took 1 min 13 sec. >>! In T92042#1103698, @hashar wrote: > https://integration.wi... [14:18:07] hashar: Any idea what might cause https://phabricator.wikimedia.org/T92351 ? [14:18:29] This is blocking any and all jobs to run on oojs/core. Not much activity (low-level lib), but been blocking for a few days now [14:18:45] It seems we can't establish HTTPS connection to saucelabs on trusty instances? [14:19:57] YuviPanda: ok I'm here [14:20:06] twentyafterfour: alright lets do it [14:20:50] twentyafterfour: whewre does this go? tin? [14:20:52] or fluorine? [14:21:56] flourine [14:22:57] twentyafterfour: done. try running? [14:23:15] looks good [14:23:24] twentyafterfour: \o/ cool [14:23:55] thanks Yuvi! and thanks bd808 for improving my improvement [14:37:32] twentyafterfour: do you have some time to look at some staging instances today? [14:37:57] thcipriani: just a little, it's deployment day again ;) [14:38:57] ah, right, I was digging though what still needs to be done. The rdb boxes look like they _should_ be somewhat straight forward. [14:39:02] 6Release-Engineering, 7Browser-Tests: Investigate updating browser versions in Jenkins builds. - https://phabricator.wikimedia.org/T92005#1109500 (10Cmcmahon) [14:39:27] zeljkof: I added you to the Phab thing for se/browser versions https://phabricator.wikimedia.org/T92005 [14:39:39] chrismcmahon: great, thanks [14:40:40] zeljkof: even though we specified Chrome 40, I don't believe we ever got it. Pretty sure Sauce gave us 35 each time. [14:40:58] chrismcmahon: you can check at sauce labs job [14:41:06] there is browser version in every job details [14:42:58] yep. I haven't looked, but everything with an overlay started failing 4 March and I am 99% sure it's a se/Chrome version problem [14:44:56] thcipriani: I'll take a look after deployment this afternoon. right now I'm working on further automation of deployment process [14:45:38] (yay) [14:45:56] also, it's weird not have overnight scrollback [14:45:56] twentyafterfour: rock on. [14:46:17] I've got the script for porting patches forward ...almost complete [14:46:18] greg-g: especially after what everyone said about you :) [14:46:28] thcipriani: I know right? [14:46:54] WHAT DID EVERYONE SAY ABOUT GREG-G [14:46:55] I MUST KNOW [14:49:08] I deliberately shut down my work computer every evening in order to not actually actually do this 24/7. I get the occasional surprise, but it's worth it. [14:49:18] YuviPanda: cheese it, he's back in the channel... [14:49:37] aaaaaaaah. *that*. ok [14:51:23] 10Staging: Create staging cluster (tracking) - https://phabricator.wikimedia.org/T88702#1109512 (10yuvipanda) What to do when the goals conflict? I think ideally we should strive to make it automatic in *both* prod and staging :D but in cases when that isn’t desired (having puppet thrash around dbs does sound... [14:52:06] 10Staging: Create staging cluster (tracking) - https://phabricator.wikimedia.org/T88702#1109514 (10yuvipanda) T85279 for LDAP vs hiera. I definitely don't think we should use hiera classes for everything. [14:58:14] 10Staging, 5Patch-For-Review: Create staging-db* (databases) - https://phabricator.wikimedia.org/T91545#1109522 (10thcipriani) >>! In T91797#1109283, @Springle wrote: > # Provision box, sign puppet, first run, etc > # xtrabackup clone & prepare data from another server > # Start MariaDB service, wait for... [15:01:55] 10Beta-Cluster: Parser cache (memcached?) broken in Beta Cluster - https://phabricator.wikimedia.org/T91310#1109555 (10greg) p:5Triage>3High [15:01:56] 10Staging, 5Patch-For-Review: Create staging-db* (databases) - https://phabricator.wikimedia.org/T91545#1109551 (10yuvipanda) (Thoughts on how to handle manual steps for staging cluster at T88702#1109512) [15:02:21] 10Beta-Cluster, 6Release-Engineering: Process accounting + deployments routinely fill up /var on deployment-bastion - https://phabricator.wikimedia.org/T91354#1109561 (10greg) p:5Triage>3Normal [15:06:19] 10Continuous-Integration, 7Technical-Debt: Delete old jobs not (or no longer) managed by JJB - https://phabricator.wikimedia.org/T91410#1109567 (10Legoktm) >>! In T91410#1109080, @hashar wrote: > The mwext-*-composer-{hhvm,zend} jobs have been created fairly recently, they are not triggered and might have been... [15:08:41] 10Staging, 5Patch-For-Review: Create staging-db* (databases) - https://phabricator.wikimedia.org/T91545#1109574 (10greg) >>! In T91545#1109522, @thcipriani wrote: >>>! In T91797#1109283, @Springle wrote: >> # Provision box, sign puppet, first run, etc >> # xtrabackup clone & prepare data from another serve... [15:10:58] 10Staging, 5Patch-For-Review: Create staging-db* (databases) - https://phabricator.wikimedia.org/T91545#1109576 (10yuvipanda) @springle How would this be handled for master, where there's nowhere to clone from? [15:10:59] !log Jenkins UI in German, again [15:11:03] Logged the message, Master [15:11:08] * hashar blames Germany [15:19:38] 10Continuous-Integration: Pool new integration-slave14xx instances and delete old ones - https://phabricator.wikimedia.org/T91524#1109604 (10Krinkle) [15:19:43] 10Continuous-Integration: Pool new integration-slave14xx instances and delete old ones - https://phabricator.wikimedia.org/T91524#1089205 (10Krinkle) [15:19:44] 10Continuous-Integration: Fix "ImportError: Entry point ('console_scripts', 'tox') not found" on integration-slave12xx instances - https://phabricator.wikimedia.org/T91526#1109606 (10Krinkle) [15:21:00] 10Continuous-Integration: Pool new integration-slave14xx instances and delete old ones - https://phabricator.wikimedia.org/T91524#1089205 (10Krinkle) 5Open>3Resolved Blocking issues for Ubuntu Trusty resolved. Our instance creation process as documented now works cleanly. New instances were created and poole... [15:21:34] 10Continuous-Integration: Fix "Entry point ('console_scripts', 'tox') not found" on new slaves running Ubuntu Precise - https://phabricator.wikimedia.org/T91526#1109611 (10Krinkle) [15:38:57] 10Continuous-Integration: Jenkins: Use node-jscs as checkstyle for javascript coding style - https://phabricator.wikimedia.org/T56218#1109636 (10Krinkle) >>! In T56218#973118, @adrianheine wrote: > Cool, thanks. I added `npm test`, although I didn't go through grunt since we are not running any other grunt jobs,... [15:39:52] 10Continuous-Integration, 10Flow: Jenkins reports test failures in current master: Cannot override frozen service "storage" - https://phabricator.wikimedia.org/T91951#1109644 (10Krinkle) Has this been resolved? Has it happened recently? [15:48:34] 10Continuous-Integration, 10Flow: Jenkins reports test failures in current master: Cannot override frozen service "storage" - https://phabricator.wikimedia.org/T91951#1109666 (10EBernhardson) 5Open>3Resolved This should have been resolved by Ic80b137e6ca2f8ba6ad34ce0bdbb1319b81c0a6e [16:04:14] chrismcmahonbrb, but also others who may be around, do you know off the top of your head how to check if an element is in the viewport, in selenium? [16:04:52] marktraceur: element.visible? maybe [16:07:21] 10Continuous-Integration: Fetch dependencies using composer instead of cloning mediawiki/vendor repository for non-WMF deployment branches - https://phabricator.wikimedia.org/T90303#1109723 (10Krinkle) [16:07:45] I was hoping that was the case but I thought it might work if the element is visible, but not scrolled to. [16:09:03] marktraceur: i'm actually not sure. looking at the selenium bindings... [16:10:51] marktraceur: I usually use foo_element.when_present.action. when_present checks that the element is both visible and also can be manipulated [16:11:04] marxarelli: ^^ [16:11:55] chrismcmahon: And presumably element.should be_present would work too? [16:12:05] I don't need to *do* anything to it, just make sure it's scrolled to. [16:12:10] marktraceur: there is a method deep in selenium called ScrollIntoView that is supposed to be called for each element, but I think Gilles found a bug with that. [16:12:18] marktraceur: expect(element).to_be present, but yeah [16:12:39] ...huh, I see different syntax in our tests. [16:12:59] the result of any predicate method can be asserted using a `be_` matcher [16:13:10] There we go. [16:13:14] it's probably the old < 2.99 rspec syntax [16:13:29] Great success. [16:13:33] (03PS1) 1020after4: move auth to an untracked file, auth.php [tools/release] - 10https://gerrit.wikimedia.org/r/195941 [16:13:35] (03PS1) 1020after4: migrate-patch utility to retain security patches [tools/release] - 10https://gerrit.wikimedia.org/r/195942 [16:13:41] yes, I think UW is one of few places that still has old RSpec syntax, my bad [16:14:16] We're bad, bad people. [16:14:41] (03Abandoned) 10Krinkle: Fix $wgTmpDirectory race condition [integration/jenkins] - 10https://gerrit.wikimedia.org/r/193832 (https://phabricator.wikimedia.org/T91070) (owner: 10Hashar) [16:15:01] nah, there is no deadline for migrating from RSpec 2 to 3, but I did most of the repos back in Nov/Dec and just petered out before doing UW [16:22:01] RECOVERY - Puppet failure on deployment-apertium01 is OK: OK: Less than 1.00% above the threshold [0.0] [16:22:46] 10Quality-Assurance, 10MediaWiki-extensions-Sentry, 6Multimedia, 3Multimedia-Sprint-2015-03-11, 5Patch-For-Review: Automated tests for Sentry error logging - https://phabricator.wikimedia.org/T88078#1109745 (10Gilles) [16:55:01] (03PS2) 10Krinkle: scm: Enable shallow-clone for git-remoteonly-zuul [integration/config] - 10https://gerrit.wikimedia.org/r/195021 [16:59:23] 10Continuous-Integration, 7Jenkins: PHP fatal errors are not visible in jenkins output - https://phabricator.wikimedia.org/T92397#1109886 (10daniel) [16:59:54] 10Continuous-Integration, 7Jenkins: PHP fatal errors are not visible in jenkins output in Wikibase phpunit jobs - https://phabricator.wikimedia.org/T92397#1109889 (10Legoktm) [17:01:33] (03CR) 10BryanDavis: [C: 031] "Great idea" [tools/release] - 10https://gerrit.wikimedia.org/r/195941 (owner: 1020after4) [17:27:42] 6Release-Engineering, 10Staging, 3releng-201415-Q3: Determine code update cycle/cadence for the staging cluster - https://phabricator.wikimedia.org/T91563#1110084 (10yuvipanda) How is code going to be deployed / updated? [17:29:33] 6Release-Engineering, 10Staging, 3releng-201415-Q3: Determine code update cycle/cadence for the staging cluster - https://phabricator.wikimedia.org/T91563#1110106 (10yuvipanda) I'd say puppet run frequencey shouldn't change anything, and we should match prod on that (every 20mins). I'm not sure what advantag... [17:30:09] 10Continuous-Integration, 10Wikimedia-Fundraising-CiviCRM: CI for Civi: provision and run tests under Jenkins/Zuul - https://phabricator.wikimedia.org/T86103#1110110 (10Ejegg) [17:33:01] 10Quality-Assurance, 10MediaWiki-extensions-Sentry, 6Multimedia, 3Multimedia-Sprint-2015-03-11, 5Patch-For-Review: Automated tests for Sentry error logging - https://phabricator.wikimedia.org/T88078#1110128 (10Tgr) [17:35:42] 6Release-Engineering, 10Fundraising Tech Backlog, 6Scrum-of-Scrums, 10Wikimedia-Fundraising-CiviCRM, and 2 others: Continuous integration - CiviCRM - https://phabricator.wikimedia.org/T78100#1110140 (10Cmcmahon) [17:37:14] 6Release-Engineering, 10Staging, 3releng-201415-Q3: Determine code update cycle/cadence for the staging cluster - https://phabricator.wikimedia.org/T91563#1110143 (10greg) less frequent than continuously is what the "less frequent" refers to. I'd prefer continuously. We don't make production for MW code (wee... [17:42:21] 6Release-Engineering, 10Staging, 3releng-201415-Q3: Determine code update cycle/cadence for the staging cluster - https://phabricator.wikimedia.org/T91563#1110153 (10greg) >>! In T91563#1110084, @yuvipanda wrote: > How is code going to be deployed / updated? the same way (minus the humans) that it is done i... [17:43:14] 6Release-Engineering, 10Staging, 3releng-201415-Q3: Determine code update cycle/cadence for the staging cluster - https://phabricator.wikimedia.org/T91563#1110154 (10yuvipanda) >>! In T91563#1110153, @greg wrote: >>>! In T91563#1110084, @yuvipanda wrote: >> How is code going to be deployed / updated? > > th... [17:43:21] 6Release-Engineering, 10Staging, 3releng-201415-Q3: Determine code update cycle/cadence for the staging cluster - https://phabricator.wikimedia.org/T91563#1110155 (10yuvipanda) (i'd prefer a cron) [17:44:16] 6Release-Engineering, 10Staging, 3releng-201415-Q3: Determine code update cycle/cadence for the staging cluster - https://phabricator.wikimedia.org/T91563#1110156 (10greg) >>! In T91563#1110155, @yuvipanda wrote: > (i'd prefer a cron) cron doesn't give you easy history without building/collecting that info... [17:45:52] 6Release-Engineering, 10Staging, 3releng-201415-Q3: Determine code update cycle/cadence for the staging cluster - https://phabricator.wikimedia.org/T91563#1110169 (10yuvipanda) >>! In T91563#1110156, @greg wrote: > cron doesn't give you easy history without building/collecting that info yourself Hmm, that'... [18:24:38] thcipriani: I’m ‘stuck’ on some really, *really* strange ssh key issues for mwdeploy, and could use a fresh pair of eyes. got a moment? [18:24:53] YuviPanda: yeah, what's up? [18:25:30] thcipriani: alright, so I’m trying to use keyholder for ssh keys in the deployment host [18:25:34] that’s how it is done in prod [18:25:46] look at the keyholder module in prod. it basically is a couple of ssh-agents [18:25:52] so a root does keyholder arm [18:25:57] and enters the passphrase [18:26:06] and the key is available via ssh-agent to other mortals [18:26:26] this lets people ssh as mwdeploy to other hosts without having access to the private key themselves. [18:26:40] yup, I'm with you. [18:26:52] thcipriani: so with https://gerrit.wikimedia.org/r/#/c/195865/ the new key is set up. [18:27:08] thcipriani: so if you go to deployment-mediawiki01 and check the key, it matches [18:27:21] thcipriani: and if you go to deployment-bastion and see the private key (/etc/keyholder.d/) it matches [18:27:33] thcipriani: but when I try to ssh from deployment-bastion to deployment-mediawiki01 [18:27:36] it fails [18:27:44] and ssh on mediawiki01 says ‘public key failure' [18:27:49] and then gives me a fingerprint [18:27:57] WHICH MATCHES THE FINGERPRINT OF THE KEY CORRECTLY [18:28:02] so I’m completely confused as to wtf is happening [18:28:19] it says key mismatch, but I go and verify the keys and they do match... [18:28:57] oh, good. alright, lemme catch up with what you're seeing. [18:29:34] thcipriani: ok [18:30:43] (03PS1) 1020after4: Fix branched sub-submodule support [tools/release] - 10https://gerrit.wikimedia.org/r/195972 [18:34:56] YuviPanda: so, I am able to get to deployment-mediawiki01 from deployment-bastion as mwdeploy [18:35:02] ... [18:35:08] thcipriani: what’s the command you are using? [18:35:55] (03CR) 10Jforrester: "Seems sane." [tools/release] - 10https://gerrit.wikimedia.org/r/195972 (owner: 1020after4) [18:36:02] so, I just did: sudo su mwdeploy then did ssh -v deployment-mediawiki01.eqiad.wmflabs [18:36:22] interesting [18:36:31] > Mar 11 18:34:58 deployment-mediawiki01 sshd[20542]: Accepted publickey for mwdeploy from 10.68.16.58 port 58819 ssh2: RSA 84:c5:fd:fa:c9:a0:c8:ea:ba:48:2c:a6:bd:e6:03:05 [18:36:32] for yours [18:36:40] > Mar 11 18:35:54 deployment-mediawiki01 sshd[20612]: Failed publickey for mwdeploy from 10.68.16.58 port 60831 ssh2: RSA f0:54:06:fa:17:27:97:a2:cc:69:a0:a7:df:4c:0a:e3 [18:36:41] for mine [18:37:12] what commands were you using? [18:37:19] SSH_AUTH_SOCK=/run/keyholder/proxy.sock ssh mwdeploy@deployment-mediawiki01 [18:37:22] that’s the one with the new key [18:37:24] oh wait. [18:37:48] yeah, I switched users. [18:38:10] thcipriani: so whatever I typed should work [18:38:28] apparently [18:38:33] hmm, let me try that in prod [18:40:24] thcipriani: and it does [18:40:25] > yuvipanda@tin:~$ SSH_AUTH_SOCK=/run/keyholder/proxy.sock ssh mwdeploy@mw1219 [18:40:28] mwdeploy@mw1219:~$ [18:42:33] YuviPanda: have you tried explicitly setting the file in keyholders.d as your identity file as root? [18:43:00] thcipriani: yeah, that doesn’t work either. [18:43:29] thcipriani: hmm, so there is a different key in /etc/ssh/userkeys/mwdeploy, and in /home/mwdeploy [18:43:31] maybe that’s the problem [18:43:41] thcipriani: I’m going to create a brand new node and see how that goes [18:43:53] kk [18:44:24] thcipriani: thanks for poking :D [18:44:37] anytime. [18:48:02] Project browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-windows_8.1-internet_explorer-11-sauce build #364: FAILURE in 41 min: https://integration.wikimedia.org/ci/job/browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-windows_8.1-internet_explorer-11-sauce/364/ [18:54:36] Project beta-scap-eqiad build #44908: FAILURE in 40 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/44908/ [18:56:43] 10Quality-Assurance, 10MediaWiki-extensions-GettingStarted: Cucumber tests for GettingStarted - https://phabricator.wikimedia.org/T92156#1110527 (10Mattflaschen) [18:57:11] 10Quality-Assurance, 10MediaWiki-extensions-GettingStarted: Cucumber tests for GettingStarted - https://phabricator.wikimedia.org/T92156#1103051 (10Mattflaschen) [18:57:39] 10Quality-Assurance, 10MediaWiki-extensions-GettingStarted: Cucumber tests for GettingStarted - https://phabricator.wikimedia.org/T92156#1103051 (10Mattflaschen) [18:59:16] PROBLEM - Puppet failure on deployment-mediawiki03 is CRITICAL: CRITICAL: 57.14% of data above the critical threshold [0.0] [19:02:01] Yippee, build fixed! [19:02:02] Project browsertests-UploadWizard-commons.wikimedia.beta.wmflabs.org-linux-chrome-sauce build #528: FIXED in 29 min: https://integration.wikimedia.org/ci/job/browsertests-UploadWizard-commons.wikimedia.beta.wmflabs.org-linux-chrome-sauce/528/ [19:02:57] 10Staging, 5Patch-For-Review: Setup staging-tin as deployment host - https://phabricator.wikimedia.org/T88442#1110555 (10yuvipanda) a:3yuvipanda [19:04:16] RECOVERY - Puppet failure on deployment-mediawiki03 is OK: OK: Less than 1.00% above the threshold [0.0] [19:05:01] (03PS1) 10Legoktm: disable submodules for git-remote-zuul-shallow-clone [integration/config] - 10https://gerrit.wikimedia.org/r/195985 (https://phabricator.wikimedia.org/T92042) [19:10:30] thcipriani: also, I had to remove the pubkey manually for salt even in deployment-prep, even after setting up hiera properly. on doing a diff with the new key + the key in hiera, the only difference was the newline at the end (hiera one didn’t have it) [19:10:44] (just a fyi) [19:10:56] yeah, thought I got that fixed. Did you just do that? [19:11:42] oh, wait, deployment-prep [19:12:19] YuviPanda: put a yaml key under the role::salt::minions::salt_master_key I think it's mediawiki chomping the newline [19:12:33] aaaah [19:12:36] intersting. [19:12:49] yeah, that’s possible [19:13:01] thcipriani: also, once we test them out, we should move things from wikitech on to ops/puppet I think [19:13:33] YuviPanda: probably a good plan, for stability sake. [19:13:38] thcipriani: also, once I’m done with staging-tin, I’m going to try to find a solution to the wikitech interface. basically, have an ENC that lets us specify regexes that match hostnames, and then just what roles they should get. [19:13:57] let me find the bug for that [19:14:02] that'd be awesome. [19:14:12] thcipriani: https://phabricator.wikimedia.org/T85279 [19:14:28] thcipriani: so basically we’d have a YAML file, and bam, that’s it, really [19:14:39] Yippee, build fixed! [19:14:39] Project beta-scap-eqiad build #44910: FIXED in 39 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/44910/ [19:14:42] nice. [19:14:49] recreating an instance just involves actually deleting it and creating a new one, and puppet should do most of the rest (unless there are manual steps needed) [19:15:36] PROBLEM - Puppet failure on deployment-restbase01 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [19:17:08] PROBLEM - App Server Main HTTP Response on deployment-mediawiki03 is CRITICAL: HTTP CRITICAL: HTTP/1.1 404 Not Found - string 'Wikipedia' not found on 'http://en.wikipedia.beta.wmflabs.org:80/wiki/Main_Page?debug=true' - 391 bytes in 0.001 second response time [19:17:40] PROBLEM - Puppet failure on deployment-stream is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [19:25:19] PROBLEM - Puppet failure on deployment-mediawiki03 is CRITICAL: CRITICAL: 75.00% of data above the critical threshold [0.0] [19:31:00] (03PS1) 10Legoktm: Create separate mediawiki-core-phplint job [integration/config] - 10https://gerrit.wikimedia.org/r/195990 (https://phabricator.wikimedia.org/T92042) [19:39:20] (03CR) 10Legoktm: [C: 032] Create separate mediawiki-core-phplint job [integration/config] - 10https://gerrit.wikimedia.org/r/195990 (https://phabricator.wikimedia.org/T92042) (owner: 10Legoktm) [19:42:44] RECOVERY - Puppet failure on deployment-stream is OK: OK: Less than 1.00% above the threshold [0.0] [19:44:51] (03Merged) 10jenkins-bot: Create separate mediawiki-core-phplint job [integration/config] - 10https://gerrit.wikimedia.org/r/195990 (https://phabricator.wikimedia.org/T92042) (owner: 10Legoktm) [19:45:27] Project beta-scap-eqiad build #44913: FAILURE in 44 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/44913/ [19:47:12] !log deployed https://gerrit.wikimedia.org/r/195990 [19:47:15] Logged the message, Master [19:50:16] RECOVERY - Puppet failure on deployment-mediawiki03 is OK: OK: Less than 1.00% above the threshold [0.0] [19:51:21] Project browsertests-Flow-en.wikipedia.beta.wmflabs.org-linux-firefox-monobook-sauce build #361: FAILURE in 49 min: https://integration.wikimedia.org/ci/job/browsertests-Flow-en.wikipedia.beta.wmflabs.org-linux-firefox-monobook-sauce/361/ [19:54:34] Yippee, build fixed! [19:54:34] Project beta-scap-eqiad build #44914: FIXED in 36 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/44914/ [20:00:34] RECOVERY - Puppet failure on deployment-restbase01 is OK: OK: Less than 1.00% above the threshold [0.0] [20:05:34] 10Continuous-Integration, 5Patch-For-Review: Generic phplint job is extremely slow for mediawiki/core - https://phabricator.wikimedia.org/T92042#1110811 (10Legoktm) p:5High>3Normal I recreated a specific "mediawiki-core-phplint" job for now that will have a workspace that it can re-use. Lowering priority a... [20:11:02] Yippee, build fixed! [20:11:03] Project browsertests-UploadWizard-commons.wikimedia.beta.wmflabs.org-linux-firefox-sauce build #533: FIXED in 38 min: https://integration.wikimedia.org/ci/job/browsertests-UploadWizard-commons.wikimedia.beta.wmflabs.org-linux-firefox-sauce/533/ [20:16:12] (03PS1) 10Legoktm: Simplify deploying zuul with fabric [integration/config] - 10https://gerrit.wikimedia.org/r/196002 [20:22:18] legoktm: holy shit not yet another deployment system :( [20:22:55] :< [20:23:10] though that is run locally :D [20:23:32] yup, nothing needed server-side [20:26:33] legoktm: I think scap fits this use case better. [20:26:44] * legoktm slaps YuviPanda [20:26:45] shush [20:26:48] (03CR) 10Yuvipanda: "better call scap" [integration/config] - 10https://gerrit.wikimedia.org/r/196002 (owner: 10Legoktm) [20:26:54] scap? [20:27:01] scap fist one and only one use case [20:27:16] he mean git-deploy/ryanlane/trebuchet/etc probably [20:27:17] hehe ‘fist' [20:27:38] trebuchet is ... an option [20:27:44] I cant stand Trebuchet myself, I find it terribly annoying [20:27:47] I was mostly making a https://en.wikipedia.org/wiki/Better_Call_Saul joke [20:27:50] but then that is the deployment system.. [20:27:53] I think fab is the right thing to do here [20:28:12] hashar: :( yeah [20:28:21] It has a lot of really good parts [20:28:28] and a few really really bad parts [20:28:47] mostly involving salt [20:29:19] and the way that salt doesn't support a synchronous command and control mode [20:29:56] I just need one command to use so I don't have to keep mistyping git commands over a laggy ssh connection [20:30:06] (03CR) 10Yuvipanda: [C: 031] Simplify deploying zuul with fabric [integration/config] - 10https://gerrit.wikimedia.org/r/196002 (owner: 10Legoktm) [20:30:13] whatever underlying system it uses doesn't really matter to me as long as it works :P [20:34:08] $ sudo -ni -u zuul /bin/bash [20:34:08] sudo: sorry, a password is required to run sudo [20:34:08] :-((( [20:34:21] it should know [20:35:15] er, is your .ssh/config set up properly? [20:36:41] yeah, that is my sudo rule which does not let me sudo as zuul [20:38:34] er, but then how do you deploy zuul changes? [20:38:49] got root access [20:38:58] so I do something like: sudo su - zuul [20:39:36] right, I normally do "sudo -su zuul" as well [20:40:19] -n means non-interactive, and -i means use a login shell [20:40:33] so I don't see why those options would cause an issue... [20:43:18] (03CR) 10Hashar: "At first I thought we should use scap, but it has proven to be annoying. In this case the fabric part is only running on our local machin" (033 comments) [integration/config] - 10https://gerrit.wikimedia.org/r/196002 (owner: 10Legoktm) [20:43:56] legoktm: it is just my sudo policy and/or sudo which are/is crazy [20:44:16] I should add myself as a contintadmin [20:44:46] hashar: I never figured out how to get my *.wmnet proxycommand to let me ssh directly into gallium.wm.o from the outside [20:44:46] :x [20:44:57] oh [20:45:32] I have a paste for that [20:45:37] ProxyCommand ssh -a -W %h:%p bast1001.wikimedia.org [20:45:48] https://phabricator.wikimedia.org/P281 [20:45:59] that should cover your ass [20:46:55] when you do: ssh somehost.there.tld [20:47:19] somehost.there.tld is passed through ssh config Host [20:47:27] and if it match the pattern, it uses the commands [20:47:45] at the very bottom I have: [20:47:46] Host *.eqiad.wmnet people.wikimedia.org gallium.wikimedia.org [20:47:54] to set the ProxyCommand via bast1001.wikimedia.org [20:48:03] my config right now looks like http://fpaste.org/196855/26106872/raw/ [20:48:51] woot [20:48:53] got it [20:48:57] thanks hashar [20:49:12] legoktm: gallium ssh port is behind a firewall [20:49:28] so you have to adjust your Host *.eqiad.wmnet pattern [20:49:45] Host *.eqiad.wmnet gallium.wikimedia.org [20:50:23] that is the nicest trick I ever learned while at the wmf [20:50:41] (that and talking to ori or bd808 to get them to implement stuff hehe) [20:52:19] (03PS2) 10Legoktm: Simplify deploying zuul with fabric [integration/config] - 10https://gerrit.wikimedia.org/r/196002 [20:52:26] (03CR) 10Legoktm: Simplify deploying zuul with fabric (032 comments) [integration/config] - 10https://gerrit.wikimedia.org/r/196002 (owner: 10Legoktm) [21:01:30] (03CR) 10Yuvipanda: "Lest I be accused of actually suggesting we use scap for this, I was mostly making a 'better call saul' reference." [integration/config] - 10https://gerrit.wikimedia.org/r/196002 (owner: 10Legoktm) [21:02:15] thcipriani: so, there’s this series of patches (https://gerrit.wikimedia.org/r/#/q/status:open+project:operations/puppet+branch:production+topic:ssh-userkey,n,z) that I’ll get merged tomorrow... [21:02:26] thcipriani: and then we can probably drop beta/scap \o/ [21:02:33] nice! [21:03:07] 10Staging, 5Patch-For-Review: Setup staging-tin as deployment host - https://phabricator.wikimedia.org/T88442#1111061 (10yuvipanda) Alright, looks like I'll have to merge https://gerrit.wikimedia.org/r/#/q/status:open+project:operations/puppet+branch:production+topic:ssh-userkey,n,z to get this done without in... [21:03:17] thcipriani: you were right about tin being a rabbit hole. so far it’s taken us what… 2 days? 3? [21:03:20] YuviPanda: where does that put completion of tin? [21:03:32] thcipriani: basically tin is done once those series of patches are done :) [21:03:47] thcipriani: uh, actually, no... [21:03:50] thcipriani: it doesn’t. [21:04:02] thcipriani: there’s still some things missing - like no initial clone of /srv/mediawiki-staging, for example [21:04:26] I've been looking at other roles, a lot of them tend to have a lot of overlap with tin, so my _hope_ is that once tin falls there'll be a clearer path forward. [21:05:16] YuviPanda: are there any roles on tin you want to split up? Or are they all too interconnected? [21:05:58] I was just going to start plugging away on mx, since it's fairly discreet, near as I can tell. [21:06:13] thcipriani: I think that’s fine. tin is basically ‘deployment-master’ (not counting the other misc stuff going on there (like releases / labsdb) that don’t apply to us) [21:06:24] thcipriani: yeah, I think the mw roles would be simpler, for example, with tin... [21:06:49] thcipriani: another requirement is for scap to know where to pick things from. if it is in labs, scap tries to scap from deployment-bastion, regardless of which project it is in... [21:07:16] 10Continuous-Integration, 5Patch-For-Review: Generic phplint job is extremely slow for mediawiki/core - https://phabricator.wikimedia.org/T92042#1111075 (10hashar) So mw/core master on Precise takes a good minute to do the shallow clone. With the old /per repositories/ jobs, the repo was already available so... [21:07:44] YuviPanda: scap::master::rsync_host have anything to do with that? [21:08:03] thcipriani: nope, it’s in scap.cfg in the scap repo [21:08:06] not in puppet [21:08:27] but I don’t know how that .cfg file picks values up - from a cursory look, it looks to be picking up based on hostname... [21:08:54] thcipriani: however, once the userkeys are fixed, I believe that the patch itself can be merged, and I will re-image deployment-prep stuff after thata. This allows us to kill beta/scap code [21:08:56] so that’s good [21:09:15] (03CR) 10Hashar: [C: 031] "Good to me. We can later on improve the fab file to add more commands such as checking gearman functions." [integration/config] - 10https://gerrit.wikimedia.org/r/196002 (owner: 10Legoktm) [21:09:43] YuviPanda: hooray! [21:10:09] thcipriani: I think at that point I’ll take a break (I smell of rabit poop now), and work on the puppet ENC so we can start using yaml to specify roles / hostname mappings [21:11:18] 10Quality-Assurance, 6Release-Engineering, 6Phabricator, 10Phabricator-Sprint-Extension, and 2 others: Create Browser Tests for Phabricator - https://phabricator.wikimedia.org/T87359#1111086 (10chasemp) p:5High>3Normal [21:11:18] yeah, that seems like a Good Thing™ to have. [21:11:29] 10Continuous-Integration, 6Phabricator, 10Phabricator-Sprint-Extension: Integrate Jenkins with Phabricator with Harbormaster - https://phabricator.wikimedia.org/T89714#1111087 (10chasemp) p:5High>3Normal [21:11:36] 6Release-Engineering, 6Phabricator, 10Phabricator-Sprint-Extension, 5Patch-For-Review: Create a continuous integration plan for Wikimedia Phabricator patches - https://phabricator.wikimedia.org/T85123#1111088 (10chasemp) p:5High>3Normal [21:11:44] thcipriani: yup, yup [21:12:20] 10Deployment-Systems, 6operations, 7Graphite: [scap] Deploy events aren't showing up in graphite/gdash - https://phabricator.wikimedia.org/T64667#1111094 (10chasemp) p:5High>3Normal reducing priority to reflect the obvious back burner status [21:15:16] greg-g: btw, we’re starting to have goal discussions for next quarter inside ops. should figure out where the releng / ops stuff stands now. [21:15:21] and how far we’ve come. [21:15:28] yep [21:15:32] and how much farther we need to go for BaaS [21:15:38] "a ways" [21:15:45] heh [21:15:57] but yeah, bring me or whoever in whatever conversation makes sense [21:15:57] thcipriani: is churning through them pretty quickly :D [21:16:11] greg-g: yeah, will do. mostly over etherpad / email, at least for now. [21:16:19] which list? [21:20:35] BaaS? [21:20:55] BetaCluster as a Service [21:21:12] yeah [21:21:16] ie: destroy/recreate should be as simple as humanly possible, "one click" [21:21:43] greg-g: I think by end of this week, it *will* be almost that simple for quite some things (mw instances, memcached, etc?) [21:21:52] a lot of the grunt work is gone now. [21:22:00] * greg-g nods [21:22:01] puppetmaster, etc is set projectwide, and so is salt master... [21:22:27] yeah, tin being one of the big things to fall and then others will come along more easily (/me knocks on wood) [21:22:33] yeah [21:22:40] * greg-g was just told that by tyler at least :) [21:22:47] that patch has begat about 6 patches now, and those have been merged [21:23:04] now its biggest blocker is this series of 8 patches from paravoid, so I’m going to test and carefully merge them tomorrow :) [21:23:38] greg-g: oh, so if this works on deployment-prep tomorrow, I might merge it as well. Is there a point where nobody’s deploying so it is safe to merge this thing that should be a nop on tin but better be safe anyway? [21:23:40] * YuviPanda looks at calendar [21:24:10] on tin on production? [21:24:14] yeah. [21:24:28] look at the calendar but give yourself ample wiggle room [21:24:32] yeah [21:24:35] I’ll give myself 4h [21:24:41] but too early to decide now [21:24:44] * greg-g nods [21:24:50] since it isn’t done on deployment-prep yet [21:25:00] ideally I guess we'd do that on Friday [21:25:03] yeah [21:25:07] I don’t mind that [21:25:08] greg-g: Surely, BeCaaS (pronounced "because")? [21:25:15] heh [21:25:27] YuviPanda: let's assume Friday then, unless something else forces your hand [21:25:58] greg-g: yup, makes sense. [21:26:20] greg-g: I’ll actually put it on the calendar once the patch is to my satisfaction [21:27:44] YuviPanda: perfecto [21:28:06] there’s also etcd / zk work planned for sometime next quarter or something like that [21:28:12] that should also help a lot with BeCaaS [21:28:18] * greg-g doesn't know what zk is [21:28:24] (since we won’t need to manually put IP addresses / hostnames everywhere [21:28:26] ZooKeeper [21:28:37] so instead of having to put the IPs for all mw instances in varnish via hiera... [21:28:45] ahhh, glad it's not https://en.wikipedia.org/wiki/ZK_%28framework%29 [21:28:47] hehe [21:29:02] anyway, it’s just more dynamic, less hand-holding [21:29:07] (AFAIK, which could be really wrong) [21:29:24] yeah, that's what I understand it to provide, which would be great [21:30:44] yeah [21:30:59] so hopefully, by end of calendar year, we’ll actually have a turnkey BeCaaS [21:31:10] and rotate them around every week or two [21:31:26] be still my beating heart [21:31:37] heh [21:31:49] Aka YuviPanda you're awesome. [21:31:54] once staging is ‘done’ I think recreatinon time will be down to about… 6h? [21:31:58] http://www.phrases.org.uk/meanings/57000.html [21:32:23] greg-g: aha! that’s kind of what I guessed that meant, but wasn’t fully sure :D [21:32:35] :) [21:32:38] swift is still my biggest worry tho [21:32:53] let’s cross that bridge when we come to it :) [21:33:16] and maybe even terbium. [21:33:24] has a lot of stuff that I don’t know why it has [21:33:36] * greg-g nods [21:34:12] I’ll go to sleep now [21:34:20] night everyone [21:34:57] 10Continuous-Integration: Fix "Entry point ('console_scripts', 'tox') not found" on new slaves running Ubuntu Precise - https://phabricator.wikimedia.org/T91526#1111165 (10hashar) Should be tried again on a fresh Precise instance. I am pretty sure that was caused by tox installation to be readable only by root (... [21:54:36] Project browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-windows_7-internet_explorer-9-sauce build #360: FAILURE in 45 min: https://integration.wikimedia.org/ci/job/browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-windows_7-internet_explorer-9-sauce/360/ [22:12:09] PROBLEM - Free space - all mounts on deployment-bastion is CRITICAL: CRITICAL: deployment-prep.deployment-bastion.diskspace._var.byte_percentfree.value (<12.50%) [22:22:48] (03PS1) 10Legoktm: make-wmf-branch: Branch SandboxLink [tools/release] - 10https://gerrit.wikimedia.org/r/196084 (https://phabricator.wikimedia.org/T72499) [22:23:57] greg-g: when can we schedule https://phabricator.wikimedia.org/T72499 to go out? [22:25:29] legoktm: Gosh, one of mine. :-) [22:25:42] :) [22:32:11] PROBLEM - Free space - all mounts on deployment-bastion is CRITICAL: CRITICAL: deployment-prep.deployment-bastion.diskspace._var.byte_percentfree.value (<11.11%) [22:59:32] (03PS1) 10Legoktm: Include SandboxLink in the shared extension job [integration/config] - 10https://gerrit.wikimedia.org/r/196095 [23:03:11] (03CR) 10Legoktm: "Good to go once Id16c090b18b3de4297dcd9274ceb4d62cfcc2b2e is merged." [integration/config] - 10https://gerrit.wikimedia.org/r/196095 (owner: 10Legoktm) [23:16:52] legoktm: can you clarify the title (titles with ? in them scare me) with what the actual plan is? [23:16:55] legoktm: of that task [23:18:03] alright :P [23:20:39] greg-g: updated with a specific list of wikis, we'll probably want to just put it on a testwiki to begin with [23:21:33] legoktm: that makes so much more sense now [23:22:40] so it's "just" extensionizing a gadget and deploying that and enabling it only on wikis that already had the gadget on by default, that's not scary at all (since the security review is done). [23:23:00] yup [23:25:46] legoktm: also, do you want to be the one to push the buttons (ie: in a one-off window not in SWAT or the train)? or do you want it to be done by mukunda? [23:26:41] greg-g: we should have a specific deployment window for this since we'll need time to go around to all the wikis and disable gadgets [23:27:42] dunno if MatmaRex has a script that can do that... [23:28:06] ahhhh [23:28:36] legoktm: is next week too soon to coordinate that stuff? [23:30:10] greg-g: fine with me, but we should check with MatmaRex [23:31:06] yeah [23:35:16] i don't, but one could be written [23:35:20] needs announcements first [23:36:04] I'm going to dip out soon-ish, replies on the ticket with suggestions welcome [23:54:05] MatmaRex: do we just need to notify them via tech news? or do they actually have to do something? [23:56:51] if we disable the gadgets ourselves, then hopefully not [23:57:01] but who knows [23:57:58] ambassadors list, tech news