[02:03:37] (03CR) 10Legoktm: "What about older release branches?" [integration/config] - 10https://gerrit.wikimedia.org/r/225687 (owner: 10Florianschmidtwelzow) [02:15:25] !log upgraded to elasticsearch-1.7.0.deb on deployment-logstash2 [02:15:29] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [02:30:24] bd808: Fingers crossed... I should be out of FL on the 1st August [02:32:18] Sweet. And back home or on to somewhere else? [02:33:31] Back home... Wedding in Norway on the 8th. On to Spain 21st September [02:34:06] But when not in the USA, I can work again! [02:34:30] Most excellent. [02:34:57] I'm sure we can find some brokenness to point you at [02:35:10] My bank balance? ;) [02:35:34] Heh. [02:35:51] I think my dad has a TODO list for me when I'm back for a while... But I should have quite a lot of time that I need to fill with something halfway productive [02:36:53] Were you ever more than 1/2 productive? ;) [02:37:39] If I can get my hour building done this week... I'm half wondering if I can find a way to spend a few days in SF before going back [02:37:52] Half everything [02:38:49] I hear the 3rd floor is mostly a ghost town these days. Lots of people working from home [02:39:21] But you could probably talk folks into showing up [02:39:42] * bd808 has to get on a plane again now [02:41:58] Have to have a look what there is plane rental wise around SF and take people flying [04:31:52] 10Continuous-Integration-Infrastructure: Write and implement tests for Wikimedia's Apache configuration (redirects.conf, etc.) - https://phabricator.wikimedia.org/T45266#1466254 (10MZMcBride) I don't seem to have permission to view . Weird. [06:56:28] RECOVERY - Free space - all mounts on deployment-videoscaler01 is OK All targets OK [07:33:35] 10Beta-Cluster, 10Pywikibot-OAuth: Investigate process for setting up an OAuth client on the Beta cluster - https://phabricator.wikimedia.org/T104764#1466370 (10VcamX) Thanks for your reply, @hashar! I've registered on deployment but don't have rights to propose. It told me that only users of group Autoconfir... [08:40:55] 10Beta-Cluster, 10Pywikibot-OAuth: Investigate process for setting up an OAuth client on the Beta cluster - https://phabricator.wikimedia.org/T104764#1466391 (10hashar) I have marked the global account `VcamX` as a confirmed user: > 08:40 UTC, 21 July 2015 Hashar (Talk | contribs | block) changed group member... [08:47:54] (03PS7) 10Paladox: Update tests in vector extension [integration/config] - 10https://gerrit.wikimedia.org/r/225029 [08:49:14] (03PS6) 10Paladox: Update CheckUser tests [integration/config] - 10https://gerrit.wikimedia.org/r/225182 [08:50:33] (03PS7) 10Paladox: Update CheckUser tests [integration/config] - 10https://gerrit.wikimedia.org/r/225182 [08:51:04] 10Beta-Cluster, 6Release-Engineering: Enable image rotation on beta for testing purposes - https://phabricator.wikimedia.org/T105877#1466397 (10hashar) Bah I should have read the code :D CommonSettings.php has: ``` lang=php // T35186: turn off incomplete feature action=imagerotate $wgAPIModules['imagerotate']... [08:51:22] 10Beta-Cluster, 6Release-Engineering: Enable image rotation on beta for testing purposes - https://phabricator.wikimedia.org/T105877#1466399 (10hashar) a:5hashar>3None [08:51:22] (03PS6) 10Paladox: Update farmer tests [integration/config] - 10https://gerrit.wikimedia.org/r/225042 [08:52:14] (03PS8) 10Paladox: Update SyntaxHighlight_GeSHi tests [integration/config] - 10https://gerrit.wikimedia.org/r/225035 [08:53:33] (03PS15) 10Paladox: Update tests for Vector skin [integration/config] - 10https://gerrit.wikimedia.org/r/224824 [08:55:02] (03CR) 10Hashar: "recheck" [integration/config] - 10https://gerrit.wikimedia.org/r/226034 (owner: 10Paladox) [08:59:15] PROBLEM - Puppet failure on deployment-sca02 is CRITICAL 22.22% of data above the critical threshold [0.0] [09:09:12] RECOVERY - Puppet failure on deployment-sca02 is OK Less than 1.00% above the threshold [0.0] [09:17:41] (03CR) 10Hashar: [C: 032] "My bad sorry. mwext-TimedMediaHandler-testextension-zend is still around at least :)" [integration/config] - 10https://gerrit.wikimedia.org/r/226034 (owner: 10Paladox) [09:18:57] (03Merged) 10jenkins-bot: Fixed TimedMediaHandler test [integration/config] - 10https://gerrit.wikimedia.org/r/226034 (owner: 10Paladox) [09:21:22] (03CR) 10Paladox: "Ok thanks." [integration/config] - 10https://gerrit.wikimedia.org/r/226034 (owner: 10Paladox) [09:22:25] PROBLEM - Puppet failure on deployment-mx is CRITICAL 100.00% of data above the critical threshold [0.0] [09:25:24] (03PS6) 10Paladox: Update ConfirmAccount tests [integration/config] - 10https://gerrit.wikimedia.org/r/225311 [09:25:45] (03PS8) 10Paladox: Update CheckUser tests [integration/config] - 10https://gerrit.wikimedia.org/r/225182 [09:36:09] 10Continuous-Integration-Infrastructure, 6Multimedia, 6operations, 5Patch-For-Review: Investigate impact of switching from ffmpeg to libav (ffmpeg is not in Jessie) - https://phabricator.wikimedia.org/T103335#1466477 (10MoritzMuehlenhoff) >>! In T103335#1465013, @brion wrote: > As long as whatever we switc... [09:43:44] Yippee, build fixed! [09:43:44] Project browsertests-Echo-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #553: FIXED in 6 min 43 sec: https://integration.wikimedia.org/ci/job/browsertests-Echo-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce/553/ [10:06:04] 10Beta-Cluster, 10Pywikibot-OAuth: Investigate process for setting up an OAuth client on the Beta cluster - https://phabricator.wikimedia.org/T104764#1466506 (10VcamX) @hashar I've proposed a [consumer](http://deployment.wikimedia.beta.wmflabs.org/w/index.php?title=Special:OAuthListConsumers/view/e0ddd3b776365... [10:35:01] 10Beta-Cluster, 10Pywikibot-OAuth: Investigate process for setting up an OAuth client on the Beta cluster - https://phabricator.wikimedia.org/T104764#1466549 (10hashar) Excellent! Thanks for the direct links that saved me a lot of time since I am not at all familiar with OAuth interface. Seems the v1.1 with t... [10:36:01] 10Beta-Cluster, 10Pywikibot-OAuth: Investigate process for setting up an OAuth client on the Beta cluster - https://phabricator.wikimedia.org/T104764#1466553 (10hashar) Hey @csteipp , for info the pywikibot framework is experimenting with OAuth on the beta cluster. I guess the framework will eventually get rid... [10:37:16] PROBLEM - Puppet failure on deployment-jobrunner01 is CRITICAL 22.22% of data above the critical threshold [0.0] [10:39:05] PROBLEM - Puppet failure on deployment-sentry2 is CRITICAL 55.56% of data above the critical threshold [0.0] [10:42:03] PROBLEM - Puppet failure on deployment-mediawiki01 is CRITICAL 33.33% of data above the critical threshold [0.0] [10:49:10] PROBLEM - Puppet failure on deployment-bastion is CRITICAL 66.67% of data above the critical threshold [0.0] [10:50:10] PROBLEM - Puppet failure on deployment-logstash2 is CRITICAL 55.56% of data above the critical threshold [0.0] [10:51:52] PROBLEM - Puppet failure on mira is CRITICAL 60.00% of data above the critical threshold [0.0] [10:53:36] PROBLEM - Puppet failure on deployment-mediawiki02 is CRITICAL 30.00% of data above the critical threshold [0.0] [10:54:44] PROBLEM - Puppet failure on deployment-mediawiki03 is CRITICAL 60.00% of data above the critical threshold [0.0] [11:05:19] (03CR) 10jenkins-bot: [V: 04-1] Update Maintenance extension tests [integration/config] - 10https://gerrit.wikimedia.org/r/225222 (owner: 10Paladox) [11:09:02] (03CR) 10jenkins-bot: [V: 04-1] Update interwiki tests [integration/config] - 10https://gerrit.wikimedia.org/r/225313 (owner: 10Paladox) [11:09:19] (03PS1) 10Mjbmr: Include Wikibase [tools/release] - 10https://gerrit.wikimedia.org/r/226059 [11:09:22] (03CR) 10jenkins-bot: [V: 04-1] Fix MwEmbedSupport dependance of TimedMediaHandler [integration/config] - 10https://gerrit.wikimedia.org/r/226027 (owner: 10Paladox) [11:16:54] (03PS4) 10Paladox: Fix MwEmbedSupport dependance of TimedMediaHandler [integration/config] - 10https://gerrit.wikimedia.org/r/226027 [11:17:22] (03PS5) 10Paladox: Update TwitterLogin tests [integration/config] - 10https://gerrit.wikimedia.org/r/225712 [11:18:09] (03PS5) 10Paladox: Add check for json in TwitterLogin [integration/config] - 10https://gerrit.wikimedia.org/r/225711 [11:20:28] (03PS10) 10Paladox: Update interwiki tests [integration/config] - 10https://gerrit.wikimedia.org/r/225313 [11:23:54] (03CR) 10jenkins-bot: [V: 04-1] Fix MwEmbedSupport dependance of TimedMediaHandler [integration/config] - 10https://gerrit.wikimedia.org/r/226027 (owner: 10Paladox) [11:24:37] (03CR) 10jenkins-bot: [V: 04-1] Update interwiki tests [integration/config] - 10https://gerrit.wikimedia.org/r/225313 (owner: 10Paladox) [11:29:01] (03PS6) 10Paladox: Update Maintenance extension tests [integration/config] - 10https://gerrit.wikimedia.org/r/225222 [11:29:11] (03PS7) 10Paladox: Update Maintenance extension tests [integration/config] - 10https://gerrit.wikimedia.org/r/225222 [11:33:01] (03PS11) 10Paladox: Update interwiki tests [integration/config] - 10https://gerrit.wikimedia.org/r/225313 [11:33:35] (03PS5) 10Paladox: Fix MwEmbedSupport dependance of TimedMediaHandler [integration/config] - 10https://gerrit.wikimedia.org/r/226027 [11:47:58] (03PS4) 10Hashar: Migrate mediawiki-core-code-coverage job to labs [integration/config] - 10https://gerrit.wikimedia.org/r/225063 (https://phabricator.wikimedia.org/T93559) [11:50:58] (03PS5) 10Hashar: Migrate mediawiki-core-code-coverage job to labs [integration/config] - 10https://gerrit.wikimedia.org/r/225063 (https://phabricator.wikimedia.org/T93559) [11:51:55] (03CR) 10Hashar: [C: 032] "Job updated https://integration.wikimedia.org/ci/job/mediawiki-core-code-coverage/" [integration/config] - 10https://gerrit.wikimedia.org/r/225063 (https://phabricator.wikimedia.org/T93559) (owner: 10Hashar) [11:53:07] 10Continuous-Integration-Infrastructure, 5Patch-For-Review: Migrate mediawiki-core-code-coverage job to labs - https://phabricator.wikimedia.org/T93559#1139948 (10hashar) I have refreshed the job it now runs on labs albeit the run time went from 1 hour to 2 hours, it is not really a problem for a coverage repo... [11:53:53] (03Merged) 10jenkins-bot: Migrate mediawiki-core-code-coverage job to labs [integration/config] - 10https://gerrit.wikimedia.org/r/225063 (https://phabricator.wikimedia.org/T93559) (owner: 10Hashar) [11:55:36] (03PS9) 10Paladox: Update SyntaxHighlight_GeSHi tests [integration/config] - 10https://gerrit.wikimedia.org/r/225035 [11:58:58] (03PS10) 10Paladox: Update SyntaxHighlight_GeSHi tests [integration/config] - 10https://gerrit.wikimedia.org/r/225035 [12:00:10] (03PS11) 10Paladox: Update SyntaxHighlight_GeSHi tests [integration/config] - 10https://gerrit.wikimedia.org/r/225035 [12:07:30] PROBLEM - Free space - all mounts on deployment-videoscaler01 is CRITICAL deployment-prep.deployment-videoscaler01.diskspace._var.byte_percentfree (<30.00%) [12:54:29] Project browsertests-GettingStarted-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #540: FAILURE in 27 sec: https://integration.wikimedia.org/ci/job/browsertests-GettingStarted-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/540/ [12:59:43] Project browsertests-MobileFrontend-en.m.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #723: FAILURE in 27 min: https://integration.wikimedia.org/ci/job/browsertests-MobileFrontend-en.m.wikipedia.beta.wmflabs.org-linux-chrome-sauce/723/ [13:37:34] !log upgraded Zuul on gallium from zuul_2.0.0-304-g685ca22-wmf1precise1 to zuul_2.0.0-306-g5984adc-wmf1precise1 . Uses a new version of GitPython [13:37:37] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [14:01:40] need a wmf branch +2 to make the build for the upcoming swat: https://gerrit.wikimedia.org/r/#/c/226021/3 [14:16:48] manybubbles, marktraceur: I need a wmf branch +2 to make the build for the upcoming swat: https://gerrit.wikimedia.org/r/#/c/226021/3 [14:21:58] Krenair, ostriches: ^^ [14:32:34] 10Continuous-Integration-Infrastructure, 6Multimedia, 6operations, 5Patch-For-Review: Investigate impact of switching from ffmpeg to libav (ffmpeg is not in Jessie) - https://phabricator.wikimedia.org/T103335#1466954 (10brion) [14:32:51] 10Continuous-Integration-Infrastructure, 6Multimedia, 6operations, 5Patch-For-Review: Investigate impact of switching from ffmpeg to libav (ffmpeg is not in Jessie) - https://phabricator.wikimedia.org/T103335#1387291 (10brion) [14:34:27] Project browsertests-Echo-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #571: FAILURE in 8 min 25 sec: https://integration.wikimedia.org/ci/job/browsertests-Echo-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/571/ [14:35:46] (03PS1) 10Hashar: Bump upstream 5984adc..3ebedde [integration/zuul] (debian/precise-wikimedia) - 10https://gerrit.wikimedia.org/r/226081 [14:39:50] (03PS2) 10Hashar: Bump upstream 5984adc..3ebedde [integration/zuul] (debian/precise-wikimedia) - 10https://gerrit.wikimedia.org/r/226081 [14:42:36] 10Beta-Cluster, 10Pywikibot-OAuth: Investigate process for setting up an OAuth client on the Beta cluster - https://phabricator.wikimedia.org/T104764#1466992 (10csteipp) Yes, beta, like production, is setup with a central OAuth server so clients only need one authorization to use OAuth on all wikis in the clus... [15:02:45] (03CR) 10Hashar: [C: 032] Bump upstream 5984adc..3ebedde [integration/zuul] (debian/precise-wikimedia) - 10https://gerrit.wikimedia.org/r/226081 (owner: 10Hashar) [15:02:50] (03CR) 10Hashar: [V: 032] Bump upstream 5984adc..3ebedde [integration/zuul] (debian/precise-wikimedia) - 10https://gerrit.wikimedia.org/r/226081 (owner: 10Hashar) [15:04:00] !log upgraded Zuul on gallium from zuul_2.0.0-306-g5984adc-wmf1precise1_amd64.deb to zuul_2.0.0-327-g3ebedde-wmf1precise1_amd64.deb . now uses python-daemon 2.0.5 [15:04:03] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [15:08:35] jzerebecki: do you know that the wmf branch +2 will merge a submodule update to core now!? [15:10:37] manybubbles: yes, currently talking with thcipriani about this in #-operations [15:12:01] 10Beta-Cluster, 10MediaWiki-extensions-GettingStarted, 6operations: GettingStarted on Beta Cluster periodically loses its Redis index - https://phabricator.wikimedia.org/T100515#1467024 (10fgiunchedi) indeed it looks like both beta redis are using aof persistence now, does still show up @mattflaschen ? [15:23:28] PROBLEM - Puppet failure on deployment-cache-bits01 is CRITICAL 100.00% of data above the critical threshold [0.0] [15:40:23] (03PS6) 10Hashar: Replace python shebang with python2.7 [integration/zuul] (patch-queue/debian/precise-wikimedia) - 10https://gerrit.wikimedia.org/r/195540 [15:40:25] (03PS6) 10Hashar: Merger: ensure_cloned() now looks for '.git' [integration/zuul] (patch-queue/debian/precise-wikimedia) - 10https://gerrit.wikimedia.org/r/195281 [15:40:27] (03PS6) 10Hashar: wmf: soften requirements [integration/zuul] (patch-queue/debian/precise-wikimedia) - 10https://gerrit.wikimedia.org/r/195280 [15:40:29] (03PS6) 10Hashar: Ensure the repository configuration lock is released [integration/zuul] (patch-queue/debian/precise-wikimedia) - 10https://gerrit.wikimedia.org/r/195283 [15:40:31] (03PS6) 10Hashar: Update merge status after merge:merge is submitted [integration/zuul] (patch-queue/debian/precise-wikimedia) - 10https://gerrit.wikimedia.org/r/195282 [15:40:33] (03PS3) 10Hashar: Cloner: Implement cache-no-hardlinks argument [integration/zuul] (patch-queue/debian/precise-wikimedia) - 10https://gerrit.wikimedia.org/r/207438 [15:40:45] (03Abandoned) 10Hashar: Bump GitPython from 0.3.2.RC1 to 0.3.2.1 [integration/zuul] (patch-queue/debian/precise-wikimedia) - 10https://gerrit.wikimedia.org/r/225894 (owner: 10Hashar) [15:46:25] arghhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh [15:46:27] stupid gerrit [15:48:24] :) [15:52:45] 10Beta-Cluster, 6Labs, 6operations, 7Monitoring: Setup (simple) catchpoint monitoring and metrics for enwiki betacluster just like production - https://phabricator.wikimedia.org/T97865#1467105 (10greg) [16:13:19] 6Release-Engineering, 10MediaWiki-Maintenance-scripts, 10MediaWiki-Redirects, 5Patch-For-Review: namespaceDupes not handling deleted namespace redirects as desired - https://phabricator.wikimedia.org/T91401#1467244 (10He7d3r) @FcoLeonSaudanha: I don't know. Only people with access to the database could che... [16:24:36] 10Beta-Cluster, 10ContentTranslation-Deployments, 10MediaWiki-extensions-ContentTranslation: Setup new wikis in Beta Cluster for Content Translation - https://phabricator.wikimedia.org/T90683#1467327 (10KartikMistry) [16:24:50] 6Release-Engineering, 10MediaWiki-Maintenance-scripts, 10MediaWiki-Redirects, 5Patch-For-Review: namespaceDupes not handling deleted namespace redirects as desired - https://phabricator.wikimedia.org/T91401#1467329 (10FcoLeonSaudanha) @He7d3r hunm, this case reminds me of the tragic change of software in D... [16:27:48] 6Release-Engineering, 6operations, 7Database: Re-compress External Storage in production using trackBlobs.php and recompressTracked.php - https://phabricator.wikimedia.org/T106387#1467370 (10Jdforrester-WMF) 3NEW [16:28:07] 6Release-Engineering, 6operations, 7Database: Audit all existing code to ensure that any extension currently or previously adding blobs to ES has been registering a reference in the text table - https://phabricator.wikimedia.org/T106388#1467385 (10Jdforrester-WMF) 3NEW [16:28:16] 6Release-Engineering, 6operations, 7Database: Audit all existing code to ensure that any extension currently or previously adding blobs to ES has been registering a reference in the text table - https://phabricator.wikimedia.org/T106388#1467385 (10Jdforrester-WMF) [16:28:19] 6Release-Engineering, 6operations, 7Database: Re-compress External Storage in production using trackBlobs.php and recompressTracked.php - https://phabricator.wikimedia.org/T106387#1467392 (10Jdforrester-WMF) [16:28:28] 6Release-Engineering, 6operations, 7Database: Audit all existing code to ensure that any extension currently or previously adding blobs to ES has been registering a reference in the text table (and fix up if wrong) - https://phabricator.wikimedia.org/T106388#1467385 (10Jdforrester-WMF) [16:28:36] 6Release-Engineering, 6operations, 7Database: Audit all existing code to ensure that any extension currently or previously adding blobs to ES has been registering a reference in the text table (and fix up if wrong) - https://phabricator.wikimedia.org/T106388#1467385 (10Jdforrester-WMF) [16:32:02] 6Release-Engineering, 6operations, 7Database: Audit all existing code to ensure that any extension currently or previously adding blobs to ES has been registering a reference in the text table (and fix up if wrong) - https://phabricator.wikimedia.org/T106388#1467446 (10Jdforrester-WMF) [16:32:29] 6Release-Engineering, 6operations, 7Database: Re-compress External Storage in production using trackBlobs.php and recompressTracked.php - https://phabricator.wikimedia.org/T106387#1467452 (10Jdforrester-WMF) [16:34:24] is there a grafana instance for beta cluster? i'm not finding it [16:38:52] 6Release-Engineering, 10MediaWiki-Maintenance-scripts, 10MediaWiki-Redirects, 5Patch-For-Review: namespaceDupes not handling deleted namespace redirects as desired - https://phabricator.wikimedia.org/T91401#1467495 (10demon) They're there, I just missed updating the archive table before. Looking at that now. [16:39:49] ebernhardson: there's definitely no beta box with role::grafana assigned, FWIW [16:41:52] thcipriani: ok, thanks [16:44:07] RECOVERY - Puppet failure on deployment-sentry2 is OK Less than 1.00% above the threshold [0.0] [16:45:35] 6Release-Engineering, 10MediaWiki-Maintenance-scripts, 10MediaWiki-Redirects, 5Patch-For-Review: namespaceDupes not handling deleted namespace redirects as desired - https://phabricator.wikimedia.org/T91401#1467520 (10demon) Archive tables fixed: ``` mysql:wikiadmin@db1024 [ptwiki]> select count(*) from ar... [16:47:01] 6Release-Engineering, 10MediaWiki-Maintenance-scripts, 10MediaWiki-Redirects, 5Patch-For-Review: namespaceDupes not handling deleted namespace redirects as desired - https://phabricator.wikimedia.org/T91401#1467523 (10FcoLeonSaudanha) @demon Okay. Already grateful :-)! [16:47:19] RECOVERY - Puppet failure on deployment-jobrunner01 is OK Less than 1.00% above the threshold [0.0] [16:50:48] greg-g: hey. can I convince you to add some of https://www.mediawiki.org/wiki/Requests_for_comment/Streamlining_Composer_usage for q2 goals? [16:52:05] RECOVERY - Puppet failure on deployment-mediawiki01 is OK Less than 1.00% above the threshold [0.0] [16:54:11] RECOVERY - Puppet failure on deployment-bastion is OK Less than 1.00% above the threshold [0.0] [16:55:09] RECOVERY - Puppet failure on deployment-logstash2 is OK Less than 1.00% above the threshold [0.0] [16:55:47] 10Browser-Tests: Inconsistent reports on sauce labs and integration.wikimedia.org - https://phabricator.wikimedia.org/T106390#1467589 (10Jdlrobson) 3NEW [16:59:42] RECOVERY - Puppet failure on deployment-mediawiki03 is OK Less than 1.00% above the threshold [0.0] [17:01:50] RECOVERY - Puppet failure on mira is OK Less than 1.00% above the threshold [0.0] [17:03:35] RECOVERY - Puppet failure on deployment-mediawiki02 is OK Less than 1.00% above the threshold [0.0] [17:07:36] hi all! Is there any guide on how one deploys a service on deployment-prep? [17:37:31] PROBLEM - Puppet failure on deployment-wdqs is CRITICAL 100.00% of data above the critical threshold [0.0] [17:47:11] thcipriani: I added SMalyshev to 'staging' so he can test wdqs deploys too. [17:47:34] ostriches: cool, sounds good. [17:48:30] PROBLEM - Host deployment-wdqs is DOWN: CRITICAL - Host Unreachable (10.68.18.89) [17:50:23] (03PS2) 10Mjbmr: make-wmf-branch: include Wikibase and Wikidata extensions [tools/release] - 10https://gerrit.wikimedia.org/r/226059 [18:03:31] 10Continuous-Integration-Infrastructure, 7Zuul: zuul_2.0.0-327-g3ebedde-wmf1precise1 fails importing daemon pidfile/pidlockfile - https://phabricator.wikimedia.org/T106399#1467934 (10hashar) 3NEW a:3hashar [18:03:55] (03CR) 10JanZerebecki: [C: 04-1] "It is only branched every two weeks. We might be able to agree to change that." [tools/release] - 10https://gerrit.wikimedia.org/r/226059 (owner: 10Mjbmr) [18:07:47] PROBLEM - Free space - all mounts on deployment-fluorine is CRITICAL deployment-prep.deployment-fluorine.diskspace.root.byte_percentfree (<100.00%) [18:19:19] 10Browser-Tests: mediawiki selenium gem creates a new user for every page created - https://phabricator.wikimedia.org/T106343#1468013 (10dduvall) 5Open>3Invalid a:3dduvall The user factory is opt-in; it's only enabled for environments where `user_factory: true`. In this particular case, I believe the issu... [18:31:46] 10Browser-Tests: mediawiki selenium gem creates a new user for every page created - https://phabricator.wikimedia.org/T106343#1468226 (10Jdlrobson) (also in general why does it need to do this for every page creation?) is a new user used for each test or for each run of cucumber) [18:44:00] RECOVERY - Puppet failure on integration-zuul-server is OK Less than 1.00% above the threshold [0.0] [18:55:42] thcipriani: How to get a new repo provisioned on staging-tin for deploy with trebuchet? [18:57:03] ostriches: if you add it to https://wikitech.wikimedia.org/wiki/Hiera:Staging role::deployment::repo_config the deployment_server_init thing should add it in /srv/deployment [18:57:16] SMalyshev: ^^ :) [18:57:27] Ahhh, I see it [18:57:39] I did. But so far it did not [18:57:40] so add it there and do a puppet run, then you should be able to deploy it out to other hosts [18:58:15] SMalyshev: lemme check something... [19:00:38] looks like I had puppet runs disabled on staging-test-tin while I was working on something [19:00:56] running now, that _should_ hopefully add it [19:00:57] Herp derp [19:02:20] SMalyshev: wdqs is on staging-test-tin now [19:02:26] thcipriani: yeah, it's there now, thanks [19:02:42] yup, sorry for the weirdness :) [19:04:31] Error: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find class role::wdqs for staging-wdqs.staging.eqiad.wmflabs on node staging-wdqs.staging.eqiad.wmflabs [19:04:45] how often is the puppetmaster updated from production branch? [19:07:08] july 15? Ouch wtf. [19:08:03] Unstashed changes [19:08:07] thcipriani, they look like urs. [19:08:41] looking... [19:08:57] 10Browser-Tests: mediawiki selenium gem creates a new user for every page created - https://phabricator.wikimedia.org/T106343#1468292 (10dduvall) 5Invalid>3Open >>! In T106343#1468139, @Jdlrobson wrote: > I tried creating a new environment but it didn't work. The only way I can get barry to not use the user... [19:08:59] stashed, pulled, applied again [19:09:08] You should be fine, but we're outta sync with upstream [19:09:56] 10Continuous-Integration-Infrastructure, 7Zuul: zuul_2.0.0-327-g3ebedde-wmf1precise1 fails importing daemon pidfile/pidlockfile - https://phabricator.wikimedia.org/T106399#1468300 (10hashar) Loading zuul-merger with verbose: `/usr/share/python/zuul/bin/python -vv /usr/bin/zuul-merger` : ``` import daemon # pr... [19:10:05] ostriches: thanks, yeah, those were the service restart permissions, committing now. [19:11:19] twentyafterfour: about? [19:11:24] same error-ish on iridium [19:11:34] chasemp: really? [19:11:34] you want to poke before I revert? [19:11:41] yes please [19:11:48] I don't get that, I tested on labs it was fine [19:12:42] isn't this [19:12:43] require => "${rootdir}/phabricator/src/extensions" [19:12:44] meant to be [19:12:51] require => File["${rootdir}/phabricator/src/extensions"], [19:13:14] and [19:13:14] path => "${$phabdir}/phabricator/src/extensions", [19:13:17] is weird? [19:13:27] require => File["${$phabdir}/phabricator/src/extensions"], [19:13:29] too [19:13:30] chasemp: hmm [19:14:27] yeah the require is supposed to be File[] [19:14:31] how did that work on labs? [19:14:40] I don't get it [19:14:45] uh yup no idea [19:15:09] the var interpollation pretty sure won't do the right thing w/ double $ [19:15:26] double $? [19:16:42] ${$phabdir} [19:16:54] ${phabdir} I think is right [19:19:33] methinks labs is tricking you somehow [19:26:32] yeah maybe it isn't actually using the puppetmaster I thought it's using? [19:33:55] chasemp: I'll fix it after I finish deploying 1.26wmf15 [19:34:04] sure [19:34:10] I disabled puppet alert [19:34:26] (03PS1) 10Hashar: lockfile is no more used [integration/zuul] (debian/precise-wikimedia) - 10https://gerrit.wikimedia.org/r/226129 (https://phabricator.wikimedia.org/T106399) [19:38:05] I really wish there was a saner way to test puppet changes. [19:39:00] chasemp: any reason we couldn't just have a freakin test puppetmaster running a branch of operations/puppet ... and have it auto update to the tip of the test branch? and let the branch bypass gerrit review? [19:39:16] uh where in labs? [19:39:18] yeah [19:39:21] I think audo update is standard now [19:39:24] auto even [19:39:30] so not to my knowledge [19:39:41] phab-01 is maybe a bad beginning [19:39:43] idk [19:39:49] I did'nt set it up so I'm not sure [19:40:12] I'm for it? :) [19:40:18] the important part was the bit about a 'testing' branch that bypasses review... [19:40:41] not just for phab but for all puppet testing [19:41:05] it will turn into a mess in a hurry (the branch) [19:41:30] bd808: it could allow force updates and just let it be a mess [19:41:57] periodically replace it with whatever is at the tip of production [19:42:16] The way I invented for deploymnet-prpe has worked pretty well there and in a few other small projects I run -- https://wikitech.wikimedia.org/wiki/Nova_Resource:Deployment-prep/How_code_is_updated#Cherry-picking_a_patch_from_gerrit [19:42:39] bd808: cherry picking? [19:42:45] that is such a pain [19:43:24] that is specifically the thing I was trying to avoid. it requires logging into root on the puppetmaster rather than a command I can run locally on my dev environment [19:44:13] (03CR) 10Hashar: [C: 032 V: 032] "That fixed the issue :)" [integration/zuul] (debian/precise-wikimedia) - 10https://gerrit.wikimedia.org/r/226129 (https://phabricator.wikimedia.org/T106399) (owner: 10Hashar) [19:44:19] at deviantART we had our developer virtual machines running virtually the same puppet as production so it was easy to test locally by just running puppet apply [19:44:26] having a force pushed, unreviewed branch per project in gerrit seems even more gross [19:44:37] bd808 not per project [19:44:39] just puppet [19:45:05] 10Continuous-Integration-Infrastructure, 5Patch-For-Review, 7Zuul: zuul_2.0.0-327-g3ebedde-wmf1precise1 fails importing daemon pidfile/pidlockfile - https://phabricator.wikimedia.org/T106399#1468499 (10hashar) 5Open>3Resolved Issue fixed on labs by dropping lockfile and shipping python-daemon 2.x in the... [19:45:07] how would you ever know what you were running? [19:45:58] you'd be running whatever you just pushed? [19:46:17] I guess it doesn't scale to many users [19:46:23] like 1 [19:46:26] but having a puppetmaster sucks [19:46:33] all the way around [19:46:49] someday I'll find more time to work on https://gerrit.wikimedia.org/r/#/c/212294/ [19:47:01] I talked to Faidon about it a bit in MX [19:47:23] he told me (and I promptly forgot) about the bootstrapping I was missing [19:47:43] bd808: that would be awesome. That is a lot closer to the way we did it at deviantART and it worked wonderfully [19:47:46] things that are baked in to prod and labs base images that aren't setup with puppet [19:48:18] bd808: vagrant supports custom base images too, right? [19:48:22] yeah [19:48:22] !log Upgrading Zuul to zuul_2.0.0-327-g3ebedde-wmf2precise1 Previous version failed because python-daemon was too old, now shipped in the venv https://phabricator.wikimedia.org/T106399 [19:48:25] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [20:10:46] !log Zuul restarted with 2.0.0-327-g3ebedde-wmf2precise1 [20:10:49] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [20:11:17] Krinkle: I got rid of lanthanum you probably noticed [20:11:31] will try to get rid of some more jobs still running on gallium [20:16:03] hasharConfcall: cool [20:17:42] PROBLEM - Puppet failure on deployment-salt is CRITICAL 60.00% of data above the critical threshold [0.0] [20:21:25] bah I broke zuul [20:23:32] !log Zuul no more reports back to Gerrit due to an error with the Gerrit label [20:23:34] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [20:25:35] ostriches, thcipriani: is staging-palladium automatically updated from puppet git or something manual needs to be done? [20:26:59] SMalyshev: I think it has to be fetched/rebased in /var/lib/git/operations/puppet [20:27:21] something may have changed about that recently, but that's the way it has been in recent past [20:27:27] (03PS1) 10Hashar: zuul: fix Gerrit labels [integration/config] - 10https://gerrit.wikimedia.org/r/226220 [20:27:36] thcipriani: ok, thanks [20:29:25] (03CR) 10Hashar: [C: 032] zuul: fix Gerrit labels [integration/config] - 10https://gerrit.wikimedia.org/r/226220 (owner: 10Hashar) [20:36:43] hashar: BTW, VE doesn't get auto-submodule-updated, so don't worry about them getting auto-deployed. [20:37:47] greg-g: lesson of the day: I should really look at the deployment calendar before crashing Zuul :((( [20:39:19] hashar: probably yes :) [20:39:24] hashar: do it in your morning :P [20:42:44] RECOVERY - Puppet failure on deployment-salt is OK Less than 1.00% above the threshold [0.0] [20:47:22] thcipriani: so I try to deploy trebuchet package and it says: Notice: /Stage[main]/Wdqs::Service/Package[wdqs]/ensure: ensure changed 'purged' to 'present' [20:47:22] but nothing gets actually deployed [20:47:59] SMalyshev: on which host? [20:48:13] thcipriani: staging-wdqs [20:48:25] lemme take a quick look [20:49:25] 0/0 minions completed checkout <- that's what git deploy sync says [20:49:35] I suspect 0/0 is not what I'm looking for? [20:51:22] Yeah, I think this is a known, git deploy start has to be run before the puppet provider works [20:51:40] * greg-g grumbles about that annoying bug [20:51:42] I just run a git deploy from staging-test-tin and it seemed to work [20:52:40] thcipriani: I run git deploy start, it says "Deployment started" [20:52:59] should deployment::target be configured somewhere? [20:53:23] because I'm not sure I understand how it knows where to deploy stuff... [20:53:26] SMalyshev: huh, from which box? staging-test-tin? That's the Trebuchet master for staging [20:53:53] thcipriani: from staging-tin. Should I use staging-test-tin? [20:54:08] deployment_target is a grain that gets set on each target box, you can check if a box has the grain by running: sudo salt-call grains.get deployment_target [20:54:12] puppet seems to go to staging-tin [20:54:38] SMalyshev: yeah, use staging-test-tin, puppet should use staging-test-tin as well (for the trebuchet provider) [20:54:39] root@staging-wdqs:~# salt-call grains.get deployment_target [20:54:39] local: [20:54:39] - wdqs/wdqs [20:55:16] thcipriani: I didn't tell puppet to use staging-tin, so I assume it's configured somewhere [20:55:16] yup, I call git deploy start and git deploy sync from staging-test-tin for wdqa and it worked for 1/1 minion [20:55:32] yeah, it's configured in hiera to use staging-test-tin [20:55:45] so where I configure it? [20:56:12] well, it should stay staging-test-tin for trebuchet use [20:56:21] staging-tin is probably out of date [20:57:00] but if you do your deploy work on staging-test-tin for deploying staging that should work fine [20:57:17] I need to delete the staging-tin instance [20:58:01] ok, looks like it deployed it now [20:58:05] thcipriani: thanks [20:59:14] greg-g: yeah I usually do the CI deploys in my mornings :D But I am on a tight schedule! [20:59:16] lame [21:01:55] hashar: live and learn [21:02:03] yeah [21:06:55] 10Beta-Cluster, 6Release-Engineering, 7Jenkins, 7Monitoring: Create metrics of Beta Cluster stability using a Jenkins job - https://phabricator.wikimedia.org/T106421#1468867 (10greg) 3NEW [21:07:32] 10Beta-Cluster, 6Labs, 6operations, 7Monitoring: Setup (simple) catchpoint monitoring and metrics for enwiki betacluster just like production - https://phabricator.wikimedia.org/T97865#1468877 (10greg) We talked about this on ops list: https://lists.wikimedia.org/mailman/private/ops/2015-July/049244.html... [21:18:26] PROBLEM - Puppet failure on deployment-parsoidcache02 is CRITICAL 100.00% of data above the critical threshold [0.0] [21:26:15] PROBLEM - Puppet failure on deployment-cache-text03 is CRITICAL 100.00% of data above the critical threshold [0.0] [21:36:19] 10Continuous-Integration-Infrastructure, 5Patch-For-Review: Migrate mediawiki-core-code-coverage job to labs - https://phabricator.wikimedia.org/T93559#1468987 (10hashar) Job ran and published the doc at https://integration.wikimedia.org/cover/mediawiki-core/master/php/ [21:36:35] umm, is zuul busted? https://gerrit.wikimedia.org/r/#/c/195088/ for example [21:36:43] hashar: ^? [21:37:47] 10Continuous-Integration-Infrastructure, 5Patch-For-Review: Migrate all jobs to labs slaves - https://phabricator.wikimedia.org/T86659#1468993 (10hashar) [21:37:48] 10Continuous-Integration-Infrastructure, 5Patch-For-Review: Migrate mediawiki-core-code-coverage job to labs - https://phabricator.wikimedia.org/T93559#1468992 (10hashar) 5Open>3Resolved [21:40:23] (03Abandoned) 10Hashar: Add tests for WikibaseRepository [integration/config] - 10https://gerrit.wikimedia.org/r/225086 (https://phabricator.wikimedia.org/T75863) (owner: 10Paladox) [21:42:23] (03Abandoned) 10Hashar: Fix MwEmbedSupport dependance of TimedMediaHandler [integration/config] - 10https://gerrit.wikimedia.org/r/226027 (owner: 10Paladox) [21:45:49] bed [21:46:59] hmm, yeah zuul still seems down [21:48:52] !log Zuul not responding [21:48:55] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [21:49:54] (03CR) 10Paladox: "Hi this test also should be done because the updated test I submitted for review removes the old mwembedsupport test. And so this should b" [integration/config] - 10https://gerrit.wikimedia.org/r/226027 (owner: 10Paladox) [21:57:22] ...so [22:00:34] So FWIW it looks like the zuul service is running on gallium (looking at https://www.mediawiki.org/wiki/Continuous_integration/Zuul?redirect=no#Debugging) [22:00:54] okay so [22:00:55] all I see [22:00:56] 2015-07-21 22:00:44,218 INFO zuul.Gerrit: Updating information for 226227,1 [22:01:00] when I +2 something [22:01:07] it is responding to other pipelines [22:01:10] but not gate-and-submit [22:02:53] I see a lot of "ERROR zuul.IndependentPipelineManager: Unable to find change queue for project [whatever]" in the zuul log [22:03:02] those are expected [22:03:30] kk [22:03:42] running zuul enqueue --trigger gerrit --pipeline gate-and-submit --project mediawiki/extensions/Echo --change 226227,1 looks like it is working properly [22:05:00] ideally we should downgrade zuul, but I have no idea how to do that [22:08:11] digging through the debug log... [22:09:40] 2015-07-21 21:21:41,581 DEBUG zuul.Scheduler: Adding trigger event: [22:10:59] 2015-07-21 21:24:08,188 DEBUG zuul.Scheduler: Adding trigger event: [22:11:06] okay, so why didn't that trigger gate-and-submit? [22:12:40] > approval This is only used for comment-added events. It only matches if the event has a matching approval associated with it. Example: code-review: 2 matches a +2 vote on the code review category. Multiple approvals may be listed. [22:19:42] legoktm, thcipriani: Any idea how quickly this'll be fixed? [22:21:10] James_F: no estimate offhand: not very familiar with this portion of the system. [22:21:35] thcipriani: Thanks. :-( I really don't want us to manually-merge things for SWAT, though. [22:21:43] indeed. [22:23:43] 10Continuous-Integration-Infrastructure, 6Release-Engineering: automatically build and commit mediawiki/vendor (composer) - https://phabricator.wikimedia.org/T101123#1469124 (10JanZerebecki) [22:23:44] 10Continuous-Integration-Infrastructure, 5MW-1.26-release, 5Patch-For-Review: Fetch dependencies using composer instead of cloning mediawiki/vendor for non-wmf branches - https://phabricator.wikimedia.org/T90303#1469123 (10JanZerebecki) [22:23:50] 10Continuous-Integration-Infrastructure, 6Release-Engineering: automatically build and commit mediawiki/vendor (composer) - https://phabricator.wikimedia.org/T101123#1330244 (10JanZerebecki) [22:23:51] 10Continuous-Integration-Infrastructure, 5MW-1.26-release, 5Patch-For-Review: Fetch dependencies using composer instead of cloning mediawiki/vendor for non-wmf branches - https://phabricator.wikimedia.org/T90303#1055749 (10JanZerebecki) [22:25:07] 10Continuous-Integration-Infrastructure: CR +2 events not triggering gate-and-submit pipeline in zuul - https://phabricator.wikimedia.org/T106436#1469134 (10Legoktm) 3NEW a:3hashar [22:25:18] 10Continuous-Integration-Infrastructure, 5MW-1.26-release, 5Patch-For-Review: Fetch dependencies using composer instead of cloning mediawiki/vendor for non-wmf branches - https://phabricator.wikimedia.org/T90303#1055749 (10JanZerebecki) [22:25:32] James_F: idk. I don't know how to downgrade zuul either [22:26:14] legoktm: Does anyone? [22:26:21] hashar probably [22:26:27] well there is https://www.mediawiki.org/wiki/Continuous_integration/Zuul#upgrading [22:26:30] Ha. [22:28:07] ehm anyone else have an error on that url ? [22:28:10] File not found: /srv/mediawiki/php-1.26wmf15/../wmf-config/ExtensionMessages-1.26wmf15.php [22:29:11] ah, ops channel i see [22:30:34] PROBLEM - Free space - all mounts on integration-slave-trusty-1015 is CRITICAL integration.integration-slave-trusty-1015.diskspace._mnt.byte_percentfree (<10.00%) [22:33:09] thcipriani: those instructions look out of date, I'm pretty sure we use a deb now [22:33:26] legoktm: yeah, I was just looking at that in https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [22:33:47] "Zuul restarted with 2.0.0-327-g3ebedde-wmf2precise1" implies deb [22:34:46] so dpkg says version 2.0.0-327-g3ebedde-wmf2precise1 is currently installed [22:35:48] and we previously had zuul_2.0.0-304-g685ca22-wmf1precise1 [22:35:59] looks like there are two versions in apt available, the currently installed and 2.0.0-304-g685ca22-wmf1precise1 [22:36:14] use that one :) [22:36:55] I could try apt-get remove and apt-get install =version (not sure if I have those permissions) [22:37:08] stop zuul and zuul-merger, downgrade, revert the integration-config change, and bring it back up? [22:38:43] thcipriani: if you can sudo apt-get update you should? right? [22:40:17] hmm, asks for password for both zuul and me for apt-get update, I feel like sudoers permissions are a bit more granular. There is no /etc/sudoers.d/zuul [22:40:43] and I can't see the suoders files anyway [22:41:06] time to ask for ops help [22:41:12] there's contint-admins which has limited access [22:42:52] can someone break down the version needed and command they think should work [22:42:53] chase is avail to help [22:44:23] sudo apt-get install zuul=2.0.0-304-g685ca22-wmf1precise1 [22:44:26] yes? [22:45:01] yeah, so, I think we need to stop zuul and zuul-merger [22:45:14] run the apt-get install of that version [22:45:30] chasemp: not 2.0.0-306-g5984adc-wmf1precise1 ? [22:45:33] that's what i see in dpkg.log [22:45:47] then git reset --hard 9588d0a6844fc9cc68372f4bf3e1eda3cffc8138 in /etc/zuul/wikimedia [22:45:51] my imipression is they want to downgrade from that jgage [22:46:42] thcipriani: why are we doing this? [22:46:44] what's the issue? [22:46:51] ah i see, It's Complicated. 304 -> 306 -> 327 -> 306 -> 327 [22:47:12] there was an upgrade today and now zuul is no longer triggering gate-and-submit jobs [22:47:28] service zuul stop && service zuul-merger stop && sudo apt-get install zuul=2.0.0-304-g685ca22-wmf1precise1 [22:47:31] agreed? [22:47:47] ^ legoktm greg-g any better ideas? [22:47:53] none from me [22:47:56] chasemp: uh, maybe run the service stop commands first and make sure they've stopped before upgrading? [22:47:56] other than calling antoine [22:48:07] except for missing sudo on the service commands ;) [22:48:10] if the stop doesn't succeed no further commands will run [22:48:14] okay [22:48:17] mmm && [22:48:35] unless the init is insane already [22:48:40] which could be [22:48:47] ok off we go [22:50:05] "then git reset --hard 9588d0a6844fc9cc68372f4bf3e1eda3cffc8138 in /etc/zuul/wikimedia" [22:50:06] yes? [22:50:10] yup [22:50:39] ok done [22:50:42] now how do we test? [22:51:08] someone +2 a change [22:51:30] though, the different queues aren't showing up for me at all on https://integration.wikimedia.org/zuul/ [22:52:10] https://integration.wikimedia.org/zuul/ looks better [22:52:12] I'm getting Error: Could not retrieve catalog from remote server: wrong header line format from puppet on staging suddenly. Anybody has any idea what could be the issue? [22:52:15] woot [22:52:25] root@gallium:/etc/zuul/wikimedia# dpkg -s zuul [22:52:25] Package: zuul [22:52:25] Status: install ok installed [22:52:25] chasemp: working! [22:52:26] Priority: optional [22:52:29] Section: python [22:52:30] Installed-Size: 14122 [22:52:32] Maintainer: Paul Belanger [22:52:35] Architecture: amd64 [22:52:36] !pastebin ;) [22:52:36] Version: 2.0.0-304-g685ca22-wmf1precise1 [22:52:37] yeah, I see a gate and submit working on mediawiki-core [22:52:47] ok, now we just need to re-+2 a bunch of stuff [22:53:02] SMalyshev: have you gotten that more than once? i've gotten transient errors from the puppetmaster before. [22:53:18] jgage: yes, getting it now constantly from puppet... didn't happen before [22:53:23] hmm [22:53:39] looks like something broke on staging-palladium [22:53:50] !log 22:47 < chasemp> service zuul stop && service zuul-merger stop && sudo apt-get install zuul=2.0.0-304-g685ca22-wmf1precise1 [22:53:53] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [22:54:01] !log 22:50 < chasemp> "then git reset --hard 9588d0a6844fc9cc68372f4bf3e1eda3cffc8138 in /etc/zuul/wikimedia" [22:54:03] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [22:54:09] SMalyshev: http://bitcube.co.uk/content/puppet-errors-explained is always helpful, looks like maybe a template problem [22:54:10] I didn't realize there was separate qa bot [22:54:12] tx [22:54:25] yeppers [22:54:30] thcipriani: I didn't change anything... [22:54:38] how meta is it that I relogged your log of my missing log to another logbot to match the log bots [22:54:48] I just installed a new host and suddenly puppet refuses to work [22:54:57] hmm, which host? [22:55:21] legoktm: hey, do you have a bot I can hack into sending either A) diff between two revisions or B) contents of a section of a wiki to an email? [22:55:36] thcipriani && greg-g you guys cool now? [22:55:44] * greg-g wants to send the last day's worth of RelEng/SAL to the team list [22:55:45] we should make a task of this for the eventual upgrade with a history [22:55:48] chasemp: I think so [22:55:51] chasemp: I think so, thanks for your help! [22:55:56] sure thing [22:57:08] thcipriani: staging-wdqs-test [22:57:17] * thcipriani looks [22:57:19] thcipriani: staging-wdqs-test.staging.eqiad.wmflabs [22:57:26] greg-g: no, but it wouldn't be hard to write. Why not use an RSS feed though? [22:57:37] I +2'd all the missed patches [22:58:36] legoktm: I was thinking of sending the summary to our team list, not asking people to add it to their personal rss reader :) [22:58:41] legoktm: any chance you can do some server-side uploads ? [22:58:42] 10Continuous-Integration-Infrastructure: CR +2 events not triggering gate-and-submit pipeline in zuul - https://phabricator.wikimedia.org/T106436#1469238 (10Legoktm) @chasemp has downgraded zuul to 2.0.0-304-g685ca22-wmf1precise1 for now. [22:58:44] legoktm: thanks for re +2'ing [22:59:01] legoktm: https://integration.wikimedia.org/ci/job/wikidata-query-rdf/493/console complains about Unable to locate the Javac Compiler [22:59:13] legoktm: and yeah, not too hard to write, it just takes me longer to bootstrap myself... :) [22:59:14] something is broked [23:00:32] PROBLEM - Free space - all mounts on integration-slave-trusty-1015 is CRITICAL integration.integration-slave-trusty-1015.diskspace._mnt.byte_percentfree (<30.00%) [23:01:13] matanya: I could yeah, file a bug and assign it to me? [23:01:24] thanks legoktm [23:02:21] SMalyshev: I'm going to restart your instance if that's ok [23:02:31] thcipriani: sure go ahead [23:03:16] greg-g: I could write something, sure. File a bug and assign it to me? ;) [23:03:23] since this happens so early in the puppet run, I'm guessing this instnace is just a little whacky. [23:03:50] legoktm: you're awesome, and not obliged, I can't give you orders ;) [23:04:06] SMalyshev: umm, I have no idea why the job is failing. [23:05:14] legoktm: very strange, it says there's no java compiler... anything changed on that machine that could have messed with the compiler? [23:06:05] 6Release-Engineering: Send email of last day's SAL entries to releng@ - https://phabricator.wikimedia.org/T106443#1469278 (10greg) 3NEW a:3Legoktm [23:06:32] 6Release-Engineering: Send email of last day's SAL entries to releng@ - https://phabricator.wikimedia.org/T106443#1469287 (10greg) [23:09:09] marxarelli: around? [23:09:22] legoktm: https://phabricator.wikimedia.org/T106444 [23:10:48] SMalyshev: as silly as it is, you have to *only* comment "recheck". If you change your CR vote, it ignores it [23:11:05] jdlrobson: sure am [23:11:59] legoktm: oh. ok, I'll remeber that, thanks! :) [23:12:11] matanya: hmm, I don't think I can reach labs from production, so I'll have to download it locally and copy it to to terbium? [23:12:31] that is right legoktm [23:12:53] marxarelli: hey [23:12:58] you can copy from terbium iirc [23:13:11] so... https://gerrit.wikimedia.org/r/226242 sets up an env without user factory [23:13:32] but it doesn't seem to have any impact as i still get the error "You're not allowed to edit interface messages (protectednamespace-interface) (MediawikiApi::ApiError)" [23:14:11] SMalyshev: did puppet ever run on this box successfully (other than the cert setup)? [23:14:18] thcipriani: yes [23:14:20] jdlrobson: weird. you had to explicitly disable it? [23:15:03] thcipriani: before I enabled the role role::wdqs it had a sucessful run. But I run that role on another host just fine before... [23:15:06] that's weird, nothing has changed on staging-palladium since this morning when I committed some uncommitted changes. [23:15:07] SMalyshev: I'm not sure why it's still failing. File a bug in #contint-infrastructure ? [23:15:49] marxarelli: even if i don't have that line same problem [23:15:59] it't not using MEDIAWIKI_USERNAME [23:16:10] jdlrobson: oh, ok. so maybe it's not related to the user factory then [23:16:40] jdlrobson: are you sure there isn't some kind of permission that the mediawiki_user lacks? [23:16:56] 10Continuous-Integration-Infrastructure: CI run complains about missing javac - https://phabricator.wikimedia.org/T106446#1469321 (10Smalyshev) 3NEW [23:17:04] legoktm: done: https://phabricator.wikimedia.org/T106446 [23:17:17] marxarelli: it does create a new user though [23:17:25] http://gather-browser-tests.wmflabs.org/w/index.php?title=Special:RecentChanges [23:17:25] jdlrobson: i can try to repro it. what's the feature/scenario? [23:17:35] thcipriani: is there any way to see what part it has the problem with? [23:17:37] ui_links.feature [23:18:05] SMalyshev: puppet agent -t --debug is sometimes helpful [23:18:35] thcipriani: I tried to do that, nothing helpful unfortunately [23:19:14] I tried to do puppet master --compile staging-wdqs-test.staging.eqiad.wmflabs on staging-palladium but I get a bunch of python errors from that [23:20:09] Error: Failed to execute generator /usr/local/bin/sshknowngen: Execution of '/usr/local/bin/sshknowngen' returned 1: Traceback (most recent call last): [23:20:11] etc. [23:20:21] matanya: can I wget it from a url? Or do I have to scp? do I have access to that labs instance? [23:20:48] legoktm: rsyc, i guess. you have access, i granted it to you [23:20:52] ConfigParser.NoOptionError: No option 'dbuser' in section: 'master' [23:20:52] at /etc/puppet/modules/ssh/manifests/client.pp:9 on node staging-wdqs-test.staging.eqiad.wmflabs [23:21:04] thcipriani: ^ not linked to any new stuff per chance? [23:21:56] SMalyshev: not anything new that I'm aware of, staging is mostly quiet in terms of work being done on it: I poke at it occasionally but that's really the only changes that happen AFAIK [23:22:58] might try adding another new box: see if it falls over in the same way. I'm wondering (since it's failing in or just after gathering facts) if something just didn't spin up quite right on here. [23:23:19] what's weird that code is under if $::realm == 'production' so how it runs at all? [23:24:08] matanya: ok, it's downloading now...should be ~15min [23:24:18] thanks legoktm [23:24:38] thcipriani: looks like it thinks it's production and tries to retrieve some passwords from somewhere for some reason... not sure why [23:26:03] maybe the name matches some over-broad regexp or something? [23:26:24] well that's interesting, looks at: ldapsearch -LLL -x -d1 -D 'cn=proxyagent,ou=profile,dc=wikimedia,dc=org' -w $(grep -Po "(?<=bindpw).*" /etc/ldap.conf) -b 'ou=h [23:26:26] osts,dc=wikimedia,dc=org' "associatedDomain=staging-wdqs-test*" [23:26:48] it doesn't show a puppetVar: realm=labs for that instance [23:27:02] so for whatever reason that instance didn't get its realm set in ldap [23:27:30] so that would explain why it's failing so early [23:27:46] yeah, I'd say just spin up a new instance and see if it happens again [23:27:59] it was probably a bootstrapping fluke [23:30:30] thcipriani: ok, spinning up new one [23:31:23] thcipriani: yeah the same thing happened [23:31:34] that is weird. [23:31:35] thcipriani: staging-wdqs2 instance [23:32:19] ok, I have to go for a train, will be online a bit later again [23:32:24] SMalyshev: I'm asking in -labs [23:35:18] jdlrobson: strange. i can verify in a debugger that the user factory is skipped, even when removing the explicit `user_factory: false` [23:36:23] jdlrobson: can you give me the full set of environment variables that barry sets? [23:37:10] marxarelli: sure... [23:37:40] MW_SERVER,MW_SCRIPT_PATH,MEDIAWIKI_URL,MEDIAWIKI_USER,MEDIAWIKI_PASSWORD,MEDIAWIKI_API_URL,MEDIAWIKI_LOAD_URL and BROWSER [23:37:52] jdlrobson: need values, too :) [23:38:04] oh okay let me paste [23:38:07] excluding pass [23:38:22] cool. im that over maybe [23:38:30] https://gist.github.com/jdlrobson/b68141c0a825fef03fc8 < marxarelli [23:40:18] jdlrobson: ah. so you need to set MEDIAWIKI_ENVIRONMENT=barry [23:41:04] jdlrobson: to instruct mw-selenium to fetch that set of configuration from environments.yml [23:41:16] including your user_factory: false [23:41:24] ahh stupid me [23:41:31] otherwise, it's going to load the `default` one, an alias to mw-vagrant-host [23:41:32] matanya: can you post the sha1 of the file on the bug? [23:42:37] jdlrobson: we should probably add some message to the Cucumber output that says "using configuration for environment [name]" [23:43:38] SMalyshev: whenever you're back around, looks like that problem should be fixed, just spun up staging-wdqs-test2 and it has the "realm" assigned in ldap: https://tools.wmflabs.org/watroles/variable/instancename/staging-wdqs-test2 [23:45:57] posted legoktm [23:46:02] marxarelli: yeh that would be helpful [23:46:11] marxarelli: i still have concerns about the user factory in general [23:46:14] matanya: ok, it's going to be an hour to upload to terbium :/ [23:46:28] would be great to talk about them over a coffee but not urgent if ican get barry passing again [23:46:42] https://gerrit.wikimedia.org/r/226242 < would be great if you could +2 if barry gives his blessing [23:47:10] jdlrobson: sure thing. always down for a chat over coffee [23:47:40] jdlrobson: you can probably remove the `user_factory: false` [23:47:50] i'd rather be explicit... [23:47:56] cool [23:47:57] we might want to change in future [23:48:36] legoktm: i think i'll go to sleep, but i'll file a request to allow this specific instance to access prod [23:49:45] good night :) [23:49:55] matanya: or lets just get you access to do this in prod ;) [23:49:58] jdlrobson: btw, the user factory thing could be a good topic for the testing workshop/discussion on Monday [23:50:05] jdlrobson: do you think you can make it? [23:50:23] should be able to yeh :) [23:50:29] just remind me in the mroning :) [23:50:31] (just announced it on eng@ today) [23:50:33] cool! [23:51:00] yeah, i hope i'm awake for it [23:51:43] in retrospect, the early timeslot should have at least been in the middle of the week