[00:00:44] 10Release-Engineering-Team (Kanban): Address proximity of service deployments to train deployments problem - https://phabricator.wikimedia.org/T182733#3832934 (10Jrbranaa) [00:02:17] 10Beta-Cluster-Infrastructure: Request to test centralauth operations on a test account - https://phabricator.wikimedia.org/T180757#3832946 (10greg) 05stalled>03declined Declining per email thread. [00:38:27] !log deployed mobileapps@bfc3588 to BC [00:38:31] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [02:06:17] PROBLEM - Free space - all mounts on deployment-fluorine02 is CRITICAL: CRITICAL: deployment-prep.deployment-fluorine02.diskspace._srv.byte_percentfree (<22.22%) [02:58:13] 10Continuous-Integration-Infrastructure, 10MediaWiki-Core-Tests, 10MediaWiki-Platform-Team (MWPT-Q3-Jan-Mar-2018): Figure out how to accurately backfill MediaWiki core test code coverage data - https://phabricator.wikimedia.org/T182750#3833278 (10Legoktm) [03:01:58] 10Continuous-Integration-Infrastructure, 10MediaWiki-Platform-Team (MWPT-Q3-Jan-Mar-2018): Migrate https://tools.wmflabs.org/coverage/mediawiki/ to CI infrastructure - https://phabricator.wikimedia.org/T182751#3833289 (10Legoktm) 05Open>03stalled [04:18:06] Project selenium-MultimediaViewer » firefox,beta,Linux,BrowserTests build #606: 04FAILURE in 22 min: https://integration.wikimedia.org/ci/job/selenium-MultimediaViewer/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=BrowserTests/606/ [04:18:40] Project selenium-MultimediaViewer » chrome,beta,OS X 10.9,BrowserTests build #606: 04FAILURE in 22 min: https://integration.wikimedia.org/ci/job/selenium-MultimediaViewer/BROWSER=chrome,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=OS%20X%2010.9,label=BrowserTests/606/ [04:26:01] 10Continuous-Integration-Infrastructure, 10MediaWiki-Platform-Team (MWPT-Q3-Jan-Mar-2018): Generate code coverage reports for extensions - https://phabricator.wikimedia.org/T71685#3833313 (10Legoktm) [04:49:52] PROBLEM - Mediawiki Error Rate on graphite-labs is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [10.0] [04:59:28] PROBLEM - Puppet staleness on deployment-restbase01 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [43200.0] [05:14:53] PROBLEM - Mediawiki Error Rate on graphite-labs is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [10.0] [05:22:25] PROBLEM - Puppet staleness on deployment-restbase02 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [43200.0] [06:52:51] Yippee, build fixed! [06:52:52] Project selenium-Wikibase » chrome,beta,Linux,BrowserTests build #573: 09FIXED in 2 hr 12 min: https://integration.wikimedia.org/ci/job/selenium-Wikibase/BROWSER=chrome,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=BrowserTests/573/ [06:56:19] RECOVERY - Free space - all mounts on deployment-fluorine02 is OK: OK: All targets OK [07:37:03] PROBLEM - Free space - all mounts on integration-slave-jessie-1003 is CRITICAL: CRITICAL: integration.integration-slave-jessie-1003.diskspace._srv.byte_percentfree (<11.11%) [08:47:52] 10Gerrit: Zuul: Gerrit's ssh event stream unavailable - https://phabricator.wikimedia.org/T48917#3833444 (10hashar) [08:47:54] 10Gerrit, 10Release-Engineering-Team (Kanban), 10Upstream: Excessive timeouts over ssh [mina sshd] - https://phabricator.wikimedia.org/T49004#3833447 (10hashar) [08:47:56] 10Continuous-Integration-Infrastructure: Zuul: Gerrit's ssh event stream unavailable - https://phabricator.wikimedia.org/T51330#3833450 (10hashar) [08:54:12] PROBLEM - Long lived cherry-picks on puppetmaster on deployment-puppetmaster02 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [08:55:40] 10Gerrit: Zuul: Gerrit's ssh event stream unavailable - https://phabricator.wikimedia.org/T48917#3833456 (10hashar) [09:17:07] 10Release-Engineering-Team (Kanban), 10Patch-For-Review, 10User-greg, 10User-zeljkofilipin: Create #wikimedia-releng-feed and move bots there - https://phabricator.wikimedia.org/T181582#3833512 (10zeljkofilipin) a:03zeljkofilipin [09:18:08] 10Release-Engineering-Team (Kanban), 10Patch-For-Review, 10User-greg, 10User-zeljkofilipin: Create #wikimedia-releng-feed and move bots there - https://phabricator.wikimedia.org/T181582#3794965 (10zeljkofilipin) p:05Triage>03Low [09:43:07] (03CR) 10Hashar: [C: 031] "Yup sounds good to me. There are a few times when that helps catch issue/recovery on beta cluster, but overall that is really just spam t" [integration/config] - 10https://gerrit.wikimedia.org/r/397971 (https://phabricator.wikimedia.org/T181582) (owner: 10Greg Grossmeier) [09:43:15] zeljkof: https://gerrit.wikimedia.org/r/#/c/397971/1 is good to go imho :) [09:43:25] (removes selenium-* jobs IRC ping) [09:44:07] hashar: will merge it in a few minutes, in the middle of something else [09:44:14] thanks for the review! :D [09:47:04] PROBLEM - Free space - all mounts on integration-slave-jessie-1003 is CRITICAL: CRITICAL: integration.integration-slave-jessie-1003.diskspace._srv.byte_percentfree (<44.44%) [09:52:58] (03CR) 10Zfilipin: [C: 032] JJB: Remove browsertests-irc [integration/config] - 10https://gerrit.wikimedia.org/r/397971 (https://phabricator.wikimedia.org/T181582) (owner: 10Greg Grossmeier) [09:54:20] (03Merged) 10jenkins-bot: JJB: Remove browsertests-irc [integration/config] - 10https://gerrit.wikimedia.org/r/397971 (https://phabricator.wikimedia.org/T181582) (owner: 10Greg Grossmeier) [09:58:16] (03CR) 10Zfilipin: "Updated jobs:" [integration/config] - 10https://gerrit.wikimedia.org/r/397971 (https://phabricator.wikimedia.org/T181582) (owner: 10Greg Grossmeier) [09:59:22] 10Release-Engineering-Team (Kanban), 10User-greg, 10User-zeljkofilipin: Create #wikimedia-releng-feed and move bots there - https://phabricator.wikimedia.org/T181582#3833591 (10zeljkofilipin) [10:00:09] 10Release-Engineering-Team (Kanban), 10User-greg, 10User-zeljkofilipin: Create #wikimedia-releng-feed and move bots there - https://phabricator.wikimedia.org/T181582#3794965 (10zeljkofilipin) Updated jobs: language-screenshots-VisualEditor selenium-CentralAuth selenium-CentralNotice selenium-CirrusSearch se... [10:00:40] 10Release-Engineering-Team (Kanban), 10User-greg, 10User-zeljkofilipin: Create #wikimedia-releng-feed and move bots there - https://phabricator.wikimedia.org/T181582#3833595 (10zeljkofilipin) a:05zeljkofilipin>03None [10:10:11] (03PS2) 10Zfilipin: Ignore Gemfile.lock [tools/release] - 10https://gerrit.wikimedia.org/r/397832 (https://phabricator.wikimedia.org/T182401) [10:11:29] 10Release-Engineering-Team, 10Scap, 10ORES, 10Operations, 10Scoring-platform-team: Connection timeout from tin to new ores servers - https://phabricator.wikimedia.org/T181661#3833644 (10mmodell) 15d5283b7422919d85203b5ba907027f9356e421 doesn't exist in the editquality repo. Somehow the submodule pointer... [10:18:23] 10Release-Engineering-Team (Kanban), 10Phabricator, 10monitoring, 10Browser-Tests, 10User-zeljkofilipin: Develop tests for phabricator search to detect regressions / search quality issues - https://phabricator.wikimedia.org/T182160#3833672 (10mmodell) @zeljkofilipin You now have push on that repo. I have... [10:42:52] 10Release-Engineering-Team, 10Scap, 10ORES, 10Operations, 10Scoring-platform-team: Connection timeout from tin to new ores servers - https://phabricator.wikimedia.org/T181661#3833707 (10akosiaris) But it does exist on tin ``` akosiaris@tin:/srv/deployment/ores/deploy/.git/modules/submodules/editquality$... [10:45:22] 10Release-Engineering-Team, 10Scap, 10ORES, 10Operations, 10Scoring-platform-team: Connection timeout from tin to new ores servers - https://phabricator.wikimedia.org/T181661#3833708 (10mmodell) Another thing: I'm having difficulty just cloning the editquality submodule. It's so large that git pack-obje... [10:47:48] 10Release-Engineering-Team, 10Scap, 10ORES, 10Operations, 10Scoring-platform-team: Connection timeout from tin to new ores servers - https://phabricator.wikimedia.org/T181661#3833717 (10akosiaris) @mmodell is on to something though with the comment about that commit not being in the repo ``` akosiaris@t... [10:48:25] (03PS1) 10Zfilipin: WIP Create selenium-CirrusSearch-jessie Jenkins job [integration/config] - 10https://gerrit.wikimedia.org/r/398030 (https://phabricator.wikimedia.org/T175179) [10:49:13] (03CR) 10jerkins-bot: [V: 04-1] WIP Create selenium-CirrusSearch-jessie Jenkins job [integration/config] - 10https://gerrit.wikimedia.org/r/398030 (https://phabricator.wikimedia.org/T175179) (owner: 10Zfilipin) [10:49:14] 10Release-Engineering-Team, 10Scap, 10ORES, 10Operations, 10Scoring-platform-team: Connection timeout from tin to new ores servers - https://phabricator.wikimedia.org/T181661#3833719 (10mmodell) Hmm, indeed, if the object does not exist on any branch or tag then it likely won't be fetched by the "dumb" g... [10:52:25] 10Release-Engineering-Team, 10Scap, 10ORES, 10Operations, 10Scoring-platform-team: Connection timeout from tin to new ores servers - https://phabricator.wikimedia.org/T181661#3833721 (10mmodell) Just fetching this one repo (editquality) from phabricator is causing inordinate load on the server. It's noth... [10:53:25] (03PS2) 10Zfilipin: WIP Create selenium-CirrusSearch-jessie Jenkins job [integration/config] - 10https://gerrit.wikimedia.org/r/398030 (https://phabricator.wikimedia.org/T175179) [11:00:38] 10Release-Engineering-Team, 10Scap, 10ORES, 10Operations, 10Scoring-platform-team: Connection timeout from tin to new ores servers - https://phabricator.wikimedia.org/T181661#3833730 (10akosiaris) Behavior is erratic as well ``` akosiaris@bast1001:~$ git clone https://phabricator.wikimedia.org/source/ed... [11:02:42] 10Release-Engineering-Team, 10Scap, 10ORES, 10Operations, 10Scoring-platform-team: Connection timeout from tin to new ores servers - https://phabricator.wikimedia.org/T181661#3833733 (10mmodell) Yeah that repo is 334M in the current workdir but the .git is 2.1 gigs. That doesn't seem too unreasonable bu... [13:05:37] 10Release-Engineering-Team (Kanban), 10Discovery, 10Discovery-Search (Current work), 10Patch-For-Review, 10User-zeljkofilipin: Run selenium-EXTENSION-jessie Jenkins job for CirrusSearch - https://phabricator.wikimedia.org/T175179#3834001 (10zeljkofilipin) I have added readme files to tests/selenium and t... [13:36:10] 10Release-Engineering-Team (Kanban), 10Discovery, 10Discovery-Search (Current work), 10Patch-For-Review, 10User-zeljkofilipin: Run selenium-EXTENSION-jessie Jenkins job for CirrusSearch - https://phabricator.wikimedia.org/T175179#3834036 (10dcausse) The error `EADDRINUSE /tmp/cirrussearch-integration-tag... [14:18:11] (03PS1) 10Ricordisamoa: Update PHP_CodeSniffer to 3.2.0 [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/398050 [15:07:07] 10Release-Engineering-Team, 10Scap, 10ORES, 10Operations, 10Scoring-platform-team: Connection timeout from tin to new ores servers - https://phabricator.wikimedia.org/T181661#3834330 (10akosiaris) A fresh clone of `http://tin.eqiad.wmnet/ores/deploy/.git/modules/submodules/editquality` on bast1001 does n... [15:20:01] 10Release-Engineering-Team, 10Scap, 10ORES, 10Operations, 10Scoring-platform-team: Connection timeout from tin to new ores servers - https://phabricator.wikimedia.org/T181661#3834452 (10mmodell) @akosiaris: scap //should// be getting the hash from the submodule pointers contained at `HEAD` of `tin.eqiad... [15:20:53] 10Release-Engineering-Team, 10Scap, 10ORES, 10Operations, 10Scoring-platform-team: Connection timeout from tin to new ores servers - https://phabricator.wikimedia.org/T181661#3834455 (10akosiaris) the ores submodule btw is in the exact same state and also fails to checkout ``` akosiaris@tin:/srv/deploym... [15:26:22] 10Release-Engineering-Team, 10Scap, 10ORES, 10Operations, 10Scoring-platform-team: Connection timeout from tin to new ores servers - https://phabricator.wikimedia.org/T181661#3834476 (10mmodell) ah ha! I figured _something_ out at least! The 15d5283b commit is in origin/master it just hasn't been merged... [15:26:30] 10Release-Engineering-Team, 10Scap, 10ORES, 10Operations, 10Scoring-platform-team: Connection timeout from tin to new ores servers - https://phabricator.wikimedia.org/T181661#3834478 (10akosiaris) >>! In T181661#3834452, @mmodell wrote: > @akosiaris: scap //should// be getting the hash from the submodule... [15:30:05] 10Release-Engineering-Team, 10Scap, 10ORES, 10Operations, 10Scoring-platform-team: Connection timeout from tin to new ores servers - https://phabricator.wikimedia.org/T181661#3834498 (10mmodell) >>! In T181661#3834478, @akosiaris wrote: >>>! In T181661#3834452, @mmodell wrote: >> @akosiaris: scap //shoul... [15:35:43] 10Release-Engineering-Team, 10Scap, 10ORES, 10Operations, 10Scoring-platform-team: Connection timeout from tin to new ores servers - https://phabricator.wikimedia.org/T181661#3834505 (10mmodell) so @awight, can you enlighten me about your scap.cfg? Is git_rev: origin/master intentional? If then I think... [15:37:34] 10Release-Engineering-Team (Kanban), 10Phabricator, 10monitoring, 10Browser-Tests, 10User-zeljkofilipin: Develop tests for phabricator search to detect regressions / search quality issues - https://phabricator.wikimedia.org/T182160#3834507 (10zeljkofilipin) Ok, I can push. I have pushed a small commit co... [15:37:54] 10Release-Engineering-Team, 10Scap, 10ORES, 10Operations, 10Scoring-platform-team: Connection timeout from tin to new ores servers - https://phabricator.wikimedia.org/T181661#3834510 (10akosiaris) Aha! nice find. It looks like it's been there since the very beginning. See fd1067ff4da. It has undergone a... [15:40:59] 10Release-Engineering-Team, 10Scap, 10ORES, 10Operations, 10Scoring-platform-team: Connection timeout from tin to new ores servers - https://phabricator.wikimedia.org/T181661#3834521 (10mmodell) I think I should add a NOTICE to scap that says something along the lines of "Deploying from non-default origi... [15:46:10] 10Release-Engineering-Team, 10Scap, 10ORES, 10Operations, 10Scoring-platform-team: Connection timeout from tin to new ores servers - https://phabricator.wikimedia.org/T181661#3834531 (10akosiaris) > I 've crafted a commit on tin removing that line and retrying a scap deploy from tin just for ores1004. O... [15:56:02] 10Release-Engineering-Team, 10Scap, 10ORES, 10Operations, 10Scoring-platform-team: Connection timeout from tin to new ores servers - https://phabricator.wikimedia.org/T181661#3834565 (10akosiaris) >>! In T181661#3834531, @akosiaris wrote: >> I 've crafted a commit on tin removing that line and retrying a... [16:08:51] 10Release-Engineering-Team, 10Scap, 10ORES, 10Operations, and 2 others: Connection timeout from tin to new ores servers - https://phabricator.wikimedia.org/T181661#3834600 (10awight) @mmodell Tangential note, I've been happy using `git clone --depth 1` on personal projects. Would that make any sense for s... [16:28:41] 10Release-Engineering-Team, 10Scap, 10ORES, 10Operations, and 2 others: Connection timeout from tin to new ores servers - https://phabricator.wikimedia.org/T181661#3834679 (10mmodell) @awight: from what I understand, git has to do a lot of extra work on the server side in order to build to shallow clone. I... [16:32:03] PROBLEM - Free space - all mounts on integration-slave-jessie-1003 is CRITICAL: CRITICAL: integration.integration-slave-jessie-1003.diskspace._srv.byte_percentfree (<11.11%) [16:35:10] (03CR) 10Paladox: [C: 031] Update PHP_CodeSniffer to 3.2.0 [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/398050 (owner: 10Ricordisamoa) [16:42:32] (03CR) 10Reedy: "https://github.com/squizlabs/PHP_CodeSniffer/compare/3.1.1...3.2.0" [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/398050 (owner: 10Ricordisamoa) [16:53:49] 10Release-Engineering-Team, 10Scap, 10Scoring-platform-team: Scap is unhappy about deploying from a branch other than master - https://phabricator.wikimedia.org/T182498#3834771 (10mmodell) 05Open>03Invalid As we found out in T181661, `git_rev=origin/master` was set in the scap.cfg for ores. This was prob... [16:58:48] 10Release-Engineering-Team, 10Scap, 10ORES, 10Operations, and 2 others: Connection timeout from tin to new ores servers - https://phabricator.wikimedia.org/T181661#3834785 (10mmodell) [17:09:30] 10Release-Engineering-Team, 10Scap, 10ORES, 10Operations, 10Scoring-platform-team: New, mysterious scap failure - https://phabricator.wikimedia.org/T182801#3834842 (10awight) p:05Triage>03High [17:10:50] 10Release-Engineering-Team, 10Scap, 10ORES, 10Operations, and 2 others: Connection timeout from tin to new ores servers - https://phabricator.wikimedia.org/T181661#3834855 (10mmodell) >>! In T181661#3834679, @mmodell wrote: > @awight: from what I understand, git has to do a lot of extra work on the server... [17:12:47] 10Release-Engineering-Team, 10Scap, 10ORES, 10Operations, and 2 others: Connection timeout from tin to new ores servers - https://phabricator.wikimedia.org/T181661#3834860 (10awight) [17:13:11] 10Release-Engineering-Team, 10Scap, 10ORES, 10Operations, 10Scoring-platform-team: New, mysterious scap failure - https://phabricator.wikimedia.org/T182801#3834858 (10awight) 05Open>03Invalid /srv is full. Strange that there was no error message during deployment, though... [17:13:37] 10Release-Engineering-Team, 10Scap, 10ORES, 10Operations, 10Scoring-platform-team: New, mysterious scap failure - https://phabricator.wikimedia.org/T182801#3834862 (10mmodell) strange indeed. Full disk can case all sorts of weird behaviors though. [17:14:54] 10Release-Engineering-Team, 10Scap, 10ORES, 10Operations, 10Scoring-platform-team: New, mysterious scap failure - https://phabricator.wikimedia.org/T182801#3834866 (10awight) >>! In T182801#3834862, @mmodell wrote: > strange indeed. Full disk can case all sorts of weird behaviors though. +1 This might n... [17:16:44] no_justification hi, i wonder should we do this https://gerrit.wikimedia.org/r/#/c/395048/2/scap/scap.cfg for the gerrit repo too? [17:16:50] twentyafterfour: Thanks for the CR, and for implementing the cache_revs feature! [17:17:17] paladox: :D I’ll let you know how it goes in practice. [17:24:20] Project beta-scap-eqiad build #186188: 04FAILURE in 41 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/186188/ [17:24:57] RECOVERY - Free space - all mounts on deployment-sca03 is OK: OK: All targets OK [17:34:01] heh [17:34:19] Project beta-scap-eqiad build #186189: 04STILL FAILING in 41 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/186189/ [17:35:22] 10Release-Engineering-Team, 10Scap, 10ORES, 10Operations, and 2 others: Connection timeout from tin to new ores servers - https://phabricator.wikimedia.org/T181661#3834939 (10awight) Looks like I'm getting the same error. > commit b67bba77acb7c0ffc678201c9f3f54f198da6650 > > scap deploy -v -l "ores*" "(no... [17:44:20] Project beta-scap-eqiad build #186190: 04STILL FAILING in 41 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/186190/ [17:44:59] https://integration.wikimedia.org/ci/job/beta-scap-eqiad/186188/console [17:46:59] damn [17:47:28] twentyafterfour: I guess your change needs a tweak ^ [17:52:14] thcipriani: https://phabricator.wikimedia.org/D915 [17:54:19] Project beta-scap-eqiad build #186191: 04STILL FAILING in 40 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/186191/ [17:55:34] twentyafterfour: accepted [18:04:21] Project beta-scap-eqiad build #186192: 04STILL FAILING in 42 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/186192/ [18:12:27] (03PS3) 10Thcipriani: operations-puppet: install mtail [integration/config] - 10https://gerrit.wikimedia.org/r/394551 (https://phabricator.wikimedia.org/T181794) (owner: 10Filippo Giunchedi) [18:12:29] (03PS1) 10Thcipriani: docker-pkg: use `run` to update docker image [integration/config] - 10https://gerrit.wikimedia.org/r/398086 [18:14:37] (03CR) 10Thcipriani: [C: 032] operations-puppet: install mtail [integration/config] - 10https://gerrit.wikimedia.org/r/394551 (https://phabricator.wikimedia.org/T181794) (owner: 10Filippo Giunchedi) [18:15:45] (03Merged) 10jenkins-bot: operations-puppet: install mtail [integration/config] - 10https://gerrit.wikimedia.org/r/394551 (https://phabricator.wikimedia.org/T181794) (owner: 10Filippo Giunchedi) [18:17:00] Yippee, build fixed! [18:17:01] Project beta-scap-eqiad build #186193: 09FIXED in 3 min 23 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/186193/ [18:18:45] !log Updating docker-pkg files on contint1001 for https://gerrit.wikimedia.org/r/#/c/394551/ [18:18:49] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [18:25:10] !log failed Updating docker-pkg files on contint1001 for https://gerrit.wikimedia.org/r/#/c/394551/ permissions errors with fabfile.py [18:25:15] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [18:36:13] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10Operations, 10Release Pipeline, and 2 others: Icinga disk space alert when a Docker container is running on an host - https://phabricator.wikimedia.org/T178454#3835328 (10Dzahn) 05Resolved>03Open [18:37:33] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10Operations, 10Release Pipeline, and 2 others: Icinga disk space alert when a Docker container is running on an host - https://phabricator.wikimedia.org/T178454#3692868 (10Dzahn) sorry to say, but there is one of thes... [18:54:28] PROBLEM - Puppet errors on deployment-secureredirexperiment is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [18:54:44] (03CR) 10Thcipriani: [C: 04-1] "ugh. Won't work because contint-admins are not automagically added to the docker group." [integration/config] - 10https://gerrit.wikimedia.org/r/398086 (owner: 10Thcipriani) [19:10:20] paladox: I mean I guessssss....but our filesizes are so tiny that old revisions won't matter much :) [19:10:30] ah ok. :) [19:11:22] Oh wait. [19:11:37] The default (5?) should suffice for us [19:12:20] ah ok [19:16:02] (03PS1) 10Thcipriani: Docker: use new operations-puppet image [integration/config] - 10https://gerrit.wikimedia.org/r/398103 [19:23:57] 10Release-Engineering-Team (Kanban), 10User-greg: Explain to TechComm (Daniel K) part of learnings from ORES post-mortem re arch reviews - https://phabricator.wikimedia.org/T182635#3835463 (10greg) a:05greg>03Jrbranaa [19:24:03] 10Release-Engineering-Team (Kanban): Explain to TechComm (Daniel K) part of learnings from ORES post-mortem re arch reviews - https://phabricator.wikimedia.org/T182635#3829546 (10greg) [19:30:40] (03CR) 10Thcipriani: [C: 032] "Already live. Watched a few jobs run: seems to be working." [integration/config] - 10https://gerrit.wikimedia.org/r/398103 (owner: 10Thcipriani) [19:32:21] PROBLEM - Free space - all mounts on integration-slave-jessie-1003 is CRITICAL: CRITICAL: integration.integration-slave-jessie-1003.diskspace._srv.byte_percentfree (<11.11%) [19:33:06] (03Merged) 10jenkins-bot: Docker: use new operations-puppet image [integration/config] - 10https://gerrit.wikimedia.org/r/398103 (owner: 10Thcipriani) [19:38:49] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Someday), 10Patch-For-Review: Disable xdebug for phpunit/composer unless needed - https://phabricator.wikimedia.org/T175028#3835508 (10Legoktm) What if we had our `php` wrapper script automatically add `-d zend_extension=xdebug.so` if `PHP_... [19:51:09] PROBLEM - Mediawiki Error Rate on graphite-labs is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [10.0] [20:10:36] (03CR) 10Chad: [C: 032] Ignore Gemfile.lock [tools/release] - 10https://gerrit.wikimedia.org/r/397832 (https://phabricator.wikimedia.org/T182401) (owner: 10Zfilipin) [20:12:19] (03Merged) 10jenkins-bot: Ignore Gemfile.lock [tools/release] - 10https://gerrit.wikimedia.org/r/397832 (https://phabricator.wikimedia.org/T182401) (owner: 10Zfilipin) [20:19:35] no_justification your fork of gerrit may actually come in handy :). [20:20:54] PROBLEM - Puppet errors on deployment-ms-be04 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [20:28:09] paladox: I was hoping it would ;-) [20:28:19] :). [20:28:27] Project selenium-Wikibase-chrome » chrome,beta,Linux,DebianJessie && contintLabsSlave build #41: 04FAILURE in 41 min: https://integration.wikimedia.org/ci/job/selenium-Wikibase-chrome/BROWSER=chrome,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=DebianJessie%20&&%20contintLabsSlave/41/ [20:30:18] no_justification im thinking we may need to do a sshd upgrade in 2.14 from 1.4 to 1.6. This includes a fix for edcsa to get it working. [20:30:52] I'd rather not upgrade a bundled library out of sync with upstream. [20:30:58] ok [20:31:00] Is there something we *need* to upgrade Mina for? [20:31:07] Other than ecdsa? [20:31:21] no_justification not really. just the ecdsa. [20:31:28] Yeah, not worth it [20:31:37] ok [20:41:09] RECOVERY - Mediawiki Error Rate on graphite-labs is OK: OK: Less than 1.00% above the threshold [1.0] [20:42:33] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10Operations, 10Release Pipeline, and 2 others: Icinga disk space alert when a Docker container is running on an host - https://phabricator.wikimedia.org/T178454#3835695 (10hashar) @Dzahn that is on lawrencium . Can yo... [20:46:31] PROBLEM - Puppet errors on deployment-ms-be03 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [20:49:11] PROBLEM - Puppet errors on deployment-mediawiki06 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [20:51:10] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Someday), 10Patch-For-Review: Disable xdebug for phpunit/composer unless needed - https://phabricator.wikimedia.org/T175028#3835725 (10Legoktm) Also having xdebug enabled is not a problem for composer anymore, it has some really scary magic... [20:55:14] PROBLEM - Puppet errors on deployment-mediawiki05 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [20:56:11] PROBLEM - Puppet errors on deployment-mediawiki04 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [22:08:56] !log deployed mobileapps@ddddebb to BC [22:09:02] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [22:10:49] mdholloway: omg, look at the sha1 ^, lol [22:11:09] bearND: lol, i know! [22:11:13] Lol [22:12:28] nice. [23:02:28] no_justification lol https://phabricator.wikimedia.org/rGERRIT699ad41086673532de69832bc3e22cdc78da3d31 [23:02:40] ah i now get it. it's notedb [23:03:19] An empty commit? [23:03:39] no_justification yeh because it's a commit that has metadata [23:03:51] so a commit msg [23:03:54] but empty content [23:05:45] Well that's....not super useful :p [23:06:17] heh. A workaround is to block refs/meta/* (and allow refs/meta/config). [23:09:10] 10Continuous-Integration-Infrastructure, 10Readers-Web-Backlog, 10Browser-Tests: Popups browser tests failing as new summary endpoint returns 500 on Main Page - https://phabricator.wikimedia.org/T182465#3836109 (10Pchelolo) [23:13:17] lol me again https://phabricator.wikimedia.org/rGERRITd50e0632216e05975eb91aa786c2c32cfc73a15e [23:20:34] no_justification lol https://phabricator.wikimedia.org/diffusion/TGTR/ [23:22:23] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10Operations, 10Release Pipeline, and 2 others: Icinga disk space alert when a Docker container is running on an host - https://phabricator.wikimedia.org/T178454#3836169 (10Dzahn) @hashar fixed by adding the right chec... [23:24:11] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10Operations, 10Release Pipeline, and 2 others: Icinga disk space alert when a Docker container is running on an host - https://phabricator.wikimedia.org/T178454#3836171 (10Dzahn) 05Open>03Resolved Current Status:... [23:42:26] PROBLEM - Puppet errors on deployment-mira is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [23:48:22] PROBLEM - Puppet errors on deployment-tin is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0]