[00:17:10] Project selenium-Flow » firefox,beta,Linux,contintLabsSlave && UbuntuTrusty build #166: 04FAILURE in 1 min 9 sec: https://integration.wikimedia.org/ci/job/selenium-Flow/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/166/ [00:32:04] 10Gerrit, 06Operations, 13Patch-For-Review: setup/deploy cobalt as gerrit warm standby/replacement - https://phabricator.wikimedia.org/T147597#2698433 (10Dzahn) [00:32:39] 10Gerrit, 06Operations, 13Patch-For-Review: setup/deploy cobalt as gerrit warm standby/replacement - https://phabricator.wikimedia.org/T147597#2697753 (10Dzahn) OS installed, added to puppet, signed salt-key, gave access to gerrit-roots, gerrit server role commented out until tomorrow... [00:50:15] 10Gerrit, 06Operations, 13Patch-For-Review: setup/deploy cobalt as gerrit warm standby/replacement - https://phabricator.wikimedia.org/T147597#2698449 (10Dzahn) started bacula restored of lead data to cobalt /srv Run Restore job JobName: RestoreFiles Bootstrap: /var/lib/bacula/helium.eqiad.wmn... [00:51:34] 10Gerrit, 06Operations, 13Patch-For-Review: setup/deploy cobalt as gerrit warm standby/replacement - https://phabricator.wikimedia.org/T147597#2698453 (10Dzahn) oops, since Where: is a prefix, this is restoring it as /srv/srv/gerrit but we can simply move it when done.. and then we'll rsync the diff tomorrow. [00:55:44] 10Gerrit, 06Operations, 13Patch-For-Review: setup/deploy cobalt as gerrit warm standby/replacement - https://phabricator.wikimedia.org/T147597#2698458 (10Dzahn) [00:56:45] 10Gerrit, 06Operations, 13Patch-For-Review: setup/deploy cobalt as gerrit warm standby/replacement - https://phabricator.wikimedia.org/T147597#2697753 (10Dzahn) [00:58:14] 10Gerrit, 06Operations, 13Patch-For-Review: setup/deploy cobalt as gerrit warm standby/replacement - https://phabricator.wikimedia.org/T147597#2698464 (10Dzahn) [02:19:33] PROBLEM - Puppet staleness on deployment-pdfrender is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [43200.0] [03:43:24] Project mediawiki-core-code-coverage build #2308: 04STILL FAILING in 43 min: https://integration.wikimedia.org/ci/job/mediawiki-core-code-coverage/2308/ [06:15:41] 03Scap3, 10Parsoid, 06Services, 15User-Joe, 15User-mobrovac: Enable Scap3 config deploys for Parsoid - https://phabricator.wikimedia.org/T144596#2698656 (10Joe) [07:46:08] 10Gerrit, 06Repository-Admins: Rename the Semantic Forms extension to "Page Forms" - https://phabricator.wikimedia.org/T147582#2698785 (10Peachey88) [07:47:01] 10Gerrit, 06Repository-Admins: Rename the Semantic Forms extension to "Page Forms" - https://phabricator.wikimedia.org/T147582#2697328 (10Peachey88) Do you want "PageForms" (Defacto standard for extension naming) or "Page Forms"? [07:51:10] 10Gerrit, 06Repository-Admins: Rename the Semantic Forms extension to "Page Forms" - https://phabricator.wikimedia.org/T147582#2697328 (10Paladox) Hi, we carnt rename projects in gerrit but what we can do is set the repo to read only and then create a new repo and import it in there? [07:55:30] !log Upgrading Nodepool image for Jessie [07:55:34] qa-morebots: poke [07:55:35] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [07:55:35] I am a logbot running on tools-exec-1214. [07:55:35] Messages are logged to https://tools.wmflabs.org/sal/releng. [07:55:35] To log a message, type !log . [08:39:00] (03CR) 10Hashar: [C: 032] "Looks good to have it always enabled. Less surprises this way." [integration/config] - 10https://gerrit.wikimedia.org/r/314571 (owner: 10Hashar) [08:39:58] (03Merged) 10jenkins-bot: dib: run puppet with --debug [integration/config] - 10https://gerrit.wikimedia.org/r/314571 (owner: 10Hashar) [08:47:41] 10Beta-Cluster-Infrastructure, 06Operations, 13Patch-For-Review, 05Prometheus-metrics-monitoring: deploy prometheus node_exporter and server to deployment-prep - https://phabricator.wikimedia.org/T144502#2698891 (10fgiunchedi) [08:48:09] 10Beta-Cluster-Infrastructure, 06Operations, 13Patch-For-Review, 05Prometheus-metrics-monitoring: deploy prometheus node_exporter and server to deployment-prep - https://phabricator.wikimedia.org/T144502#2601885 (10fgiunchedi) Prometheus for beta is available at https://beta-prometheus.wmflabs.org/beta/gra... [09:14:39] 10Beta-Cluster-Infrastructure, 10Monitoring, 07Tracking: Setup monitoring for Beta Cluster (tracking) - https://phabricator.wikimedia.org/T53497#2698944 (10fgiunchedi) [09:14:41] 10Beta-Cluster-Infrastructure, 06Operations, 13Patch-For-Review, 05Prometheus-metrics-monitoring: deploy prometheus node_exporter and server to deployment-prep - https://phabricator.wikimedia.org/T144502#2698941 (10fgiunchedi) 05Open>03Resolved Dashboard for host overview: https://grafana-labs.wikimedi... [09:14:47] 10Beta-Cluster-Infrastructure, 06Operations, 13Patch-For-Review, 05Prometheus-metrics-monitoring: deploy prometheus node_exporter and server to deployment-prep - https://phabricator.wikimedia.org/T144502#2698945 (10fgiunchedi) [09:25:01] PROBLEM - Puppet run on deployment-eventlogging03 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [10:04:59] RECOVERY - Puppet run on deployment-eventlogging03 is OK: OK: Less than 1.00% above the threshold [0.0] [10:23:31] I'm looking into migrating wasat/terbium to jessie and noticed that there's no terbium equivalent in deployment-prep, is there a particular reason for that or rather a case of "noone needed that so far"? [10:35:29] moritzm: on beta we use deployment-tin [10:35:42] historically we had deployment-bastion which was both the deploy server and running script host [10:36:04] moritzm: for prod, maybe keep terbium and spin off a new jessie host ? [10:36:27] notably mwscript is still set to use Zend PHP5 [10:36:40] and it might no more be installed on jessie hosts [10:36:43] PROBLEM - Puppet run on deployment-ms-fe01 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [10:39:36] hashar: I think I'll rather setup a temporariy deployment-terbium in beta for the initial tests/sorting out puppet and packages; we can check on what do on production later on [10:39:37] 06Release-Engineering-Team, 10Wikimedia-Developer-Summit, 06Developer-Relations (Oct-Dec-2016): Developer Summit 2017: Work with TPG and RelEng on solution to event documenting - https://phabricator.wikimedia.org/T132400#2699017 (10Rfarrand) Thanks @RobLa-WMF! Very helpful. I would like us to be able to ea... [10:40:37] moritzm: sounds safe :) [10:54:24] (03PS1) 10Hashar: [PoolCounter] Migrate make job to Nodepool/Jessie [integration/config] - 10https://gerrit.wikimedia.org/r/314671 [11:11:42] RECOVERY - Puppet run on deployment-ms-fe01 is OK: OK: Less than 1.00% above the threshold [0.0] [11:28:45] (03PS1) 10Hashar: [PoolCounter] run the cucumber test suite [integration/config] - 10https://gerrit.wikimedia.org/r/314675 [11:36:33] zeljkof: I am way more familiar with cucumber nowadays https://integration.wikimedia.org/ci/job/mwext-PoolCounter-build-jessie/11/console :D [11:36:38] (03PS2) 10Hashar: [PoolCounter] run the cucumber test suite [integration/config] - 10https://gerrit.wikimedia.org/r/314675 [11:37:13] hashar: cool! :D [11:37:15] (03CR) 10Hashar: [C: 032] [PoolCounter] Migrate make job to Nodepool/Jessie [integration/config] - 10https://gerrit.wikimedia.org/r/314671 (owner: 10Hashar) [11:37:20] cucumbers are good for you ;) [11:37:43] yeah [11:37:53] just found that PoolCounter mw extension had a test suite [11:37:59] so I have made the job to run it [11:38:13] (03Merged) 10jenkins-bot: [PoolCounter] Migrate make job to Nodepool/Jessie [integration/config] - 10https://gerrit.wikimedia.org/r/314671 (owner: 10Hashar) [11:40:02] zeljkof: also yesterday you told me some stack was still using rspec 2.x ? [11:40:12] we should move to rspec 3.x [11:40:17] whatever that was [11:40:33] hashar: sure, not urgent or important, but we should, eventually [11:40:43] which component was it ? [11:40:57] for whatever reason, mediawiki_selenium locks rspec to 2.x [11:41:04] I had the issue on puppet rspec [11:41:09] reading doc for rspec 3 [11:41:10] I forgot the reason, probably something changed in the api [11:41:18] and wondering why the stuff did not work under rspec 2 :D [11:41:50] (03CR) 10Hashar: [C: 032] "It works! https://integration.wikimedia.org/ci/job/mwext-PoolCounter-build-jessie/12/console" [integration/config] - 10https://gerrit.wikimedia.org/r/314675 (owner: 10Hashar) [11:41:54] https://integration.wikimedia.org/ci/job/mwext-PoolCounter-build-jessie/12/console lovely [11:41:56] with colors [11:42:39] (03Merged) 10jenkins-bot: [PoolCounter] run the cucumber test suite [integration/config] - 10https://gerrit.wikimedia.org/r/314675 (owner: 10Hashar) [11:42:55] zeljkof: I imagine if we bump rspec, that is going to have a bunch of side effects in all extensions [11:43:01] might need a new minor version bump [11:43:12] probably [11:46:56] ahhh [11:47:11] yeah so rspec-core and rspec-expectations are added as runtime dependencies [11:51:37] bah Debian is so helpless [11:51:47] stable has rspec 2.14.1 [11:51:54] backports/jessie 3.4 [11:54:49] (03PS1) 10Hashar: test: raise_error() should have an explicit message [selenium] - 10https://gerrit.wikimedia.org/r/314676 [11:58:31] (03CR) 10jenkins-bot: [V: 04-1] test: raise_error() should have an explicit message [selenium] - 10https://gerrit.wikimedia.org/r/314676 (owner: 10Hashar) [12:00:50] (03PS2) 10Hashar: test: raise_error() should have an explicit message [selenium] - 10https://gerrit.wikimedia.org/r/314676 [12:09:08] 03Scap3, 10ContentTranslation-CXserver, 10MediaWiki-extensions-ContentTranslation, 06Services, 15User-mobrovac: Enable Scap3 config deploys for CXServer - https://phabricator.wikimedia.org/T147634#2699130 (10mobrovac) [12:09:59] 03Scap3, 10ContentTranslation-CXserver, 10MediaWiki-extensions-ContentTranslation, 06Services, 15User-mobrovac: Enable Scap3 config deploys for CXServer - https://phabricator.wikimedia.org/T147634#2699147 (10mobrovac) [12:10:01] 03Scap3, 06Services, 10service-runner, 10service-template-node, 15User-mobrovac: Enable config deploys for service::node services - https://phabricator.wikimedia.org/T144542#2602980 (10mobrovac) [12:34:08] 10Continuous-Integration-Infrastructure, 06Release-Engineering-Team: Investigate again a central cache for package managers - https://phabricator.wikimedia.org/T147635#2699168 (10hashar) [12:36:04] 10Gerrit, 06Repository-Admins: Rename the Semantic Forms extension to "Page Forms" - https://phabricator.wikimedia.org/T147582#2699182 (10Yaron_Koren) @Peachey88 - "Page Forms"; all of my extensions' names contain spaces. Although as far as I know, on Git/Gerrit and Phabricator the spaces are taken out anyway,... [12:45:28] PROBLEM - Puppet run on deployment-cache-text04 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [13:19:10] 05Gitblit-Deprecate, 10Diffusion: Redirect git.wikimedia.org HEAD URLs to Diffusion - https://phabricator.wikimedia.org/T141965#2699284 (10Nemo_bis) [13:21:33] 06Release-Engineering-Team, 06Operations, 06Security-Team, 15User-greg: Determine a core set or a checklist of permissions for deployment purpose - https://phabricator.wikimedia.org/T140270#2699289 (10Dzahn) [13:25:02] 05Gitblit-Deprecate, 10Diffusion: Redirect git.wikimedia.org HEAD URLs to Diffusion - https://phabricator.wikimedia.org/T141965#2699303 (10Dzahn) Before it can be deployed it needs reviews on the Gerrit change from Phabricator/Gerrit maintainers. [13:25:28] RECOVERY - Puppet run on deployment-cache-text04 is OK: OK: Less than 1.00% above the threshold [0.0] [13:42:48] (03PS1) 10Hashar: Let castor save from publish pipeline [integration/config] - 10https://gerrit.wikimedia.org/r/314688 (https://phabricator.wikimedia.org/T119140) [13:48:07] (03PS2) 10Hashar: Let castor save from postmerge pipeline [integration/config] - 10https://gerrit.wikimedia.org/r/314688 (https://phabricator.wikimedia.org/T119140) [13:55:24] (03PS1) 10Hashar: Migrate unicodejs-publish to Nodepool [integration/config] - 10https://gerrit.wikimedia.org/r/314690 (https://phabricator.wikimedia.org/T119140) [13:55:38] (03CR) 10Hashar: [C: 032] Let castor save from postmerge pipeline [integration/config] - 10https://gerrit.wikimedia.org/r/314688 (https://phabricator.wikimedia.org/T119140) (owner: 10Hashar) [13:55:45] (03CR) 10Hashar: [C: 032] Migrate unicodejs-publish to Nodepool [integration/config] - 10https://gerrit.wikimedia.org/r/314690 (https://phabricator.wikimedia.org/T119140) (owner: 10Hashar) [13:56:35] (03Merged) 10jenkins-bot: Let castor save from postmerge pipeline [integration/config] - 10https://gerrit.wikimedia.org/r/314688 (https://phabricator.wikimedia.org/T119140) (owner: 10Hashar) [13:57:26] (03Merged) 10jenkins-bot: Migrate unicodejs-publish to Nodepool [integration/config] - 10https://gerrit.wikimedia.org/r/314690 (https://phabricator.wikimedia.org/T119140) (owner: 10Hashar) [14:01:06] (03PS1) 10Hashar: Debian glue for ops/debs/bdsync [integration/config] - 10https://gerrit.wikimedia.org/r/314691 [14:01:20] (03CR) 10Hashar: [C: 032] Debian glue for ops/debs/bdsync [integration/config] - 10https://gerrit.wikimedia.org/r/314691 (owner: 10Hashar) [14:01:54] (03Merged) 10jenkins-bot: Debian glue for ops/debs/bdsync [integration/config] - 10https://gerrit.wikimedia.org/r/314691 (owner: 10Hashar) [14:02:39] hey hashar you know how we said we were kick off some nodepool migrations monday, I forgot monday is a US holiday [14:02:42] tuesday? [14:07:40] chasemp: yeah Columbus day :D [14:07:49] and you told me that your monday are quite busy anyway [14:07:58] I have migrated a few misc jobs already yesterday [14:08:31] the whole thing is taking long, but I am quite happy since we have a bunch of nice metrics/graphs around now :] [14:21:53] (03PS1) 10Hashar: mwext-jsduck-publish to Nodepool [integration/config] - 10https://gerrit.wikimedia.org/r/314693 [14:23:29] (03CR) 10Hashar: [C: 032] mwext-jsduck-publish to Nodepool [integration/config] - 10https://gerrit.wikimedia.org/r/314693 (owner: 10Hashar) [14:25:29] (03Merged) 10jenkins-bot: mwext-jsduck-publish to Nodepool [integration/config] - 10https://gerrit.wikimedia.org/r/314693 (owner: 10Hashar) [14:31:16] (03PS1) 10Hashar: Migrate oojs publish jobs to Nodepool [integration/config] - 10https://gerrit.wikimedia.org/r/314697 [14:31:30] (03CR) 10Hashar: [C: 032] Migrate oojs publish jobs to Nodepool [integration/config] - 10https://gerrit.wikimedia.org/r/314697 (owner: 10Hashar) [14:32:33] (03Merged) 10jenkins-bot: Migrate oojs publish jobs to Nodepool [integration/config] - 10https://gerrit.wikimedia.org/r/314697 (owner: 10Hashar) [14:35:32] (03PS1) 10Hashar: Delete parsoidsvc-deploy-jsduck-publish [integration/config] - 10https://gerrit.wikimedia.org/r/314701 [14:36:32] (03PS1) 10Hashar: parsoidsvc-source-jsduck-publish to Nodepool [integration/config] - 10https://gerrit.wikimedia.org/r/314702 [14:36:48] (03CR) 10Hashar: [C: 032] parsoidsvc-source-jsduck-publish to Nodepool [integration/config] - 10https://gerrit.wikimedia.org/r/314702 (owner: 10Hashar) [14:36:53] (03CR) 10Hashar: [C: 032] Delete parsoidsvc-deploy-jsduck-publish [integration/config] - 10https://gerrit.wikimedia.org/r/314701 (owner: 10Hashar) [14:38:07] (03Merged) 10jenkins-bot: Delete parsoidsvc-deploy-jsduck-publish [integration/config] - 10https://gerrit.wikimedia.org/r/314701 (owner: 10Hashar) [14:38:31] (03Merged) 10jenkins-bot: parsoidsvc-source-jsduck-publish to Nodepool [integration/config] - 10https://gerrit.wikimedia.org/r/314702 (owner: 10Hashar) [15:03:21] PROBLEM - Puppet run on deployment-eventlogging04 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [15:04:48] 10Gerrit, 06Operations, 10hardware-requests: Allocate spare misc box in eqiad for gerrit replacement - https://phabricator.wikimedia.org/T147596#2699555 (10mark) 05Open>03Resolved Approved. [15:05:54] (03PS1) 10Hashar: Fix job oojs-core-publish [integration/config] - 10https://gerrit.wikimedia.org/r/314705 [15:08:19] (03CR) 10Hashar: [C: 032] Fix job oojs-core-publish [integration/config] - 10https://gerrit.wikimedia.org/r/314705 (owner: 10Hashar) [15:09:39] (03Merged) 10jenkins-bot: Fix job oojs-core-publish [integration/config] - 10https://gerrit.wikimedia.org/r/314705 (owner: 10Hashar) [15:23:39] PROBLEM - Puppet run on deployment-kafka05 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [15:30:56] 06Release-Engineering-Team, 10Phabricator, 13Patch-For-Review, 07Wikimedia-Incident: Contention on search phabricator database creating full phabricator outages - https://phabricator.wikimedia.org/T146673#2699613 (10mmodell) [15:38:00] Project selenium-MobileFrontend » chrome,beta,Linux,contintLabsSlave && UbuntuTrusty build #185: 04FAILURE in 15 min: https://integration.wikimedia.org/ci/job/selenium-MobileFrontend/BROWSER=chrome,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/185/ [15:43:19] RECOVERY - Puppet run on deployment-eventlogging04 is OK: OK: Less than 1.00% above the threshold [0.0] [15:45:11] Project selenium-MobileFrontend » firefox,beta,Linux,contintLabsSlave && UbuntuTrusty build #185: 04FAILURE in 23 min: https://integration.wikimedia.org/ci/job/selenium-MobileFrontend/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/185/ [15:46:00] Project mediawiki-core-code-coverage build #2309: 04STILL FAILING in 46 min: https://integration.wikimedia.org/ci/job/mediawiki-core-code-coverage/2309/ [16:03:38] RECOVERY - Puppet run on deployment-kafka05 is OK: OK: Less than 1.00% above the threshold [0.0] [16:51:33] hello, how can i see what is the current branch of mw deployed? [16:52:23] nuria: for what purpose? programmatically? query special:version on the wiki you're interested in. Otherwise, there's this handle single use website: https://tools.wmflabs.org/versions/ [16:53:08] greg-g: ahhahaha [16:53:52] handy* [16:56:05] "how dare you enquire what versions of MediaWiki we're running!?| [16:56:07] " [16:58:58] greg-g: and forgive me if i'm a total newbie to this but ... is there anything i need to do to be sure this change (merged) deploys? https://gerrit.wikimedia.org/r/#/c/312561/ [17:00:01] nuria: you can check by clicking on "included in" on the top right and wait for it to load and then see which branc it's in. it's in wmf.21 so yeah, it's deployed [17:00:46] or about to be [17:00:47] ostriches: yo, i'm here, are we using -operations? [17:01:16] Krenair: per the versions page, we're at wmf.21 everywhere :) [17:01:21] or -devtools or .. [17:01:27] mutante: Probably best ya [17:01:30] alright [17:01:38] -operations is best for that type of thing, yeah [17:01:39] Lemme have a smoke then we'll get started [17:01:52] greg-g, sure but that doesn't mean every change listed as included in a branch is actually currently deployed [17:02:06] Krenair: sure, if it was just backported, right [17:02:16] greg-g: ok, got it for next time [17:55:12] 10Beta-Cluster-Infrastructure, 06Labs, 13Patch-For-Review: Replace all class imports on Labs with role imports - https://phabricator.wikimedia.org/T147233#2699984 (10Andrew) [18:23:11] PROBLEM - Puppet run on integration-slave-trusty-1012 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [18:24:21] PROBLEM - Host integration-puppetmaster is DOWN: CRITICAL - Host Unreachable (10.68.16.42) [18:24:29] yes yes [18:57:33] is gerrit sick again? [18:57:55] fatal: unable to access 'https://gerrit.wikimedia.org/r/p/operations/puppet/': Failed to connect to gerrit.wikimedia.org port 443: Connection refused [18:58:29] andrewbogott hi, gerrit is about to go into maint mode [18:58:39] See -operations [18:59:20] andrewbogott we are migrating gerrit to a new server due to problems with cpu [19:01:35] is gerrit ailing again? [19:01:46] andrewbogott: read wikitech-l :) [19:01:54] we're moving servers due to bad hardware [19:02:01] Per ^^ [19:03:13] RECOVERY - Puppet run on integration-slave-trusty-1012 is OK: OK: Less than 1.00% above the threshold [0.0] [19:04:03] ok [19:04:27] ok :) [19:08:24] 06Release-Engineering-Team, 05Goal, 15User-greg: Redo some #RelEng -related project workboard columns - https://phabricator.wikimedia.org/T138884#2700310 (10greg) [19:12:40] PROBLEM - Puppet run on integration-slave-precise-1012 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [19:13:32] PROBLEM - Puppet run on integration-slave-trusty-1003 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [19:14:02] PROBLEM - Puppet run on integration-slave-jessie-1001 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [19:14:53] Project beta-code-update-eqiad build #124752: 04FAILURE in 1 min 52 sec: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/124752/ [19:15:26] PROBLEM - Puppet run on integration-slave-precise-1002 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [19:17:52] PROBLEM - Puppet run on integration-slave-jessie-1002 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [19:20:30] PROBLEM - Puppet run on integration-slave-trusty-1001 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [19:22:13] PROBLEM - Puppet run on integration-slave-trusty-1016 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [19:25:09] PROBLEM - Puppet run on integration-slave-jessie-1005 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [19:33:02] PROBLEM - Puppet run on deployment-sentry01 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [19:35:29] Project beta-code-update-eqiad build #124753: 04STILL FAILING in 12 min: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/124753/ [19:35:56] PROBLEM - Puppet run on integration-slave-trusty-1017 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [19:39:12] PROBLEM - Puppet run on integration-slave-trusty-1012 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [19:39:18] PROBLEM - Puppet run on deployment-eventlogging04 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [19:39:54] PROBLEM - Puppet run on integration-slave-trusty-1013 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [19:40:48] PROBLEM - Puppet run on integration-slave-trusty-1004 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [19:42:23] PROBLEM - Puppet run on integration-slave-trusty-1006 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [19:43:45] PROBLEM - Puppet run on integration-slave-jessie-1004 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [19:43:45] PROBLEM - Puppet run on integration-slave-jessie-1003 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [19:46:49] PROBLEM - Puppet run on integration-slave-trusty-1014 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [19:48:21] Project beta-code-update-eqiad build #124754: 04STILL FAILING in 12 min: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/124754/ [19:51:07] PROBLEM - Puppet run on integration-slave-trusty-1018 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [19:52:09] PROBLEM - Puppet run on integration-slave-trusty-1011 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [19:53:44] PROBLEM - Puppet run on integration-slave-precise-1011 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [19:54:13] Project beta-code-update-eqiad build #124755: 04STILL FAILING in 6 min 9 sec: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/124755/ [19:56:08] Project beta-code-update-eqiad build #124756: 04STILL FAILING in 1 min 55 sec: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/124756/ [20:04:55] Project beta-code-update-eqiad build #124757: 04STILL FAILING in 1 min 54 sec: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/124757/ [20:09:59] PROBLEM - Puppet run on repository is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [20:10:25] !log rebooting integration-puppetmaster01 [20:10:29] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [20:10:40] !log Created repository.integration.eqiad.wmflabs to play/Test Sonatype Nexus [20:10:43] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [20:14:55] Yippee, build fixed! [20:14:55] Project beta-code-update-eqiad build #124758: 09FIXED in 1 min 54 sec: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/124758/ [20:22:26] (03CR) 10Hashar: "recheck" [selenium] - 10https://gerrit.wikimedia.org/r/314676 (owner: 10Hashar) [20:22:38] RECOVERY - Puppet run on integration-slave-precise-1012 is OK: OK: Less than 1.00% above the threshold [0.0] [20:24:01] RECOVERY - Puppet run on integration-slave-jessie-1001 is OK: OK: Less than 1.00% above the threshold [0.0] [20:27:51] RECOVERY - Puppet run on integration-slave-jessie-1002 is OK: OK: Less than 1.00% above the threshold [0.0] [20:30:07] RECOVERY - Puppet run on integration-slave-jessie-1005 is OK: OK: Less than 1.00% above the threshold [0.0] [20:30:31] RECOVERY - Puppet run on integration-slave-trusty-1001 is OK: OK: Less than 1.00% above the threshold [0.0] [20:32:15] RECOVERY - Puppet run on integration-slave-trusty-1016 is OK: OK: Less than 1.00% above the threshold [0.0] [20:34:14] RECOVERY - Puppet run on integration-slave-trusty-1012 is OK: OK: Less than 1.00% above the threshold [0.0] [20:34:54] RECOVERY - Puppet run on integration-slave-trusty-1013 is OK: OK: Less than 1.00% above the threshold [0.0] [20:35:50] RECOVERY - Puppet run on integration-slave-trusty-1004 is OK: OK: Less than 1.00% above the threshold [0.0] [20:35:54] RECOVERY - Puppet run on integration-slave-trusty-1017 is OK: OK: Less than 1.00% above the threshold [0.0] [20:37:22] RECOVERY - Puppet run on integration-slave-trusty-1006 is OK: OK: Less than 1.00% above the threshold [0.0] [20:38:02] RECOVERY - Puppet run on deployment-sentry01 is OK: OK: Less than 1.00% above the threshold [0.0] [20:38:42] RECOVERY - Puppet run on integration-slave-jessie-1004 is OK: OK: Less than 1.00% above the threshold [0.0] [20:38:46] RECOVERY - Puppet run on integration-slave-jessie-1003 is OK: OK: Less than 1.00% above the threshold [0.0] [20:39:21] 06Release-Engineering-Team, 15User-greg: Create agenda outline for 2016 RelEng team offsite - https://phabricator.wikimedia.org/T138437#2700491 (10greg) 05Open>03Resolved a:03greg Done, though the agenda will be in flux even during the event, of course. [20:40:00] RECOVERY - Puppet run on repository is OK: OK: Less than 1.00% above the threshold [0.0] [20:41:48] RECOVERY - Puppet run on integration-slave-trusty-1014 is OK: OK: Less than 1.00% above the threshold [0.0] [20:43:14] if ( $wmfRealm == 'labs' && file_exists( '/etc/wikimedia-transcoding' ) ) { [20:43:14] require( "$wmfConfigDir/transcoding-labs.org" ); [20:43:19] Where does that file come from? [20:43:25] the transcoding-labs.org [20:43:31] why is it not .php? [20:44:21] RECOVERY - Puppet run on deployment-eventlogging04 is OK: OK: Less than 1.00% above the threshold [0.0] [20:46:09] RECOVERY - Puppet run on integration-slave-trusty-1018 is OK: OK: Less than 1.00% above the threshold [0.0] [20:47:09] RECOVERY - Puppet run on integration-slave-trusty-1011 is OK: OK: Less than 1.00% above the threshold [0.0] [20:47:19] Reedy, I don't think that code is ever called [20:47:21] the require, I mean [20:47:25] lol [20:47:31] /etc/wikimedia-transcoding doesn't seem to exist anywhere in beta [20:47:35] Well, the file itself it's in wmf-config [20:47:40] *isn't in [20:47:47] so it'd just break [20:48:32] Shall I just make a patch to remove the lot? [20:48:37] krenair@deployment-salt02:~$ sudo salt '*' cmd.run --out=text 'ls -l /etc/wikimedia-transcoding' | grep -v "No such file" [20:48:37] krenair@deployment-salt02:~$ [20:48:39] yep [20:48:41] RECOVERY - Puppet run on integration-slave-precise-1011 is OK: OK: Less than 1.00% above the threshold [0.0] [20:50:23] RECOVERY - Puppet run on integration-slave-precise-1002 is OK: OK: Less than 1.00% above the threshold [0.0] [20:51:01] PROBLEM - Puppet run on repository is CRITICAL: CRITICAL: 10.00% of data above the critical threshold [0.0] [20:53:34] RECOVERY - Puppet run on integration-slave-trusty-1003 is OK: OK: Less than 1.00% above the threshold [0.0] [21:00:58] RECOVERY - Puppet run on repository is OK: OK: Less than 1.00% above the threshold [0.0] [21:06:57] PROBLEM - Puppet run on repository is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [21:11:22] 10Gerrit, 06Operations, 13Patch-For-Review: setup/deploy cobalt as gerrit warm standby/replacement - https://phabricator.wikimedia.org/T147597#2700582 (10Dzahn) [21:12:00] <7YUAAAAUM> 10Gerrit, 06Operations, 13Patch-For-Review: setup/deploy cobalt as gerrit warm standby/replacement - https://phabricator.wikimedia.org/T147597#2697753 (10Dzahn) 20:30 bblack: lead.wikimedia.org: replaced by cobalt functionally, please leave it untouched for now with puppet disabled! 19:46 mutante: deleted o... [21:16:58] RECOVERY - Puppet run on repository is OK: OK: Less than 1.00% above the threshold [0.0] [21:46:28] PROBLEM - Puppet run on deployment-cache-text04 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [22:04:13] 10Continuous-Integration-Infrastructure, 06Release-Engineering-Team, 13Patch-For-Review: Investigate again a central cache for package managers - https://phabricator.wikimedia.org/T147635#2700720 (10hashar) Did a basic puppet sprint and applied it on repository.integration.eqiad.wmflabs Nexus got installed... [22:21:29] RECOVERY - Puppet run on deployment-cache-text04 is OK: OK: Less than 1.00% above the threshold [0.0] [22:28:45] Yippee, build fixed! [22:28:46] Project selenium-MobileFrontend » chrome,beta,Linux,contintLabsSlave && UbuntuTrusty build #186: 09FIXED in 16 min: https://integration.wikimedia.org/ci/job/selenium-MobileFrontend/BROWSER=chrome,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/186/ [22:32:35] 06Release-Engineering-Team, 15User-greg: Create FY1617Q2 timespent spreadsheet - https://phabricator.wikimedia.org/T147675#2700755 (10greg) Next week is done, which is good enough for now, but need to create the following weeks (copy/paste, if this one makes sense) and summary tab. [22:35:51] Yippee, build fixed! [22:35:51] Project selenium-MobileFrontend » firefox,beta,Linux,contintLabsSlave && UbuntuTrusty build #186: 09FIXED in 23 min: https://integration.wikimedia.org/ci/job/selenium-MobileFrontend/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/186/ [22:43:36] PROBLEM - Keyholder status on deployment-tin is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [22:54:13] PROBLEM - Puppet run on deployment-ores-redis is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [22:56:00] PROBLEM - Puppet run on deployment-eventlogging03 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [23:27:41] (03PS1) 10Niedzielski: WIP: Android: upgrade emulator to API 24 [integration/config] - 10https://gerrit.wikimedia.org/r/314788 (https://phabricator.wikimedia.org/T133183) [23:30:59] RECOVERY - Puppet run on deployment-eventlogging03 is OK: OK: Less than 1.00% above the threshold [0.0] [23:31:13] (03PS2) 10Niedzielski: WIP: Android: upgrade emulator to API 24 [integration/config] - 10https://gerrit.wikimedia.org/r/314788 (https://phabricator.wikimedia.org/T133183) [23:34:15] RECOVERY - Puppet run on deployment-ores-redis is OK: OK: Less than 1.00% above the threshold [0.0]