[00:13:16] RECOVERY - Puppet errors on integration-slave-docker-1001 is OK: OK: Less than 1.00% above the threshold [0.0] [00:39:17] PROBLEM - Puppet errors on integration-slave-docker-1001 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [01:14:19] RECOVERY - Puppet errors on integration-slave-docker-1001 is OK: OK: Less than 1.00% above the threshold [0.0] [01:17:11] (03Abandoned) 10Mholloway: Provide Android SDK location as an argument to non-periodic test scripts [integration/config] - 10https://gerrit.wikimedia.org/r/368238 (https://phabricator.wikimedia.org/T171811) (owner: 10Mholloway) [02:10:17] PROBLEM - Puppet errors on integration-slave-docker-1001 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [03:03:18] 10Gerrit, 10Wikidata, 10User-Ladsgroup, 10Wikidata-Sprint-2016-03-01, 10Wikidata-Sprint-2016-04-12: [Task] Move DataTypes repository from Github to gerrit - https://phabricator.wikimedia.org/T127292#2038409 (10Legoktm) https://github.com/wmde/DataTypes/blob/master/src/Modules/DataTypesModule.php Given t... [04:04:13] Project selenium-MultimediaViewer » firefox,beta,Linux,BrowserTests build #516: 04FAILURE in 8 min 12 sec: https://integration.wikimedia.org/ci/job/selenium-MultimediaViewer/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=BrowserTests/516/ [04:15:17] RECOVERY - Puppet errors on integration-slave-docker-1001 is OK: OK: Less than 1.00% above the threshold [0.0] [04:36:16] PROBLEM - Puppet errors on integration-slave-docker-1001 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [05:16:17] RECOVERY - Puppet errors on integration-slave-docker-1001 is OK: OK: Less than 1.00% above the threshold [0.0] [05:49:49] PROBLEM - Mediawiki Error Rate on graphite-labs is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [10.0] [06:07:16] PROBLEM - Puppet errors on integration-slave-docker-1001 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [06:19:48] PROBLEM - Mediawiki Error Rate on graphite-labs is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [10.0] [06:47:17] RECOVERY - Puppet errors on integration-slave-docker-1001 is OK: OK: Less than 1.00% above the threshold [0.0] [06:49:49] PROBLEM - Mediawiki Error Rate on graphite-labs is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [10.0] [07:08:19] PROBLEM - Puppet errors on integration-slave-docker-1001 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [07:20:01] 10Release-Engineering-Team (Next), 10Release Pipeline, 10User-Joe: Prove helm as a potential k8s deployment tool - https://phabricator.wikimedia.org/T173129#3606960 (10Joe) After the discussion the other day at the containers cabal meeting, I promised to come up with a proposal for helm chart development/man... [07:30:35] hi, looking at the deployment page on wikitech I see no eu swat today, looks like it was removed for testing jouncebot, any objection if I restore the eu swat line for today? [07:36:58] PROBLEM - Free space - all mounts on integration-slave-jessie-1001 is CRITICAL: CRITICAL: integration.integration-slave-jessie-1001.diskspace._mnt.byte_percentfree (No valid datapoints found)integration.integration-slave-jessie-1001.diskspace._srv.byte_percentfree (<10.00%) [07:39:47] 10Continuous-Integration-Infrastructure, 10DNS, 10Operations, 10Traffic: CI: operations-dns-lint broken due to missing Maxmind DB file - https://phabricator.wikimedia.org/T175864#3606984 (10hashar) That is related. As I migrated some jobs from Trusty to Jessie, I have added a couple Jessie instances. Tha... [07:40:10] Project selenium-Wikibase » chrome,beta,Linux,BrowserTests build #483: 15ABORTED in 3 hr 0 min: https://integration.wikimedia.org/ci/job/selenium-Wikibase/BROWSER=chrome,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=BrowserTests/483/ [07:40:28] fyi, I'll dirty the puppet repo on deployment-puppetmaster02 for 2 minutes to test something out [07:46:29] ok, done, cleaned up [08:31:59] PROBLEM - Puppet errors on deployment-aqs01 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [08:40:50] PROBLEM - Puppet errors on deployment-aqs03 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [08:41:04] PROBLEM - Puppet errors on deployment-aqs02 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [08:43:18] RECOVERY - Puppet errors on integration-slave-docker-1001 is OK: OK: Less than 1.00% above the threshold [0.0] [08:46:59] PROBLEM - Free space - all mounts on integration-slave-jessie-1001 is CRITICAL: CRITICAL: integration.integration-slave-jessie-1001.diskspace._mnt.byte_percentfree (No valid datapoints found)integration.integration-slave-jessie-1001.diskspace._srv.byte_percentfree (<10.00%) [09:14:03] 10Release-Engineering-Team, 10CheckUser: Checkuser on IPs is not working - https://phabricator.wikimedia.org/T175898#3607103 (10Ladsgroup) [09:14:14] 10Release-Engineering-Team, 10CheckUser: Checkuser on IPs is not working - https://phabricator.wikimedia.org/T175898#3607115 (10Ladsgroup) p:05Triage>03Unbreak! [09:24:46] 10Release-Engineering-Team, 10CheckUser, 10MW-1.30-release-notes (WMF-deploy-2017-09-12_(1.30.0-wmf.18)), 10Regression: Checkuser on IPs is not working - https://phabricator.wikimedia.org/T175898#3607167 (10Peachey88) [09:29:39] 10Release-Engineering-Team, 10CheckUser, 10MW-1.30-release-notes (WMF-deploy-2017-09-12_(1.30.0-wmf.18)), 10Regression: Checkuser on IP ranges produces no results, even if there are edits in that rage - https://phabricator.wikimedia.org/T175898#3607178 (10Deskana) [09:29:51] 10Release-Engineering-Team, 10CheckUser, 10MW-1.30-release-notes (WMF-deploy-2017-09-12_(1.30.0-wmf.18)), 10Regression: Checkuser on IP ranges produces no results, even if there are edits in that range - https://phabricator.wikimedia.org/T175898#3607103 (10Deskana) [09:31:36] 10Release-Engineering-Team, 10CheckUser, 10MW-1.30-release-notes (WMF-deploy-2017-09-12_(1.30.0-wmf.18)), 10Regression: Checkuser on IP ranges produces no results, even if there are edits in that range - https://phabricator.wikimedia.org/T175898#3607103 (10Deskana) [09:31:54] 10Release-Engineering-Team (Kanban), 10Release, 10Train Deployments: 1.30.0-wmf.18 deployment blockers - https://phabricator.wikimedia.org/T170636#3607198 (10Ladsgroup) [09:31:56] 10Release-Engineering-Team, 10CheckUser, 10MW-1.30-release-notes (WMF-deploy-2017-09-12_(1.30.0-wmf.18)), 10Regression: Checkuser on IP ranges produces no results, even if there are edits in that range - https://phabricator.wikimedia.org/T175898#3607199 (10Ladsgroup) [09:40:48] 10Continuous-Integration-Infrastructure, 10DNS, 10Operations, 10Traffic: CI: operations-dns-lint broken due to missing Maxmind DB file - https://phabricator.wikimedia.org/T175864#3607247 (10hashar) I am trying to add the GeoIP files on the CI puppet master. Gotta fix some puppet madness with an undefined... [09:51:46] 10Release-Engineering-Team, 10CheckUser, 10Stewards-and-global-tools, 10MW-1.30-release-notes (WMF-deploy-2017-09-12_(1.30.0-wmf.18)), 10Regression: Checkuser on IP ranges produces no results, even if there are edits in that range - https://phabricator.wikimedia.org/T175898#3607284 (10MarcoAurelio) [10:34:38] PROBLEM - Puppet errors on integration-puppetmaster01 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [0.0] [10:35:42] !log CI puppet master: added class geoip::data::package and parameters: puppetmaster::geoip::fetch_private: false puppetmaster::geoip::use_proxy: false - T175864 [10:35:47] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [10:35:50] T175864: CI: operations-dns-lint broken due to missing Maxmind DB file - https://phabricator.wikimedia.org/T175864 [10:38:14] 10Deployment-Systems, 10Scap (Scap3-Adoption-Phase1), 10scap2, 10OCG-PDFRenderer, 10Services (watching): Deploy ocg with scap3 - https://phabricator.wikimedia.org/T129142#3607364 (10Aklapper) As already announced in [[ https://meta.wikimedia.org/wiki/Tech/News/2017/37 | Tech News ]], OfflineContentGenera... [10:38:47] !log cherry-pick https://gerrit.wikimedia.org/r/#/c/377753/7 on deployment-prep's puppetmaster02 to test it on the new kafka jumbo instances [10:38:51] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [10:39:16] PROBLEM - Puppet errors on integration-slave-docker-1001 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [10:42:23] PROBLEM - Puppet errors on integration-slave-jessie-1002 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [10:44:39] RECOVERY - Puppet errors on integration-puppetmaster01 is OK: OK: Less than 1.00% above the threshold [0.0] [10:44:58] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Kanban), 10DNS, 10Operations, and 2 others: CI: operations-dns-lint broken due to missing Maxmind DB file - https://phabricator.wikimedia.org/T175864#3607455 (10hashar) a:03hashar I have rebuild the jenkins build and it passed on the sl... [10:45:42] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Kanban), 10DNS, 10Operations, and 2 others: CI: operations-dns-lint broken due to missing Maxmind DB file - https://phabricator.wikimedia.org/T175864#3607458 (10hashar) p:05Triage>03Normal [10:48:26] PROBLEM - Puppet errors on deployment-kafka-jumbo-1 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [10:48:34] 10Continuous-Integration-Infrastructure, 10Gerrit, 10Release-Engineering-Team (Next), 10Patch-For-Review, 10Zuul: Freshly provisionned zuul fails connecting to Gerrit due to ssh key host - https://phabricator.wikimedia.org/T157912#3607464 (10hashar) 05Open>03declined [10:52:23] RECOVERY - Puppet errors on integration-slave-jessie-1002 is OK: OK: Less than 1.00% above the threshold [0.0] [11:03:02] PROBLEM - Puppet errors on deployment-kafka-jumbo-2 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [11:04:56] the jumbos errors are mine, need to configure the scap repo [11:13:28] RECOVERY - Puppet errors on deployment-kafka-jumbo-1 is OK: OK: Less than 1.00% above the threshold [0.0] [11:14:17] RECOVERY - Puppet errors on integration-slave-docker-1001 is OK: OK: Less than 1.00% above the threshold [0.0] [11:16:52] 10Gerrit, 10Wikidata, 10User-Ladsgroup, 10Wikidata-Sprint-2016-03-01, 10Wikidata-Sprint-2016-04-12: [Task] Move DataTypes repository from Github to gerrit - https://phabricator.wikimedia.org/T127292#3607545 (10WMDE-leszek) This component seems a bit confusing. Indeed, there is a part there which clearly... [11:19:14] 10Continuous-Integration-Config, 10Wiki-Loves-Monuments-Database: Add Shell linting to heritage repo - https://phabricator.wikimedia.org/T175906#3607562 (10JeanFred) [11:21:17] 10Continuous-Integration-Config, 10Wiki-Loves-Monuments-Database, 10Patch-For-Review: Add Shell linting to heritage repo - https://phabricator.wikimedia.org/T175906#3607566 (10JeanFred) Digging a bit in what we could do with our existing entry-points (tox, npm and composer) I came across [bashate](https://do... [11:40:16] PROBLEM - Puppet errors on integration-slave-docker-1001 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [11:42:57] RECOVERY - Puppet errors on deployment-kafka-jumbo-2 is OK: OK: Less than 1.00% above the threshold [0.0] [12:06:08] 10Gerrit, 10Release-Engineering-Team (Backlog), 10Scap, 10Patch-For-Review: Deploy gerrit with scap3 - https://phabricator.wikimedia.org/T157414#3607701 (10Paladox) @Chad I guess the final things to this task is to migrate over the plugins? Could you build the its-phabricator plugin please? Then it will b... [12:15:18] RECOVERY - Puppet errors on integration-slave-docker-1001 is OK: OK: Less than 1.00% above the threshold [0.0] [12:16:36] 10Release-Engineering-Team (Watching / External), 10MediaWiki-Containers, 10Kubernetes, 10Services (designing), 10User-mobrovac: RFC: Container path conventions - https://phabricator.wikimedia.org/T169998#3415931 (10akosiaris) FTR, I would really prefer us to stay close to FHS. I 've had to debug contain... [12:22:44] Project selenium-GettingStarted » firefox,beta,Linux,BrowserTests build #525: 04FAILURE in 43 sec: https://integration.wikimedia.org/ci/job/selenium-GettingStarted/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=BrowserTests/525/ [12:36:15] PROBLEM - Puppet errors on integration-slave-docker-1001 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [12:48:50] hashar! i just had some nommy tomato soup [12:52:57] addshore: ! [12:53:01] neat [12:53:10] and I manage to build the docker multi stage image :] [12:53:44] I am just wondering where the sources are for docker-registry.wikimedia.org/wikimedia-jessie [13:03:09] addshore: any clue why we need to generate a locale ? :) [13:16:16] RECOVERY - Puppet errors on integration-slave-docker-1001 is OK: OK: Less than 1.00% above the threshold [0.0] [13:27:35] 10Beta-Cluster-Infrastructure: Beta cluster rights clarification - https://phabricator.wikimedia.org/T175917#3607866 (10Sau226) [13:32:42] 10Release-Engineering-Team (Watching / External), 10MediaWiki-Containers, 10Kubernetes, 10Services (designing), 10User-mobrovac: RFC: Container path conventions - https://phabricator.wikimedia.org/T169998#3607898 (10mobrovac) >>! In T169998#3607722, @akosiaris wrote: > FTR, I would really prefer us to st... [13:37:16] 10Release-Engineering-Team, 10CheckUser, 10Stewards-and-global-tools, 10MW-1.30-release-notes (WMF-deploy-2017-09-12_(1.30.0-wmf.18)), and 2 others: Checkuser on IP ranges produces no results, even if there are edits in that range - https://phabricator.wikimedia.org/T175898#3607103 (10Huji) @Melos where di... [13:39:34] 10Release-Engineering-Team, 10CheckUser, 10Stewards-and-global-tools, 10MW-1.30-release-notes (WMF-deploy-2017-09-12_(1.30.0-wmf.18)), and 2 others: Checkuser on IP ranges produces no results, even if there are edits in that range - https://phabricator.wikimedia.org/T175898#3607925 (10Huji) It turns out it... [13:43:53] 10Release-Engineering-Team, 10CheckUser, 10Stewards-and-global-tools, 10MW-1.30-release-notes (WMF-deploy-2017-09-12_(1.30.0-wmf.18)), and 2 others: Checkuser on IP ranges produces no results, even if there are edits in that range - https://phabricator.wikimedia.org/T175898#3607103 (10hashar) Comes from ht... [13:48:22] 10Release-Engineering-Team (Kanban), 10Release, 10Train Deployments: 1.30.0-wmf.18 deployment blockers - https://phabricator.wikimedia.org/T170636#3607956 (10Huji) [13:48:26] 10Release-Engineering-Team, 10CheckUser, 10MW-1.30-release-notes (WMF-deploy-2017-09-12_(1.30.0-wmf.18)), 10Regression: Checkuser on IP ranges produces no results, even if there are edits in that range - https://phabricator.wikimedia.org/T175898#3607952 (10Huji) 05Open>03Resolved a:03Huji Yeah, betwe... [13:50:08] 10Release-Engineering-Team, 10CheckUser, 10MW-1.30-release-notes (WMF-deploy-2017-09-12_(1.30.0-wmf.18)), 10Regression: Checkuser on IP ranges produces no results, even if there are edits in that range - https://phabricator.wikimedia.org/T175898#3607958 (10Ladsgroup) Adding a regression test would be reall... [13:51:21] 10Beta-Cluster-Infrastructure: Beta cluster rights clarification - https://phabricator.wikimedia.org/T175917#3607866 (10Samtar) Which rights are you referring to? [13:57:04] 10Release-Engineering-Team, 10CheckUser, 10MW-1.30-release-notes (WMF-deploy-2017-09-12_(1.30.0-wmf.18)), 10Regression: Checkuser on IP ranges produces no results, even if there are edits in that range - https://phabricator.wikimedia.org/T175898#3607103 (10MusikAnimal) You have to admit the irony, though..... [14:07:15] PROBLEM - Puppet errors on integration-slave-docker-1001 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [14:10:47] 10Release-Engineering-Team (Watching / External), 10MediaWiki-Containers, 10Kubernetes, 10Services (designing), 10User-mobrovac: RFC: Container path conventions - https://phabricator.wikimedia.org/T169998#3607989 (10akosiaris) >> I am also not fully sold on the idea that we should not template some thin... [14:16:57] (03PS1) 10Hashar: docker: base image for CI images [integration/config] - 10https://gerrit.wikimedia.org/r/378033 [14:19:55] hashar: yeh, there is an operations/docker/* repos on gerrit [14:20:05] (03PS1) 10Hashar: docker: set DEBIAN_FRONTEND=noninteractive [integration/config] - 10https://gerrit.wikimedia.org/r/378034 [14:20:18] hashar: also, no idea about locale, hunk python needs it? thcipriani would know [14:20:22] addshore: eventually I went proposing a base image for ci [14:20:26] maybe python yeah [14:20:39] I guess it is needed, and I have tweaked how it is generated in https://gerrit.wikimedia.org/r/378033 [14:20:47] (trying to have zero warning/error) [14:20:56] So, with the base images we want to make we need to think of a way to make the build script build them in the right order :) [14:21:18] I wish we had CI to build the docker images when we send a patch :] [14:21:38] ah order [14:21:39] grr [14:22:01] 00_ci-jessie [14:22:04] operations-puppet [14:22:06] ;D [14:22:26] I really don't like the images having anything Jenkins or WM specific in them [14:23:02] If we want a user we should just call it CI, and the uid shouldn't matter, these images should be able to run locally with no issues too / no matching uid [14:23:14] Also, should be able to work on windows (no uids) [14:23:35] Yeh, could do basic numbering, I mean, we would only really need a couple of levels [14:24:19] hashar: the co to build probably wouldn't be that hard, but build and put where? Push to docker hub? Also, it would make the CI for int-config sloooe [14:24:22] *sloww [14:30:48] I am tempted to rewrite that build.sh with a Makefile :D [14:30:54] (03PS1) 10Hashar: docker: normalize build.sh argument [integration/config] - 10https://gerrit.wikimedia.org/r/378035 [14:31:50] addshore: well the uid/gid is set inside the container. So I am not sure what would happen for windows [14:32:17] I think we need that uid because we mount a volume to caputre logs [14:32:52] maybe there is a way to map uid between and the container [14:33:52] I guess we will want to do something like: [14:35:09] addshore: but in short, I guess we can figure it out later :] [14:35:26] i think the main scope for now is to ship some images to migrate the CI jobs to them [14:35:54] hashar: just have to chmod the logs after creating them to 777 or whatever i guess [14:36:00] if thats the only reason the user and ids are there :) [14:38:29] potentially yeah [14:38:41] gotta ask back to thcipriani|afk :] [14:42:34] yup [14:43:26] hashar: It might be an idea to have experimental jobs for each docker job? that reads from the tag 'experimental' from dockerhub to test things out? [14:43:38] again, just another random though [14:43:40] *thought [14:44:12] (03CR) 10Addshore: [C: 032] docker: normalize build.sh argument [integration/config] - 10https://gerrit.wikimedia.org/r/378035 (owner: 10Hashar) [14:44:41] wait, let me try that first... [14:46:23] or test against latest [14:46:30] and once happy promote to stable, which CI would use [14:46:37] but yeah I would love a multiple stage process [14:47:00] and I guess when we push a patch, we could generate the docker images, boot the containers and run some tests in them [14:47:07] but that is becoming meta test test test test [14:47:45] hashar: indeed, I mean, the unstable could also run against all patches but as non voting? (If the unstable is different to the stable) if not just skip those jobs [14:48:13] hashar: yeh, thats one of the reasons I added the example run bash scripts so I can quickly test the image before commiting [14:48:43] could actually run that as part of the build script, and could also make the build script auto push to dockerhub? [14:49:05] +1 on running the tests after the build :D [14:49:35] Thr next thing I want to do is decouple the getting the files from gerit for the test and then the running of the test [14:49:39] and I will try to look at how we can make all of that automatic in jenkins itself [14:49:42] but defo no time today [14:49:45] so we can just +2 :] [14:50:07] yeah [14:50:12] I think if we want to do that the docker files should probably be in a different repo? [14:50:15] I gotta rush out myself. I got a board meeting this evening [14:50:43] I thought: send a patch to update the dockerfiles, have CI build them, run a few tests in them to ensure they are proper [14:50:45] +2 [14:50:46] on merge [14:50:49] and there just needs to be 1 build server i guess, and it would be quite nice as during a "test" of a patch it would build it, and on gate-submit it would have to rebuild it as if it hasnt changed it would be cached :D [14:50:56] on merge rebuild them, push to docker hub [14:51:07] we can talk tommorrow or whenever :D [14:51:11] sure thing [14:51:15] I'm starting to think some phab tickets might be suefull :D [14:51:32] :=]]] [14:51:45] anyway I am back tomorrow [14:51:49] * hasharAway drives [14:52:58] * thcipriani *waves* [14:53:22] 10Release-Engineering-Team (Next), 10Release Pipeline, 10User-Joe: Prove helm as a potential k8s deployment tool - https://phabricator.wikimedia.org/T173129#3608056 (10akosiaris) >>! In T173129#3606960, @Joe wrote: > After the discussion the other day at the containers cabal meeting, I promised to come up wi... [14:55:37] hi thcipriani [14:56:35] Hiya! Catching up on scrollback: so WRT locale stuff: ruby complains about that for whatever reason. WRT to uid stuff. I really want to ensure that we don't allow root file creation on the integration cluster, so the user creating needs to be unpriviledged, and also jenkins needs to be able to read them: that's pretty much my only criteria. [14:57:12] by way of explanation as to why I just went with uid/gid jenkins/wikidev [14:58:03] ack! okay, I might poke the user stuff at some point and just chmod the log dir / files [14:58:28] wfm :) [14:58:57] I am also in favor of coordination via phab ticket :) [14:59:19] * addshore still wants to nail this shallow clone and fetch stuff down too ;) [14:59:28] no time this week anymore i dont think [15:01:44] 10Release-Engineering-Team, 10CheckUser, 10MW-1.30-release-notes (WMF-deploy-2017-09-19 (1.30.0-wmf.19)), 10Regression: Checkuser on IP ranges produces no results, even if there are edits in that range - https://phabricator.wikimedia.org/T175898#3608108 (10Huji) >>! In T175898#3607972, @MusikAnimal wrote:... [15:07:14] 10Release-Engineering-Team (Watching / External), 10MediaWiki-Containers, 10Kubernetes, 10Services (designing), 10User-mobrovac: RFC: Container path conventions - https://phabricator.wikimedia.org/T169998#3608117 (10mobrovac) >>! In T169998#3607989, @akosiaris wrote: > To me it is not, in fact this is th... [15:25:58] 10Beta-Cluster-Infrastructure, 10Analytics-Kanban, 10Wikimedia-Stream, 10Patch-For-Review: Decom RCStream in Beta Cluster - https://phabricator.wikimedia.org/T172356#3608191 (10Ottomata) Just made a patch to use EventBus for RCFeed instead of RCStream. If we merge that, we can remove the RCStream puppet m... [15:40:11] no_justification great news upstream are removing parts of drafts in gerrit tommror. https://gerrit-review.googlesource.com/#/c/gerrit/+/104497/ :). (Thought you would like to know) :) [15:47:18] RECOVERY - Puppet errors on integration-slave-docker-1001 is OK: OK: Less than 1.00% above the threshold [0.0] [16:02:38] 10Release-Engineering-Team (Kanban), 10Release, 10Train Deployments: 1.31-wmf.1 deployment blockers - https://phabricator.wikimedia.org/T174361#3608300 (10greg) [16:08:17] PROBLEM - Puppet errors on integration-slave-docker-1001 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [16:46:21] (03CR) 10Thcipriani: [C: 032] docker: normalize build.sh argument [integration/config] - 10https://gerrit.wikimedia.org/r/378035 (owner: 10Hashar) [16:47:20] (03Merged) 10jenkins-bot: docker: normalize build.sh argument [integration/config] - 10https://gerrit.wikimedia.org/r/378035 (owner: 10Hashar) [16:56:27] 10Release-Engineering-Team (Kanban), 10Release Pipeline (Blubber): Package Blubber - https://phabricator.wikimedia.org/T175609#3608409 (10dduvall) [17:19:43] 10Deployment-Systems, 10Scap (Scap3-Adoption-Phase1), 10scap2, 10Discovery, and 2 others: Deploy discovery-analytics with scap3 - https://phabricator.wikimedia.org/T129149#3608534 (10Gehel) [17:19:51] 10Scap (Scap3-Adoption-Phase1), 10releng-201516-q4, 10releng-201718-q1, 10Trebuchet: [keyresult] Migrate remaining trebuchet deployed services - https://phabricator.wikimedia.org/T129290#3608538 (10thcipriani) [17:19:54] 10Deployment-Systems, 10Scap (Scap3-Adoption-Phase1), 10scap2, 10Discovery, and 2 others: Deploy discovery-analytics with scap3 - https://phabricator.wikimedia.org/T129149#3608535 (10thcipriani) 05Open>03Resolved a:03thcipriani Deployed! Thanks @Gehel ! [17:20:40] 10Deployment-Systems, 10Scap (Scap3-Adoption-Phase1), 10scap2, 10Discovery, and 2 others: Deploy discovery-analytics with scap3 - https://phabricator.wikimedia.org/T129149#3608543 (10Gehel) deployment completed. @mpopov do you want to check that everything is working as expected? [17:32:50] 10Gerrit, 10Analytics-Tech-community-metrics, 10Upstream: Gerrit patchset 99101 cannot be accessed: "500 Internal server error" - https://phabricator.wikimedia.org/T161206#3608614 (10Paladox) I wonder if we can try https://gerrit-review.googlesource.com/Documentation/rest-api-changes.html#fix-change ? [17:34:13] 10Gerrit, 10Analytics-Tech-community-metrics, 10Upstream: Gerrit patchset 99101 cannot be accessed: "500 Internal server error" - https://phabricator.wikimedia.org/T161206#3608618 (10Paladox) Ah, running https://gerrit-review.googlesource.com/Documentation/rest-api-changes.html#check-change results in "pr... [17:41:10] no_justification hi, i am wondering when you have time could you run https://gerrit-review.googlesource.com/Documentation/rest-api-changes.html#fix-change please on ^^. [17:41:19] requires an admin or the change owner to do it [17:41:33] curl --digest --user user:password -X POST https://gerrit.wikimedia.org/r/a/changes/99101/check -v [17:42:17] paladox got a fix for this ? https://phabricator.wikimedia.org/T174362 Unhandled Exception ("AphrontParameterQueryException") [17:42:34] uh [17:42:36] "Array for %Ls conversion is empty. Query: task.phid in (%Ls)" [17:42:48] nope [17:42:55] Zppix could you file a task please? [17:43:18] RECOVERY - Puppet errors on integration-slave-docker-1001 is OK: OK: Less than 1.00% above the threshold [0.0] [17:44:08] PROBLEM - Puppet errors on deployment-mathoid is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [17:44:50] paladox: Zppix I've already poked mukunda about it, don't worry [17:45:00] thanks [17:45:11] greg-g: well i already created the task before i saw your message [17:47:34] 10Release-Engineering-Team (Kanban), 10Phabricator, 10User-Zppix: Unhandled Exception ("AphrontParameterQueryException") when viewing T174362 - https://phabricator.wikimedia.org/T175942#3608674 (10greg) p:05Triage>03Normal a:03mmodell Yes, it's broken because of the intelligent pieces of the associate... [17:47:46] paladox: as I have stated if you comment on a task there is no need to ping me here unless its urgent. I promise I read bugmail [17:48:00] Oh sorry. did not realise you got the mail. [17:53:13] 10Beta-Cluster-Infrastructure: Beta cluster rights clarification - https://phabricator.wikimedia.org/T175917#3608712 (10Aklapper) @Sau226: Could you provide a specific example and link to those "certain groups"? Currently this might be too abstract to provide a good answer. (Whatever potentially comes out of thi... [18:16:44] 10Release-Engineering-Team (Kanban), 10Release Pipeline (Blubber): Package Blubber - https://phabricator.wikimedia.org/T175609#3608777 (10dduvall) [18:19:06] RECOVERY - Puppet errors on deployment-mathoid is OK: OK: Less than 1.00% above the threshold [0.0] [18:30:07] 10Release-Engineering-Team (Kanban), 10Phabricator, 10User-Zppix: Unhandled Exception ("AphrontParameterQueryException") when viewing T174362 - https://phabricator.wikimedia.org/T175942#3608817 (10mmodell) [18:35:35] (03PS5) 10Umherirrender: Add missing unit test, npm jobs and make tests voting [integration/config] - 10https://gerrit.wikimedia.org/r/376761 [18:38:35] (03CR) 10Thcipriani: [C: 032] docker: set DEBIAN_FRONTEND=noninteractive [integration/config] - 10https://gerrit.wikimedia.org/r/378034 (owner: 10Hashar) [18:39:18] PROBLEM - Puppet errors on integration-slave-docker-1001 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [18:39:48] 10Release-Engineering-Team (Kanban), 10Release, 10Train Deployments: 1.31-wmf.2 deployment blockers - https://phabricator.wikimedia.org/T174362#3608842 (10mmodell) [18:40:05] (03Merged) 10jenkins-bot: docker: set DEBIAN_FRONTEND=noninteractive [integration/config] - 10https://gerrit.wikimedia.org/r/378034 (owner: 10Hashar) [18:41:19] (03CR) 10jerkins-bot: [V: 04-1] Add missing unit test, npm jobs and make tests voting [integration/config] - 10https://gerrit.wikimedia.org/r/376761 (owner: 10Umherirrender) [18:47:02] (03PS6) 10Umherirrender: Add missing unit test, npm jobs and make tests voting [integration/config] - 10https://gerrit.wikimedia.org/r/376761 [18:47:07] 10Release-Engineering-Team (Kanban), 10Release, 10Train Deployments: MW-1.30.0-wmf.9 deployment blockers - https://phabricator.wikimedia.org/T167893#3608849 (10mmodell) [19:04:56] 10Release-Engineering-Team (Watching / External), 10Phlogiston (Requests): Adjust phlogiston configuration for Release Engineering - https://phabricator.wikimedia.org/T170359#3608897 (10JAufrecht) @ksmith, could you please add back the footnote references so I can be sure I get the right projects? [19:08:34] Project selenium-MinervaNeue » chrome,beta,Linux,BrowserTests build #120: 04FAILURE in 19 min: https://integration.wikimedia.org/ci/job/selenium-MinervaNeue/BROWSER=chrome,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=BrowserTests/120/ [19:19:04] Project selenium-MinervaNeue » firefox,beta,Linux,BrowserTests build #120: 04FAILURE in 30 min: https://integration.wikimedia.org/ci/job/selenium-MinervaNeue/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=BrowserTests/120/ [19:22:00] 10Release-Engineering-Team (Watching / External), 10Phlogiston (Requests): Adjust phlogiston configuration for Release Engineering - https://phabricator.wikimedia.org/T170359#3608944 (10ksmith) [19:22:27] 10Release-Engineering-Team (Watching / External), 10Phlogiston (Requests): Adjust phlogiston configuration for Release Engineering - https://phabricator.wikimedia.org/T170359#3428392 (10ksmith) >>! In T170359#3608897, @JAufrecht wrote: > @ksmith, could you please add back the footnote references so I can be su... [19:24:40] greg-g: what's the beta cluster equivalent of scap for mediawiki? [19:25:10] or does scap sync just work there? [19:26:36] it should be the same [19:31:43] thx [19:33:47] Project beta-scap-eqiad build #173041: 04FAILURE in 0.41 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/173041/ [19:43:26] 10Release-Engineering-Team (Kanban), 10Release Pipeline: Install Blubber on contint1001 - https://phabricator.wikimedia.org/T175296#3608997 (10dduvall) [19:43:28] 10Release-Engineering-Team (Kanban), 10Release Pipeline (Blubber): Package Blubber - https://phabricator.wikimedia.org/T175609#3608995 (10dduvall) 05Open>03Resolved [19:46:10] Yippee, build fixed! [19:46:11] Project beta-scap-eqiad build #173042: 09FIXED in 2 min 32 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/173042/ [20:18:05] PROBLEM - Free space - all mounts on deployment-kafka01 is CRITICAL: CRITICAL: deployment-prep.deployment-kafka01.diskspace.root.byte_percentfree (<44.44%) [20:49:18] RECOVERY - Puppet errors on integration-slave-docker-1001 is OK: OK: Less than 1.00% above the threshold [0.0] [21:47:40] 10Continuous-Integration-Infrastructure (shipyard): CI docker build should use an apt cache - https://phabricator.wikimedia.org/T175966#3609344 (10hashar) [21:48:10] 10Continuous-Integration-Infrastructure (shipyard): CI docker build should use an apt cache - https://phabricator.wikimedia.org/T175966#3609357 (10hashar) [22:11:02] 10Gerrit, 10Release-Engineering-Team (Backlog), 10Scap, 10Patch-For-Review: Deploy gerrit with scap3 - https://phabricator.wikimedia.org/T157414#3609396 (10demon) The deb didn't do anything other than drop some files on a disk. [22:11:04] 10Release-Engineering-Team (Kanban), 10Release Pipeline: Install Blubber on contint1001 - https://phabricator.wikimedia.org/T175296#3609397 (10dduvall) @joe or @akosiaris, would love your review of the Debian package created in {T175609} when you get a chance. We're hoping to get it into apt.wikimedia.org and... [22:14:44] 10Continuous-Integration-Infrastructure (shipyard): CI docker build should use a git cache - https://phabricator.wikimedia.org/T175968#3609403 (10hashar) [22:35:00] 10Release-Engineering-Team (Kanban), 10Phlogiston (Requests), 10User-greg: Adjust phlogiston configuration for Release Engineering - https://phabricator.wikimedia.org/T170359#3609431 (10JAufrecht) a:05JAufrecht>03greg I've set up two reports on dev for review: Backlog report: http://phlogiston-dev.wmfla... [22:50:58] 10Release-Engineering-Team (Watching / External), 10Phlogiston (Requests): Adjust phlogiston configuration for Release Engineering - https://phabricator.wikimedia.org/T170359#3609435 (10greg) a:05greg>03None >>! In T170359#3609431, @JAufrecht wrote: > I've set up two reports on dev for review: > > Backlog... [22:57:43] 10Release-Engineering-Team (Kanban), 10Release, 10Train Deployments: 1.31.0-wmf.1 deployment blockers - https://phabricator.wikimedia.org/T172806#3609452 (10greg) [22:57:48] 10Release-Engineering-Team (Kanban), 10Release, 10Train Deployments: 1.31.0-wmf.2 deployment blockers - https://phabricator.wikimedia.org/T174358#3609454 (10greg) [22:57:52] 10Release-Engineering-Team (Kanban), 10Release, 10Train Deployments: 1.31.0-wmf.3 deployment blockers - https://phabricator.wikimedia.org/T174359#3609456 (10greg) [22:57:55] 10Release-Engineering-Team (Kanban), 10Release, 10Train Deployments: 1.31.0-wmf.4 deployment blockers - https://phabricator.wikimedia.org/T174360#3609458 (10greg) [22:57:58] 10Release-Engineering-Team (Kanban), 10Release, 10Train Deployments: 1.31.0-wmf.5 deployment blockers - https://phabricator.wikimedia.org/T174361#3609460 (10greg) [22:58:02] 10Release-Engineering-Team (Kanban), 10Release, 10Train Deployments: 1.31.0-wmf.6 deployment blockers - https://phabricator.wikimedia.org/T174362#3609462 (10greg) [22:59:17] 10Release-Engineering-Team (Kanban), 10Release, 10Train Deployments: 1.31.0-wmf.5 deployment blockers - https://phabricator.wikimedia.org/T174361#3559324 (10greg) (I was off by a month ;) ) [23:06:57] greg-g: Eurgh. [23:08:01] I guess we can't move sub-projects still, right? So I need to delete https://phabricator.wikimedia.org/project/view/3003/ https://phabricator.wikimedia.org/project/view/3004/ https://phabricator.wikimedia.org/project/view/3005/ https://phabricator.wikimedia.org/project/view/3006/ and re-create them under https://phabricator.wikimedia.org/project/profile/3010/ [23:17:15] James_F: sorry! [23:17:24] greg-g: Such is life. :-) [23:21:05] 10Release-Engineering-Team (Watching / External), 10Phlogiston (Requests): Adjust phlogiston configuration for Release Engineering - https://phabricator.wikimedia.org/T170359#3609490 (10JAufrecht) >> http://phlogiston-dev.wmflabs.org/rel_report.html > > That perspective (completed above the zero line, open bel... [23:55:58] 10Release-Engineering-Team (Watching / External), 10Phlogiston (Requests): Adjust phlogiston configuration for Release Engineering - https://phabricator.wikimedia.org/T170359#3609564 (10greg) Really, just two stacked area charts (open task grouped by column and closed tasks grouped by column) for the two proje...