[00:00:29] <Krenair>	 https://phabricator.wikimedia.org/T110556 I think
[00:00:35] <Krenair>	 but there's more to it than that
[00:01:50] <niedzielski>	 Krenair: thanks, that's a good option
[00:02:13] <legoktm>	 niedzielski, thcipriani: in the past we had lanthanum which was a production box and physical hardware, but that was phased out a year or two ago in favor of running everything in labs, which offers better isolation, etc. It can be done, would just require hardware request and some ops help, etc.
[00:02:56] <Krenair>	 legoktm, are you talking a labs-support type host, or something more like promethium.wikitextexp.eqiad.wmflabs ?
[00:03:10] <niedzielski>	 legoktm: thanks :)
[00:03:26] <legoktm>	 Krenair: it was a proper production server, lanthanum.eqiad.wmnet.
[00:03:35] <legoktm>	 https://phabricator.wikimedia.org/T86658
[00:03:44] <wikibugs>	 10Continuous-Integration-Infrastructure, 05Goal, 07Technical-Debt: All repositories should pass jshint test - https://phabricator.wikimedia.org/T62619#2631416 (10ashley)
[00:06:20] <Krenair>	 legoktm, oh right, you're talking about going back on 'phased out a year or two ago in favor of running everything in labs, which offers better isolation, etc.'
[00:06:31] <Krenair>	 while it's theoretically possible I can't guarantee it'd be allowed
[00:11:13] <legoktm>	 right, it'd probably be bare metal in labs 
[00:11:25] <legoktm>	 niedzielski: I'd recommend filing a ci-infra task for this, how urgent is it?
[00:11:51] <niedzielski>	 legoktm: it's not urgent. it was more a question of what my options are
[00:12:22] <legoktm>	 "it is possible, but requires more details and planning" ;)
[00:12:57] <niedzielski>	 legoktm: getting android ci to do everything we need it to, which seems startlingly basic, has unfortunately been a heavy time investment due to the android ecosystem. at a certain point, it seems like diminishing returns :|
[00:13:06] <niedzielski>	 legoktm: of course! :)
[00:13:29] <Krenair>	 First, I'd find out if it's required to be within labs isolation
[00:13:29] <legoktm>	 huh
[00:13:55] <legoktm>	 is that due to android stuff? or how because of how wikimedia runs CI?
[00:14:19] <Krenair>	 If it is, I'd go to labs ops and hashar
[00:14:42] <Krenair>	 If it's not, I'd go to hashar and ops hardware requests
[00:15:46] <niedzielski>	 legoktm: mostly android stuff. like the emulators are crazy slow so full tests take a long while to run to completion. the virtualization is always tricky with emulators but newer versions don't seem to support software rendering fully. the jenkins android emulator plugin kind of works and has gotten a lot better but is incompatible with the latest
[00:15:46] <niedzielski>	  android tooling
[00:16:56] <legoktm>	 do other people use something other than jenkins?
[00:17:03] <niedzielski>	 legoktm: it's easy for me to run the tests on a laptop but that's lame because it's very process oriented
[00:17:08] <niedzielski>	 it's not*
[00:17:18] * legoktm nods
[00:18:11] <niedzielski>	 legoktm: i'm not sure for android specifically. it's not really any one tool's problem and most of the issues lie in androidland
[00:18:22] <niedzielski>	 legoktm: mostly i'm just whining (sorry about that) :)
[00:19:21] <legoktm>	 haha this is all interesting to me :)
[00:20:03] <Krenair>	 niedzielski, so do you know where to go next with this?
[00:20:09] <niedzielski>	 legoktm: we have been using the api 15 (ice cream sandwich) emulator but it's no longer supproted by google
[00:21:15] <niedzielski>	 Krenair: not sure yet. i guess i'm thinking of either removing our screenshot tests that depend on webviews or trying to get virtualgl working
[00:21:24] <Krenair>	 no I mean
[00:21:48] <Krenair>	 oh
[00:21:59] <Krenair>	 so you'd just not use physical hardware?
[00:22:44] <Dereckson>	 Did you explore the possibility to test on Android x86 image? If so, you could get a pure VM, not an emulator as Jenkins slave
[00:22:46] <niedzielski>	 Krenair: yeah, i'm a little skeptical virtualgl will actually work even if i manage to get it configured right and if we removed the tests, we wouldn't need hardware.
[00:23:11] <Krenair>	 I'm not an android expert
[00:23:19] <Dereckson>	 http://thisismyeye.blogspot.be/2014/04/enabling-virtio-drivers-on-kernel-for.html seems to do it, see also https://sourceforge.net/projects/androidx86-openstack/
[00:23:36] <niedzielski>	 Dereckson: i haven't. that's an interesting idea. i tried running android-x86 in a virtualbox instance but opengl es was only supported for physical hardware last i checked
[00:23:39] <Krenair>	 But isn't WebView pretty important to the app?
[00:24:16] <niedzielski>	 Krenair: yes
[00:24:27] <Krenair>	 should probably be testing it
[00:24:44] <niedzielski>	 Krenair: but there are other issues with those tests. they're integration level tests (as opposed to unit) and quite flaky
[00:25:26] <niedzielski>	 Dereckson: we have some tests that depend on the latest version of android so i don't think this will quite work for us (or if it did, it would be another tradeoff). really neat idea though
[00:32:29] <niedzielski>	 i'll figure this out tomorrow. thanks all \o
[03:36:01] <greg-g>	 bd808: <3 <3
[03:58:22] <wmf-insecte>	 Yippee, build fixed!
[03:58:22] <wmf-insecte>	 Project selenium-MultimediaViewer » firefox,mediawiki,Linux,contintLabsSlave && UbuntuTrusty build #140: 09FIXED in 2 min 21 sec: https://integration.wikimedia.org/ci/job/selenium-MultimediaViewer/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=mediawiki,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/140/
[04:16:41] <shinken-wm>	 PROBLEM - Puppet run on deployment-mathoid is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0]
[04:24:16] <shinken-wm>	 PROBLEM - Puppet run on deployment-ores-redis is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0]
[04:55:43] <shinken-wm>	 PROBLEM - Puppet run on deployment-mx is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0]
[05:04:14] <shinken-wm>	 RECOVERY - Puppet run on deployment-ores-redis is OK: OK: Less than 1.00% above the threshold [0.0]
[05:21:43] <shinken-wm>	 RECOVERY - Puppet run on deployment-mathoid is OK: OK: Less than 1.00% above the threshold [0.0]
[05:25:06] <greg-g>	 bd808: (I only had 3 seconds before Terran was going to start crying with me at the computer, but I saw the test of adding channel support to stashbot's task logging of !log :) )
[05:30:46] <shinken-wm>	 RECOVERY - Puppet run on deployment-mx is OK: OK: Less than 1.00% above the threshold [0.0]
[07:29:33] <elukey>	 hashar: morning! Let me know when you are ready to start mw04 deployment + zuul
[07:46:35] <gehel>	 !log upgrading elasticsearch to 2.3.5 on deployment-elastic0? - T145404
[07:46:39] <qa-morebots>	 Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master
[07:52:41] <hashar>	 elukey: good morning
[07:52:51] <elukey>	 o/
[07:53:09] <hashar>	 I was trying to test mediawiki04 with something like  curl -H 'Host: en.wikipedia.beta.wmflabs.org' http://127.0.0.1/
[07:53:16] <hashar>	 yields "unconfigured domain"
[07:53:25] <hashar>	 then I found out some documentation mentionning  "furl"
[07:53:38] <hashar>	 that does fcgi queries directly to localhost, and apparently yields some page :]
[07:54:49] <elukey>	 hashar: yeah I saw your updates yesterday but I don't see any mw04 entry in the scap dsh list on tin-deployment :(
[07:55:04] <hashar>	 eg: furl http://en.wikipedia.beta.wmflabs.org/wiki/Special:Version
[07:55:23] <hashar>	 ah for dsh
[07:55:28] <hashar>	 I guess we want to cherry pick the patch you made
[07:55:49] <hashar>	 https://gerrit.wikimedia.org/r/#/c/310034/
[07:56:32] <elukey>	 we can even merge it in puppet, as you wish
[07:56:36] <elukey>	 I don't see huge problems 
[07:56:37] <hashar>	 that should add mw04 to the dshl files
[07:56:40] <hashar>	 yeah
[07:56:52] <elukey>	 (brb coffee - need to activate more neurons before starting :)
[08:06:47] <elukey>	 back :)
[08:07:20] <hashar>	 still here :]
[08:07:32] <hashar>	 so I guess
[08:07:35] <hashar>	 add to dsh https://gerrit.wikimedia.org/r/310034
[08:07:52] <hashar>	 rebase the git repo on deployment-puppetmaster.deployment-prep.eqiad.wmflabs  /var/lib/git/operations/puppet 
[08:07:57] <hashar>	 run puppet on tin
[08:07:59] <hashar>	 run scap
[08:08:17] <hashar>	 and should be good :D
[08:08:43] <elukey>	 hashar: let's merge the CR rather than doing this ok?
[08:09:02] <hashar>	 yeah
[08:09:25] <hashar>	 I have the habit of cherry picking since I cant merge to puppet.git :D
[08:09:33] <elukey>	 I don't have your experience in doing manual puppetmaster changes and I'd avoid to cause a big mess :)
[08:09:49] <hashar>	 once merged I will rebase
[08:09:57] <hashar>	 got a terminal open
[08:14:15] <hashar>	 rebased
[08:14:20] <hashar>	 running puppet on deployment-tin
[08:14:40] <elukey>	 super :)
[08:15:12] <hashar>	  deployment-mediawiki03.deployment-prep.eqiad.wmflabs
[08:15:12] <hashar>	 +deployment-mediawiki04.deployment-prep.eqiad.wmflabs
[08:15:12] <hashar>	  deployment-tin.deployment-prep.eqiad.wmflabs
[08:15:13] <hashar>	 neat
[08:15:34] <hashar>	 then going to run scap job from the list of Beta cluster Jenkins job on https://integration.wikimedia.org/ci/view/Beta/
[08:15:35] <hashar>	 ah
[08:15:39] <hashar>	 one is already running
[08:16:20] <hashar>	 00:01:36.327 08:16:14 ['/usr/bin/scap', 'pull', '--no-update-l10n'] on deployment-mediawiki04.deployment-prep.eqiad.wmflabs returned [255]: Host key verification failed.
[08:16:21] <hashar>	 bah
[08:16:28] <wmf-insecte>	 Project beta-scap-eqiad build #119826: 04FAILURE in 1 min 49 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/119826/
[08:17:35] <elukey>	 I started a manual scap pull on 04 to see if it works
[08:17:42] <hashar>	 !log beta: manually accepted ssh host key for deployment-mediawiki04 as user mwdeploy on deployment-tin and mira T144006
[08:17:45] <qa-morebots>	 Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master
[08:18:00] <hashar>	 the ssh issue is a dispredancy with prod
[08:18:08] <hashar>	 on prod we have the ssh host fingerprints collected by puppet
[08:18:19] <hashar>	 and published on the deployment servers in /etc/ssh/known_hosts automagically
[08:18:33] <wmf-insecte>	 Project beta-scap-eqiad build #119827: 04STILL FAILING in 1 min 45 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/119827/
[08:19:08] <hashar>	 and I missed something :(
[08:23:38] <hashar>	 ah
[08:23:44] <grrrit-wm>	 (03Draft1) 10Paladox: [mediawiki/extensions/UploadWizard] Add dependancy on EventLogger [integration/config] - 10https://gerrit.wikimedia.org/r/310238 
[08:23:49] <hashar>	 maybe that is deployment-tin key that needs to be accepted on mw04
[08:24:29] <elukey>	 what is the current error?
[08:25:01] <grrrit-wm>	 (03PS2) 10Paladox: [mediawiki/extensions/UploadWizard] Add dependancy on EventLogger [integration/config] - 10https://gerrit.wikimedia.org/r/310238 
[08:25:02] <hashar>	 from https://integration.wikimedia.org/ci/view/Beta/job/beta-scap-eqiad/119827/console
[08:25:10] <hashar>	 00:01:33.056 sync-apaches:   0% (ok: 0; fail: 0; left: 6)                                    
[08:25:10] <hashar>	 00:01:33.099 08:18:21 ['/usr/bin/scap', 'pull', '--no-update-l10n'] on deployment-mediawiki04.deployment-prep.eqiad.wmflabs returned [255]: Host key verification failed.
[08:26:15] <hashar>	 !log  mwdeploy@deployment-mediawiki04  manually accepted ssh host key of deployment-tin  T144006
[08:26:18] <qa-morebots>	 Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master
[08:26:25] <hashar>	 looks like that solved it
[08:26:32] <hashar>	 arhh no
[08:26:38] <wmf-insecte>	 Project beta-scap-eqiad build #119828: 04STILL FAILING in 1 min 51 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/119828/
[08:28:21] <hashar>	 Sep 13 08:28:13 deployment-mediawiki04 sshd[6076]: Connection from 10.68.17.240 port 49169 on 10.68.19.128 port 22
[08:28:21] <hashar>	 Sep 13 08:28:13 deployment-mediawiki04 sshd[6076]: Connection closed by 10.68.17.240 [preauth]
[08:28:39] <hashar>	 so on deployment-tin I am running 'scap sync-wikiversions --verbose'
[08:28:45] <hashar>	 which does ssh to the mw host
[08:29:02] <hashar>	 then on deployment-mediawiki04  in /var/log/auth.log  there is a :  Connection closed by 10.68.16.210 [preauth]
[08:29:43] <hashar>	 I guess some ssh key is missing
[08:32:30] <hashar>	 so somehow
[08:32:44] <hashar>	 mwdeploy@deployment-mediawiki04  lacks a ~/.ssh/authorized_keys
[08:34:02] <hashar>	 there is one on deployment-mediawiki02 though
[08:34:10] <hashar>	 guess that got manually hacked and is not in puppet :((
[08:35:33] <elukey>	 mmmm
[08:35:52] <hashar>	 yeah welcome to a huge stack of mess :(
[08:36:02] <hashar>	 https://wikitech.wikimedia.org/wiki/Hiera:Deployment-prep  has a  "mediawiki::users::mwdeploy_pub_key"
[08:36:19] <wmf-insecte>	 Project beta-scap-eqiad build #119829: 04STILL FAILING in 1 min 47 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/119829/
[08:36:32] <hashar>	 which is also in puppet.git hieradata/labs/deployment-prep/common.yaml
[08:36:43] <hashar>	 but that pub key does not match what is on deployment-mediawiki02
[08:36:51] <hashar>	 and it is not referenced anywhere in puppet manifests
[08:39:03] <hashar>	 some puppet class is missing :(
[08:39:13] <hashar>	 beta::deployaccess at least
[08:39:23] <hashar>	 that tweaks /etc/security/access.conf to allow deployment-tin to ssh
[08:41:09] <elukey>	 so the mwdeploy ssh key in hiera is stored in /etc/ssh/userkeys/mwdeploy
[08:41:36] <hashar>	 (forget the /etc/security stuff, it is there actually :D )
[08:42:22] <hashar>	 ah the key at /etc/ssh/userkeys/mwdeploy
[08:42:35] <hashar>	 does not match the one on deployment-mediawiki02 /home/mwdeploy/.ssh/authorized_keys
[08:43:07] <hashar>	 and i have no idea whether it is actually used / looked up 
[08:46:21] <wmf-insecte>	 Project beta-scap-eqiad build #119830: 04STILL FAILING in 1 min 45 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/119830/
[08:46:26] <hashar>	 I am stuck :(
[08:46:45] <elukey>	 I am reading /etc/ssh/sshd_config
[08:46:49] <elukey>	 a bit weird
[08:46:50] <elukey>	 mmm
[08:48:31] <hashar>	 digging in the ssh pub keys mess :]
[08:48:42] <hashar>	 looks like mediawiki::users::mwdeploy_pub_key" is no more used
[08:49:10] <hashar>	 and instead we rely on  secret('keyholder/mwdeploy.pub')
[08:52:07] <hashar>	 so I got puppet class mediawiki::users
[08:52:20] <hashar>	 invoking  ssh::userkey { 'mwdeploy': content => secret('keyholder/mwdeploy.pub'), }
[08:52:41] <hashar>	 which seems to push the key to /etc/ssh/userkeys/mwdeploy
[08:53:07] <hashar>	 and on mediawiki04 that file has the content from the private puppet repo
[08:55:42] <elukey>	 hashar: can you retry the deployment?
[08:56:07] <hashar>	 something running on https://integration.wikimedia.org/ci/view/Beta/job/beta-scap-eqiad/119831/console
[08:56:22] <hashar>	 Connection closed by 10.68.17.240 [preauth]
[08:56:23] <wmf-insecte>	 Project beta-scap-eqiad build #119831: 04STILL FAILING in 1 min 49 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/119831/
[08:56:34] <hashar>	 could it be that /etc/security/ rules are not applied?
[08:58:11] <elukey>	 I am always confused with keyholder and ssh keys bootstrap
[08:58:22] <hashar>	 I am too :(
[08:58:54] <hashar>	 the idea is to have the key loaded by a daemon (keyholder)
[08:58:58] <hashar>	 which exposes a ssh proxy
[08:59:23] <hashar>	 at least:  SSH_AUTH_SOCK=/run/keyholder/proxy.sock ssh deployment-mediawiki04.deployment-prep.eqiad.wmflabs
[08:59:25] <hashar>	 works :]
[08:59:33] <hashar>	 as mwdeploy user
[09:00:35] <hashar>	 elukey: I think I found the issue
[09:01:15] <elukey>	 so with sudo -u mwdeploy SSH_AUTH_SOCK=/run/keyholder/proxy.sock ssh  deployment-mediawiki04.deployment-prep.eqiad.wmflabs -v it works
[09:01:25] <hashar>	 on deployment-tin:   sudo -u jenkins-deploy -H SSH_AUTH_SOCK=/run/keyholder/proxy.sock ssh mwdeploy@deployment-mediawiki04.deployment-prep.eqiad.wmflabs
[09:01:25] <elukey>	 ah yes :D
[09:01:29] <hashar>	 then manually accepted the key
[09:01:46] <elukey>	 ahhh the Jenkins identity!
[09:01:52] <elukey>	 makes senssseeeee
[09:02:03] <elukey>	 we were looking in the wrong way
[09:02:05] <hashar>	 !log on deployment-tin, accepted mediawiki04 host key for jenkins-deploy user : sudo -u jenkins-deploy -H SSH_AUTH_SOCK=/run/keyholder/proxy.sock ssh mwdeploy@deployment-mediawiki04.deployment-prep.eqiad.wmflabs  T144006 
[09:02:09] <qa-morebots>	 Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master
[09:02:10] <elukey>	 nice! 
[09:02:11] <hashar>	 took me a while 
[09:02:24] <wmf-insecte>	 Yippee, build fixed!
[09:02:24] <wmf-insecte>	 Project beta-scap-eqiad build #119832: 09FIXED in 1 min 47 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/119832/
[09:02:28] <elukey>	 makes complete sense when you get it :D
[09:02:29] <elukey>	 \o/
[09:02:39] * hashar updates the wikitech page
[09:03:20] <elukey>	 link please when you have finished :)
[09:04:17] <hashar>	 https://wikitech.wikimedia.org/wiki/Keyholder
[09:04:18] <hashar>	 bottom
[09:05:08] <elukey>	 oh keyholder's doc
[09:05:23] <elukey>	 We may want to build a doc for mw in deployment-prep
[09:05:24] <wikibugs>	 10Beta-Cluster-Infrastructure, 06Operations, 07HHVM, 13Patch-For-Review: Move the MW Beta appservers to Debian - https://phabricator.wikimedia.org/T144006#2632056 (10hashar)
[09:05:29] <hashar>	 and I have added the magic command on the task
[09:05:30] <elukey>	 but it is a stretch goal :)
[09:05:41] <hashar>	 yeah sorry :(
[09:05:47] <hashar>	 there are a lot of confusing bits
[09:06:20] <elukey>	 ah and I just discovered that I need to test http://en.wikipedia.beta.wmflabs.org/wiki/Main_Page
[09:06:27] <elukey>	 not wikimedia.org
[09:06:28] <hashar>	 I am creating a task for mediawiki::users::mwdeploy_pub_key
[09:06:32] <elukey>	 that MAKES SENSE but I didn't know
[09:06:47] <elukey>	 ok the a curl looks fine
[09:07:09] <elukey>	 hashar: we could replace 01 with 04 and see how it goes
[09:07:17] <elukey>	 I'll leave 01 up
[09:07:25] <elukey>	 and I'll nuke it tomorrow if nothing comes u
[09:07:27] <elukey>	 *up
[09:07:38] <elukey>	 or I can merge https://gerrit.wikimedia.org/r/#/c/310035/
[09:07:39] <elukey>	 wait
[09:07:50] <elukey>	 then replace 01 with 04
[09:07:53] <elukey>	 wait
[09:07:55] <elukey>	 nuke 01
[09:08:02] <elukey>	 repeat with 02/05 :D
[09:08:37] <elukey>	 let's do the second one
[09:08:38] <elukey>	 safer
[09:08:49] <elukey>	 hashar: ready to merge https://gerrit.wikimedia.org/r/#/c/310035/
[09:11:03] <elukey>	 done
[09:11:29] <hashar>	 filled https://phabricator.wikimedia.org/T145495
[09:11:29] <wikibugs>	 10Beta-Cluster-Infrastructure, 10Deployment-Systems: mediawiki::users::mwdeploy_pub_key hiera key should be purge - https://phabricator.wikimedia.org/T145495#2632078 (10hashar)
[09:11:32] <hashar>	 to clean up hiera
[09:12:07] <hashar>	 sorry for the delay elukey  :(
[09:12:28] <hashar>	 I have rebased puppet repo
[09:12:58] <elukey>	 thanks!!
[09:13:18] <hashar>	 so now that mw04 is added in hiera cache::text::apps
[09:13:34] <elukey>	 would you mind to add the steps to rebase deployment-tin in the phab description?
[09:13:35] <hashar>	 I am going to force run puppet on the varnish text cache deployment-cache-text04
[09:13:44] <elukey>	 okok!
[09:16:52] <wikibugs>	 10Beta-Cluster-Infrastructure, 06Operations, 07HHVM, 13Patch-For-Review: Move the MW Beta appservers to Debian - https://phabricator.wikimedia.org/T144006#2632096 (10hashar)
[09:17:00] <hashar>	 ran puppet
[09:17:06] <shinken-wm>	 PROBLEM - Puppet run on deployment-conf03 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0]
[09:18:16] <hashar>	 looking at t-backend]/Varnish::Wikimedia_vcl[/usr/share/varnish/tests/wikimedia-common_text-backend.inc.vcl]/File[/usr/share/varnish/tests/wikimedia-common_text-backend.inc.vcl]/content: content changed '{md5}d94269f8184d7176a65599c90be53c71' to '{md5}e4e991826ed6b4f20e75455a40c092b3'
[09:18:16] <hashar>	 Notice: /Stage[main]/Role::Cache::2layer/Salt::Grain[varnish_version]/Exec[ensure_varnish_version_3]/returns: executed successfully
[09:18:16] <hashar>	 Notice: /Stage[main]/Role::Cache::Ssl::Unified/Tlsproxy::Localssl[unified]/Letsencrypt::Cert::Integrated[beta.wmflabs.org]/Exec[acme-setup-acme-beta_wmflabs_org]/returns: executed successfully
[09:18:16] <hashar>	 Notice: /Stage[main]/Role::Cache::Text/Role::Cache::Instances[text]/Varnish::Instance[text-backend]/Exec[load-new-vcl-file]: Triggered 'refresh' from 1 events
[09:18:16] <hashar>	 Notice: /Stage[main]/Confd/Base::Service_unit[confd]/Service[confd]/ensure: ensure changed 'stopp
[09:18:18] <hashar>	 azearazea
[09:19:18] <hashar>	 I am looking at https://logstash-beta.wmflabs.org/  to see whether some traffic is received
[09:22:40] <hashar>	 elukey: it works !!! :]
[09:22:42] <wikibugs>	 10Beta-Cluster-Infrastructure, 06Operations, 07HHVM, 13Patch-For-Review: Move the MW Beta appservers to Debian - https://phabricator.wikimedia.org/T144006#2632106 (10hashar) After some mess with scap mwdeploy keys solved by running on deployment-tin:      sudo -u jenkins-deploy -H SSH_AUTH_SOCK=/run/keyhol...
[09:23:01] <elukey>	  niceeeee
[09:23:45] <hashar>	 and I have updated the task description to list the commands
[09:23:48] <elukey>	 where are you looking (for curiosity) 
[09:23:56] <elukey>	 super
[09:23:57] <hashar>	 the one to manually accept the host key to please jenkins
[09:24:03] <hashar>	 bits for rebasing puppet
[09:24:12] <hashar>	 then the puppet run on varnish + service varnish reload
[09:25:08] <elukey>	 awesome!
[09:25:12] <wikibugs>	 10Beta-Cluster-Infrastructure, 06Operations, 07HHVM, 13Patch-For-Review: Move the MW Beta appservers to Debian - https://phabricator.wikimedia.org/T144006#2632108 (10hashar)
[09:25:25] <hashar>	 yeah
[09:25:40] <hashar>	 and since prod already has mw server on Jessie, at least the puppet part is straightforward :D
[09:26:02] <hashar>	 there are some oddities in the log though
[09:26:10] <hashar>	 Warning: failed to mkdir "/srv/mediawiki/php-master/images/thumb/2/20/Order_of_St_John_(UK)_ribbon.png" mode 0777 [Called from wfMkdirParents in /srv/mediawiki/php-master/includes/Glob
[09:26:51] <wikibugs>	 10Beta-Cluster-Infrastructure, 06Operations, 07HHVM, 13Patch-For-Review: Move the MW Beta appservers to Debian - https://phabricator.wikimedia.org/T144006#2586022 (10hashar) Noticed in logstash:      Warning: failed to mkdir "/srv/mediawiki/php-master/images/thumb/2/20/Order_of_St_John_(UK)_ribbon.png"...
[09:28:20] <hashar>	 most probably unrelated
[09:29:07] <elukey>	 looks a bit weird yes
[09:29:36] <hashar>	 yeah that is a different issue entirely
[09:29:42] <elukey>	 what is your preference? Wait a day and then remove 01 or proceed now?
[09:30:06] <elukey>	 I meant s/01/04/g leaving 01 running
[09:30:09] <elukey>	 in case of fire
[09:30:24] <hashar>	 yeah
[09:30:27] <hashar>	 sounds good
[09:32:09] <hashar>	 looks like the thumbnailling configuration is off
[09:34:02] <wikibugs>	 10Beta-Cluster-Infrastructure: beta cluster: Warning: failed to mkdir "/srv/mediawiki/php-master/images/thumb/... - https://phabricator.wikimedia.org/T145496#2632115 (10hashar)
[09:34:28] <wikibugs>	 10Beta-Cluster-Infrastructure, 06Operations, 07HHVM, 13Patch-For-Review: Move the MW Beta appservers to Debian - https://phabricator.wikimedia.org/T144006#2632143 (10hashar) >>! In T144006#2632108, @hashar wrote: > Noticed in logstash: >  >     Warning: failed to mkdir "/srv/mediawiki/php-master/images/thu...
[09:35:36] <elukey>	 hashar: https://gerrit.wikimedia.org/r/#/c/310256
[09:35:55] <shinken-wm>	 PROBLEM - Puppet run on deployment-db03 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0]
[09:37:05] <hashar>	 lets do it :]
[09:37:22] <hashar>	 and shutdown deployment-mediawiki01
[09:39:28] <elukey>	 merged!
[09:39:48] <hashar>	 rebasing
[09:40:04] <hashar>	 puppet + varnish reload
[09:40:45] <elukey>	 \o/
[09:40:49] <hashar>	 !log Unpooled deployment-mediawiki01 from scap and varnish. Shutting down instance.  T144006
[09:40:52] <qa-morebots>	 Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master
[09:41:18] <hashar>	 the whole process is a bit tedious :(
[09:42:20] <elukey>	 I think that 02 will go a lot quicker since we know exactly what to do
[09:43:22] <shinken-wm>	 PROBLEM - Puppet staleness on deployment-db1 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [43200.0]
[09:44:04] <shinken-wm>	 PROBLEM - Host deployment-mediawiki01 is DOWN: CRITICAL - Host Unreachable (10.68.17.170)
[09:45:07] <elukey>	 mmm
[09:46:00] <hashar>	 I have shut it down
[09:46:04] <elukey>	 ahhahaha
[09:46:19] <elukey>	 okok I missed the shutdown part in the log message
[09:46:22] <hashar>	 keep it around just in case then it can be deleted :D
[09:47:54] <elukey>	 okok so let's do 02/05 tomorrow (maybe 03/06 too if we have time)
[09:48:08] <elukey>	 and the jobrunner/videoscaler the day after
[09:49:10] <hashar>	 oh also
[09:49:12] <hashar>	 about the quota
[09:49:24] <hashar>	 Dan / Jaime are going to migrate the beta cluster databases this afternoon
[09:49:34] <hashar>	 which will soon freeup 2 x 8 CPUs :]
[09:50:14] <elukey>	 very nice :)
[09:51:30] <hashar>	 ah
[09:51:32] <hashar>	 nutcracker
[09:52:05] <shinken-wm>	 RECOVERY - Puppet run on deployment-conf03 is OK: OK: Less than 1.00% above the threshold [0.0]
[09:54:45] <elukey>	 is there a special config file in puppet that I didn't get?
[09:55:03] <shinken-wm>	 PROBLEM - Puppet staleness on deployment-db2 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [43200.0]
[09:58:14] <elukey>	 hashar: I am about to upload zuul to reprepro with reprepro -C thirdparty include jessie-wikimedia ~elukey/zuul/zuul_2.5.0-8-gcbc7f62-wmf2jessie1/zuul_2.5.0-8-gcbc7f62-wmf2jessie1_amd64.changes
[09:58:24] <elukey>	 (on carbon)
[09:59:00] <wmf-insecte>	 Project beta-scap-eqiad build #119837: 04FAILURE in 14 min: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/119837/
[10:00:36] <hashar>	 !log Upgrading beta cluster jobrunner to catch up with upstream b952a7c..0dc341f  merely picking up a trivial log change ( https://gerrit.wikimedia.org/r/#/c/297935/ )
[10:00:40] <qa-morebots>	 Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master
[10:01:03] <hashar>	 that is when Trebuchet is going to explode entirely
[10:02:36] <hashar>	 !log Trebuchet is broken for /srv/deployment/jobrunner/jobrunner  cant reach the deploy minions somehow.  Did the update manually
[10:02:40] <qa-morebots>	 Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master
[10:04:39] <wmf-insecte>	 Project beta-scap-eqiad build #119838: 04STILL FAILING in 4 min 4 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/119838/
[10:06:15] <hashar>	 !log beta: manually updated  jobrunner install on deployment-jobrunner01 and deployment-tmh01 then reloaded the services with:  service jobchron reload
[10:06:19] <qa-morebots>	 Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master
[10:06:41] <hashar>	 elukey: sorry back around
[10:06:46] <hashar>	 so yeah we can upgrade zuul on scandium
[10:06:56] <hashar>	 there is a single service running there: zuul-merger
[10:07:03] <elukey>	 ok uploading to reprpreo
[10:07:07] <hashar>	 which is given a Gerrit patchset reference to be merged against the tip of the branc
[10:07:08] <elukey>	 *reprrepro
[10:07:10] <hashar>	 does the git merge
[10:07:21] <hashar>	 then report back with the resulting commit. That is then send to Jenkins for testing
[10:07:51] <hashar>	 that causes a slight delay in CI processing, but nothing to worry about
[10:07:55] <wmf-insecte>	 Project beta-scap-eqiad build #119839: 04STILL FAILING in 1 min 41 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/119839/
[10:09:24] <hashar>	 bah dsh group is out of date on tin. Running puppet
[10:10:23] <hashar>	 /etc/dsh/group/mediawiki-appserver-canaries:deployment-mediawiki01.deployment-prep.eqiad.wmflabs
[10:10:31] <hashar>	 elukey: mediawiki01 is still a canary apparently
[10:10:52] <shinken-wm>	 RECOVERY - Puppet run on deployment-db03 is OK: OK: Less than 1.00% above the threshold [0.0]
[10:11:12] <elukey>	 hashar: checking
[10:11:18] <elukey>	 also I discovered something weird
[10:11:37] <elukey>	 https://people.wikimedia.org/~hashar/debs/zuul_2.5.0-8-gcbc7f62-wmf2jessie1/ - the changelog lists precise-wikimedia, not jessie-wikimedia
[10:11:58] <elukey>	 so when I try to upload it asks me for the --ignore=wrong-distribution
[10:12:09] <elukey>	 that should be avoided if possible :)
[10:12:38] <hashar>	 https://gerrit.wikimedia.org/r/310264
[10:12:44] <hashar>	 will fix mw01
[10:12:45] <hashar>	 ah 
[10:12:54] <hashar>	 well it is definitely meant for jessie-wikimedia
[10:12:57] <hashar>	 and not for jessie
[10:13:08] <hashar>	 so I guess ignore it ?  Not sure what reprepro doc says about it
[10:14:02] <elukey>	 I know that you are going to kill me but can you please correct the distribution in the pkg?
[10:14:05] <hashar>	 can double check with  reprepro ls zuul
[10:14:09] <elukey>	 (merging the CR)
[10:14:52] * hashar shoots elukey :D
[10:15:27] <elukey>	 (merged :)
[10:16:20] <wmf-insecte>	 Project beta-scap-eqiad build #119840: 04STILL FAILING in 1 min 43 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/119840/
[10:16:33] <hashar>	 rebase + puppet run on tin
[10:17:33] <hashar>	 regarding the zuul package.  To build it I need  the packages from jessie-wikimedia 
[10:17:37] <hashar>	 it does not build against just jessie
[10:17:41] <shinken-wm>	 PROBLEM - Puppet run on deployment-mathoid is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0]
[10:17:55] <hashar>	 cause we backported/added a bunch of python modules in jessie-wikimedia
[10:18:18] <hashar>	 "reprepro ls zuul"  should list precise-wikimedia
[10:18:25] <hashar>	 so I guess you want to --ignore=wrong-distribution
[10:18:35] <hashar>	 but maybe it is better to double check with the reprepro gurus :]
[10:20:20] <elukey>	 root@carbon:/srv/wikimedia# reprepro ls zuul
[10:20:20] <elukey>	 zuul | 2.1.0-60-g1cc37f7-wmf4precise1 | precise-wikimedia | amd64, source
[10:20:23] <elukey>	 zuul | 2.1.0-60-g1cc37f7-wmf4trusty1 |  trusty-wikimedia | amd64, source
[10:20:26] <elukey>	 zuul | 2.1.0-60-g1cc37f7-wmf4jessie1 |  jessie-wikimedia | amd64, source
[10:20:48] <wmf-insecte>	 Yippee, build fixed!
[10:20:48] <wmf-insecte>	 Project beta-scap-eqiad build #119841: 09FIXED in 1 min 45 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/119841/
[10:23:03] <shinken-wm>	 PROBLEM - Puppet run on deployment-zotero01 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0]
[10:26:04] <hashar>	 scap is all good
[10:26:19] <hashar>	 elukey: so I guess  --ignore=wrong-distribution
[10:26:27] <hashar>	 not sure why the -wikimedia ones are not whitelisted though
[10:26:44] <shinken-wm>	 PROBLEM - Puppet run on deployment-mx is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0]
[10:28:33] <hashar>	 "" If your package was specifically built for wikimedia and does not have a distribution of *-wikimedia listed in the .changes file, then you should force reprepro to accept a *-wikimedia distribution. (You'll probably want precise-wikimedia). Add the --ignore=wrongdistribution flag to the reprepro command to do so. """
[10:28:41] <hashar>	 that is from https://wikitech.wikimedia.org/wiki/Reprepro
[10:28:57] <elukey>	 okok will read it in a bit :)
[10:29:04] <hashar>	 so should be in .changes
[10:29:45] <hashar>	 Version: 2.5.0-8-gcbc7f62-wmf2jessie1
[10:29:45] <hashar>	 Distribution: precise-wikimedia
[10:29:46] <hashar>	 eek
[10:29:55] <elukey>	 but what is your problem of building with jessie-wikimedia? the access to copper to build?
[10:29:56] <hashar>	 looks like I screwed it up :D
[10:30:30] <hashar>	 oh I build on my local machine
[10:30:39] <hashar>	 with the magic hook from the package_builder puppet module
[10:30:41] <hashar>	 works like a charm
[10:31:39] <elukey>	 ah ok so the changes file looks weird
[10:31:50] <hashar>	 yeah
[10:31:54] <hashar>	 and I screwed it up sorry :(
[10:31:59] <hashar>	 the debian/changelog has the wrong distro
[10:32:14] <elukey>	 okok so I am not drunk :D
[10:32:24] <hashar>	 sorry I have misunderstood
[10:32:38] <hashar>	 I thought reprepro was complaining because of 'jessie-wikimedia' distribution name
[10:33:02] <shinken-wm>	 PROBLEM - Puppet run on deployment-elastic07 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0]
[10:33:53] <grrrit-wm>	 (03PS1) 10Hashar: Merge commit 'debian/precise/wikimedia' into debian/jessie-wikimedia [integration/zuul] (debian/jessie-wikimedia) - 10https://gerrit.wikimedia.org/r/310267 
[10:34:51] <hashar>	 I am rebuilding it
[10:36:03] <elukey>	 ahh sorry I didn't explain myself correctly 
[10:37:02] <hashar>	 Version: 2.5.0-8-gcbc7f62-wmf2jessie1
[10:37:02] <hashar>	 Distribution: jessie-wikimedia
[10:37:03] <hashar>	 better
[10:37:07] <hashar>	 pushed that to people.wm.o
[10:38:30] <hashar>	 elukey: refreshed on https://people.wikimedia.org/~hashar/debs/zuul_2.5.0-8-gcbc7f62-wmf2jessie1/
[10:38:39] <hashar>	 and the rebuild is a noop according to debdiff \o/
[10:39:00] <hashar>	 I should probaqbly have bumped it to jessie2  but I am lazy :D
[10:40:09] <wikibugs>	 10Continuous-Integration-Infrastructure (phase-out-gallium), 06Operations: Upgrade Zuul on scandium.eqiad.wmnet  (Jessie zuul-merger) - https://phabricator.wikimedia.org/T145057#2632303 (10hashar)
[10:40:28] <grrrit-wm>	 (03PS2) 10Hashar: Merge commit 'debian/precise/wikimedia' into debian/jessie-wikimedia [integration/zuul] (debian/jessie-wikimedia) - 10https://gerrit.wikimedia.org/r/310267 (https://phabricator.wikimedia.org/T145057) 
[10:45:17] <elukey>	 hashar: done :)
[10:45:23] <elukey>	 now I am going to upgrade scandium
[10:45:24] <elukey>	 ok?
[10:45:33] <hashar>	 yup
[10:45:46] <hashar>	 I am watching its debug log  /var/log/zuul/merger-debug.log
[10:46:04] <hashar>	 iirc the package restart the service
[10:46:33] <hashar>	 sorry for the Distribution:  mess :-(
[10:47:18] <shinken-wm>	 RECOVERY - Host deployment-parsoid05 is UP: PING OK - Packet loss = 0%, RTA = 0.79 ms
[10:47:38] <grrrit-wm>	 (03CR) 10Hashar: [C: 032] Merge commit 'debian/precise/wikimedia' into debian/jessie-wikimedia [integration/zuul] (debian/jessie-wikimedia) - 10https://gerrit.wikimedia.org/r/310267 (https://phabricator.wikimedia.org/T145057) (owner: 10Hashar)
[10:47:50] <elukey>	 I am learning so it is all good experience :)
[10:47:58] <elukey>	 zuul upgraded!
[10:48:16] <hashar>	 hashar@scandium:~$ zuul --version
[10:48:16] <hashar>	 Zuul version: 2.5.0-8-gcbc7f62-wmf2jessie1
[10:48:40] <hashar>	 and hopefully the .deb install has restarted the service
[10:48:59] <hashar>	 it is processing a change
[10:49:47] <elukey>	 Sep 13 10:47:18 scandium systemd[1]: Starting LSB: Zuul...
[10:49:47] <elukey>	 Sep 13 10:47:18 scandium zuul[28898]: Zuul Server: /etc/default/zuul is not set to START_DAEMON=1: exiting: failed!
[10:49:50] <elukey>	 Sep 13 10:47:18 scandium systemd[1]: Started LSB: Zuul.
[10:49:54] <hashar>	 yeah that is a bit lame
[10:50:01] <hashar>	 the package install both services (server and merger)
[10:50:10] <hashar>	 and there is START_DAEMON to prevent the service from starting
[10:50:22] <hashar>	 so that zuul-server error is expected
[10:50:26] <elukey>	 super
[10:50:42] <hashar>	 one of ops told me that the package should probably be split in several binary packages
[10:50:49] <hashar>	 eg  zuul-common   zuul-merger  zuul-server etc
[10:50:57] <hashar>	 so scandium would just have "zuul-merger"
[10:51:22] <grrrit-wm>	 (03CR) 10Hashar: "recheck" [integration/zuul] (debian/jessie-wikimedia) - 10https://gerrit.wikimedia.org/r/310267 (https://phabricator.wikimedia.org/T145057) (owner: 10Hashar)
[10:51:40] <hashar>	 elukey: looks all fine to me. Thank you very much :]
[10:51:42] <wikibugs>	 10Continuous-Integration-Infrastructure, 10Packaging, 07Zuul: Package / puppetize zuul-clear-refs.py - https://phabricator.wikimedia.org/T103529#2632326 (10elukey)
[10:51:44] <wikibugs>	 10Continuous-Integration-Infrastructure (phase-out-gallium), 06Operations, 13Patch-For-Review: Upgrade Zuul on scandium.eqiad.wmnet  (Jessie zuul-merger) - https://phabricator.wikimedia.org/T145057#2632323 (10elukey) 05Open>03Resolved a:03elukey Package installed and uploaded to jessie-wikimedia/thirdp...
[10:52:09] <elukey>	 goooood
[10:52:26] <elukey>	 let's resync tomorrow morning for the other mw hosts :)
[10:52:26] <hashar>	 that was also a blocker to phase out gallium (Precise node)
[10:52:32] <elukey>	 thanks you for the help!
[10:52:35] <hashar>	 since I needed an up to date zuul package for jessie
[10:52:42] <shinken-wm>	 RECOVERY - Puppet run on deployment-mathoid is OK: OK: Less than 1.00% above the threshold [0.0]
[10:52:46] <elukey>	 nuke all the precises! 
[10:52:48] <elukey>	 :D
[10:54:27] <hashar>	 yeah gotta write the step-by-step migration plan :]
[10:55:35] <hashar>	 bah I screwed it up again :(
[10:56:07] <hashar>	 that will be for another day.  Nothing of importance
[10:58:05] <shinken-wm>	 RECOVERY - Puppet run on deployment-zotero01 is OK: OK: Less than 1.00% above the threshold [0.0]
[11:00:29] <wikibugs>	 03Scap3, 10Citoid, 06Services, 10VisualEditor, 15User-mobrovac: Enable Scap3 config deploys for Citoid - https://phabricator.wikimedia.org/T144597#2632347 (10mobrovac) 05Open>03Resolved
[11:01:46] <shinken-wm>	 RECOVERY - Puppet run on deployment-mx is OK: OK: Less than 1.00% above the threshold [0.0]
[11:03:53] <hashar>	 I love open source
[11:03:59] <hashar>	 # Copyright 2015 BMW Car IT GmbH
[11:06:23] <wikibugs>	 10Continuous-Integration-Infrastructure, 10Packaging, 07Zuul: Package / puppetize zuul-clear-refs.py - https://phabricator.wikimedia.org/T103529#2632352 (10hashar) 2.5.0-8-gcbc7f62-wmf2jessie1 has been deployed on scandium.eqiad.wmnet (the zuul-merger) but I screwed it up.  The python shebang needs to be adj...
[11:07:29] <aude>	 hashar: any idea why jenkins is not running on https://gerrit.wikimedia.org/r/#/c/308444/ ?
[11:09:45] <zeljkof>	 hashar: I am pretty sure the drawing is wrong here :( https://www.mediawiki.org/wiki/Selenium/mwext-mw-selenium_Jenkins_job
[11:10:02] <zeljkof>	 I am not not sure how the Jenkins infrastructure is set up internally
[11:11:36] <grrrit-wm>	 (03PS1) 10Hashar: New release 2.5.0-8-gcbc7f62-wmf3precise1 [integration/zuul] (debian/precise-wikimedia) - 10https://gerrit.wikimedia.org/r/310277 (https://phabricator.wikimedia.org/T103529) 
[11:12:32] <hashar>	 aude: dependency issue
[11:12:48] <hashar>	 aude: there is a cycle in the chain of Depends-On somewhere. I hav ecommented on one of the patches
[11:13:00] <shinken-wm>	 RECOVERY - Puppet run on deployment-elastic07 is OK: OK: Less than 1.00% above the threshold [0.0]
[11:13:41] <hashar>	 aude: ah yeah that is in https://gerrit.wikimedia.org/r/#/c/308422/
[11:13:53] <hashar>	 it depends on https://gerrit.wikimedia.org/r/#/c/308801/
[11:14:00] <hashar>	 and that patch depends back to 308422
[11:14:16] <aude>	 oh
[11:14:18] <hashar>	 given 308801 has been abandonned, it should be removed from list of Depends-On:
[11:14:24] <aude>	 k
[11:14:28] <hashar>	 the very confusing thing is that Zuul just skip entirely on such cycles
[11:14:31] <hashar>	 when really, it should report back
[11:16:19] <grrrit-wm>	 (03CR) 10Hashar: [C: 032] New release 2.5.0-8-gcbc7f62-wmf3precise1 [integration/zuul] (debian/precise-wikimedia) - 10https://gerrit.wikimedia.org/r/310277 (https://phabricator.wikimedia.org/T103529) (owner: 10Hashar)
[11:20:21] <shinken-wm>	 PROBLEM - Puppet run on deployment-salt02 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0]
[11:21:15] <aude>	 seems good now
[11:25:06] <hashar>	 elukey: do you remember about a zuul.changes that was missing the orig.tar.gz  ?
[11:25:15] <hashar>	 that is actually normal behavior whenever the debian version is > 1
[11:25:25] <hashar>	 they are assuming that debian version 2  is a second upload
[11:25:30] <hashar>	 and thus the orig tarball is already in the repo
[11:25:38] <hashar>	 thus there is no point in uploading the orig.tar.gz
[11:26:02] <hashar>	 and I am hitting that because I first do a bunch of debian version 1..n  for precise
[11:26:07] <shinken-wm>	 PROBLEM - Host deployment-parsoid05 is DOWN: CRITICAL - Host Unreachable (10.68.16.120)
[11:26:08] <hashar>	 then build version n for jessie
[11:26:15] <elukey>	 ah ok
[11:26:17] <hashar>	 when jessie never had version 1
[11:26:21] <hashar>	 so it get confused
[11:26:44] <hashar>	 elukey: source https://www.logilab.org/ticket/22071 :]
[11:27:16] <hashar>	 and it must be somewhere in book 32 chapter 11 section 37 paragraph 12  of the Debian Policy (see page 39019 )
[11:27:19] <hashar>	 (I am ranting)
[11:29:11] <elukey>	 ahahha
[11:29:52] <elukey>	 I still didn't get the whole thing but probably I need to eat something and drink coffee
[11:34:17] <elukey>	 I am not getting why version > 2 should have the orig.tar.gz in the repo
[11:34:37] <elukey>	 anyhow, will research after lunch :)
[11:34:42] <elukey>	 thanks for the link
[11:36:02] <grrrit-wm>	 (03PS1) 10Hashar: Merge branch 'debian/precise-wikimedia' into debian/jessie-wikimedia [integration/zuul] (debian/jessie-wikimedia) - 10https://gerrit.wikimedia.org/r/310284 
[11:36:04] <grrrit-wm>	 (03PS1) 10Hashar: New release 2.5.0-8-gcbc7f62-wmf3jessie1 [integration/zuul] (debian/jessie-wikimedia) - 10https://gerrit.wikimedia.org/r/310285 (https://phabricator.wikimedia.org/T103529) 
[11:36:41] <hashar>	 yeah gotta lunch as well
[11:37:00] <grrrit-wm>	 (03PS1) 10Zfilipin: WIP Marionette [selenium] - 10https://gerrit.wikimedia.org/r/310286 (https://phabricator.wikimedia.org/T137540) 
[11:37:18] <hashar>	 then I will look at the Xkcd timeline of earth temperatures (  https://xkcd.com/1732/  )
[11:40:58] <grrrit-wm>	 (03CR) 10jenkins-bot: [V: 04-1] WIP Marionette [selenium] - 10https://gerrit.wikimedia.org/r/310286 (https://phabricator.wikimedia.org/T137540) (owner: 10Zfilipin)
[11:41:50] <grrrit-wm>	 (03CR) 10Hashar: [C: 032] Merge branch 'debian/precise-wikimedia' into debian/jessie-wikimedia [integration/zuul] (debian/jessie-wikimedia) - 10https://gerrit.wikimedia.org/r/310284 (owner: 10Hashar)
[11:41:53] <grrrit-wm>	 (03CR) 10Hashar: [C: 032] New release 2.5.0-8-gcbc7f62-wmf3jessie1 [integration/zuul] (debian/jessie-wikimedia) - 10https://gerrit.wikimedia.org/r/310285 (https://phabricator.wikimedia.org/T103529) (owner: 10Hashar)
[11:43:52] <wikibugs>	 10Continuous-Integration-Infrastructure, 10Packaging, 13Patch-For-Review, 07Zuul: Package / puppetize zuul-clear-refs.py - https://phabricator.wikimedia.org/T103529#2632489 (10hashar) I have rebased the shebang patch. Rebuild package for both Precise and Jessie:  https://people.wikimedia.org/~hashar/debs/z...
[11:51:48] <wikibugs>	 10Continuous-Integration-Infrastructure: jenkins debian-glue job should use Wikimedia debian mirror - https://phabricator.wikimedia.org/T145508#2632490 (10Legoktm)
[11:55:18] <shinken-wm>	 RECOVERY - Puppet run on deployment-salt02 is OK: OK: Less than 1.00% above the threshold [0.0]
[12:10:02] <wikibugs>	 03Scap3, 06Services, 15User-mobrovac: Allow per-environment scap.cfg overrides - https://phabricator.wikimedia.org/T134156#2632530 (10mobrovac)
[12:12:42] <wikibugs>	 03Scap3, 06Services, 15User-mobrovac: Allow per-environment scap.cfg overrides - https://phabricator.wikimedia.org/T134156#2632553 (10mobrovac)
[12:22:48] <wmf-insecte>	 Project selenium-GettingStarted » firefox,beta,Linux,contintLabsSlave && UbuntuTrusty build #143: 04FAILURE in 48 sec: https://integration.wikimedia.org/ci/job/selenium-GettingStarted/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/143/
[12:28:25] <wikibugs>	 10Continuous-Integration-Infrastructure: jenkins debian-glue job should use Wikimedia debian mirror - https://phabricator.wikimedia.org/T145508#2632490 (10hashar) That is hardcoded in the Jenkins debian glue script `piuparts_wrapper`: ``` lang=sh   if [ -n "${MIRROR:-}" ] ; then     echo "*** MIRROR variable is...
[12:29:26] <phuedx>	 hullo releng fellows!
[12:31:05] <wikibugs>	 03Scap3, 06Services, 15User-mobrovac: Scap config management: Jinja2 fills templates with Pythonic values - https://phabricator.wikimedia.org/T145510#2632657 (10mobrovac)
[12:31:15] <hashar>	 hallo
[12:32:01] <phuedx>	 i have a bit of a conundrum and i'm unsure how best to fix it
[12:32:04] <phuedx>	 hashar: o/
[12:32:19] <hashar>	 conundrum ? :D
[12:32:31] <phuedx>	 afaict, the zeroportal and zerobanner extensions need each other to be present in order for their tests to pass
[12:32:47] <phuedx>	 i.e. zerobanner will fail if zerobanner isn't present
[12:33:28] <hashar>	 ah
[12:33:39] <hashar>	 so how do you test it ?  *evil grin*
[12:34:03] <phuedx>	 one way is with unit tests ;) https://github.com/wikimedia/integration-config/blob/d735ce2bda218126640ac191b1bcc3794d0431de/zuul/parameter_functions.py#L168-L170
[12:34:23] <hashar>	 so yeah
[12:34:34] <hashar>	 a test for ZeroPortal  get ZeroBanner injected
[12:34:39] <hashar>	 but the inverse is not true
[12:34:40] <phuedx>	 but //i think// that zerobanner's tests fail without zeroportal present
[12:34:53] <phuedx>	 hashar: is it not-too-evil to do the reverse too?
[12:35:06] <phuedx>	 you have a cyclic dependency check there, right? ;)
[12:35:16] <hashar>	 maybe :D
[12:35:37] <phuedx>	 the real problem for me is that the zerobanner failures are causing mobilefrontend (among others) to fail
[12:35:51] <hashar>	 the thing is that both are in a job  extension-gate
[12:36:04] <hashar>	 that triggers jobs  mediawiki-extensions-*  which have a bunch of a deps
[12:36:07] <hashar>	 kind of a shared job
[12:36:35] <hashar>	 eg https://gerrit.wikimedia.org/r/#/c/310276/
[12:36:46] <hashar>	 if you look at the last test result
[12:36:56] <hashar>	 the first two jobs pass because they get both Zero extensions included
[12:37:26] <hashar>	 ditto for ZeroPortal https://gerrit.wikimedia.org/r/#/c/303378/
[12:38:55] <hashar>	 phuedx: I am just going to drop the more specific jobs
[12:39:03] <hashar>	 and just rely on the mediawiki-testextensions-*  one 
[12:39:30] <phuedx>	 hashar: i'm not sure that fixing it in ci world is the way to go, but it'd help move things along
[12:39:34] <hashar>	 should probably do the same to MobileFrontend
[12:39:51] <phuedx>	 there seems to be some config sharing between zeroportal and zerobanner
[12:39:54] <phuedx>	 which makes the latter hard to fix in isolation
[12:40:06] <hashar>	 yeah
[12:40:09] <hashar>	 same for MF
[12:40:13] <hashar>	 let me clear out a bunch of legacy stuff
[12:40:20] <phuedx>	 \o/
[12:40:45] <phuedx>	 hashar: could you tag commits with T145227
[12:40:49] <hashar>	 sure thing
[12:40:52] <phuedx>	 <3 <3 <3
[12:41:10] <phuedx>	 hashar: cool if i dump this conversation into the phab ticket for posterity?
[12:41:24] <phuedx>	 i'm also going to submit a patch to zeroportal
[12:45:05] <wikibugs>	 10Continuous-Integration-Config, 10MediaWiki-extensions-JsonConfig, 10MediaWiki-extensions-ZeroBanner, 13Patch-For-Review, and 2 others: Zero phpunit test failure (blocks merges to MobileFrontend) - https://phabricator.wikimedia.org/T145227#2632703 (10hashar) That is due to a fault in the CI config.
[12:46:30] <grrrit-wm>	 (03PS1) 10Hashar: ZeroPortal/Banner stop using specific jobs [integration/config] - 10https://gerrit.wikimedia.org/r/310294 (https://phabricator.wikimedia.org/T145227) 
[12:46:40] <hashar>	 phuedx: yeah please copy paste as needed :]
[12:48:24] <wikibugs>	 10Beta-Cluster-Infrastructure, 10Continuous-Integration-Config, 03Scap3, 06Revision-Scoring-As-A-Service, 15User-mobrovac: Deploy beta cluster services automatically via scap3 - https://phabricator.wikimedia.org/T131857#2632719 (10mobrovac)
[12:48:55] * hashar reviews the diff
[12:50:03] <phuedx>	 hashar: dropping zeroportal?
[12:50:19] * phuedx doesn't understand the change fully enough (obviously ;])
[12:52:31] <hashar>	 so yeah
[12:52:44] <shinken-wm>	 PROBLEM - Puppet run on deployment-mx is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0]
[12:52:57] <hashar>	 there are three jobs named  mwext-testextentions-*  that have an hardcoded list of extensions 
[12:53:03] <hashar>	 it clones all those extensions then run tests
[12:53:13] <hashar>	 turns out it has mobilefrontend/zero* and all the required dependencies
[12:53:27] <hashar>	 so those jobs mwext-testextensions-  pass just fine on both repo
[12:53:50] <hashar>	 the other set of jobs mwext-testextension-  (singular)  have the list of deps listed in zuul parameter function
[12:53:57] <hashar>	 sorry the whole crap is really messy and confusing
[12:54:05] <phuedx>	 oic
[12:54:07] <hashar>	 in short, I am going to drop the singular versions
[12:54:21] <hashar>	 and keep the plural versions (which pass,  and those jobs should be renamed )
[12:55:30] <grrrit-wm>	 (03CR) 10Hashar: [C: 032] "Side effect: that adds jsonlint job, but that is not really a problem." [integration/config] - 10https://gerrit.wikimedia.org/r/310294 (https://phabricator.wikimedia.org/T145227) (owner: 10Hashar)
[12:55:42] <hashar>	 lets give it a try
[12:55:51] <hashar>	 once deployed, the faulty jobs will disappear
[12:56:30] <grrrit-wm>	 (03Merged) 10jenkins-bot: ZeroPortal/Banner stop using specific jobs [integration/config] - 10https://gerrit.wikimedia.org/r/310294 (https://phabricator.wikimedia.org/T145227) (owner: 10Hashar)
[12:58:30] <hashar>	 phuedx: so the Zero* extensions should be fine now
[12:58:35] <hashar>	 gotta fix up MobileFrontend now :(
[13:00:23] <hashar>	 phuedx: will clean up MF after the SWAT
[13:00:39] <phuedx>	 it's eu swat already?!
[13:01:00] <phuedx>	 hashar: from tomorrow, i'll be around to observe and help out
[13:01:04] <phuedx>	 as i've increased my hours
[13:01:52] <wikibugs>	 10Continuous-Integration-Config, 10MediaWiki-extensions-JsonConfig, 10MediaWiki-extensions-ZeroBanner, 13Patch-For-Review, and 2 others: Zero phpunit test failure (blocks merges to MobileFrontend) - https://phabricator.wikimedia.org/T145227#2624033 (10phuedx) @hashar reports that there's a little more work...
[13:02:39] <wikibugs>	 10Continuous-Integration-Config, 10MediaWiki-extensions-JsonConfig, 10MediaWiki-extensions-ZeroBanner, 13Patch-For-Review, and 2 others: Zero phpunit test failure (blocks merges to MobileFrontend) - https://phabricator.wikimedia.org/T145227#2632745 (10phuedx) From IRC:  ``` 1:32:31 PM <phuedx> afaict, the...
[13:08:25] <wikibugs>	 03Scap3, 10Parsoid, 06Services: Allow failures for a percentage of targets - https://phabricator.wikimedia.org/T145512#2632755 (10mobrovac)
[13:08:34] <wikibugs>	 03Scap3, 10Parsoid, 06Services, 15User-mobrovac: Allow failures for a percentage of targets - https://phabricator.wikimedia.org/T145512#2632767 (10mobrovac)
[13:32:43] <shinken-wm>	 RECOVERY - Puppet run on deployment-mx is OK: OK: Less than 1.00% above the threshold [0.0]
[14:08:16] <hashar>	 phuedx: so swat is done
[14:09:23] <hashar>	 gotta fix MF now
[14:47:14] <grrrit-wm>	 (03PS1) 10Matthias Mullie: Add EventLogging as UploadWizard dependency for mwext-testextension- [integration/config] - 10https://gerrit.wikimedia.org/r/310313 
[14:47:59] <grrrit-wm>	 (03CR) 10Matthias Mullie: "I don't know too much about our test infra, so please verify that this indeed does what I expect it to do (see commit msg) :)" [integration/config] - 10https://gerrit.wikimedia.org/r/310313 (owner: 10Matthias Mullie)
[14:49:47] <grrrit-wm>	 (03CR) 10Hashar: [C: 032] "That is exactly how we are handling it in CI. Kudos on figuring it out :]" [integration/config] - 10https://gerrit.wikimedia.org/r/310313 (owner: 10Matthias Mullie)
[14:50:22] <grrrit-wm>	 (03Merged) 10jenkins-bot: Add EventLogging as UploadWizard dependency for mwext-testextension- [integration/config] - 10https://gerrit.wikimedia.org/r/310313 (owner: 10Matthias Mullie)
[14:55:07] <grrrit-wm>	 (03CR) 10Hashar: "Deployed :)" [integration/config] - 10https://gerrit.wikimedia.org/r/310313 (owner: 10Matthias Mullie)
[15:01:14] <grrrit-wm>	 (03PS1) 10Hashar: MobileFrontend: stop using specific jobs [integration/config] - 10https://gerrit.wikimedia.org/r/310317 (https://phabricator.wikimedia.org/T145227) 
[15:01:17] <jynus>	 is marxarelli around?
[15:01:46] <hashar>	 marxarelli magically appears
[15:02:10] * jynus uses pokeball
[15:02:17] <grrrit-wm>	 (03CR) 10Hashar: [C: 032] MobileFrontend: stop using specific jobs [integration/config] - 10https://gerrit.wikimedia.org/r/310317 (https://phabricator.wikimedia.org/T145227) (owner: 10Hashar)
[15:02:57] <grrrit-wm>	 (03Merged) 10jenkins-bot: MobileFrontend: stop using specific jobs [integration/config] - 10https://gerrit.wikimedia.org/r/310317 (https://phabricator.wikimedia.org/T145227) (owner: 10Hashar)
[15:05:44] <shinken-wm>	 PROBLEM - Puppet run on deployment-ms-fe01 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0]
[15:05:57] <paladox>	 hashar hi, could you merge https://gerrit.wikimedia.org/r/#/c/310238/ please?
[15:06:07] <paladox>	 It ubreaks the uploadwizzard test
[15:06:43] <hashar>	 paladox: matthias fixed it 
[15:06:52] <paladox>	 Oh 
[15:06:53] <hashar>	 https://gerrit.wikimedia.org/r/310313
[15:06:54] <hashar>	 :D
[15:07:18] <hashar>	 havent tested tough
[15:08:16] <paladox>	 Oh
[15:08:21] <paladox>	 Thanks
[15:09:14] <grrrit-wm>	 (03Abandoned) 10Paladox: [mediawiki/extensions/UploadWizard] Add dependancy on EventLogger [integration/config] - 10https://gerrit.wikimedia.org/r/310238 (owner: 10Paladox)
[15:10:41] <mlitn>	 just did, it works
[15:10:50] <mlitn>	 I hadn’t noticed you already had a patch, paladox :)
[15:11:04] <paladox>	 Oh, yep :)
[15:11:08] <paladox>	 thanks mlitn :)
[15:11:37] <mlitn>	 paladox: existing (failing) also need to be rebased on UW master, the test was also failing because of an issue withing UW
[15:11:47] <paladox>	 Oh
[15:12:46] <hashar>	 mlitn: hello :)
[15:13:06] <hashar>	 I haven't recheck any patch, is everything fine for UW ?
[15:13:06] <paladox>	 I actually fixed the whole commit https://gerrit.wikimedia.org/r/#/c/309852/ on an iphone this mornning
[15:13:14] <hashar>	 ah something is running
[15:13:16] <paladox>	 Since i wasent near a pc but handy gerrit inline editing
[15:13:20] <hashar>	 php55 / hhvm passed already
[15:13:44] <paladox>	 But it was a little diffilcult renaming but in the end i managed it through an iphone :)
[15:15:52] <wikibugs>	 03Scap3, 06Services, 15User-mobrovac: Allow per-environment scap.cfg overrides - https://phabricator.wikimedia.org/T134156#2256060 (10thcipriani) p:05Triage>03High a:03thcipriani
[15:17:58] <paladox>	 hashar i got another change upstream merged https://gerrit-review.googlesource.com/#/c/86011/ :)
[15:18:05] <paladox>	 polygerrit now works on internet explorer
[15:18:10] <paladox>	 and old android versions.
[15:18:49] <marxarelli>	 !log starting 2-hour read-only maintenance window for beta cluster migration
[15:18:52] <qa-morebots>	 Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master
[15:19:02] <wikibugs>	 03Scap3, 06Services, 15User-mobrovac: Scap config management: Jinja2 fills templates with Pythonic values - https://phabricator.wikimedia.org/T145510#2632657 (10thcipriani) p:05Triage>03Normal
[15:22:20] <wikibugs>	 03Scap3, 10Parsoid, 06Services, 15User-mobrovac: Allow failures for a percentage of targets - https://phabricator.wikimedia.org/T145512#2633167 (10thcipriani) p:05Triage>03Normal
[15:25:47] <grrrit-wm>	 (03PS1) 10Hashar: ZeroPortal/ZeroBanner depends on each other [integration/config] - 10https://gerrit.wikimedia.org/r/310326 (https://phabricator.wikimedia.org/T145227) 
[15:28:05] <mlitn>	 hashar: yeah, it works now
[15:29:30] <grrrit-wm>	 (03CR) 10Hashar: [C: 032] ZeroPortal/ZeroBanner depends on each other [integration/config] - 10https://gerrit.wikimedia.org/r/310326 (https://phabricator.wikimedia.org/T145227) (owner: 10Hashar)
[15:30:36] <grrrit-wm>	 (03Merged) 10jenkins-bot: ZeroPortal/ZeroBanner depends on each other [integration/config] - 10https://gerrit.wikimedia.org/r/310326 (https://phabricator.wikimedia.org/T145227) (owner: 10Hashar)
[15:30:46] <paladox>	 hashar yay and my final patch upstream was merged https://gerrit-review.googlesource.com/#/c/85340/ :)
[15:30:53] <paladox>	 Fixes one of our bugs/
[15:31:04] <paladox>	 Now we have to wait for the gerrit 2.12.5 release :)
[15:32:25] <icinga-wm>	 PROBLEM - puppet last run on gallium is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues
[15:32:39] <wikibugs>	 10Continuous-Integration-Config, 10MediaWiki-extensions-JsonConfig, 10MediaWiki-extensions-ZeroBanner, 13Patch-For-Review, and 2 others: Zero phpunit test failure (blocks merges to MobileFrontend) - https://phabricator.wikimedia.org/T145227#2633249 (10hashar) a:03hashar I have put ZeroPortal / ZeroBanner...
[15:32:46] <hashar>	 I am off for a while , be back in like 3+ hours
[15:34:11] <marxarelli>	 !log disabled beta jenkins builds while in maintenance mode
[15:34:15] <qa-morebots>	 Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master
[15:40:11] <wikibugs>	 03Scap3, 10Parsoid: Rollback failed when target is down - https://phabricator.wikimedia.org/T145460#2633278 (10thcipriani) p:05Triage>03Normal Hrm.  ``` rollback stage(s): 100% (ok: 6; fail: 1; left: 0)  ```  **tl;dr**: I'm mostly offloading working memory to ticket format.  How this works ---  So currentl...
[15:42:37] <icinga-wm>	 RECOVERY - puppet last run on gallium is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures
[15:44:29] <shinken-wm>	 PROBLEM - Puppet run on deployment-cache-text04 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0]
[15:45:43] <shinken-wm>	 RECOVERY - Puppet run on deployment-ms-fe01 is OK: OK: Less than 1.00% above the threshold [0.0]
[15:52:55] <wikibugs>	 03Scap3: Scap3 config references to deployed directory - https://phabricator.wikimedia.org/T145437#2633352 (10thcipriani) p:05Triage>03Normal
[16:01:50] <wmf-insecte>	 Project selenium-CentralNotice » firefox,beta,Linux,contintLabsSlave && UbuntuTrusty build #147: 04FAILURE in 49 sec: https://integration.wikimedia.org/ci/job/selenium-CentralNotice/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/147/
[16:04:53] <shinken-wm>	 PROBLEM - Puppet run on deployment-redis01 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0]
[16:16:52] <grrrit-wm>	 (03PS1) 10MarkTraceur: Add EL to FileAnnotations dependencies [integration/config] - 10https://gerrit.wikimedia.org/r/310339 
[16:17:06] <marktraceur>	 If someone could get that deployed they'd be my favourite person
[16:17:27] <paladox>	 marktraceur that would be legoktm or hasharAway
[16:17:49] <marktraceur>	 Yes, legoktm being my favourite person does seem pretty likely
[16:18:17] <grrrit-wm>	 (03CR) 10Paladox: [C: 031] Add EL to FileAnnotations dependencies [integration/config] - 10https://gerrit.wikimedia.org/r/310339 (owner: 10MarkTraceur)
[16:18:54] <wikibugs>	 10Gerrit, 13Patch-For-Review, 07Upstream: Gerrit in Microsoft Edge doesn't display the git commands in the download box - https://phabricator.wikimedia.org/T145130#2620802 (10Dzahn) @paladox can you show the difference on the labs instance?
[16:24:28] <shinken-wm>	 RECOVERY - Puppet run on deployment-cache-text04 is OK: OK: Less than 1.00% above the threshold [0.0]
[16:42:21] <grrrit-wm>	 (03CR) 10Hashar: [C: 032] Add EL to FileAnnotations dependencies [integration/config] - 10https://gerrit.wikimedia.org/r/310339 (owner: 10MarkTraceur)
[16:43:26] <marktraceur>	 hasharAway takes the cake!
[16:44:17] <grrrit-wm>	 (03Merged) 10jenkins-bot: Add EL to FileAnnotations dependencies [integration/config] - 10https://gerrit.wikimedia.org/r/310339 (owner: 10MarkTraceur)
[16:44:29] <paladox>	 Yep
[16:44:34] <paladox>	 thanks hasharAway :0
[16:44:35] <paladox>	 :)
[16:44:54] <shinken-wm>	 RECOVERY - Puppet run on deployment-redis01 is OK: OK: Less than 1.00% above the threshold [0.0]
[16:45:19] <hasharAway>	 marktraceur: deployed!  (I am out again :D )
[16:45:46] <marktraceur>	 hasharAway: <3
[16:48:02] <marktraceur>	 hasharAway: Remind me that I owe you a homebrewed beer at our next meeting
[16:49:54] <hasharAway>	 marktraceur: checked :]
[16:53:53] <paladox>	 I doint know why but i keep publishing patches upstream with my personal account, this time i carnt get it deleted but hopefully no one will no it's me :)
[16:54:32] <James_F>	 Is T114313 (scap3 for MW train deploys) now done?
[16:55:28] <James_F>	 (Going by hazy memory of Greg's e-mail re. SWAT.)
[16:59:50] <marxarelli>	 !log aborting beta cluster db migration due to time constraints and ops outage. will reschedule
[16:59:53] <qa-morebots>	 Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master
[17:00:34] <greg-g>	 sad
[17:00:56] <greg-g>	 James_F: nope
[17:02:33] <marxarelli>	 !log re-enabling beta cluster jenkins jobs following maintenance window
[17:02:36] <qa-morebots>	 Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master
[17:02:56] <bd808>	 greg-g: I love that that ticket for mw->scap3 is a 5 point story. :)(
[17:03:30] <bd808>	 so 1 point must be roughly one person-month of work on that scale
[17:04:36] <greg-g>	 it is?
[17:05:42] <bd808>	 https://phabricator.wikimedia.org/T114313
[17:05:51] * greg-g deletes
[17:05:54] <bd808>	 "5 sotry points"
[17:06:05] <jynus>	 marxarelli, please blame me
[17:06:12] <jynus>	 for taking too much time
[17:06:18] <jynus>	 because of the production issue
[17:06:28] <jynus>	 happening at the same time than the labs maintenance
[17:06:47] <marxarelli>	 greg-g: ^ :)
[17:07:37] <marxarelli>	 no worries, jynus. luckily greg-g is not a vindictive manager, not outwardly at least
[17:07:47] <greg-g>	 jynus: totally blaming you because you should have predicted a production issue arrising at the same time :P
[17:07:58] <jynus>	 well, I am interested on this getting done
[17:08:07] <jynus>	 unifying beta and production
[17:08:14] <greg-g>	 I do have a direct line to Santa Claus, so, yeah, expect coal
[17:08:32] <marxarelli>	 jynus: plus, you showed me all the wonderful percona toys
[17:08:36] <jynus>	 will be good for beta (better tools) and for production (better tests)
[17:08:53] <greg-g>	 word
[17:09:02] <jynus>	 greg-g, I showed him that with MAriadB 10's performance_schema 
[17:09:10] <jynus>	 you will be able to monitor in real time slow queries
[17:09:18] <jynus>	 and wartning and errors
[17:09:29] <greg-g>	 sweet
[17:09:43] <jynus>	 just do not be afraid to ask for help
[17:10:08] <greg-g>	 jynus: thanks
[17:10:08] <jynus>	 see you
[17:13:42] <wikibugs>	 10Continuous-Integration-Config, 10MediaWiki-extensions-JsonConfig, 10MediaWiki-extensions-ZeroBanner, 06Reading-Web-Backlog, and 3 others: Zero phpunit test failure (blocks merges to MobileFrontend) - https://phabricator.wikimedia.org/T145227#2624033 (10MBinder_WMF)
[17:14:13] <hasharAway>	 so now marxarelli is our DBA :]
[17:14:15] <hasharAway>	 \o/
[17:15:29] <shinken-wm>	 PROBLEM - Puppet run on deployment-cache-text04 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0]
[17:20:43] <wikibugs>	 10Beta-Cluster-Infrastructure, 06Release-Engineering-Team, 10DBA, 13Patch-For-Review, 07WorkType-Maintenance: Upgrade mariadb in deployment-prep from Precise/MariaDB 5.5 to Jessie/MariaDB 5.10 - https://phabricator.wikimedia.org/T138778#2633663 (10dduvall) We had to abort the migration due to time constr...
[17:27:28] <wikibugs>	 10Beta-Cluster-Infrastructure, 06Release-Engineering-Team, 10DBA, 13Patch-For-Review, 07WorkType-Maintenance: Upgrade mariadb in deployment-prep from Precise/MariaDB 5.5 to Jessie/MariaDB 5.10 - https://phabricator.wikimedia.org/T138778#2633717 (10jcrespo) @dduvall I leave you here the full tutorial for...
[17:30:36] <wikibugs>	 10Beta-Cluster-Infrastructure, 06Release-Engineering-Team, 10DBA, 13Patch-For-Review, 07WorkType-Maintenance: Upgrade mariadb in deployment-prep from Precise/MariaDB 5.5 to Jessie/MariaDB 5.10 - https://phabricator.wikimedia.org/T138778#2633733 (10dduvall) >>! In T138778#2633717, @jcrespo wrote: > @dduva...
[17:36:27] <mobrovac>	 zeljkof: here?
[17:36:37] <zeljkof>	 mobrovac: yes
[17:36:47] <mobrovac>	 zeljkof: can https://phabricator.wikimedia.org/T145523 be closed then?
[17:36:55] <zeljkof>	 I'm just about to close T145523
[17:37:24] <zeljkof>	 mobrovac: sure, will do in a minute, just finished with updating my machine, everything works now
[17:37:32] <mobrovac>	 kk
[17:37:34] <mobrovac>	 awesome, thnx
[17:38:05] <zeljkof>	 mobrovac: thank you! (resolved)
[17:43:28] <wikibugs>	 10Gerrit, 13Patch-For-Review, 07Upstream: Gerrit in Microsoft Edge doesn't display the git commands in the download box - https://phabricator.wikimedia.org/T145130#2633780 (10Dzahn) 05Open>03Resolved a:03Dzahn
[17:50:29] <shinken-wm>	 RECOVERY - Puppet run on deployment-cache-text04 is OK: OK: Less than 1.00% above the threshold [0.0]
[18:00:01] <wikibugs>	 03Scap3, 10Citoid, 10ContentTranslation-CXserver, 10Graphoid, and 5 others: Depool and repool SCB services during deploys - https://phabricator.wikimedia.org/T144602#2633867 (10mobrovac)
[18:04:27] <paladox>	 RoanKattouw hi, this https://gerrit-review.googlesource.com/#/c/85340/  was merged upstream and will be released in gerrit 2.12.5 is release.
[18:04:35] <paladox>	 Not sure when that will be released though
[18:04:36] <paladox>	 :)
[18:07:12] <RoanKattouw>	 yay
[18:07:29] <paladox>	 Yep :)
[18:34:01] <shinken-wm>	 PROBLEM - Puppet run on deployment-elastic07 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0]
[18:34:10] <wikibugs>	 10Continuous-Integration-Config, 10MediaWiki-extensions-JsonConfig, 10MediaWiki-extensions-ZeroBanner, 06Reading-Web-Backlog, and 3 others: Zero phpunit test failure (blocks merges to MobileFrontend) - https://phabricator.wikimedia.org/T145227#2634072 (10hashar) a:05hashar>03phuedx So @phuedx had the r...
[18:49:02] <shinken-wm>	 RECOVERY - Puppet run on deployment-elastic07 is OK: OK: Less than 1.00% above the threshold [0.0]
[19:14:16] <hashar>	 Krenair: I still dont get filebackend configuration :D  Thank you to have found that instantCommons write to the mw app disk instead of swift ( was https://phabricator.wikimedia.org/T145496 )
[19:14:58] <Krenair>	 I'm just wondering if we should disable InstantCommons
[19:15:44] <Krenair>	 or see if we can let it store its thumbs in swift somehow
[19:20:02] <wmf-insecte>	 Project beta-update-databases-eqiad build #11321: 04FAILURE in 1.7 sec: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/11321/
[19:40:14] <wikibugs>	 03Scap3: Scap3 config references to deployed directory - https://phabricator.wikimedia.org/T145437#2634444 (10mmodell) The symlink swapping is annoying. I wish there was a mode that disabled that and simply rely on `git checkout $tag` to implement rollback - git checkout leaves .gitignore'd files in place, for e...
[19:44:03] <wikibugs>	 10Beta-Cluster-Infrastructure, 03Scap3: Fixup beta scap3 keyholder problems - https://phabricator.wikimedia.org/T144647#2634452 (10mmodell) what if we created a CA and signed all the host keys with that?  We could then have the clients verify the signature based on the CA's signing key instead of having to man...
[19:54:26] <wikibugs>	 10Continuous-Integration-Config, 06Brickimedia, 10MediaWiki-extensions-ArticleFeedbackv5: ArticleFeedbackv5 should pass jshint - https://phabricator.wikimedia.org/T63588#2634479 (10hashar)
[19:55:46] <wikibugs>	 10Continuous-Integration-Config, 06Brickimedia, 10MediaWiki-extensions-ArticleFeedbackv5: ArticleFeedbackv5 should pass jshint - https://phabricator.wikimedia.org/T63588#647960 (10hashar) Forgot, feel free to poke us on IRC channel `#wikimedia-releng`
[20:13:04] <grrrit-wm>	 (03CR) 10Hashar: "Nice! Note that mwext-mw-selenium still fails, but that might be due to another extension (eg MobileFrontend)." [integration/config] - 10https://gerrit.wikimedia.org/r/310141 (https://phabricator.wikimedia.org/T120715) (owner: 10Jdlrobson)
[20:21:12] <wmf-insecte>	 Yippee, build fixed!
[20:21:13] <wmf-insecte>	 Project beta-update-databases-eqiad build #11322: 09FIXED in 1 min 11 sec: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/11322/
[20:34:11] <wikibugs>	 10Beta-Cluster-Infrastructure, 03Scap3: Fixup beta scap3 keyholder problems - https://phabricator.wikimedia.org/T144647#2634619 (10bd808) >>! In T144647#2634452, @mmodell wrote: > what if we created a CA and signed all the host keys with that?  We could then have the clients verify the signature based on the C...
[20:39:59] <wikibugs>	 10Continuous-Integration-Config, 10MediaWiki-extensions-JsonConfig, 10MediaWiki-extensions-ZeroBanner, 06Reading-Web-Backlog, and 3 others: Zero phpunit test failure (blocks merges to MobileFrontend) - https://phabricator.wikimedia.org/T145227#2634643 (10phuedx) Thanks again @hashar!  AFAICT the ZeroBanner...
[20:58:06] <grrrit-wm>	 (03PS2) 10Jdlrobson: Run RelatedArticles browser tests on every commit [integration/config] - 10https://gerrit.wikimedia.org/r/310141 (https://phabricator.wikimedia.org/T120715) 
[20:58:18] <grrrit-wm>	 (03CR) 10Jdlrobson: [C: 031] "This can now be merged." [integration/config] - 10https://gerrit.wikimedia.org/r/310141 (https://phabricator.wikimedia.org/T120715) (owner: 10Jdlrobson)
[20:58:37] <wikibugs>	 10Continuous-Integration-Config, 10MediaWiki-extensions-RelatedArticles, 06Reading-Web-Backlog, 07Browser-Tests, and 2 others: RelatedArticles browser tests should run on a commit basis - https://phabricator.wikimedia.org/T120715#2634768 (10Jdlrobson) We just need to review/merge https://gerrit.wikimedia.o...
[21:05:20] <wikibugs>	 10Continuous-Integration-Config: BotPassword file for FLOSSbot - https://phabricator.wikimedia.org/T145331#2634778 (10hashar) I have seen that task on Sunday, been swamped with other duties though.  I am not sure how to handle the secret, then I am pretty sure #pywikibot-core had a similar use case (run tests ag...
[21:08:41] <wikibugs>	 10Continuous-Integration-Infrastructure (phase-out-gallium), 13Patch-For-Review: Firewall rules for labs support host to communicate with contint1001.wikimedia.org (new gallium) - https://phabricator.wikimedia.org/T137323#2634783 (10hashar) The rule for iridium has been removed properly ( https://gerrit.wikime...
[21:11:26] <wikibugs>	 06Release-Engineering-Team (Deployment-Blockers), 05Release: MW-1.28.0-wmf.19 deployment blockers - https://phabricator.wikimedia.org/T143328#2634790 (10hashar) a:05demon>03hashar Per discussion with @demon I will follow up and handle group 1 and group2.
[21:15:18] <wikibugs>	 06Release-Engineering-Team (Deployment-Blockers), 05Release: MW-1.28.0-wmf.19 deployment blockers - https://phabricator.wikimedia.org/T143328#2634799 (10hashar) I thought it would be nicer to revert in master and fiddles with the rights in the -labs.php files. Chad pointed that is way easier to just revert in...
[21:18:58] <wikibugs>	 06Release-Engineering-Team (Deployment-Blockers), 05Release: MW-1.28.0-wmf.19 deployment blockers - https://phabricator.wikimedia.org/T143328#2634806 (10demon) Already handling the revert: https://gerrit.wikimedia.org/r/#/c/310440/  Should be done shortly.
[21:21:27] <grrrit-wm>	 (03CR) 10Hashar: [C: 032] Run RelatedArticles browser tests on every commit [integration/config] - 10https://gerrit.wikimedia.org/r/310141 (https://phabricator.wikimedia.org/T120715) (owner: 10Jdlrobson)
[21:22:24] <grrrit-wm>	 (03Merged) 10jenkins-bot: Run RelatedArticles browser tests on every commit [integration/config] - 10https://gerrit.wikimedia.org/r/310141 (https://phabricator.wikimedia.org/T120715) (owner: 10Jdlrobson)
[21:35:40] <wikibugs>	 10Continuous-Integration-Config: BotPassword file for FLOSSbot - https://phabricator.wikimedia.org/T145331#2634851 (10dachary) @hashar I'm glad I did not miss anything obvious :-) I'll wait to read @jayvdb advice.
[21:39:21] <wikibugs>	 10Continuous-Integration-Config, 10MediaWiki-extensions-RelatedArticles, 06Reading-Web-Backlog, 07Browser-Tests, and 3 others: RelatedArticles browser tests should run on a commit basis - https://phabricator.wikimedia.org/T120715#2634871 (10hashar) https://gerrit.wikimedia.org/r/#/c/310141/ got merged  CI...
[21:39:46] <wikibugs>	 10Continuous-Integration-Config, 10MediaWiki-extensions-RelatedArticles, 06Reading-Web-Backlog, 07Browser-Tests, and 3 others: RelatedArticles browser tests should run on a commit basis - https://phabricator.wikimedia.org/T120715#1906189 (10hashar)
[22:18:50] <wikibugs>	 06Release-Engineering-Team, 15User-greg: Phabricator-ize q2-4 updated plans - https://phabricator.wikimedia.org/T145583#2634997 (10greg)
[22:20:02] <wmf-insecte>	 Project beta-update-databases-eqiad build #11324: 04FAILURE in 0.67 sec: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/11324/
[22:42:55] <wikibugs>	 10Beta-Cluster-Infrastructure, 10Continuous-Integration-Infrastructure, 10DBA, 10MediaWiki-Database, 07WorkType-NewFunctionality: Enable MariaDB/MySQL's Strict Mode - https://phabricator.wikimedia.org/T108255#2635181 (10AndyRussG)
[22:53:05] <shinken-wm>	 PROBLEM - Puppet run on deployment-conf03 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0]
[22:54:56] <Krenair>	 ^ think it's just upset about a dns record not appearing yet, waiting for labs-ns1 to pick up the record labs-ns0 already has before re-checking
[23:01:02] <Krenair>	 ah, nope, I know what's up
[23:02:50] <wikibugs>	 10Continuous-Integration-Config, 10Fundraising-Backlog, 10MediaWiki-extensions-DonationInterface, 03Fundraising Sprint Qwerty Thwacking, and 3 others: Continuous integration: DonationInterface needs composer variant - https://phabricator.wikimedia.org/T141309#2635276 (10DStrine)
[23:13:05] <shinken-wm>	 RECOVERY - Puppet run on deployment-conf03 is OK: OK: Less than 1.00% above the threshold [0.0]
[23:20:01] <wmf-insecte>	 Project beta-update-databases-eqiad build #11325: 04STILL FAILING in 0.65 sec: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/11325/