[00:00:20] Wouldn't be the worst idea :) [00:00:53] Reedy: Sadly the i18n team are against it. [00:02:15] en-gb.wikipedia.org [00:02:33] legoktm: Interface language, not content language. :-) [00:02:48] James_F: ostriches will use it too, because he's secretly british [00:03:25] That was one time! [00:03:41] Just get James_F to issue you with your passport [00:03:42] * James_F laughs. [00:03:52] I don't actually have passport-issuing authority, sorry. [00:03:58] Much as it pains me to admit it. [00:03:59] He can ask the queen next time he's in the UK [00:04:11] Reedy: … says someone currently in the UK. [00:04:22] James_F: I'm not a southerner [00:04:25] London is a long way away [00:04:39] Reedy: James and the queen are bffs? [00:05:05] I am but a loyal subject of Her Majesty. [00:05:59] Oh dear. [00:06:07] https://www.mediawiki.org/wiki/Extension:VisualEditor [00:06:42] Did someone break styling of infoboxes? [00:06:44] who broke the infobox? [00:07:50] Template:Extension last updated 2 days ago [00:09:06] I did a null edit on https://www.mediawiki.org/wiki/MediaWiki:Gadget-site.css but no effect. [00:09:31] RL likely? [00:09:36] >>> mw.loader.getState('ext.gadget.site'); [00:09:37] ready [00:09:38] ohhhhh [00:09:42] did my gadgets change just go out? [00:09:43] legoktm: Did you break it? [00:09:47] hm [00:10:16] https://gerrit.wikimedia.org/r/#/c/228781/ I'd blame that. [00:10:43] Likely. [00:11:01] Should we pin Gadgets to wmf.7 and see if that fixes it? [00:11:41] Hey RoanKattouw. [00:11:49] (A magic RoanKattouw appears.) [00:12:03] * RoanKattouw tries to use at the public log, but all topic links use a URL shortener that is down [00:12:28] lemme see what else is in the changelog [00:12:33] RoanKattouw: http://bots.wmflabs.org/~wm-bot/logs/%23wikimedia-releng/ [00:12:46] legoktm: https://gerrit.wikimedia.org/r/#/q/status:merged+project:mediawiki/extensions/Gadgets+-age:3w,n,z [00:12:58] yeah, nothing super important [00:13:00] legoktm: Only other change is https://gerrit.wikimedia.org/r/#/c/256021/ [00:13:12] RoanKattouw: TL;DR is look at https://www.mediawiki.org/wiki/Extension:VisualEditor and tell us what's broken :) [00:13:13] (Actual change.) [00:13:36] James_F: it would be nice to have that but not the end of the world [00:13:49] I think we can just revert mine out of wmf.8 for now [00:14:08] legoktm: Pin to wmf.7 first, then validate? Could be other things. [00:14:32] Where do the styles for that infobox come from? Site styles? Gadgets? [00:14:45] https://www.mediawiki.org/wiki/MediaWiki:Gadget-site.css [00:16:14] I'm not seeing any load.php CSS calls with ext.gadget.* in them [00:16:17] RL thinks it's executed that gadget [00:17:18] But the styles obviously aren't there, and it looks like the JS wasn't executed either [00:17:28] No mainpage class on the main page [00:17:38] which should have been added by https://www.mediawiki.org/wiki/MediaWiki:Gadget-site.js [00:18:29] I found it [00:18:35] > var_dump(GadgetRepo::singleton()->getGadget('site')->getScriptsAndStyles()); [00:18:35] array(4) { [00:18:35] [0]=> [00:18:35] string(14) "Gadget-site.js" [00:18:35] [1]=> [00:18:36] string(18) "Gadget-NavFrame.js" [00:18:38] [2]=> [00:18:40] string(15) "Gadget-site.css" [00:18:43] [3]=> [00:18:45] string(19) "Gadget-NavFrame.css" [00:18:47] } [00:18:51] Those should all be "MediaWiki:Gadget-site.js" [00:19:21] mw.loader.implement("ext.gadget.site",function($,jQuery){}) [00:19:24] Well there ya go [00:19:37] Oh, hah [00:19:39] A bunch of missing pages [00:20:16] ok [00:20:18] I think I fixed it [00:20:24] we just need to wait for RL cache to expire now [00:20:36] it just needed a cache purge [00:20:44] > var_dump(GadgetRepo::singleton()->purgeDefinitionCache()); [00:20:53] Did the way the definition cache was stored change or something? [00:21:05] Yes [00:21:23] RoanKattouw: https://gerrit.wikimedia.org/r/#/c/228781/18/includes/MediaWikiGadgetsDefinitionRepo.php,cm [00:22:11] RoanKattouw: +2 on https://gerrit.wikimedia.org/r/257785 ? [00:23:16] thanks [01:02:12] Oh, oops, was https://gerrit.wikimedia.org/r/#/c/253371/ meant to be active for tomorrow? Need it to be in tomorrow's SWAT? [01:10:33] James_F: as long as it's before the swtich to group1 it should be fine. cc thcipriani [01:10:40] is restbase known to be broken in beta? [01:10:47] Yes. [01:10:50] See -services. [01:11:06] greg-g: I'll schedule for the morning? [01:12:19] Also, can we do https://gerrit.wikimedia.org/r/#/c/257788/ now given that MW.org is broken without it. [01:13:26] yes [01:13:33] Kk. [01:39:44] Yippee, build fixed! [01:39:45] Project browsertests-Wikidata-SmokeTests-linux-firefox-sauce build #465: 09FIXED in 22 min: https://integration.wikimedia.org/ci/job/browsertests-Wikidata-SmokeTests-linux-firefox-sauce/465/ [02:12:20] 10Continuous-Integration-Infrastructure, 7Zuul: zuul-cloner fails with "fatal: You don't exist. Go away!" - https://phabricator.wikimedia.org/T120901#1864435 (10Krinkle) 3NEW [02:12:35] 10Continuous-Integration-Infrastructure, 7Zuul: zuul-cloner fails with "fatal: You don't exist. Go away!" - https://phabricator.wikimedia.org/T120901#1864442 (10Krinkle) [02:28:09] ...that's amusing [02:55:14] Project browsertests-MultimediaViewer-mediawiki.org-linux-firefox-sauce build #802: 04FAILURE in 13 sec: https://integration.wikimedia.org/ci/job/browsertests-MultimediaViewer-mediawiki.org-linux-firefox-sauce/802/ [03:04:44] Project beta-scap-eqiad build #81690: 04FAILURE in 0.45 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/81690/ [04:05:21] Yippee, build fixed! [04:05:22] Project beta-scap-eqiad build #81692: 09FIXED in 40 min: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/81692/ [04:19:51] Project browsertests-Wikidata-WikidataTests-linux-chrome-sauce build #234: 04STILL FAILING in 2 hr 50 min: https://integration.wikimedia.org/ci/job/browsertests-Wikidata-WikidataTests-linux-chrome-sauce/234/ [04:33:14] 10MediaWiki-Releasing: Ready-to-use Docker package for MediaWiki - https://phabricator.wikimedia.org/T92826#1864560 (10MarkAHershberger) >>! In T92826#1859027, @GWicke wrote: > There are still several config issues to be worked out before this is fully functional, but it does look promising so far. Thanks for t... [04:35:59] 10Continuous-Integration-Infrastructure, 10MediaWiki-extensions-ContentTranslation, 10Wikidata, 7Jenkins: Wikibase Qunit test fails with CX + ULS - https://phabricator.wikimedia.org/T120907#1864561 (10KartikMistry) 3NEW [04:37:54] Project browsertests-Wikidata-WikidataTests-linux-firefox-sauce build #450: 04STILL FAILING in 3 hr 15 min: https://integration.wikimedia.org/ci/job/browsertests-Wikidata-WikidataTests-linux-firefox-sauce/450/ [05:13:00] 10Deployment-Systems, 10Architecture, 10Wikimedia-Developer-Summit-2016-Organization, 7Availability: WikiDev 16 working area: Software engineering - https://phabricator.wikimedia.org/T119032#1864586 (10RobLa-WMF) Back in September, @cscott commented on {T96903} >>! In T96903#1659718, @cscott wrote: >[We s... [05:43:07] Yippee, build fixed! [05:43:07] Project browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-os_x_10.9-chrome-sauce build #279: 09FIXED in 27 min: https://integration.wikimedia.org/ci/job/browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-os_x_10.9-chrome-sauce/279/ [08:09:53] (03CR) 10Polybuildr: [C: 031] "This looks okay to merge." [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/255558 (owner: 10Paladox) [08:13:20] polybuildr: +2 it then ;) [08:13:54] legoktm: I would, but I'm not very sure of it. Which is why just a +1. :P It's a PHPUnit upgrade and the tests don't seem to fail, but no, me not sure enough. [08:14:29] (03PS2) 10Legoktm: Update phpunit to 4.8.18 [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/255558 (owner: 10Paladox) [08:14:34] (03CR) 10Legoktm: [C: 032] Update phpunit to 4.8.18 [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/255558 (owner: 10Paladox) [08:15:27] (03Merged) 10jenkins-bot: Update phpunit to 4.8.18 [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/255558 (owner: 10Paladox) [08:16:15] polybuildr: oh btw, https://github.com/squizlabs/PHP_CodeSniffer/commit/6167224f7cad4450128c08f8979b3a8e31b9b48e#diff-34ae68f4adad56c25c5bc05dcb64794e [08:16:56] legoktm: oh wow, this is great! :D [08:17:14] It was super annoying when one . changed whether phpcs.xml was read or not -_- [08:18:03] (03CR) 10Legoktm: "We can get rid of the T_HASHBANG hack now since this release includes my upstream patch." [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/255557 (owner: 10Paladox) [08:18:10] yeaaaaaaah :D [08:20:59] Yippee, build fixed! [08:21:00] Project browsertests-CirrusSearch-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #791: 09FIXED in 59 sec: https://integration.wikimedia.org/ci/job/browsertests-CirrusSearch-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/791/ [08:24:29] (03PS1) 10Legoktm: Revert "CharacterBeforePHPOpeningTagSniff: Support T_HASHBANG for HHVM >=3.5,<3.7" [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/257836 [08:24:36] (03PS2) 10Legoktm: Revert "CharacterBeforePHPOpeningTagSniff: Support T_HASHBANG for HHVM >=3.5,<3.7" [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/257836 [08:25:01] polybuildr: ^ quick CR? [08:26:02] legoktm: on it [08:27:19] (03CR) 10Polybuildr: [C: 032] Revert "CharacterBeforePHPOpeningTagSniff: Support T_HASHBANG for HHVM >=3.5,<3.7" [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/257836 (owner: 10Legoktm) [08:28:00] (03Merged) 10jenkins-bot: Revert "CharacterBeforePHPOpeningTagSniff: Support T_HASHBANG for HHVM >=3.5,<3.7" [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/257836 (owner: 10Legoktm) [08:28:27] ty [08:29:34] yw [08:35:14] (03CR) 10Paladox: "Thanks" [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/255558 (owner: 10Paladox) [08:44:00] 10Deployment-Systems, 6Release-Engineering-Team, 5Patch-For-Review, 7user-notice: Move the train deployment from Thursday to Wednesday for some Wikipedia sites - https://phabricator.wikimedia.org/T115002#1864760 (10Luke081515) p:5Triage>3Normal [09:34:47] Project beta-scap-eqiad build #81722: 04FAILURE in 1 min 49 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/81722/ [09:43:49] Yippee, build fixed! [09:43:50] Project beta-scap-eqiad build #81723: 09FIXED in 7 min 10 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/81723/ [09:58:37] 5Testing-Initiative-2015, 10Browser-Tests-Infrastructure, 7JavaScript, 5Patch-For-Review: Experiment with browser testing in other software languages - https://phabricator.wikimedia.org/T108874#1864850 (10zeljkofilipin) [10:30:07] 10Deployment-Systems, 10Architecture, 10Wikimedia-Developer-Summit-2016-Organization, 7Availability: WikiDev 16 working area: Software engineering - https://phabricator.wikimedia.org/T119032#1864891 (10daniel) @RobLa-WMF The multi-lingual topic seems to fit better into {T119029} I think, though it certainl... [10:32:33] 10Continuous-Integration-Infrastructure, 7Zuul: zuul-cloner fails with "fatal: You don't exist. Go away!" - https://phabricator.wikimedia.org/T120901#1864908 (10hashar) 5Open>3Resolved a:3hashar I looked at Gerrit logs and nothing show up. The Zuul merges were on gallium.wikimedia.org and the git-daemon... [12:23:27] 10Continuous-Integration-Infrastructure, 10MediaWiki-extensions-ContentTranslation, 10Wikidata, 7Jenkins: Wikibase Qunit test fails with CX + ULS - https://phabricator.wikimedia.org/T120907#1865065 (10JanZerebecki) [12:27:38] 10Continuous-Integration-Infrastructure, 10MediaWiki-extensions-ContentTranslation, 10Wikidata, 7Jenkins: Wikibase Qunit test fails with CX + ULS - https://phabricator.wikimedia.org/T120907#1865072 (10JanZerebecki) When I mentioned that on IRC I was referring to T117886, which looks like a different bug. T... [12:57:53] !log Upgrading Jenkins Gearman plugin to grab upstream patch https://review.openstack.org/#/c/252768/ 'fix registration for jenkins master' should be noop [12:57:57] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [13:03:48] phedenskog: gilles: Krinkle: hi! I noticed the performance Jenkins job tasks have a 50 minutes timeout. Seems a bit high isn't it? [13:03:54] An example failure: https://integration.wikimedia.org/ci/job/performance-webpagetest-wpt-org/250/console [13:04:36] Bangalore-real3g has a "timeout": 3000000, seems to be 50 mins [13:04:53] and that fail the whole script, might want to mark it as failed and continue [13:43:01] mobrovac: any idea why am I getting "Remote not found in /vagrant/mediawiki for branch 'master'" when running "vagrant git-update"? [13:43:29] zeljkof: which repo? [13:43:52] ah mediawiki duh [13:44:14] I have checked, I do have remote set [13:44:33] git remote -v [13:44:40] origin ssh://zfilipin@gerrit.wikimedia.org:29418/mediawiki/core (fetch) [13:44:47] origin ssh://zfilipin@gerrit.wikimedia.org:29418/mediawiki/core (push) [13:44:55] zeljkof: ah, ssh! [13:45:03] zeljkof: remotes need to be https [13:45:07] I get the same for two other repos [13:45:11] I see [13:45:21] but my remotes are all ssh [13:46:41] and vagrant only complains about a few [13:47:00] mobrovac: but thanks, will change to https [13:47:17] yeah try that first [13:48:16] zeljkof: also, try running git pull manually in that repo, but from inside the guest [13:48:31] might be that you're on detached HEAD of sth [13:48:40] no, all repos are at master [13:48:48] git pull works fine from host [13:48:52] did not try from guest [13:49:04] What. https://integration.wikimedia.org/ci/job/cxserver-deploy-npm/113/console [13:49:08] hashar: ^ [13:50:45] zeljkof: that's the trick, you need to check from the guest, and since your remotes are ssh, it's possible your ssh key is not in the guest [13:51:52] mobrovac: it is not for sure [13:52:04] there you go then :) [13:52:11] but how it then works for other repos at all? [13:52:33] which repos does git-update update? everything under mediawiki/? [13:53:03] this so, yeah [13:53:08] s/this/think/ [13:53:45] strange then that it works for other repos then [13:53:47] 00:00:07.622 chmod: changing permissions of ‘/mnt/home/jenkins-deploy/tmpfs/jenkins-0’: Operation not permitted [13:53:51] kart_: yeah that is annoying [13:54:20] oh [13:54:26] it belongs to www-data www-data [14:01:50] hashar: this is testing on a real 3g connection in india, it's expected to have irregular behaviour, afaik [14:02:04] gilles: I can imagine [14:02:33] gilles: seems the jenkins jobs run a ton of tests and whenever one has a failure, the script wait for up to 50 minutes and then abort skipping the rest of the tests [14:02:44] hashar: (sorry if you're already on it) but I have ‘/mnt/home/jenkins-deploy/tmpfs/jenkins-0’: Operation not permitted (https://gerrit.wikimedia.org/r/#/c/257839/) [14:02:56] hashar: phedenskog would be the one to talk to :) [14:03:03] hashar: yep lets change that. thanks [14:03:05] gilles: might want to catch the failure and record it so the tests can continue [14:03:15] dcausse: yeah kart_ mentionned it [14:03:21] k thanks :) [14:03:34] somehow the directory is created by apache [14:03:43] when it should be owned by jenkins [14:03:46] * hashar blames l10n cache [14:06:48] !log integration-slave-trusty-1011: sudo rm -fR /mnt/home/jenkins-deploy/tmpfs/jenkins-0 ( https://phabricator.wikimedia.org/T120824 ) [14:06:52] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [14:07:38] 10Continuous-Integration-Infrastructure: Dozens of jobs failing on integration-slave-trusty-1012 because chmod fails for /tmp/jenkins-2 - https://phabricator.wikimedia.org/T120824#1865203 (10hashar) 5Resolved>3Open Happened again https://integration.wikimedia.org/ci/job/cxserver-deploy-npm/113/console which... [14:12:45] kart_: dcausse the bug is https://phabricator.wikimedia.org/T120824 gotta dig into it [14:12:45] hashar: thanks! [14:12:45] kart_: dcausse: in short the localization cache files are created by apache / www-data and later we can't chmod 777 as jenkins-bot user [14:12:45] init order trouble [14:13:03] * hashar looks for builds that were running on Dec 9 10:47 [14:14:04] 10Continuous-Integration-Infrastructure: Dozens of jobs failing on integration-slave-trusty-1012 because chmod fails for /tmp/jenkins-2 - https://phabricator.wikimedia.org/T120824#1865208 (10hashar) p:5Triage>3High Fixed with: ssh integration-slave-trusty-1011.integration.eqiad.wmflabs sudo rm -fR /mnt/... [14:14:06] 10Continuous-Integration-Config, 5Patch-For-Review: qunit jobs have Localisation cache under /tmp causing cache pollution between runs - https://phabricator.wikimedia.org/T120356#1851789 (10hashar) Might have caused T120824 [14:15:22] 10MediaWiki-Releasing, 6Developer-Relations, 10Wikimedia-Blog-Content, 3DevRel-December-2015, 5MW-1.26-release: Write blog post announcing MW 1.26 - https://phabricator.wikimedia.org/T112842#1865220 (10Qgil) @greg could you help assessing what is missing and/or when we are ready, please? [14:18:41] hashar: thanks! [14:28:55] so many builds [14:42:24] 10Continuous-Integration-Infrastructure: Dozens of jobs failing on integration-slave-trusty-1012 because chmod fails for /tmp/jenkins-2 - https://phabricator.wikimedia.org/T120824#1865302 (10JanZerebecki) Cleaned again jenkins-3 on integration-slave-trusty-1016.eqiad.wmflabs. [14:43:14] 5Testing-Initiative-2015, 10Browser-Tests-Infrastructure, 7JavaScript, 5Patch-For-Review: Experiment with browser testing in other software languages - https://phabricator.wikimedia.org/T108874#1865305 (10zeljkofilipin) [14:43:44] jzerebecki: somehow the tmpfs dir is created by Apache via qunit [14:43:54] I am trying to find which job does it [14:47:18] 3Scap3: Build a dependency graph resolver for deployment stages and tasks - https://phabricator.wikimedia.org/T120684#1865312 (10mmodell) @mobrovac: This will be a lot simpler than puppet. And 'among stages' probably doesn't need to use a graph resolver. You can see what I have so far in {D76} [14:56:40] 7Browser-Tests, 10Continuous-Integration-Config, 7Ruby: Cucumber linter should run for all repositories that contain Cucumber code - https://phabricator.wikimedia.org/T58251#1865340 (10zeljkofilipin) a:5zeljkofilipin>3None [15:15:42] thcipriani|afk or others: A lot of the ‘staging’ project is rotting due to puppet failures. The first I found is: [15:15:42] Could not find class ::scap::target for staging-restbase03.staging.eqiad.wmflabs [15:15:49] Can someone (who isn’t me) try to resolve those? [15:16:13] andrewbogott: yes, I can take a look at that. [15:16:29] mobrovac: now gerrit asks me for password when I do "git review" but says "remote: Unauthorized" when I type it :( [15:16:30] thcipriani: probably you can’t log in to any of the instances due to… rotting [15:16:43] * zeljkof shakes fist towards gerrit [15:16:45] but maybe you can fix via wikitech node definitions [15:17:25] andrewbogott: I can get to staging-palladium (salt master and puppetmaster) might be enough to fix all the things [15:17:35] zeljkof: that's normal, its' anonymous https :/ [15:17:38] great, thanks [15:17:49] mobrovac: but wait [15:18:03] how am I supposed to develop in a repo with anon https? [15:18:18] I mean, how do I push changes to gerrit? [15:18:53] I have changed the remote url to (example) https://zfilipin@gerrit.wikimedia.org/r/mediawiki/extensions/MultimediaViewer/ [15:19:33] zeljkof: git config insteadOf trick [15:19:38] wait, looking for docs on that [15:19:41] I have used HTTP, not anon HTTP [15:19:56] zeljkof: https://www.mediawiki.org/wiki/MediaWiki-Vagrant#Pushing_commits [15:20:27] mobrovac: thanks, looking [15:20:44] zeljkof: you run that on the host, not in the guest [15:20:58] this will keep vagrant git-update working, while allowing you to push changes to gerrit [15:21:19] zeljkof: note that this implies you will be pushing changes from your host, not from the guest [15:21:44] mobrovac: yes, that is my workflow [15:21:50] perfect [15:21:54] problem solved then :) [15:23:43] still have to change all my https://zfilipin@gerrit... to git clone https://gerrit..., but that is doable :) [15:25:41] (03PS5) 10EBernhardson: Enable submodules for operations/mediawiki-config phpunit tests [integration/config] - 10https://gerrit.wikimedia.org/r/256979 [15:26:11] (03PS6) 10EBernhardson: Enable submodules for operations/mediawiki-config phpunit tests [integration/config] - 10https://gerrit.wikimedia.org/r/256979 [15:32:26] mobrovac: thanks a lot, works great [15:32:32] was not aware of that until now [15:33:10] np zeljkof [15:38:05] andrewbogott: fixed! Thanks for the heads up. [15:38:27] thcipriani: hopefully you’ll be able to log in to your instances again soon :) [15:39:00] forced a puppet run via salt—they're all coming back :) [15:39:12] 10Continuous-Integration-Infrastructure: Dozens of jobs failing on integration-slave-trusty-1012 because chmod fails for /tmp/jenkins-2 - https://phabricator.wikimedia.org/T120824#1865542 (10hashar) So seems mwext-Wikibase-qunit does a few curl requests to the web host before the tmpfs directory had a change to... [15:56:27] !log refreshing nodepool snapshot instance, need a new etcd version [15:56:31] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [15:57:19] ostriches: yup ^^^ :-} [16:03:36] (03PS1) 10Hashar: Add python-etcd to the latest version (for conftool) [integration/config] - 10https://gerrit.wikimedia.org/r/257906 [16:05:38] !log Image ci-jessie-wikimedia-1449676603 in wmflabs-eqiad is ready [16:05:41] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [16:06:42] (03CR) 10Giuseppe Lavagetto: [C: 031] Add python-etcd to the latest version (for conftool) [integration/config] - 10https://gerrit.wikimedia.org/r/257906 (owner: 10Hashar) [16:07:16] (03CR) 10Hashar: [C: 032] "Puppet patch got merged https://gerrit.wikimedia.org/r/#/c/257904/" [integration/config] - 10https://gerrit.wikimedia.org/r/257906 (owner: 10Hashar) [16:08:29] (03Merged) 10jenkins-bot: Add python-etcd to the latest version (for conftool) [integration/config] - 10https://gerrit.wikimedia.org/r/257906 (owner: 10Hashar) [16:15:25] !log Refreshing nodepool snapshots will hopefully grab python-etcd ( https://gerrit.wikimedia.org/r/257906 ) [16:15:28] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [16:18:53] (03CR) 10Hashar: "I have refreshed the snapshot:" [integration/config] - 10https://gerrit.wikimedia.org/r/257906 (owner: 10Hashar) [16:19:11] !log Image ci-jessie-wikimedia-1449677602 in wmflabs-eqiad is ready ( comes with python-etcd ) [16:19:15] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [16:25:40] 10Continuous-Integration-Infrastructure, 5Continuous-Integration-Scaling: Nodepool snapshot refresh should run apt-get upgrade - https://phabricator.wikimedia.org/T120961#1865688 (10hashar) 3NEW [16:27:40] 6Release-Engineering-Team, 7Ruby, 7Tracking: Fix easy problems reported by RuboCop - https://phabricator.wikimedia.org/T91485#1865703 (10zeljkofilipin) [16:30:38] 10Continuous-Integration-Infrastructure, 5Continuous-Integration-Scaling: Nodepool snapshot refresh should run apt-get upgrade - https://phabricator.wikimedia.org/T120961#1865719 (10hashar) Example from an instance: ``` $ apt-get -s upgrade NOTE: This is only a simulation! apt-get needs root privileges f... [16:32:35] 7Browser-Tests, 10Browser-Tests-Infrastructure, 10CirrusSearch, 6Discovery, 7Ruby: Upgrade CirrusSearch browser tests to use mediawiki_selenium 1.x - https://phabricator.wikimedia.org/T99653#1865728 (10zeljkofilipin) a:3zeljkofilipin [16:38:02] 10Continuous-Integration-Infrastructure, 5Continuous-Integration-Scaling: Nodepool instances have duplicate entry for jessie-backports/main - https://phabricator.wikimedia.org/T120963#1865746 (10hashar) 3NEW [16:43:58] 7Browser-Tests, 10Browser-Tests-Infrastructure, 10CirrusSearch, 6Discovery, 7Ruby: Upgrade CirrusSearch browser tests to use mediawiki_selenium 1.x - https://phabricator.wikimedia.org/T99653#1865773 (10zeljkofilipin) @dduvall: I think you have started work on this, but I can not find the patch: https://... [16:49:26] 10Continuous-Integration-Infrastructure: Phase out gallium.wikimedia.org - https://phabricator.wikimedia.org/T95757#1865792 (10hashar) So this need to happen. Precise is definitely legacy and we need to migrate straight up to Jessie. There is a bunch of challenges though since part of what is running on galliu... [16:50:24] 10Continuous-Integration-Infrastructure: Dozens of jobs failing on integration-slave-trusty-1012 because chmod fails for /tmp/jenkins-2 - https://phabricator.wikimedia.org/T120824#1865801 (10JanZerebecki) If the problem is in mwext-Wikibase-qunit but not in mwext-qunit-composer, the first can just be replaced by... [16:51:30] hashar: is that^^ the case? [16:54:42] 10Continuous-Integration-Infrastructure: Dozens of jobs failing on integration-slave-trusty-1012 because chmod fails for /tmp/jenkins-2 - https://phabricator.wikimedia.org/T120824#1865814 (10hashar) Seems `mwext-Wikibase-qunit` has the issue. I haven't looked at mwext-qunit-composer though but it might well suf... [16:54:42] jzerebecki: replied there [16:54:42] mwext-qunit-composer might have the same issue [16:54:43] I havent closely looked at [16:54:43] one sure thing: we have too many of those slave scripts and need some kind of monolitic test runner [16:54:43] would be easier to gasp what is really happening [16:55:25] PROBLEM - Puppet failure on wmfbranch is CRITICAL: CRITICAL: 37.50% of data above the critical threshold [0.0] [17:03:37] thcipriani: I think you are looking at updating staging for the ldap things? [17:03:40] we also have [17:03:42] https://etherpad.wikimedia.org/p/remaining-ldap [17:03:58] these deployment vm's are in a bad state [17:04:00] 1 deployment-eventlogging03.deployment-prep.eqiad.wmflabs. [17:04:00] 1 deployment-kafka03.deployment-prep.eqiad.wmflabs. [17:04:01] 1 deployment-mx.deployment-prep.eqiad.wmflabs. [17:04:44] chasemp: staging should be mostly good, I can doublecheck here after a meeting and finishing SWAT. [17:05:06] thcipriani: thanks -- i crossed off staging things and put your name there already :) [17:05:14] I saw that :) [17:05:15] just wanting to follow up on deployment stuffs but yep [17:07:07] thcipriani: did someone switch off automatic submodule updates? [17:07:45] jzerebecki: they should still work, I'm not sure they worked for SpecialExtensions (i.e. extensions not branched with core), did Wikidata work for .7? [17:08:04] if so, I can take a look at the .gitmodules file, see if there are any differences. [17:09:23] hmm, interesting. So it _does_ look like the .gitmodules for for .7 for the wikidata submodule tracks a branch, whereas it does not for .8. [17:09:53] thcipriani: that is it, creating patch [17:10:02] jzerebecki: thank you! [17:15:41] 10Continuous-Integration-Infrastructure: Dozens of jobs failing on integration-slave-trusty-1012 because chmod fails for /tmp/jenkins-2 - https://phabricator.wikimedia.org/T120824#1865910 (10JanZerebecki) Yea all qunit jobs have that problem as they all use the qunit-karma macro. [17:16:10] How does salt work? [17:16:12] Actually, scratch that. [17:16:16] How does salt attempt to work? [17:16:28] thcipriani: https://gerrit.wikimedia.org/r/#/c/257932/ [17:21:03] chasemp, so I can't actually get into deployment-kafka03 [17:22:49] chasemp, same with deployment-mx [17:22:52] even as root [17:23:09] I suppose puppet is disabled on them [17:23:09] Krenair: me neither ha the second one I can [17:23:22] well it may have been, idk how stale these are tbh yet [17:24:42] I should be able to log in as root on anything that's run puppet in the last few hours [17:25:57] it's a problem [17:27:03] eventlogging03 seems fine so I struck that from the list [17:27:07] kk [17:27:25] I can make deployment-mx work via salt, one sec [17:29:13] why is it running all salt commands twice? [17:29:27] or sometimes running them twice [17:29:27] hm [17:35:08] (03PS1) 10Krinkle: Remove unused legacy wmfgrunt 'qunit-querystring' macros [integration/config] - 10https://gerrit.wikimedia.org/r/257936 [17:35:54] (03CR) 10Krinkle: [C: 032] Remove unused legacy wmfgrunt 'qunit-querystring' macros [integration/config] - 10https://gerrit.wikimedia.org/r/257936 (owner: 10Krinkle) [17:37:29] (03Merged) 10jenkins-bot: Remove unused legacy wmfgrunt 'qunit-querystring' macros [integration/config] - 10https://gerrit.wikimedia.org/r/257936 (owner: 10Krinkle) [17:39:49] Krenair: salt is a mystery :) [17:40:07] kafka03 is up and responding to ping [17:40:14] but not salt and does not seem to run puppet [17:40:33] can you get in as root? [17:41:02] console output makes it look very unhappy indeed [17:41:23] I cannot [17:41:28] (03PS1) 10Krinkle: Remove legacy wmfgrunt infrastructure [integration/jenkins] - 10https://gerrit.wikimedia.org/r/257939 [17:41:50] 2015-12-09T17:15:34.810119+00:00 deployment-kafka03 salt-minion[1214]: [ERROR ] This master address: 'None' was previously resolvable but now fails to resolve! The previously resolved ip addr will continue to be used [17:41:51] (03CR) 10Krinkle: [C: 032] Remove legacy wmfgrunt infrastructure [integration/jenkins] - 10https://gerrit.wikimedia.org/r/257939 (owner: 10Krinkle) [17:42:45] (03PS1) 10Hashar: nodepool: run apt-get upgrade while snapshoting [integration/config] - 10https://gerrit.wikimedia.org/r/257940 (https://phabricator.wikimedia.org/T120961) [17:42:58] 10Continuous-Integration-Infrastructure, 5Continuous-Integration-Scaling, 5Patch-For-Review: Nodepool snapshot refresh should run apt-get upgrade - https://phabricator.wikimedia.org/T120961#1865981 (10hashar) a:3hashar [17:43:00] (03Merged) 10jenkins-bot: Remove legacy wmfgrunt infrastructure [integration/jenkins] - 10https://gerrit.wikimedia.org/r/257939 (owner: 10Krinkle) [17:44:28] 10Continuous-Integration-Infrastructure, 5Continuous-Integration-Scaling, 5Patch-For-Review: Nodepool snapshot refresh should run apt-get upgrade - https://phabricator.wikimedia.org/T120961#1865688 (10hashar) Gotta manually refresh a snapshot to make sure it works as described on https://wikitech.wikimedia.o... [17:45:45] 10Continuous-Integration-Infrastructure, 5Continuous-Integration-Scaling, 5Patch-For-Review: Nodepool snapshot refresh should run apt-get upgrade - https://phabricator.wikimedia.org/T120961#1865996 (10hashar) p:5Triage>3Normal [17:46:15] Krenair: salt keeps breaking on beta for some reason [17:46:37] Krenair: I usually do something like: killall salt-minion ; rm /var/run/salt-minion.pid ; /etc/init.d/salt-minion start [17:46:44] and I run my salt commands with: [17:46:54] salt --timeout 20 --show-timeout ... [17:47:24] (03CR) 10Hashar: "Thanks to have taken the care of cleanup legacy stuff!" [integration/jenkins] - 10https://gerrit.wikimedia.org/r/257939 (owner: 10Krinkle) [17:47:43] yw :) [17:47:52] :-} [17:49:59] !log salt-key --delete deployment-sentry2.eqiad.wmflabs ( already have deployment-sentry2.deployment-prep.eqiad.wmflabs ) [17:50:02] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [17:50:25] good luck, I am out :-} [17:51:39] as soon as that instance starts, the console starts showing issues: [17:51:40] 2015-12-09T17:48:58.664385+00:00 deployment-kafka03 puppet-agent[674]: Could not request certificate: Connection refused - connect(2) for "" port 8140 [17:51:40] 2015-12-09T17:48:58.664699+00:00 deployment-kafka03 rc.local[364]: #033[1;31mError: Could not request certificate: Connection refused - connect(2) for "" port 8140#033[0m [17:51:40] 2015-12-09T17:48:59.067046+00:00 deployment-kafka03 salt-minion[647]: [ERROR ] This master address: 'None' was previously resolvable but now fails to resolve! The previously resolved ip addr will continue to be used [17:53:41] Hm. notifications from phab about https://phabricator.wikimedia.org/T120824 are not going to -dev and not to -releng [17:53:43] where do they go? [17:53:54] 10Continuous-Integration-Infrastructure: Dozens of jobs failing on integration-slave-trusty-1012 because chmod fails for /tmp/jenkins-2 - https://phabricator.wikimedia.org/T120824#1866040 (10Krinkle) a:5Krinkle>3None [17:53:58] can you not get an interactive console from the actual host, chasemp? [17:54:02] OK. I guess it was down for a minute [17:55:53] * Krenair is being dragged off to socialise. back later [17:57:40] Krenair: I usually have to be dragged kicking and screaming. [18:00:00] thcipriani: any config changes deployed with the train yesterday? debugging an issue where all elasticsearch writes except phase0 look to have stopped at 23:30 UTC yesterday [18:00:01] nothing quite matches 23:30 in SAL, but train was around then [18:00:24] ....all writes have stopped? [18:00:31] yes :/ [18:00:44] ostriches: well, phase0 is still writing [18:00:58] mediawikiwiki and testwiki [18:02:21] config seems to be good for enwiki (wgCirrusSearchWriteClusters, wgCirrusSearchDefaultCluster, wgCirrusSearchClusters), tested with eval on terbium [18:02:59] PROBLEM - Free space - all mounts on deployment-bastion is CRITICAL: CRITICAL: deployment-prep.deployment-bastion.diskspace._var.byte_percentfree (<12.50%) [18:03:27] Are jobs being popped and executed or just piling up? [18:03:38] ostriches: job queue looks empty [18:03:47] Hmm, so not inserting them at all. Ok.... [18:03:54] Narrows it down a tad [18:04:10] strange because I can jobs in runJobs.log (like cirrusSearchLinksUpdatePrioritized) [18:04:16] ostriches: i think some are being inserted, because i still see us in https://grafana.wikimedia.org/dashboard/db/job-queue-rate [18:04:38] ostriches: but incoming link count fell off the duplicate inserts (from 80% to 0%) at same time [18:05:06] ebernhardson: the only config changes that went out yesterday with the train were related to the Cards extension [18:05:13] (in meeting) [18:05:29] RECOVERY - Puppet failure on wmfbranch is OK: OK: Less than 1.00% above the threshold [0.0] [18:05:43] ostriches: the links update prioritized still makes up 30% of job queue completed though, so i think they are going in [18:05:58] ebernhardson: Different meeting, but ya. I'll start poking in a few if you guys don't figure it out first. [18:06:02] I'll catch any scrollback [18:06:03] ok thanks [18:06:10] will switch back to -discovery [18:20:53] 10Continuous-Integration-Config, 10Analytics: add CI for repos analytics/limn-*-data - https://phabricator.wikimedia.org/T117416#1866246 (10Milimetric) 5Open>3declined a:3Milimetric we will focus on phasing out limn next year, as part of the project code named {frog} [18:21:48] 10Beta-Cluster-Infrastructure, 10EventBus, 7Beta-Cluster-reproducible: Beta cluster crashes on page save - https://phabricator.wikimedia.org/T120980#1866249 (10Yurik) 3NEW [18:22:12] (03CR) 10Paladox: "@Hashar this can be merged now the patch was merged." [integration/config] - 10https://gerrit.wikimedia.org/r/253287 (owner: 10Paladox) [18:22:49] (03CR) 10Paladox: [C: 04-1] "Woops sorry nope this carn't be merged." [integration/config] - 10https://gerrit.wikimedia.org/r/253287 (owner: 10Paladox) [18:24:53] 10Continuous-Integration-Infrastructure: Dozens of jobs failing on integration-slave-trusty-1012 because chmod fails for /tmp/jenkins-2 - https://phabricator.wikimedia.org/T120824#1866266 (10JanZerebecki) What I don't yet understand: mw-install-sqlite.sh -> mw-setup.sh -> global-setup.sh Which would mean that al... [18:25:41] (03PS1) 10Paladox: [LiquidThreads] Update Jenkins tests [integration/config] - 10https://gerrit.wikimedia.org/r/257955 [18:26:33] (03CR) 10Paladox: "This can be merged." [integration/config] - 10https://gerrit.wikimedia.org/r/257955 (owner: 10Paladox) [18:28:40] (03PS1) 10Paladox: [ContentTranslation] Move jshint to check: [integration/config] - 10https://gerrit.wikimedia.org/r/257957 [18:29:28] (03CR) 10Paladox: "This can be merged." [integration/config] - 10https://gerrit.wikimedia.org/r/257957 (owner: 10Paladox) [18:31:02] Yippee, build fixed! [18:31:02] Project browsertests-CentralNotice-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #527: 09FIXED in 1 min 0 sec: https://integration.wikimedia.org/ci/job/browsertests-CentralNotice-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/527/ [18:50:14] 6Release-Engineering-Team, 7Ruby, 7Tracking: Fix easy problems reported by RuboCop - https://phabricator.wikimedia.org/T91485#1866358 (10Jdlrobson) [18:50:16] 7Browser-Tests, 10Browser-Tests-Infrastructure, 10MediaWiki-extensions-MultimediaViewer, 5Patch-For-Review, and 2 others: Fix easy problems reported by RuboCop in MultimediaViewer - https://phabricator.wikimedia.org/T117984#1866356 (10Jdlrobson) 5Open>3Resolved [19:22:54] 10Beta-Cluster-Infrastructure, 10EventBus, 7Beta-Cluster-reproducible: Beta cluster crashes on page save - https://phabricator.wikimedia.org/T120980#1866458 (10Luke081515) Can't reproduce this: http://en.wikipedia.beta.wmflabs.org/w/index.php?title=DynamicGraph&type=revision&diff=295009&oldid=294386 What di... [19:50:11] 10Beta-Cluster-Infrastructure, 10EventBus, 7Beta-Cluster-reproducible: Beta cluster crashes on page save - https://phabricator.wikimedia.org/T120980#1866563 (10Yurik) I either added a blank line or it was a null edit [20:31:05] 10Deployment-Systems, 10Architecture, 10Wikimedia-Developer-Summit-2016-Organization, 7Availability: WikiDev 16 working area: Software engineering - https://phabricator.wikimedia.org/T119032#1866753 (10RobLa-WMF) >>! In T119032#1864891, @daniel wrote: > I personally think that factoring services out of cor... [20:33:42] 10MediaWiki-Releasing: Ready-to-use Docker package for MediaWiki - https://phabricator.wikimedia.org/T92826#1866757 (10GWicke) A few tweaks later RESTBase and Parsoid are now working out of the box. I have also created a wikimedia/mediawiki image based on benhutchin's original Dockerfile. Next steps: - Hook up... [20:49:53] 10MediaWiki-Releasing: Ready-to-use Docker package for MediaWiki - https://phabricator.wikimedia.org/T92826#1866791 (10Jdforrester-WMF) Nice! [21:13:36] (03PS1) 10Thcipriani: Ensure changes to an extension's .gitreview file [tools/release] - 10https://gerrit.wikimedia.org/r/258030 [21:14:27] (03CR) 10jenkins-bot: [V: 04-1] Ensure changes to an extension's .gitreview file [tools/release] - 10https://gerrit.wikimedia.org/r/258030 (owner: 10Thcipriani) [21:16:55] (03PS2) 10Thcipriani: Ensure changes to an extension's .gitreview file [tools/release] - 10https://gerrit.wikimedia.org/r/258030 [21:17:36] Project browsertests-Flow-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #853: 15ABORTED in 51 min: https://integration.wikimedia.org/ci/job/browsertests-Flow-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce/853/ [21:25:58] 10Beta-Cluster-Infrastructure, 10EventBus, 7Beta-Cluster-reproducible: Beta cluster crashes on page save - https://phabricator.wikimedia.org/T120980#1866961 (10Luke081515) @Yurik Can you replicate your issue again? I can't replicate it with a blank line or a null edit. :-/ [21:53:17] Jenkins is upgraded [21:53:25] to 1.625.3 [21:53:32] see the sec advisory at https://wiki.jenkins-ci.org/display/SECURITY/Jenkins+Security+Advisory+2015-12-09 [21:54:01] an impact is that Content-Security-Policy headers are now sent to prevent img/javascript/css from being loaded when one look at an archived .html file [21:54:18] the only use case I know of is for the apps/android/wikipedia.git repo and I have told them about it [21:54:47] we will want to redirect to ci.wmfusercontent.org or something like that [21:58:20] 10Beta-Cluster-Infrastructure, 10EventBus, 7Beta-Cluster-reproducible: Beta cluster crashes on page save - https://phabricator.wikimedia.org/T120980#1867110 (10Yurik) 5Open>3Resolved a:3Yurik nope, all works now, guess its a self-healing bug )) [22:07:02] 10Beta-Cluster-Infrastructure, 10EventBus, 7Beta-Cluster-reproducible: Beta cluster crashes on page save - https://phabricator.wikimedia.org/T120980#1867166 (10hashar) 5Resolved>3Open That is related to #EventBus ... I believe the error should be investigated by them. [22:10:37] RECOVERY - Puppet staleness on deployment-eventlogging03 is OK: OK: Less than 1.00% above the threshold [3600.0] [22:39:15] thcipriani: twentyafterfour deployment-kafka03 is hosed can we propose for delete? [22:39:17] no one can get in [22:39:17] etc [22:42:09] chasemp: Looks like a newish instance, not sure it even has any roles...trying to find some info about who created it. [22:43:16] thanks thcipriani [22:52:14] marxarelli: I'm curious if when we finish debian packaging of scap, we'd be able to put scap (the library) in the python libs so we can skip the relative imports in bin/. That'd let us just install our bin files directly to /usr/local/bin instead of making the symlinks too. [22:52:31] yeah, looks like deployment-kafka03 has no roles assigned via ldap, no hieradata, nothing in puppet at all on the instance. [22:52:32] Although, now that I write it out I wonder if that makes scap harder to use standalone/in-dev where you don't have a package yet. [22:54:01] ostriches: we should be able to package it without the bin stubs if that makes sense [22:54:18] though i'm not sure if it does without some complimentary scap-bin package [22:54:59] i.e. a goal for packaging post-refactor might be to separate the core bits from the deploy-host side bits [22:55:15] Or we could package scap (the library) as libscap :) [22:55:25] scap-core, scap-deploy, scap-mediawiki, etc. [22:55:30] oh, werd [22:55:32] yuck [22:55:33] libscap :) [22:55:41] libscap3? [22:55:41] bd808: :P [22:55:43] ;-P [22:55:43] why make it all so dam complicated? [22:55:59] It's a simple tool ffs [22:56:34] I get the security issue but not a desire to split it into a zillion little parts [22:56:41] That's just idle chit-chat. [22:56:49] "Might" [22:56:50] "Maybe" [22:56:51] etc. [22:57:09] * bd808 is crabby about other shit today and should just stay off of irc ;) [22:57:09] I don't see us actually building more than one package. [22:57:16] /kick bd808 [22:57:43] bd808: because we're refactoring it into well defined parts that should be easily separated [22:59:36] marxarelli: is D70 actually WIP? It's accepted and land-able but commit summary misleads. [23:00:04] ostriches: yeah, `arc diff` isn't updating the summary for me [23:00:17] Edit it on phab. [23:00:24] It'll get squashed in when you land. [23:00:26] blerg. [23:06:01] * greg-g chest/belly bumps bd808 to chear him up [23:41:06] !log delete deployment-kafka03 doesn't seem to be in-use yet and cannot be accessed via salt or ssh by root or anyone [23:41:10] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [23:44:43] PROBLEM - Host deployment-kafka03 is DOWN: CRITICAL - Host Unreachable (10.68.16.176)