[00:04:14] 6Release-Engineering: RelEng Roadmap April - June 2015 (Q4 2014/2015) - https://phabricator.wikimedia.org/T93955#1151750 (10bd808) [00:04:15] 6Release-Engineering, 6MediaWiki-Core-Team, 6Multimedia, 6Parsoid-Team, and 3 others: Prepare Platform/Ops April 2015 quarterly review presentation - https://phabricator.wikimedia.org/T91803#1151749 (10bd808) [00:04:54] 6Release-Engineering, 6MediaWiki-Core-Team, 6Multimedia, 6Parsoid-Team, and 3 others: Prepare Platform/Ops April 2015 quarterly review presentation - https://phabricator.wikimedia.org/T91803#1096558 (10bd808) [00:04:56] 6Release-Engineering: RelEng Roadmap April - June 2015 (Q4 2014/2015) - https://phabricator.wikimedia.org/T93955#1151060 (10bd808) [00:05:29] Project browsertests-VisualEditor-language-screenshot-os_x_10.10-firefox » om,contintLabsSlave && UbuntuTrusty build #30: FAILURE in 3 hr 57 min: https://integration.wikimedia.org/ci/job/browsertests-VisualEditor-language-screenshot-os_x_10.10-firefox/LANGUAGE_SCREENSHOT_CODE=om,label=contintLabsSlave%20&&%20UbuntuTrusty/30/ [00:08:30] Yippee, build fixed! [00:08:30] Project browsertests-Gather-en.m.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #54: FIXED in 6 min 53 sec: https://integration.wikimedia.org/ci/job/browsertests-Gather-en.m.wikipedia.beta.wmflabs.org-linux-chrome-sauce/54/ [00:13:12] bd808, had a bit of a weird issue with sync-dir wmf-config earlier [00:13:30] Krenair: oh? [00:13:38] ran it again and it was fine [00:13:41] but still :/ [00:14:01] so far you have not reported a bug I can do anything about ;) [00:14:12] "a bit of a weird issue"? [00:14:28] bd808, https://phabricator.wikimedia.org/P434 [00:14:39] I was busy checking the console output to be sure I could paste it [00:15:01] hmm [00:15:29] mw1010 was resyncing at the same time it was serving to mw1012 maybe? [00:15:35] That's not supposed to happen [00:18:14] I wonder if one list uses fqdn and the other is only hostname? [00:18:18] * bd808 looks on tin [00:18:51] frack. yup [00:19:18] Krenair: this is a puppet config bug [00:19:34] fun [00:19:42] we deploy to frack...? [00:20:24] we have one file that lists fqdns like mw1010.eqiad.wmnet and another that only uses hostnames like mw1010 [00:20:58] The first list is subtracted from the second but... since the names are different that doesn't stop problems [00:21:26] Heh. "Frack" as in the BSG version of "fuck" [00:21:56] right, not as in fundraising rack :p [00:22:16] so the manually edited list in puppet needs to be updated to make the names fqdns [00:22:36] because the other list is generated by puppet magic I think [00:22:57] would you mind opening a bug for this? [00:25:41] sure [00:26:43] 10Deployment-Systems, 6operations: Random one-off deployment failure for one host - https://phabricator.wikimedia.org/T93983#1151789 (10Krenair) 3NEW a:3bd808 [00:38:03] 10Deployment-Systems, 6operations: Random one-off deployment failure for one host - https://phabricator.wikimedia.org/T93983#1151838 (10bd808) It appears that mw1010.eqiad.wmnet was running rsync from another source at the same time as mw1012.eqiad.wmnet was syncing from the rsync server on mw1010.eqiad.wmnet.... [02:55:41] (03PS1) 10Mattflaschen: Set wgFlowContentFormat to wikitext until we have Parsoid on Jenkins [integration/jenkins] - 10https://gerrit.wikimedia.org/r/199818 [03:02:13] (03CR) 10Mattflaschen: "I'd like to get this deployed. I believe it's correct (given we don't have Parsoid here yet), and should fix the remaining issue with htt" [integration/jenkins] - 10https://gerrit.wikimedia.org/r/199818 (owner: 10Mattflaschen) [03:54:44] 10Continuous-Integration: Post build exceptions from jUnit about old/invalid log files - https://phabricator.wikimedia.org/T93993#1152092 (10Krinkle) 3NEW a:3Krinkle [03:56:08] thcipriani|afk, is tin:/srv/mediawiki-staging/silver.dblist yours? [03:59:01] (03PS1) 10Krinkle: mw-setup: Clear log directory earlier (setup instead of apply-settings) [integration/jenkins] - 10https://gerrit.wikimedia.org/r/199821 [03:59:10] (03PS2) 10Krinkle: mw-setup: Clear log directory earlier (setup instead of apply-settings) [integration/jenkins] - 10https://gerrit.wikimedia.org/r/199821 (https://phabricator.wikimedia.org/T93993) [03:59:44] Project browsertests-Core-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #561: FAILURE in 12 min: https://integration.wikimedia.org/ci/job/browsertests-Core-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/561/ [04:06:47] (03CR) 10Krinkle: [C: 032] "Tested on integration-slave1405 with mediawiki-extensions-hhvm to verify." [integration/jenkins] - 10https://gerrit.wikimedia.org/r/199821 (https://phabricator.wikimedia.org/T93993) (owner: 10Krinkle) [04:07:20] (03Merged) 10jenkins-bot: mw-setup: Clear log directory earlier (setup instead of apply-settings) [integration/jenkins] - 10https://gerrit.wikimedia.org/r/199821 (https://phabricator.wikimedia.org/T93993) (owner: 10Krinkle) [04:18:53] 10Continuous-Integration, 5Patch-For-Review: Post build exceptions from jUnit about old/invalid log files - https://phabricator.wikimedia.org/T93993#1152126 (10Krinkle) 5Open>3Resolved [04:35:35] (03CR) 10Legoktm: [C: 04-1] "I don't think this is a good idea. Instead you can do something similar to CirrusSearch, which checks if $wgWikimediaJenkinsCI === true, a" [integration/jenkins] - 10https://gerrit.wikimedia.org/r/199818 (owner: 10Mattflaschen) [04:45:55] 6Release-Engineering, 6WMF-Legal, 6operations, 7Documentation: Sphinx generated documentation should state license in footer - https://phabricator.wikimedia.org/T94000#1152198 (10Mattflaschen) 3NEW [04:46:13] Krinkle: now that testextension-zend jobs are on labs slave, we can clean up the workspaces from lanthanum/gallium right? [04:46:56] Krinkle: also, if a job can run on both labs and prod slaves, should it still be pinned to labs slaves for https://phabricator.wikimedia.org/T86659 ? [04:53:16] legoktm: Yes and yes :) [04:54:16] alright [04:55:12] !log deleting mwext-*-testextension-zend workspaces from lanthanum [04:55:58] haha [04:55:59] [21:55:24] RECOVERY - Disk space on lanthanum is OK: DISK OK [05:02:20] 8.1G --> 136G free [05:02:23] damn [05:04:00] a couple hundred full mediawiki clones will do that :D [05:05:04] !log deleting mwext-*-testextension-zend workspaces from gallium [06:17:08] Project browsertests-MultimediaViewer-mediawiki.org-linux-firefox-sauce build #543: FAILURE in 6 min 3 sec: https://integration.wikimedia.org/ci/job/browsertests-MultimediaViewer-mediawiki.org-linux-firefox-sauce/543/ [06:27:56] 10Deployment-Systems, 6operations: Use FQDNs for mediawiki-installation - https://phabricator.wikimedia.org/T93983#1152322 (10greg) [06:28:13] 10Deployment-Systems, 6operations: Use FQDNs for mediawiki-installation - https://phabricator.wikimedia.org/T93983#1151789 (10greg) p:5Triage>3Normal [06:29:07] 22:02 < legoktm> 8.1G --> 136G free [06:29:07] 22:02 < legoktm> damn [06:29:10] damn indeed [06:32:03] (03CR) 10Legoktm: [C: 032] Run "composer test" for WikiEditor, make phpcs voting [integration/config] - 10https://gerrit.wikimedia.org/r/199773 (owner: 10Legoktm) [06:32:59] (03CR) 10Legoktm: [C: 032] Add qunit job for WikiEditor [integration/config] - 10https://gerrit.wikimedia.org/r/199786 (owner: 10Legoktm) [06:36:30] (03Merged) 10jenkins-bot: Run "composer test" for WikiEditor, make phpcs voting [integration/config] - 10https://gerrit.wikimedia.org/r/199773 (owner: 10Legoktm) [06:37:58] (03Merged) 10jenkins-bot: Add qunit job for WikiEditor [integration/config] - 10https://gerrit.wikimedia.org/r/199786 (owner: 10Legoktm) [06:39:30] !log deploying https://gerrit.wikimedia.org/r/199773 and https://gerrit.wikimedia.org/r/199786 [06:44:06] (03PS1) 10Legoktm: Run 'yamllint' job only on labs slaves [integration/config] - 10https://gerrit.wikimedia.org/r/199832 [06:44:08] (03PS1) 10Legoktm: Use standard jobs for WikiEditor [integration/config] - 10https://gerrit.wikimedia.org/r/199833 [06:44:44] (03PS2) 10Legoktm: Use standard jobs for WikiEditor [integration/config] - 10https://gerrit.wikimedia.org/r/199833 [06:45:23] 6Release-Engineering, 6operations: Re-enable codfw scap targets - https://phabricator.wikimedia.org/T93958#1152382 (10Joe) So, just to understand. We merged that change /before/ morning swat the other day, we had 2 swats and a train deploy who worked just fine before you disabled this (also, not disabling the... [06:46:07] (03CR) 10Legoktm: [C: 032] Use standard jobs for WikiEditor [integration/config] - 10https://gerrit.wikimedia.org/r/199833 (owner: 10Legoktm) [06:47:25] (03Merged) 10jenkins-bot: Use standard jobs for WikiEditor [integration/config] - 10https://gerrit.wikimedia.org/r/199833 (owner: 10Legoktm) [06:47:54] !log deploying https://gerrit.wikimedia.org/r/199833 [06:47:59] 6Release-Engineering, 6operations: Re-enable codfw scap targets - https://phabricator.wikimedia.org/T93958#1152386 (10MaxSem) Because eqiad hosts might've attempted to rsync from codfw or something? [06:57:23] (03PS2) 10Legoktm: Run 'yamllint' job only on labs slaves [integration/config] - 10https://gerrit.wikimedia.org/r/199832 [06:57:25] (03PS1) 10Legoktm: Run ruby1.9.3lint jobs on trusty labs slaves [integration/config] - 10https://gerrit.wikimedia.org/r/199836 [06:58:16] (03CR) 10Legoktm: [C: 032] Run 'yamllint' job only on labs slaves [integration/config] - 10https://gerrit.wikimedia.org/r/199832 (owner: 10Legoktm) [07:01:01] (03CR) 10Legoktm: [C: 032] Run ruby1.9.3lint jobs on trusty labs slaves [integration/config] - 10https://gerrit.wikimedia.org/r/199836 (owner: 10Legoktm) [07:02:49] (03Merged) 10jenkins-bot: Run 'yamllint' job only on labs slaves [integration/config] - 10https://gerrit.wikimedia.org/r/199832 (owner: 10Legoktm) [07:05:25] (03Merged) 10jenkins-bot: Run ruby1.9.3lint jobs on trusty labs slaves [integration/config] - 10https://gerrit.wikimedia.org/r/199836 (owner: 10Legoktm) [07:10:39] (03PS1) 10Legoktm: Use "composer test" to run phpcs for Parsoid extension [integration/config] - 10https://gerrit.wikimedia.org/r/199837 [07:17:00] (03CR) 10Legoktm: [C: 032] Use "composer test" to run phpcs for Parsoid extension [integration/config] - 10https://gerrit.wikimedia.org/r/199837 (owner: 10Legoktm) [07:21:24] (03Merged) 10jenkins-bot: Use "composer test" to run phpcs for Parsoid extension [integration/config] - 10https://gerrit.wikimedia.org/r/199837 (owner: 10Legoktm) [07:22:00] !log deploying https://gerrit.wikimedia.org/r/199837 [08:32:30] Project browsertests-VisualEditor-language-screenshot-os_x_10.10-firefox » cs,contintLabsSlave && UbuntuTrusty build #30: SUCCESS in 12 hr: https://integration.wikimedia.org/ci/job/browsertests-VisualEditor-language-screenshot-os_x_10.10-firefox/LANGUAGE_SCREENSHOT_CODE=cs,label=contintLabsSlave%20&&%20UbuntuTrusty/30/ [08:53:46] PROBLEM - Puppet failure on deployment-sca01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [08:54:24] PROBLEM - Puppet failure on deployment-cache-bits01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [08:55:12] 10Continuous-Integration, 10MediaWiki-Codesniffer, 10Possible-Tech-Projects, 3Google-Summer-of-Code-2015, 3Outreachy-Round-10: Improving static analysis tools for MediaWiki - https://phabricator.wikimedia.org/T89682#1152455 (10Qgil) [08:55:12] PROBLEM - Puppet failure on deployment-zotero01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [08:55:13] 10Continuous-Integration, 10MediaWiki-Codesniffer, 10Possible-Tech-Projects, 3Google-Summer-of-Code-2015: GSOC Project Proposal for the Idea : Improving static analysis tools for MediaWiki - https://phabricator.wikimedia.org/T93934#1152456 (10Qgil) [08:55:34] PROBLEM - Puppet failure on deployment-cache-text02 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [08:56:27] PROBLEM - Puppet failure on deployment-pdf01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [09:12:34] PROBLEM - Puppet staleness on deployment-cache-mobile03 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [43200.0] [09:14:55] !lsal [09:14:57] !sal [09:14:57] https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [09:15:20] too many version of myself [09:16:52] !log reenabled puppet agent on deployment-cache-mobile03 . Was disabled with no reason given. [09:17:01] pfff [09:19:12] !log reenabled puppet agent on deployment-cache-mobile03 . Was disabled with no reason given. [09:19:29] qa-morebots: ping [09:19:32] qa-morebots: pok [09:19:42] * hashar gives up [09:21:01] I am a logbot running on tools-exec-11. [09:21:01] Messages are logged to https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL. [09:21:01] To log a message, type !log . [09:21:01] I am a logbot running on tools-exec-11. [09:21:01] Messages are logged to https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL. [09:21:02] To log a message, type !log . [09:21:47] 10Continuous-Integration, 7Browser-Tests: language screenshot job for Persian (fa) seems to run correctly, but marked as failure - https://phabricator.wikimedia.org/T93742#1152461 (10zeljkofilipin) p:5Triage>3Normal [09:25:08] 10Continuous-Integration, 6Release-Engineering, 7Browser-Tests: Move browser test alerts to responsible teams' channels from -releng - https://phabricator.wikimedia.org/T89375#1152471 (10zeljkofilipin) [09:26:02] zeljkof: hallo [09:26:10] aharoni: hi [09:26:37] I ran all the languages last night [09:26:42] still waiting for it to end :) [09:26:52] aharoni: :) [09:32:31] RECOVERY - Puppet staleness on deployment-cache-mobile03 is OK: OK: Less than 1.00% above the threshold [3600.0] [09:36:57] 10Staging, 5Patch-For-Review, 3releng-201415-Q3: [Quarterly Success Metric] Stable uptime metrics of the Staging cluster - https://phabricator.wikimedia.org/T88705#1152476 (10mmodell) https://graphite.wmflabs.org/render?from=-2days&until=now&width=500&height=350&target=deployment-prep.deployment-cache-text02... [10:06:14] (03CR) 10Hashar: Package python deps with dh-virtualenv (031 comment) [integration/zuul] (debian/precise-wikimedia) - 10https://gerrit.wikimedia.org/r/195272 (https://phabricator.wikimedia.org/T48552) (owner: 10Hashar) [10:06:16] (03PS12) 10Hashar: Package python deps with dh-virtualenv [integration/zuul] (debian/precise-wikimedia) - 10https://gerrit.wikimedia.org/r/195272 (https://phabricator.wikimedia.org/T48552) [10:10:26] 10Continuous-Integration, 7Browser-Tests: language screenshot job for Persian (fa) seems to run correctly, but marked as failure - https://phabricator.wikimedia.org/T93742#1152542 (10zeljkofilipin) Looks like this is the problem: https://integration.wikimedia.org/ci/view/BrowserTests/view/VisualEditor/job/bro... [10:14:04] (03PS2) 10Hashar: Forward port precise dh-virtualenv to trusty [integration/zuul] (debian/trusty-wikimedia) - 10https://gerrit.wikimedia.org/r/197329 (https://phabricator.wikimedia.org/T48552) [10:21:18] (03PS3) 10Hashar: Forward port precise dh-virtualenv to trusty [integration/zuul] (debian/trusty-wikimedia) - 10https://gerrit.wikimedia.org/r/197329 (https://phabricator.wikimedia.org/T48552) [10:31:56] (03CR) 10Filippo Giunchedi: "couple of comments here and there, looks good overall" (0311 comments) [integration/zuul] (debian/precise-wikimedia) - 10https://gerrit.wikimedia.org/r/195272 (https://phabricator.wikimedia.org/T48552) (owner: 10Hashar) [10:50:17] 10Continuous-Integration, 7Browser-Tests: language screenshot job for Persian (fa) seems to run correctly, but marked as failure - https://phabricator.wikimedia.org/T93742#1152618 (10zeljkofilipin) ``` $ bundle exec upload Uploading ./screenshots/VisualEditor_Formula_Insert_Menu-fa.png /Library/Ruby/Gems/2.0.0... [10:51:16] 10Continuous-Integration, 7Browser-Tests: language screenshot job for Persian (fa) seems to run correctly, but marked as failure - https://phabricator.wikimedia.org/T93742#1152623 (10zeljkofilipin) Looks like somebody redirected VisualEditor_Formula_Insert_Menu-fa.png to VisualEditor_Insert_Menu-fa https://co... [11:00:20] PROBLEM - Puppet failure on deployment-memc02 is CRITICAL: CRITICAL: 42.86% of data above the critical threshold [0.0] [11:00:30] PROBLEM - Puppet failure on deployment-rsync01 is CRITICAL: CRITICAL: 37.50% of data above the critical threshold [0.0] [11:00:30] PROBLEM - Puppet failure on deployment-jobrunner01 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [11:01:10] PROBLEM - Puppet failure on deployment-elastic07 is CRITICAL: CRITICAL: 42.86% of data above the critical threshold [0.0] [11:01:12] PROBLEM - Puppet failure on deployment-test is CRITICAL: CRITICAL: 62.50% of data above the critical threshold [0.0] [11:01:14] PROBLEM - Puppet failure on deployment-upload is CRITICAL: CRITICAL: 42.86% of data above the critical threshold [0.0] [11:01:34] PROBLEM - Puppet failure on deployment-sentry2 is CRITICAL: CRITICAL: 62.50% of data above the critical threshold [0.0] [11:02:30] PROBLEM - Puppet failure on deployment-zookeeper01 is CRITICAL: CRITICAL: 62.50% of data above the critical threshold [0.0] [11:02:34] PROBLEM - Puppet failure on deployment-bastion is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [11:03:18] PROBLEM - Puppet failure on deployment-memc04 is CRITICAL: CRITICAL: 85.71% of data above the critical threshold [0.0] [11:06:02] PROBLEM - Puppet failure on deployment-mediawiki01 is CRITICAL: CRITICAL: 77.78% of data above the critical threshold [0.0] [11:12:31] RECOVERY - Puppet failure on deployment-zookeeper01 is OK: OK: Less than 1.00% above the threshold [0.0] [11:15:32] RECOVERY - Puppet failure on deployment-jobrunner01 is OK: OK: Less than 1.00% above the threshold [0.0] [11:16:10] RECOVERY - Puppet failure on deployment-elastic07 is OK: OK: Less than 1.00% above the threshold [0.0] [11:16:10] RECOVERY - Puppet failure on deployment-mediawiki01 is OK: OK: Less than 1.00% above the threshold [0.0] [11:16:36] RECOVERY - Puppet failure on deployment-sentry2 is OK: OK: Less than 1.00% above the threshold [0.0] [11:21:14] RECOVERY - Puppet failure on deployment-test is OK: OK: Less than 1.00% above the threshold [0.0] [11:22:21] 10Continuous-Integration, 7Browser-Tests: language screenshot job for Persian (fa) seems to run correctly, but marked as failure - https://phabricator.wikimedia.org/T93742#1152651 (10zeljkofilipin) a:5zeljkofilipin>3Amire80 [11:22:37] RECOVERY - Puppet failure on deployment-bastion is OK: OK: Less than 1.00% above the threshold [0.0] [11:25:17] RECOVERY - Puppet failure on deployment-memc02 is OK: OK: Less than 1.00% above the threshold [0.0] [11:25:29] RECOVERY - Puppet failure on deployment-rsync01 is OK: OK: Less than 1.00% above the threshold [0.0] [11:28:25] RECOVERY - Puppet failure on deployment-memc04 is OK: OK: Less than 1.00% above the threshold [0.0] [11:30:58] (03PS13) 10Hashar: Package python deps with dh-virtualenv [integration/zuul] (debian/precise-wikimedia) - 10https://gerrit.wikimedia.org/r/195272 (https://phabricator.wikimedia.org/T48552) [11:31:52] Yippee, build fixed! [11:31:53] Project browsertests-VisualEditor-language-screenshot-os_x_10.10-firefox » om,contintLabsSlave && UbuntuTrusty build #31: FIXED in 22 min: https://integration.wikimedia.org/ci/job/browsertests-VisualEditor-language-screenshot-os_x_10.10-firefox/LANGUAGE_SCREENSHOT_CODE=om,label=contintLabsSlave%20&&%20UbuntuTrusty/31/ [11:32:43] (03CR) 10Hashar: "PS11 addresses issues reported with a trick to stop embedding the python2.7 binary :)" (0310 comments) [integration/zuul] (debian/precise-wikimedia) - 10https://gerrit.wikimedia.org/r/195272 (https://phabricator.wikimedia.org/T48552) (owner: 10Hashar) [11:35:28] 10Continuous-Integration, 6operations: Get python-gear 0.5.5 to trusty-wikimedia and jessie-wikimedia - https://phabricator.wikimedia.org/T92684#1152660 (10hashar) [11:37:22] 10Continuous-Integration, 7Regression, 7Upstream: Manually starting builds in Jenkins throws "java.lang.IndexOutOfBoundsException: Index: 0, Size: 0" - https://phabricator.wikimedia.org/T93321#1152664 (10hashar) p:5Normal>3Low [11:37:52] 10Continuous-Integration, 7Browser-Tests: language screenshot job for Persian (fa) seems to run correctly, but marked as failure - https://phabricator.wikimedia.org/T93742#1152670 (10zeljkofilipin) @amire80: I have assigned the task to you, since we have found the problem. Could you please assign it to someone... [11:39:11] 10Continuous-Integration, 5Patch-For-Review: Have jenkins jobs logrotate their build history - https://phabricator.wikimedia.org/T91396#1152677 (10hashar) [11:39:12] 10Continuous-Integration: Write a JJB config tests to ensure all jobs logrotate their build history - https://phabricator.wikimedia.org/T93189#1152675 (10hashar) 5Open>3declined Per Timo comment on T91396. We have logrotate in the JJB global default. [11:41:13] 10Continuous-Integration, 5Patch-For-Review: Have jenkins jobs logrotate their build history - https://phabricator.wikimedia.org/T91396#1152681 (10hashar) You are right, the logrotate in the global defaults and in the other one as well. So a meta test would be overkill probably. I am keeping this bug around c... [11:44:37] 10Deployment-Systems, 6Release-Engineering, 6operations, 5Patch-For-Review: /usr/local/bin/deploy2graphite broken on tin due to nc command syntax - https://phabricator.wikimedia.org/T1387#1152683 (10fgiunchedi) thanks Bryan for looking into this! I've improved the script at https://gerrit.wikimedia.org/r/1... [11:53:00] Yippee, build fixed! [11:53:01] Project browsertests-VisualEditor-language-screenshot-os_x_10.10-firefox » fa,contintLabsSlave && UbuntuTrusty build #31: FIXED in 43 min: https://integration.wikimedia.org/ci/job/browsertests-VisualEditor-language-screenshot-os_x_10.10-firefox/LANGUAGE_SCREENSHOT_CODE=fa,label=contintLabsSlave%20&&%20UbuntuTrusty/31/ [11:54:25] 6Release-Engineering, 10Wikimania-Hackathon-2015, 10Wikimedia-Hackathon-2015, 7Browser-Tests, 7I18n: Load i18n messages from MediaWiki to browser tests - https://phabricator.wikimedia.org/T90577#1152695 (10zeljkofilipin) p:5Triage>3Normal [11:54:31] 6Release-Engineering, 10Wikimania-Hackathon-2015, 10Wikimedia-Hackathon-2015, 7Browser-Tests: Create pool of user accounts on beta cluster for browser test builds in Jenkins - https://phabricator.wikimedia.org/T90964#1152696 (10zeljkofilipin) p:5Triage>3Normal [11:54:51] 6Release-Engineering, 7Browser-Tests: Investigate updating browser versions in Jenkins builds. - https://phabricator.wikimedia.org/T92005#1152697 (10zeljkofilipin) p:5Triage>3Normal [11:56:55] 6Release-Engineering, 10Flow, 7Browser-Tests: Investigate updating browser versions in Jenkins builds. - https://phabricator.wikimedia.org/T92005#1152699 (10zeljkofilipin) [11:57:31] 10Continuous-Integration, 10VisualEditor, 7Browser-Tests: language screenshot job for Persian (fa) seems to run correctly, but marked as failure - https://phabricator.wikimedia.org/T93742#1152700 (10zeljkofilipin) [11:58:12] 10Continuous-Integration, 10Flow, 7Browser-Tests, 7Easy: send Flow browser test job notices to #wikimedia-corefeatures channel - https://phabricator.wikimedia.org/T66103#1152701 (10zeljkofilipin) [11:59:57] 10Continuous-Integration, 6Release-Engineering, 10Wikidata, 7Browser-Tests, 5Patch-For-Review: browsertest jobs should not be allowed to run for 10 hours - https://phabricator.wikimedia.org/T92275#1152702 (10zeljkofilipin) [12:03:01] (03CR) 10Filippo Giunchedi: Package python deps with dh-virtualenv (033 comments) [integration/zuul] (debian/precise-wikimedia) - 10https://gerrit.wikimedia.org/r/195272 (https://phabricator.wikimedia.org/T48552) (owner: 10Hashar) [12:11:35] 10Continuous-Integration, 7Browser-Tests: Have wmf-selenium-bot use color to make the reading of the scrollback easier - https://phabricator.wikimedia.org/T64573#1152733 (10zeljkofilipin) [12:13:02] 10Continuous-Integration, 7Browser-Tests: Cucumber linter should run for all repositories that contain Cucumber code - https://phabricator.wikimedia.org/T58251#1152736 (10zeljkofilipin) [12:13:17] 10Continuous-Integration, 7Browser-Tests: Passed Jenkins jobs should have links to Sauce Labs jobs - https://phabricator.wikimedia.org/T48890#1152737 (10zeljkofilipin) [12:14:18] 6Release-Engineering, 10Echo, 10Flow, 6Mobile-Web, 7Browser-Tests: Move user_agent string manipulation to the Ruby gem - https://phabricator.wikimedia.org/T678#1152739 (10zeljkofilipin) [12:27:04] (03CR) 10Hashar: Package python deps with dh-virtualenv (033 comments) [integration/zuul] (debian/precise-wikimedia) - 10https://gerrit.wikimedia.org/r/195272 (https://phabricator.wikimedia.org/T48552) (owner: 10Hashar) [12:27:21] (03PS14) 10Hashar: Package python deps with dh-virtualenv [integration/zuul] (debian/precise-wikimedia) - 10https://gerrit.wikimedia.org/r/195272 (https://phabricator.wikimedia.org/T48552) [12:29:01] 6Release-Engineering, 7Browser-Tests: mediawiki_selenium 0.4.0 does not timeout on sauce_api call - https://phabricator.wikimedia.org/T88221#1152765 (10zeljkofilipin) [12:30:40] (03CR) 10Hashar: "* add sane defaults in the init script and no more die when the default file is missing. Puppet patch https://gerrit.wikimedia.org/r/19986" [integration/zuul] (debian/precise-wikimedia) - 10https://gerrit.wikimedia.org/r/195272 (https://phabricator.wikimedia.org/T48552) (owner: 10Hashar) [12:32:31] 6Release-Engineering, 10Staging, 7Browser-Tests: Run browser test suite against staging cluster - https://phabricator.wikimedia.org/T806#1152767 (10zeljkofilipin) [12:32:56] PROBLEM - Content Translation Server on deployment-cxserver03 is CRITICAL: Connection refused [12:33:40] 6Release-Engineering, 6Engineering-Community, 6Team-Practices, 10Wikimedia-Hackathon-2015, 3ECT-March-2015: RelEng team offsite - May 2015 - Pre Wikimedia Hackathon - https://phabricator.wikimedia.org/T89036#1152770 (10zeljkofilipin) [12:34:28] 10Continuous-Integration, 6Release-Engineering: Learn how Zuul works - https://phabricator.wikimedia.org/T1367#1152774 (10zeljkofilipin) [12:34:48] 10Continuous-Integration, 6Release-Engineering, 7Jenkins: Send beta cluster Jenkins alerts to betacluster-alert list - https://phabricator.wikimedia.org/T1125#1152776 (10zeljkofilipin) [12:35:03] 10Deployment-Systems, 6Release-Engineering, 7Puppet: Puppet failure on deployment-sentry2 - https://phabricator.wikimedia.org/T78411#1152777 (10zeljkofilipin) [12:35:22] 6Release-Engineering, 10MediaWiki-Vagrant: investigate use containers/docker for local test envs - https://phabricator.wikimedia.org/T65956#1152779 (10zeljkofilipin) [12:35:43] 6Release-Engineering, 6translatewiki.net, 7Browser-Tests: When translating a string at translatewiki.net, there should be a screenshot of the page where the string is used - https://phabricator.wikimedia.org/T1366#1152780 (10zeljkofilipin) [12:35:57] 6Release-Engineering, 10MediaWiki-Vagrant: Percentage of WMF production deployed extensions available in Vagrant - https://phabricator.wikimedia.org/T431#1152781 (10zeljkofilipin) [12:36:10] 10Continuous-Integration, 6Release-Engineering, 7Browser-Tests, 7Jenkins: Jenkins: browser test host performance issue for timed builds - https://phabricator.wikimedia.org/T68449#1152782 (10zeljkofilipin) [12:37:55] RECOVERY - Content Translation Server on deployment-cxserver03 is OK: HTTP OK: HTTP/1.1 200 OK - 1103 bytes in 0.022 second response time [13:01:29] 10Continuous-Integration, 10VisualEditor, 7Browser-Tests: language screenshot job for Persian (fa) seems to run correctly, but marked as failure - https://phabricator.wikimedia.org/T93742#1152831 (10Amire80) 5Open>3Resolved One of the files was a redirect to another file. This confused the uploading API,... [13:04:51] What's up Jenkins? [13:04:56] https://integration.wikimedia.org/zuul/ stuck? [13:05:04] hashar: ^^ [13:06:12] Where I can see full log, https://integration.wikimedia.org/ci/job/mwext-ContentTranslation-jslint/2733/ [13:15:27] about to leave [13:15:29] kart_: lookng [13:15:47] seems the job yamllint is stuck [13:17:02] yeah [13:17:07] there is nothing able to run yamllint :) [13:17:32] !log Changes blocked because there is nothing able to run yamllint ( zuul-gearman.py status|grep build:yamllint , shows 8 jobs pending and no worker available) [13:17:37] Logged the message, Master [13:24:04] (03PS1) 10Hashar: Make yamllint job runnable again [integration/config] - 10https://gerrit.wikimedia.org/r/199876 [13:24:51] (03CR) 10Hashar: "That apparently prevented the job from running." [integration/config] - 10https://gerrit.wikimedia.org/r/199832 (owner: 10Legoktm) [13:25:11] (03CR) 10Hashar: [C: 032] Make yamllint job runnable again [integration/config] - 10https://gerrit.wikimedia.org/r/199876 (owner: 10Hashar) [13:25:24] !log yamllint job fixed by altering the label https://gerrit.wikimedia.org/r/#/c/199876/ [13:25:28] Logged the message, Master [13:25:29] kart_: fixed :) [13:25:36] kart_: we will get rid of that yamllint job eventually [13:26:28] kart_: in favor of having devs validate yaml via their own test / yaml implementation [13:26:36] off for a few [13:29:52] (03Merged) 10jenkins-bot: Make yamllint job runnable again [integration/config] - 10https://gerrit.wikimedia.org/r/199876 (owner: 10Hashar) [13:41:56] 10Beta-Cluster, 10Wikimedia-Labs-Infrastructure, 7Tracking: Log files on labs instance fill up disk (/var is only 2GB) (tracking) - https://phabricator.wikimedia.org/T71601#1152917 (10coren) This is "fixed" only insofar as there is now more /var to fill before things break; but that provides no guard against... [13:43:54] 10Beta-Cluster, 6Labs, 6operations: Core dumps fill up /var on labs instances - https://phabricator.wikimedia.org/T1259#1152920 (10coren) a:5coren>3None The new partitioning scheme has more room in /var for stray core dumps; though this does not address the necessity of cleaning/collecting them as apropr... [13:45:25] hasharAway: this is strange [13:45:25] https://gerrit.wikimedia.org/r/#/c/199877/ [13:45:33] Change has been successfully merged into the git repository. [13:45:39] but then: Post-merge build failed. [13:48:46] 10Continuous-Integration: Pool new integration-slave14xx instances and delete old ones - https://phabricator.wikimedia.org/T91524#1152938 (10coren) [13:48:47] 10Continuous-Integration, 6Labs, 10Wikimedia-Labs-Infrastructure, 6operations: dnsmasq returns SERVFAIL for (some?) names that do not exist instead of NXDOMAIN - https://phabricator.wikimedia.org/T92351#1152934 (10coren) 5Open>3Resolved This has been worked around in beta, and the new DNS server (see T... [13:51:15] hasharAway: thank! [14:13:23] Hi zeljkof! How's it going? [14:13:31] AndyRussG: busy busy :) [14:13:39] I undersetand :) [14:14:40] AndyRussG: what's up? [14:14:47] zeljkof: Just wondering if u saw this: https://integration.wikimedia.org/ci/view/BrowserTests/view/CentralNotice/job/browsertests-CentralNotice-en.m.wikipedia.beta.wmflabs.org-os_x_10.10-iphone-sauce/ [14:15:11] It's not that urgent if you have other priorities on your plate! [14:15:19] AndyRussG: no, I do not usually have the time to look at failed jobs [14:15:44] AndyRussG: please create a phab task [14:15:50] so this does not get forgotten [14:15:58] were you able to reproduce the problem on your machine? [14:16:40] zeljkof: ahh... It's just the iphone on sauce, the same test is working on other platforms [14:17:00] AndyRussG: ok, so the test is fine then [14:17:06] but there is a problem with iphone [14:17:06] I'll make a Phab task for sure [14:17:19] do you know how to debug this? [14:17:21] Yeah! I couldn't see what it is by looking at the log [14:17:23] No I don't [14:17:39] But I'm happy to if u give me some pointers [14:17:44] AndyRussG: I would recommend a pairing session then! :) [14:18:05] AndyRussG: could you create a meeting early in your day say tomorrow? [14:18:19] zeljkof: sure you bet [14:18:36] AndyRussG: great, see you tomorrow then :) [14:18:44] zeljkof: thanks much! [14:18:51] AndyRussG: no problem [14:24:24] zeljkof: done! pls feel free to move the event around or modify it as need be of course [14:25:01] zeljkof: that's it, not more screenshots patches for today :) [14:25:04] AndyRussG: the time is perfect [14:25:23] aharoni: I think I have +2d them all, let me check :) [14:25:23] [ except the two already in Gerrit ] [14:25:50] zeljkof: cool! [14:26:35] aharoni: +2d all the commits! :) [14:28:41] elmo delmo! [14:28:43] thanks. [14:37:35] aharoni: elmo delmo? [14:38:40] aharoni: this? https://www.youtube.com/watch?v=tvvCY69oqjo [14:42:17] dah. I forgot What is the path for extensions in beta labs? [14:42:23] ie location [14:47:22] Project browsertests-Wikidata-SmokeTests-linux-firefox-sauce build #198: FAILURE in 30 min: https://integration.wikimedia.org/ci/job/browsertests-Wikidata-SmokeTests-linux-firefox-sauce/198/ [14:48:40] Krenair: 13 hour late response tin:/srv/mediawiki-staging/silver.dblist must have been created when I ran refresh-dblist when deploying https://gerrit.wikimedia.org/r/#/c/171219/ [14:49:09] ^ what do I do when that happens? twentyafterfour ^d ? [14:49:38] we could just commit it [14:49:46] I think we might already have some stuff that relies on it? [14:50:25] <^d> too many f'ing dblists [14:50:34] I don't know [14:50:37] or, maybe not [14:50:58] Krenair: I _did_ deploy it a few weeks ago, but it wasn't any explicit part of that patch, it was just in all.dblist [14:54:06] 6Release-Engineering, 7Browser-Tests: Things to do after Chris leaves - https://phabricator.wikimedia.org/T94032#1153118 (10zeljkofilipin) 3NEW a:3zeljkofilipin [14:54:41] <^d> thcipriani: What about silver.dblist btw? [14:55:05] er, sorry, in wmf-config/db-eqiad.php. ^d it's an uncommited file on tin that I generated during swat. [14:55:27] * ^d ssh's [14:56:18] <^d> Ah, I see. I'd either commit it or nuke it :) [14:56:29] <^d> If something's using it by now, I guess it'll be the former. [14:56:36] <^d> Again: TOO MANY DBLISTS!! [14:58:29] ^d: what's the problem with dblists? [14:58:50] <^d> Too many to keep track of when things move about :) [14:59:04] <^d> We should finish cleaning them up into a single directory though [14:59:08] <^d> Stop dumping them in the root [14:59:22] ori had a commit for it [14:59:35] <^d> Yeah but we can't commit it all at once [14:59:46] <^d> Some of the older dblists (all.dblist, etc) will need symlinks for awhile [14:59:51] <^d> Because puppet shit expects them there [15:00:05] k, I'll create a patch and +2 it here after today's SWAT. [15:00:50] https://gerrit.wikimedia.org/r/175007 [15:16:36] 6Release-Engineering, 7Browser-Tests: Things to do after Chris leaves - https://phabricator.wikimedia.org/T94032#1153195 (10zeljkofilipin) [15:27:22] 10Continuous-Integration, 6Release-Engineering, 7Browser-Tests: It takes about 20 seconds just to start a Sauce Labs browser - https://phabricator.wikimedia.org/T92613#1153268 (10zeljkofilipin) I have pushed a few simple files that I have used to debug to GitHub: https://github.com/zeljkofilipin/page-object-... [15:28:49] 10Continuous-Integration, 6Release-Engineering, 10Wikidata, 7Browser-Tests, 5Patch-For-Review: browsertest jobs should not be allowed to run for 10 hours - https://phabricator.wikimedia.org/T92275#1153284 (10zeljkofilipin) Oops, sorry about that. I am looking at [[ https://integration.wikimedia.org/ci/vi... [15:35:20] (03PS1) 10Zfilipin: Abort browsertests* jobs if they do not complete in 4 hours [integration/config] - 10https://gerrit.wikimedia.org/r/199919 (https://phabricator.wikimedia.org/T92275) [15:39:46] (03CR) 10Zfilipin: "I have deployed browsertests-Wikidata-WikidataTests-linux-firefox-sauce job:" [integration/config] - 10https://gerrit.wikimedia.org/r/199919 (https://phabricator.wikimedia.org/T92275) (owner: 10Zfilipin) [15:41:46] hasharAway: https://github.com/osyo-manga/vim-monster [15:42:02] zeljkof: nice one :) [15:54:24] 6Release-Engineering, 10MediaWiki-Maintenance-scripts, 10MediaWiki-Redirects, 5Patch-For-Review: namespaceDupes not handling deleted namespace redirects as desired - https://phabricator.wikimedia.org/T91401#1153383 (10greg) a:3demon [16:00:04] 10Continuous-Integration, 6Release-Engineering, 10Wikidata, 7Browser-Tests, 5Patch-For-Review: browsertest jobs should not be allowed to run for 10 hours - https://phabricator.wikimedia.org/T92275#1153398 (10zeljkofilipin) 5Open>3Resolved [16:22:34] hashar: https://git.wikimedia.org/tree/mediawiki%2Fextensions%2FContentTranslation.git/3f039e0f341a1fca8b00be4f2f51ba702c9e1454 [16:22:39] in beta. [16:22:46] ie not updated since today. [16:22:49] Known issue? [16:22:58] greg-g: ^^ [16:28:50] (03PS1) 10Hashar: Stop throttling SauceLabs jobs [integration/config] - 10https://gerrit.wikimedia.org/r/199932 [16:29:06] zeljkof: if you are still around, I got a change for tomorrow 1/1 :) https://gerrit.wikimedia.org/r/199932 [16:29:32] (03CR) 10Hashar: "That is for Antoine/Zeljkof Friday pairing session." [integration/config] - 10https://gerrit.wikimedia.org/r/199932 (owner: 10Hashar) [16:30:02] kart_: bah the beta update job is broken :( [16:30:16] need to restart Jenkins [16:30:20] but I must leave right now sorry :( [16:30:23] kids! [16:30:38] !log deadlock on deployment-bastion slave. Someone need to restart Jenkins :( [16:30:44] Logged the message, Master [16:31:51] marxarelli: sorry, I was keeping zeljko long :) [16:32:41] !log I'll start going through the checklist at https://www.mediawiki.org/wiki/Continuous_integration/Jenkins#Hung_beta_code.2Fdb_update [16:32:45] Logged the message, Master [16:33:19] (03CR) 10jenkins-bot: [V: 04-1] Stop throttling SauceLabs jobs [integration/config] - 10https://gerrit.wikimedia.org/r/199932 (owner: 10Hashar) [16:37:34] !log did that checklist once, jobs still not executing, doing again [16:37:39] Logged the message, Master [16:38:12] !log same, nothing executing [16:38:16] Logged the message, Master [16:38:24] !log nothing executing on deployment-bastion, that is [16:38:28] Logged the message, Master [16:39:08] !log going to do a safe-restart of Jenkins https://www.mediawiki.org/wiki/Continuous_integration/Jenkins#Restart_all_of_Jenkins [16:39:13] Logged the message, Master [16:39:45] now we just wait for those long ass zend "unit" tests to finish [16:41:12] (03PS1) 10Greg Grossmeier: Remove Chris from email alerts [integration/config] - 10https://gerrit.wikimedia.org/r/199934 (https://phabricator.wikimedia.org/T94032) [16:42:42] Project browsertests-VisualEditor-language-screenshot-os_x_10.10-firefox » zh-hans,contintLabsSlave && UbuntuTrusty build #32: ABORTED in 1 hr 33 min: https://integration.wikimedia.org/ci/job/browsertests-VisualEditor-language-screenshot-os_x_10.10-firefox/LANGUAGE_SCREENSHOT_CODE=zh-hans,label=contintLabsSlave%20&&%20UbuntuTrusty/32/ [16:42:43] Project browsertests-VisualEditor-language-screenshot-os_x_10.10-firefox » it,contintLabsSlave && UbuntuTrusty build #32: ABORTED in 1 hr 33 min: https://integration.wikimedia.org/ci/job/browsertests-VisualEditor-language-screenshot-os_x_10.10-firefox/LANGUAGE_SCREENSHOT_CODE=it,label=contintLabsSlave%20&&%20UbuntuTrusty/32/ [16:43:07] ^^ was me, sorry [16:46:45] 10Continuous-Integration, 6Scrum-of-Scrums, 6operations, 7Blocked-on-Operations: Jenkins is using php-luasandbox 1.9-1 for zend unit tests; precise should be upgraded to 2.0-8 or equivalent - https://phabricator.wikimedia.org/T88798#1153531 (10Anomie) [16:47:21] 10Continuous-Integration, 6Scrum-of-Scrums, 6operations, 7Blocked-on-Operations: Jenkins is using php-luasandbox 1.9-1 for zend unit tests; precise should be upgraded to 2.0-8 or equivalent - https://phabricator.wikimedia.org/T88798#1020616 (10Anomie) (updating bug title since I see a newer version in [[ht... [16:48:58] !log "Please wait while Jenkins is restarting..." [16:49:02] Logged the message, Master [16:52:58] !log Still.... "Please wait while Jenkins is restarting..." [16:53:02] Logged the message, Master [16:53:48] Jenkins startup is a pig [16:53:49] Krinkle|detached: hashar I might have messed up [16:54:20] It reads a done of xml junk from disk to set everything up [16:54:49] I don't know why it hasn't moved to a more sane and performant config system [16:55:01] * bd808 works on MW, heh [16:55:04] 6Release-Engineering, 7Browser-Tests, 5Patch-For-Review: Things to do after Chris leaves - https://phabricator.wikimedia.org/T94032#1153559 (10greg) [16:55:22] bd808: :) :) [16:56:42] hmmm... [16:56:56] not sure if it is having trouble stopping or starting [16:57:01] :( [16:57:23] gah, sorry about the yamllint thing [16:57:38] I think it is not stopping [16:57:55] the /var/log/jenkins/jenkins.log is getting gearman things written into it [16:58:13] did you follow the instructions on https://wikitech.wikimedia.org/wiki/Jenkins#Restart_Jenkins ? [16:58:18] INFO: Added task PRE_SLEEP to taskAwaiting list. List size = 1( Event was PRE_SLEEP) [16:58:26] you have to manually kill it [16:58:47] greg-g did the web restart thing [16:59:10] yeah [16:59:24] legoktm: where/what? [16:59:29] how? [16:59:42] just kill it from the cli I think [16:59:45] I don't have root sudo on gallium [17:00:12] me neither [17:00:47] marktraceur: oh mark.... [17:01:00] https://github.com/wikimedia/operations-puppet/blob/production/modules/admin/data/data.yaml#L138 just hashar? [17:01:03] marktraceur: you have sudo rights on gallum, can you help us unstuck Jenkins? [17:01:42] no, that's full root [17:02:03] the section above is sudo with only certain access, which says you and bd808 have it [17:02:42] right, isn't mark in the same group? [17:02:56] Oh. I can try the init script [17:03:09] the wiki page says it doesn't work for stopping though [17:03:23] kill -9 the pid? [17:04:12] It's owned by "jenkins" so that requires root [17:04:19] I'll try the init script [17:04:43] * Stopping Jenkins Continuous Integration Server jenkins [17:04:47] ... [17:04:53] ... [17:05:31] greg-g: script returned but pid didn't change [17:05:37] so I think it needs a root kill -9 [17:05:45] well, now I get a 503 at least [17:08:32] !log 0:07 < robh> kill -9 and restarted per instrucitons [17:08:37] Logged the message, Master [17:09:44] greg-g: Yeah, for restarting jenkins one needs "only" contint-admin (sudo jenkins, not root). to kill and then init.d/jenkins start [17:12:55] !log "Please wait while Jenkins is getting ready to work" [17:13:00] Logged the message, Master [17:13:36] jobs appear to be processing? [17:13:38] https://integration.wikimedia.org/zuul/ [17:14:36] !log jobs appear to be processing according to zuul, the Jenkins UI just takes forever to load, apparently [17:14:40] Logged the message, Master [17:16:49] I notice a bunch of my config changes from last night are only just beginning to get post-merge deployment-prep syncs [17:17:31] probably doesn't help that I did like 10 config changes [17:19:37] (03CR) 10jenkins-bot: [V: 04-1] Remove Chris from email alerts [integration/config] - 10https://gerrit.wikimedia.org/r/199934 (https://phabricator.wikimedia.org/T94032) (owner: 10Greg Grossmeier) [17:38:56] (03CR) 10Greg Grossmeier: "eh?" [integration/config] - 10https://gerrit.wikimedia.org/r/199934 (https://phabricator.wikimedia.org/T94032) (owner: 10Greg Grossmeier) [17:42:05] (03PS1) 10MaxSem: Add Gather [tools/release] - 10https://gerrit.wikimedia.org/r/199946 [17:43:13] Project browsertests-UploadWizard-commons.wikimedia.beta.wmflabs.org-linux-firefox-sauce-197975 build #16: ABORTED in 19 sec: https://integration.wikimedia.org/ci/job/browsertests-UploadWizard-commons.wikimedia.beta.wmflabs.org-linux-firefox-sauce-197975/16/ [17:45:00] zeljkof: it does execute that step twice! [17:45:08] marxarelli: :) [17:45:10] fun times [17:45:18] why, oh why? :) [17:45:20] i just reran it and continued on the first breakpoint [17:50:44] Project beta-mediawiki-config-update-eqiad build #2184: FAILURE in 1.4 sec: https://integration.wikimedia.org/ci/job/beta-mediawiki-config-update-eqiad/2184/ [17:57:13] 10Beta-Cluster: error: unable to create file silver.dblist (Permission denied) - https://phabricator.wikimedia.org/T94054#1153787 (10greg) 3NEW a:3thcipriani [17:57:41] thcipriani: ^ (though I guess since you're consistent in your names, you probably already got an IRC ping from that notification...) [17:58:24] greg-g: yup, got it. [17:58:36] * greg-g nods [17:58:57] sorry if I was too quick to report the bug :) [18:00:04] greg-g: heh, no, that's fine. I was just trying to look into what this was from the jenkins email. [18:02:31] I think I ran into this issue before [18:02:33] (03CR) 10Krinkle: "The qunit template has evolved since. Required: Teardown must use mw-teardown-mysql instead of mw-teardown because 'prepare-mediawiki' (wh" [integration/config] - 10https://gerrit.wikimedia.org/r/180418 (https://phabricator.wikimedia.org/T86176) (owner: 10Adrian Lang) [18:02:50] or some sort of permissions error in a beta update [18:07:45] well, if zuul is the one running these jobs...doesn't look like that user is in the project-deployment-prep group... [18:08:18] (03CR) 10Krinkle: "I'm sure it'll come up, but beware of SauceLabs quotas. Having documentation about what kind of account(s) we have is imho mandatory for t" [integration/config] - 10https://gerrit.wikimedia.org/r/199932 (owner: 10Hashar) [18:08:48] which, aside from mwdeploy is the only entity able to write to that srv/mediawiki-staging directory [18:10:20] Project browsertests-UploadWizard-commons.wikimedia.beta.wmflabs.org-linux-firefox-sauce build #564: FAILURE in 27 min: https://integration.wikimedia.org/ci/job/browsertests-UploadWizard-commons.wikimedia.beta.wmflabs.org-linux-firefox-sauce/564/ [18:17:53] greg-g: Sorry I missed your ping [18:18:29] I usually ignore pings from this channel, they tend to be browser test noise [18:20:08] (03CR) 10Mattflaschen: "> I don't think this is a good idea. Instead you can do something similar to CirrusSearch, which checks if $wgWikimediaJenkinsCI === true," [integration/jenkins] - 10https://gerrit.wikimedia.org/r/199818 (owner: 10Mattflaschen) [18:20:53] Project beta-update-databases-eqiad build #8466: FAILURE in 53 sec: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/8466/ [18:20:59] marktraceur: fair [18:26:43] I'm going to pack up here and go work from a coffee shop for the afternoon, bbiab [18:32:56] Yippee, build fixed! [18:32:56] Project browsertests-Core-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #562: FIXED in 13 min: https://integration.wikimedia.org/ci/job/browsertests-Core-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/562/ [19:02:56] bd808: would there be any scap problems on tin or deployment-prep with having mwdeploy user own /srv/mediawiki-staging? [19:04:05] context: mwdeploy can't currently create a file in beta: https://integration.wikimedia.org/ci/view/Beta/job/beta-mediawiki-config-update-eqiad/2184/console [19:04:31] thcipriani: No, but in prod (tin) it needs to be group writable by the mwdeploy group so that deployers can manage the git clones there [19:04:53] Something weird happened in beta with the last round of puppet changes [19:04:59] yeah, it'd stay group writable by "$deployment_group" [19:05:17] bd808: yeah merging roles to work in 3 environments is hard :\ [19:05:19] The user/group permissions got all messed up [19:05:47] I !logged a bunch of chmods I had to do yesterday [19:06:02] the deployment::master role changed from being two rolls + ::common into being one role [19:06:13] s/rolls/roles [19:07:16] On tin the dir is g+S wikidev. With that it really doesn't matter who the owning user is [19:11:34] bd808: yeah it's the same on deployment-prep, only problem is mwdeploy is not a part of the $deployment_group, which is the other option: add mwdeploy to that group [19:12:05] The $deployment_group should be mwdeploy and then everything would work right [19:12:13] That's what changed semi-randomly [19:12:46] ah-ha, so this project-deployment-prep wasn't a thing before these changes? [19:12:56] It wasn't used, no [19:13:38] got it, ok...let me dig a bit and see if this is going to break anything else before I hiera. Thanks for your help! [19:13:47] sure [19:15:07] grumble. The !logs I did yesterday afternoon aren't in the SAL [19:20:25] Yippee, build fixed! [19:20:25] Project beta-update-databases-eqiad build #8467: FIXED in 24 sec: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/8467/ [19:22:49] !log Manually added missing !log entries from 2015-03-25 from my bouncer logs [19:22:53] Logged the message, Master [19:23:11] thcipriani: https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL#March_25 has the things I changed manually now [19:24:21] (03Abandoned) 10Mattflaschen: Set wgFlowContentFormat to wikitext until we have Parsoid on Jenkins [integration/jenkins] - 10https://gerrit.wikimedia.org/r/199818 (owner: 10Mattflaschen) [19:25:07] bd808: got it, yeah, looks like this would be the root cause, digging through now https://gerrit.wikimedia.org/r/#/c/195340/ I think some sudoers stuff got conflated along the way... [19:36:12] PROBLEM - Free space - all mounts on deployment-bastion is CRITICAL: CRITICAL: deployment-prep.deployment-bastion.diskspace._var.byte_percentfree.value (<12.50%) [20:13:40] 6Release-Engineering: Create projects and tasks for RelEng 201415Q4 goals - https://phabricator.wikimedia.org/T94080#1154339 (10greg) 3NEW a:3greg [20:14:13] (03PS1) 10Dduvall: Fix double yield bug in `PageFactory#on` [selenium] - 10https://gerrit.wikimedia.org/r/199987 (https://phabricator.wikimedia.org/T94079) [20:14:26] 6Release-Engineering: Create projects and tasks for RelEng 201415Q4 goals - https://phabricator.wikimedia.org/T94080#1154352 (10greg) Used releng-201415-q3 last quarter, will continue with the theme... #releng-201415-q4 [20:17:43] (03CR) 10Zfilipin: [C: 032] Fix double yield bug in `PageFactory#on` [selenium] - 10https://gerrit.wikimedia.org/r/199987 (https://phabricator.wikimedia.org/T94079) (owner: 10Dduvall) [20:17:59] (03Merged) 10jenkins-bot: Fix double yield bug in `PageFactory#on` [selenium] - 10https://gerrit.wikimedia.org/r/199987 (https://phabricator.wikimedia.org/T94079) (owner: 10Dduvall) [20:18:15] marxarelli: feel free to release new version of mediawiki_selenium [20:18:39] zeljkof: will do! [20:19:25] man, i stared at that for a while before realizing "oh shit. why the hell am i calling tap there!?" [20:19:45] (03CR) 10Zfilipin: "recheck" [integration/config] - 10https://gerrit.wikimedia.org/r/199934 (https://phabricator.wikimedia.org/T94032) (owner: 10Greg Grossmeier) [20:20:03] (03CR) 10Zfilipin: "recheck" [integration/config] - 10https://gerrit.wikimedia.org/r/199932 (owner: 10Hashar) [20:20:04] 10Continuous-Integration, 7Epic, 3releng-201415-Q3, 3releng-201415-q4: [Quarterly Success Metric] Jenkins: Run jobs in disposable VMs - https://phabricator.wikimedia.org/T47499#1154370 (10greg) [20:20:20] 6Release-Engineering, 10Staging, 3releng-201415-Q3, 3releng-201415-q4: [Quarterly Success Metric] Green nightly builds on the staging cluster (tracking) - https://phabricator.wikimedia.org/T88701#1154371 (10greg) [20:20:29] 6Release-Engineering, 5MW-1.25-release, 3releng-201415-Q3, 3releng-201415-q4: [Quarterly Success Metric] Release MediaWiki 1.25 - https://phabricator.wikimedia.org/T88709#1154372 (10greg) [20:20:33] zeljkof: we'll need to update the repos that use mediawiki_selenium 1.0 [20:20:46] marxarelli: but how nothing else failed? [20:20:52] or we just did not notice [20:21:05] I think only mediawiki/core is updated to 1.x [20:21:39] there are only a couple of repos using it, and perhaps they don't use on(Page) with a block? [20:27:12] 6Release-Engineering, 6Collaboration-Team, 6Editing, 6Engineering-Community, and 15 others: Create team projects for all teams participating in scrum of scrums - https://phabricator.wikimedia.org/T1211#1154422 (10Yurik) [20:27:31] (03PS1) 10Dduvall: Releasing patch version 1.0.2 [selenium] - 10https://gerrit.wikimedia.org/r/199996 [20:27:42] zeljkof: ^ [20:28:00] 6Release-Engineering, 6Collaboration-Team, 6Editing, 6Engineering-Community, and 14 others: Create team projects for all teams participating in scrum of scrums - https://phabricator.wikimedia.org/T1211#20962 (10Yurik) [20:28:43] 10Beta-Cluster, 5Patch-For-Review: error: unable to create file silver.dblist (Permission denied) - https://phabricator.wikimedia.org/T94054#1154450 (10thcipriani) When `role::deployment::deployment_server::{labs,production}` were combined there was an odd combination of two sudoers declarations: https://gith... [20:29:59] marxarelli: https://phabricator.wikimedia.org/T94083 [20:30:10] zeljkof: thanks! [20:30:45] (03CR) 10Zfilipin: [C: 032] Releasing patch version 1.0.2 [selenium] - 10https://gerrit.wikimedia.org/r/199996 (owner: 10Dduvall) [20:31:03] (03Merged) 10jenkins-bot: Releasing patch version 1.0.2 [selenium] - 10https://gerrit.wikimedia.org/r/199996 (owner: 10Dduvall) [20:31:18] marxarelli: ^ [20:32:28] zeljkof: wee! https://rubygems.org/gems/mediawiki_selenium/versions/1.0.2 [20:33:58] and with that, i'm off to lunch [20:40:44] 10Continuous-Integration, 6Security: Upgrade Jenkins from v1.480.2 to v1.480.3 - https://phabricator.wikimedia.org/T47147#1154626 (10csteipp) [20:40:45] 10Continuous-Integration, 6Security: Jenkins run 'rake validate' on any ops/puppet change - https://phabricator.wikimedia.org/T44929#1154632 (10csteipp) [20:40:47] 10Continuous-Integration, 6Security: Jenkins security issue - https://phabricator.wikimedia.org/T45725#1154631 (10csteipp) [20:42:12] Yippee, build fixed! [20:42:13] Project browsertests-MultimediaViewer-mediawiki.org-linux-firefox-sauce build #544: FIXED in 4 min 8 sec: https://integration.wikimedia.org/ci/job/browsertests-MultimediaViewer-mediawiki.org-linux-firefox-sauce/544/ [20:53:56] Project browsertests-Gather-en.m.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #56: FAILURE in 7 min 26 sec: https://integration.wikimedia.org/ci/job/browsertests-Gather-en.m.wikipedia.beta.wmflabs.org-linux-chrome-sauce/56/ [21:08:26] Project browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-windows_7-internet_explorer-9-sauce build #389: FAILURE in 45 min: https://integration.wikimedia.org/ci/job/browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-windows_7-internet_explorer-9-sauce/389/ [21:15:24] what is the appropriate way to retrigger just a postmerge stask in jenkins? [21:15:26] Project beta-scap-eqiad build #46451: FAILURE in 1 min 23 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/46451/ [21:15:35] s/stask/task [21:16:23] I could just click 'rebuild' but I'm not sure if that would confuse zuul [21:17:06] it won't confuse zuul, I don't believe [21:17:19] not everytihng Jenkins does is known by zuul [21:17:51] kk, here goes [21:18:07] Yippee, build fixed! [21:18:07] Project beta-mediawiki-config-update-eqiad build #2185: FIXED in 1.2 sec: https://integration.wikimedia.org/ci/job/beta-mediawiki-config-update-eqiad/2185/ [21:18:19] yay [21:18:27] hot diggity. [21:19:33] now just need to get that change merged to get the cherry pick out of deployment-salt... [21:30:37] (03CR) 10MaxSem: [C: 032] Add Gather [tools/release] - 10https://gerrit.wikimedia.org/r/199946 (owner: 10MaxSem) [22:00:31] (03Merged) 10jenkins-bot: Add Gather [tools/release] - 10https://gerrit.wikimedia.org/r/199946 (owner: 10MaxSem) [22:05:53] Eh.. beta is 302 forcing https, but https is (for as long as I remember) broken for beta labs [22:06:07] A rather stranger header, too. Status Code:302 forced.302 [22:06:15] HTTP/1.1 302 forced.302 [22:06:25] http://en.wikipedia.beta.wmflabs.org/wiki/MediaWiki:Editnotice [22:06:41] Hm.. doesn't affect when I'm logged out [22:06:58] greg-g: ^ [22:07:22] Even plain hit on http://en.wikipedia.beta.wmflabs.org/ redirects the same way [22:09:04] Cookie: forceHTTPS | 1 | .wikipedia.beta.wmflabs.org | 2015-04-22T05:37:16.029Z | Secure Yes [22:09:14] Krinkle: it is? I can't reproduce [22:09:20] I'm logged in [22:09:29] Something set that cookie [22:09:37] I've removed and its working now [22:09:45] yeah... that happened before once.... [22:10:15] greg-g: It's doing it again when I log in [22:10:15] I'm thinking of this: https://phabricator.wikimedia.org/T72145 which isn't the same thing [22:10:23] lemme try loggint out/in [22:10:35] From Special:Login forwarded to https://login.wikimedia.beta.wmflabs.org/wiki/Special:CentralLogin/start?token=xxxxx [22:10:48] still fine for me [22:12:34] stupid effing ssl [22:12:46] I still can't reproduce [22:12:58] Krinkle: file a task, please? [22:19:21] Krinkle: i also had a redirect there to www.en.wikipedia.beta.wmflabs [22:19:23] greg-g: will do later, I;m already in 5th level mental 'else' branch from 'oh, that didn't work' [22:19:31] before I lose track of what I was doing [22:19:32] might be EU only ? [22:19:36] Krinkle: :) :) sorry [22:19:52] thedj: shouldn't be, wmflabs / beta cluster is eqiad only [22:20:22] incl any caching and dns ? [22:20:59] i don't know, something is wrong, i also noticed it over the past 2 days. but other ppl were unable to reproduce [22:21:24] thedj: I'm 99% sure yes, caching and dns also [23:00:39] (03PS1) 10Mattflaschen: Revert "Temporarily remove Flow from mediawiki-extensions combo group" [integration/config] - 10https://gerrit.wikimedia.org/r/200062 [23:01:44] (03PS2) 10Mattflaschen: Revert "Temporarily remove Flow from mediawiki-extensions combo group" [integration/config] - 10https://gerrit.wikimedia.org/r/200062 [23:06:26] (03CR) 10jenkins-bot: [V: 04-1] Revert "Temporarily remove Flow from mediawiki-extensions combo group" [integration/config] - 10https://gerrit.wikimedia.org/r/200062 (owner: 10Mattflaschen) [23:11:59] (03PS1) 10Legoktm: LivingStyleGuide skin renamed to Blueprint [integration/config] - 10https://gerrit.wikimedia.org/r/200069 [23:26:07] 10Continuous-Integration: Pool new integration-slave14xx instances and delete old ones - https://phabricator.wikimedia.org/T91524#1155637 (10scfc) [23:26:08] 10Continuous-Integration, 6Labs, 10Wikimedia-Labs-Infrastructure, 6operations: dnsmasq returns SERVFAIL for (some?) names that do not exist instead of NXDOMAIN - https://phabricator.wikimedia.org/T92351#1155635 (10scfc) 5Resolved>3declined That may be, but this is certainly not //resolved//, and whethe... [23:30:38] 10Continuous-Integration, 6Labs, 10Wikimedia-Labs-Infrastructure, 6operations: dnsmasq returns SERVFAIL for (some?) names that do not exist instead of NXDOMAIN - https://phabricator.wikimedia.org/T92351#1155653 (10coren) That may still be an option, once he have at least //one// that actually works right.... [23:32:10] (03CR) 10Legoktm: [C: 032] LivingStyleGuide skin renamed to Blueprint [integration/config] - 10https://gerrit.wikimedia.org/r/200069 (owner: 10Legoktm) [23:46:34] (03Merged) 10jenkins-bot: LivingStyleGuide skin renamed to Blueprint [integration/config] - 10https://gerrit.wikimedia.org/r/200069 (owner: 10Legoktm) [23:47:05] legoktm: Do you know who removed/renamed the gerrit repo? [23:47:12] I don't [23:47:17] We should inform and mandate from that process to notify CI. [23:47:53] !log deploying https://gerrit.wikimedia.org/r/200069 [23:47:58] Logged the message, Master [23:47:59] * legoktm looks up who [23:48:33] qchris https://phabricator.wikimedia.org/T93568 [23:48:35] I'll leave a comment [23:52:04] Krinkle: https://phabricator.wikimedia.org/T93568#1155692 [23:54:37] legoktm: cool