[00:07:08] 10Continuous-Integration: Recreate integration-puppetmaster with new image (/var/ is too small) - https://phabricator.wikimedia.org/T87484#1072156 (10Krinkle) p:5Normal>3High a:3hashar [00:12:57] 10Continuous-Integration: Recreate integration-puppetmaster with new image (/var/ is too small) - https://phabricator.wikimedia.org/T87484#1072162 (10Krinkle) Ran into this error on local instances a bunch of times today. ``` Warning: Error 400 on SERVER: cannot generate tempfile `/var/lib/puppet/yaml/node/i-0... [00:37:55] 10Continuous-Integration, 5Patch-For-Review, 7Puppet: Puppet is causing changed/added files in 'slave-scripts' git::clone on integration slaves in labs to become root read-only - https://phabricator.wikimedia.org/T87843#1072221 (10Krinkle) 5Open>3Resolved [00:53:14] !log Finished provisioning of integration-slave12xx and slave14xx instance. Initial testing failed due to "/usr/local/bin/zuul-cloner: No such file or directory" [00:53:18] Logged the message, Master [00:56:25] 10Beta-Cluster: Beta labs are serving three-weeks-old content - https://phabricator.wikimedia.org/T90983#1072266 (10matmarex) 3NEW [00:56:48] ask me if i enjoyed finding that out ^ [00:59:15] 10Continuous-Integration: "/usr/local/bin/zuul-cloner: No such file or directory" on new instances - https://phabricator.wikimedia.org/T90984#1072280 (10Krinkle) 3NEW a:3hashar [00:59:56] MatmaRex: Yeah, I noticed the same. It seems static debug mode responses are cached [01:00:31] cached is a bit of an understatement [01:00:51] MatmaRex: what about non-debug mode? [01:01:02] it mostly works [01:01:16] !log Keeping all integration-slave12xx and slave14xx instances depooled. [01:01:18] but it's harder to visually tell whether something is out of date there [01:01:20] Logged the message, Master [01:01:43] and that file being stale is why we have two copies of content in VE in debug mode [01:01:49] http://i.imgur.com/yTozcDn.png [01:02:14] YEah [01:02:21] It's been that way for a few weeks [01:02:24] debug mode being cached [01:02:25] dunno [01:02:28] beta snafu [01:02:37] at least so I thought [01:04:13] Im out for the day. [01:04:30] As I suspected. re-creating the instances should be an easy 1-2 hour task, but instead took all-day and it's still not working. [01:04:58] found and resolved 5 critical bugs, filed one more, and who knwos what will fail once that one is fixed. They cascade so can't see what's behind this error. [01:05:00] o/ [01:16:36] 10Continuous-Integration, 6Release-Engineering, 10MediaWiki-File-management, 6Multimedia: Parser tests intermittently failing on Zend due to unexpected thumbnail error - https://phabricator.wikimedia.org/T91016#1072481 (10Krinkle) [01:20:00] 10Beta-Cluster: Beta labs are serving three-weeks-old content - https://phabricator.wikimedia.org/T90983#1072554 (10Krinkle) For the given url http://bits.beta.wmflabs.org/static-master/extensions/VisualEditor/modules/ve-mw/init/styles/ve.init.mw.ViewPageTarget.css ``` HTP 200 Accept-Ranges:bytes Access-Control-... [01:20:40] 10Beta-Cluster: Beta labs bits should not cache static-master for three weeks - https://phabricator.wikimedia.org/T90983#1072555 (10Krinkle) [02:10:58] 10Continuous-Integration, 6Release-Engineering, 10MediaWiki-File-management, 6Multimedia: Parser tests intermittently failing on Zend due to unexpected thumbnail error - https://phabricator.wikimedia.org/T91016#1072734 (10aaron) Are there FSFileBackend log entries? [02:13:51] (03PS4) 10Phoenix303: Sniff to detect unused global variables [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/192248 (https://phabricator.wikimedia.org/T53279) [02:28:40] PROBLEM - Host deployment-db1 is DOWN: PING CRITICAL - Packet loss = 100% [02:29:13] PROBLEM - Host deployment-cxserver03 is DOWN: PING CRITICAL - Packet loss = 100% [02:29:37] PROBLEM - Host deployment-sca01 is DOWN: PING CRITICAL - Packet loss = 100% [02:29:51] PROBLEM - Host deployment-logstash1 is DOWN: PING CRITICAL - Packet loss = 100% [02:31:58] PROBLEM - Host deployment-restbase02 is DOWN: PING CRITICAL - Packet loss = 100% [02:32:08] PROBLEM - Host deployment-parsoid01-test is DOWN: PING CRITICAL - Packet loss = 100% [02:32:44] PROBLEM - English Wikipedia Mobile Main page on beta-cluster is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 MediaWiki exception - 1872 bytes in 6.338 second response time [02:34:08] PROBLEM - App Server Main HTTP Response on deployment-mediawiki02 is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 MediaWiki exception - 1611 bytes in 1.604 second response time [02:36:03] PROBLEM - English Wikipedia Main page on beta-cluster is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 MediaWiki exception - 1867 bytes in 1.523 second response time [02:36:57] PROBLEM - App Server Main HTTP Response on deployment-mediawiki01 is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 MediaWiki exception - 1611 bytes in 1.573 second response time [02:48:39] PROBLEM - Host deployment-sca01 is DOWN: CRITICAL - Host Unreachable (10.68.17.54) [03:00:59] RECOVERY - Host deployment-parsoid01-test is UP: PING OK - Packet loss = 0%, RTA = 0.91 ms [03:01:07] RECOVERY - Host deployment-db1 is UP: PING OK - Packet loss = 0%, RTA = 0.86 ms [03:01:07] RECOVERY - Host deployment-logstash1 is UP: PING OK - Packet loss = 0%, RTA = 0.79 ms [03:03:02] RECOVERY - Host deployment-cxserver03 is UP: PING OK - Packet loss = 0%, RTA = 474.76 ms [03:03:40] RECOVERY - Host deployment-sca01 is UP: PING OK - Packet loss = 0%, RTA = 0.94 ms [03:05:46] RECOVERY - Host deployment-restbase02 is UP: PING OK - Packet loss = 0%, RTA = 0.84 ms [03:06:44] Project browsertests-PageTriage-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #437: FAILURE in 43 sec: https://integration.wikimedia.org/ci/job/browsertests-PageTriage-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce/437/ [03:06:48] PROBLEM - Puppet failure on deployment-restbase02 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [0.0] [03:07:40] PROBLEM - Puppet failure on deployment-sca01 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [03:16:43] RECOVERY - Puppet failure on deployment-restbase02 is OK: OK: Less than 1.00% above the threshold [0.0] [03:17:45] RECOVERY - Puppet failure on deployment-sca01 is OK: OK: Less than 1.00% above the threshold [0.0] [03:19:41] Project browsertests-CentralNotice-en.wikipedia.beta.wmflabs.org-windows_7-internet_explorer-11-sauce build #146: FAILURE in 1 min 40 sec: https://integration.wikimedia.org/ci/job/browsertests-CentralNotice-en.wikipedia.beta.wmflabs.org-windows_7-internet_explorer-11-sauce/146/ [03:24:55] Project browsertests-CirrusSearch-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #476: FAILURE in 2 min 13 sec: https://integration.wikimedia.org/ci/job/browsertests-CirrusSearch-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/476/ [03:31:39] Project browsertests-Flow-en.wikipedia.beta.wmflabs.org-linux-chrome-monobook-sauce build #321: FAILURE in 6 min 43 sec: https://integration.wikimedia.org/ci/job/browsertests-Flow-en.wikipedia.beta.wmflabs.org-linux-chrome-monobook-sauce/321/ [03:35:15] Project browsertests-Math-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #428: FAILURE in 27 sec: https://integration.wikimedia.org/ci/job/browsertests-Math-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce/428/ [03:35:20] Project browsertests-Echo-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #393: FAILURE in 3 min 52 sec: https://integration.wikimedia.org/ci/job/browsertests-Echo-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/393/ [03:37:30] Project browsertests-CentralNotice-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #210: FAILURE in 2 min 14 sec: https://integration.wikimedia.org/ci/job/browsertests-CentralNotice-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/210/ [03:39:13] Project browsertests-WikiLove-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #468: FAILURE in 26 sec: https://integration.wikimedia.org/ci/job/browsertests-WikiLove-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/468/ [03:41:49] Project browsertests-PageTriage-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #433: FAILURE in 48 sec: https://integration.wikimedia.org/ci/job/browsertests-PageTriage-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/433/ [03:48:24] Project browsertests-CentralNotice-en.wikipedia.beta.wmflabs.org-os_x_10.9-safari-sauce build #146: FAILURE in 1 min 9 sec: https://integration.wikimedia.org/ci/job/browsertests-CentralNotice-en.wikipedia.beta.wmflabs.org-os_x_10.9-safari-sauce/146/ [03:51:31] Project browsertests-CentralNotice-en.wikipedia.beta.wmflabs.org-windows_7-internet_explorer-10-sauce build #146: FAILURE in 1 min 2 sec: https://integration.wikimedia.org/ci/job/browsertests-CentralNotice-en.wikipedia.beta.wmflabs.org-windows_7-internet_explorer-10-sauce/146/ [04:01:31] PROBLEM - Free space - all mounts on deployment-jobrunner01 is CRITICAL: CRITICAL: deployment-prep.deployment-jobrunner01.diskspace.root.byte_percentfree.value (<25.00%) [04:04:03] Project browsertests-CentralNotice-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #211: FAILURE in 39 sec: https://integration.wikimedia.org/ci/job/browsertests-CentralNotice-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce/211/ [04:16:09] Project browsertests-CentralAuth-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #6: FAILURE in 1 min 18 sec: https://integration.wikimedia.org/ci/job/browsertests-CentralAuth-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/6/ [04:21:36] Yippee, build fixed! [04:21:36] Project browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-os_x_10.9-safari-sauce build #493: FIXED in 1 hr 11 min: https://integration.wikimedia.org/ci/job/browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-os_x_10.9-safari-sauce/493/ [04:23:51] Project browsertests-Math-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #436: FAILURE in 32 sec: https://integration.wikimedia.org/ci/job/browsertests-Math-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/436/ [04:24:41] Project browsertests-UniversalLanguageSelector-commons.wikimedia.beta.wmflabs.org-linux-firefox-sauce build #486: FAILURE in 3 min 5 sec: https://integration.wikimedia.org/ci/job/browsertests-UniversalLanguageSelector-commons.wikimedia.beta.wmflabs.org-linux-firefox-sauce/486/ [04:29:38] Yippee, build fixed! [04:29:39] Project browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-windows_8-internet_explorer-sauce build #491: FIXED in 35 min: https://integration.wikimedia.org/ci/job/browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-windows_8-internet_explorer-sauce/491/ [04:46:32] Project browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #492: FAILURE in 30 min: https://integration.wikimedia.org/ci/job/browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce/492/ [04:46:58] 10Beta-Cluster: Beta cluster bits should not cache static-master for three weeks - https://phabricator.wikimedia.org/T90983#1072851 (10greg) [05:20:37] PROBLEM - Host deployment-sca01 is DOWN: PING CRITICAL - Packet loss = 100% [05:20:50] PROBLEM - Host deployment-logstash1 is DOWN: PING CRITICAL - Packet loss = 100% [05:21:13] PROBLEM - Host deployment-cxserver03 is DOWN: PING CRITICAL - Packet loss = 100% [05:22:58] PROBLEM - Host deployment-restbase02 is DOWN: PING CRITICAL - Packet loss = 100% [05:23:08] PROBLEM - Host deployment-parsoid01-test is DOWN: PING CRITICAL - Packet loss = 100% [05:23:41] PROBLEM - Host deployment-db1 is DOWN: PING CRITICAL - Packet loss = 100% [06:14:10] (03PS4) 10Legoktm: Regularly run mediawiki-vendor-composer-security and notify on failures [integration/config] - 10https://gerrit.wikimedia.org/r/193057 [06:14:36] greg-g: I’ve not been able to do much betacluster work since coming back from vacation because labs has been on fire so often [06:24:44] (03CR) 10Legoktm: [C: 032] Regularly run mediawiki-vendor-composer-security and notify on failures [integration/config] - 10https://gerrit.wikimedia.org/r/193057 (owner: 10Legoktm) [06:31:22] (03Merged) 10jenkins-bot: Regularly run mediawiki-vendor-composer-security and notify on failures [integration/config] - 10https://gerrit.wikimedia.org/r/193057 (owner: 10Legoktm) [06:35:36] PROBLEM - Puppet failure on deployment-sentry2 is CRITICAL: CRITICAL: 75.00% of data above the critical threshold [0.0] [06:42:09] !log deployed https://gerrit.wikimedia.org/r/193057 [06:42:16] Logged the message, Master [06:44:34] 10Deployment-Systems, 10Librarization, 6MediaWiki-Core-Team, 6Security, 5Patch-For-Review: Have a check for reported security issues in dependencies - https://phabricator.wikimedia.org/T74193#1072879 (10Legoktm) 5Open>3Resolved https://gerrit.wikimedia.org/r/193057 If there are any other repos commi... [06:46:27] Project UploadWizard-api-commons.wikimedia.beta.wmflabs.org build #1546: FAILURE in 26 sec: https://integration.wikimedia.org/ci/job/UploadWizard-api-commons.wikimedia.beta.wmflabs.org/1546/ [06:46:33] RECOVERY - Free space - all mounts on deployment-jobrunner01 is OK: OK: All targets OK [07:00:40] RECOVERY - Puppet failure on deployment-sentry2 is OK: OK: Less than 1.00% above the threshold [0.0] [07:17:01] (03PS1) 10Legoktm: Remove reference to removed oojs-ui-phpcs-HEAD job [integration/config] - 10https://gerrit.wikimedia.org/r/193336 [07:18:53] (03CR) 10Legoktm: [C: 032] Remove reference to removed oojs-ui-phpcs-HEAD job [integration/config] - 10https://gerrit.wikimedia.org/r/193336 (owner: 10Legoktm) [07:19:59] (03Merged) 10jenkins-bot: Remove reference to removed oojs-ui-phpcs-HEAD job [integration/config] - 10https://gerrit.wikimedia.org/r/193336 (owner: 10Legoktm) [07:50:57] 10Quality-Assurance, 10Wikimania-Hackathon-2015, 10Wikimedia-Hackathon-2015, 7Performance: Session: Help make Wikipedia fast without leaving your browser - https://phabricator.wikimedia.org/T90975#1072932 (10Qgil) [07:56:33] PROBLEM - Puppet failure on deployment-pdf01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [08:20:01] 10Continuous-Integration, 10Wikidata, 10§ Wikidata-Sprint-2015-02-03, 10§ Wikidata-Sprint-2015-02-25: fix the qunit tests for wikidata: mwext-Wikibase-qunit - https://phabricator.wikimedia.org/T74184#1072953 (10adrianheine) The focusing tests fail. We should either try to detect if they could possibly pass... [08:42:21] RECOVERY - Host deployment-sca01 is UP: PING OK - Packet loss = 0%, RTA = 0.78 ms [08:43:02] RECOVERY - Host deployment-cxserver03 is UP: PING OK - Packet loss = 0%, RTA = 1.38 ms [08:44:16] RECOVERY - Host deployment-logstash1 is UP: PING OK - Packet loss = 0%, RTA = 0.93 ms [08:45:47] RECOVERY - Host deployment-restbase02 is UP: PING OK - Packet loss = 0%, RTA = 0.68 ms [08:46:06] RECOVERY - Host deployment-db1 is UP: PING OK - Packet loss = 0%, RTA = 0.59 ms [08:50:25] RECOVERY - Host deployment-parsoid01-test is UP: PING OK - Packet loss = 0%, RTA = 0.72 ms [08:54:57] 10Continuous-Integration, 10Wikidata, 10§ Wikidata-Sprint-2015-02-03, 10§ Wikidata-Sprint-2015-02-25: fix the qunit tests for wikidata: mwext-Wikibase-qunit - https://phabricator.wikimedia.org/T74184#1072985 (10Tobi_WMDE_SW) Regarding the focus tests, IIRC we talked about removing them quite some times. Th... [09:07:17] PROBLEM - Puppet staleness on deployment-eventlogging02 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [43200.0] [09:46:14] ah [09:46:18] zeljkof: here I am at least [09:46:24] hashar: welcome :) [09:46:37] still have to fix up the colors etc though [10:53:05] aharoni: looks like we found a real problem yesterday :) https://phabricator.wikimedia.org/T90858 [11:08:03] http://wikidata.beta.wmflabs.org/ is broken :( [11:08:07] no database [11:11:01] RECOVERY - English Wikipedia Main page on beta-cluster is OK: HTTP OK: HTTP/1.1 200 OK - 49573 bytes in 0.676 second response time [11:11:57] RECOVERY - App Server Main HTTP Response on deployment-mediawiki01 is OK: HTTP OK: HTTP/1.1 200 OK - 49401 bytes in 0.511 second response time [11:12:35] RECOVERY - English Wikipedia Mobile Main page on beta-cluster is OK: HTTP OK: HTTP/1.1 200 OK - 30286 bytes in 0.577 second response time [11:13:48] 10Beta-Cluster: Crashed tables in deployment-db1 - https://phabricator.wikimedia.org/T91055#1073130 (10yuvipanda) 3NEW [11:14:07] RECOVERY - App Server Main HTTP Response on deployment-mediawiki02 is OK: HTTP OK: HTTP/1.1 200 OK - 49384 bytes in 0.534 second response time [11:23:20] zeljkof: firefox hello works perfectly :) [11:23:30] hashar: :) [12:12:48] hashar: do you have a minute? [12:12:59] is this builder used anywhere? [12:13:00] https://github.com/wikimedia/integration-config/blob/master/jjb/macro.yaml#L106-L179 [12:13:03] or could it be deleted? [12:15:06] zeljkof: I think it was used for the browser tests triggered when a patch is proposed in Gerrit [12:15:23] hashar: but that is not done any more anywere, right? [12:15:23] there might be no such jobs anymore [12:15:28] though elasticsearch had one iirc [12:15:41] you can add to the macro some placeholder like FINDME [12:15:51] hashar: I will test if it is used, if not I will delete it [12:15:53] generate the XML files locally with jjb test -o output [12:15:54] good idea [12:16:01] then grep for xml files having the placeholder FINDME [12:16:02] will do [12:16:09] but jut dropping the macro should be enough [12:16:17] JJB should complains it cant find a 'browsertest' macro [12:25:32] (03PS1) 10Zfilipin: WIP cleanup [integration/config] - 10https://gerrit.wikimedia.org/r/193350 [12:28:40] (03CR) 10jenkins-bot: [V: 04-1] WIP cleanup [integration/config] - 10https://gerrit.wikimedia.org/r/193350 (owner: 10Zfilipin) [12:37:35] PROBLEM - SSH on deployment-lucid-salt is CRITICAL: Connection refused [13:27:14] hashar: Hey :) [13:43:20] 10Continuous-Integration, 10MediaWiki-ResourceLoader, 10MediaWiki-Vagrant, 10Wikidata, and 2 others: qunit test broken without explicitly setting $wgResourceLoaderMaxQueryLength - https://phabricator.wikimedia.org/T90453#1073327 (10Krinkle) This change ([Ic416def846f361425c46f7b](https://gerrit.wikimedia.o... [14:07:09] 10Continuous-Integration, 10MediaWiki-ResourceLoader, 10MediaWiki-Vagrant, 10Wikidata, and 2 others: qunit test broken without explicitly setting $wgResourceLoaderMaxQueryLength - https://phabricator.wikimedia.org/T90453#1073364 (10JanZerebecki) Test run without that setting: https://integration.wikimedia.... [14:08:18] Yippee, build fixed! [14:08:18] Project browsertests-CentralNotice-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #212: FIXED in 56 sec: https://integration.wikimedia.org/ci/job/browsertests-CentralNotice-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce/212/ [14:28:24] (03PS1) 10Zfilipin: WIP Created the first Android centralNotice Jenkins job [integration/config] - 10https://gerrit.wikimedia.org/r/193361 [14:33:00] (03CR) 10Zfilipin: "Tested the configuration here:" [integration/config] - 10https://gerrit.wikimedia.org/r/193361 (owner: 10Zfilipin) [14:36:18] Project browsertests-MobileFrontend-SmokeTests-linux-chrome-sauce build #27: FAILURE in 8 min 17 sec: https://integration.wikimedia.org/ci/job/browsertests-MobileFrontend-SmokeTests-linux-chrome-sauce/27/ [15:00:52] (03CR) 10Phuedx: [C: 031] "Agreed." [integration/config] - 10https://gerrit.wikimedia.org/r/191046 (https://phabricator.wikimedia.org/T74794) (owner: 10Hashar) [15:36:48] 10Continuous-Integration, 10VisualEditor, 5§ VisualEditor Q3 Blockers: Concurrent builds using local Chromium/Firefox browsers on Linux host fail - https://phabricator.wikimedia.org/T90673#1073556 (10Krinkle) Today I investigated this on integration-slave1010. I set up a MediaWiki by running the build manual... [15:43:00] (03PS1) 10WMDE-Fisch: add Christoph Fischer to Wikidata notifications [integration/config] - 10https://gerrit.wikimedia.org/r/193379 [15:57:51] (03PS2) 10WMDE-Fisch: add Christoph Fischer to Wikidata notifications [integration/config] - 10https://gerrit.wikimedia.org/r/193379 [16:00:18] YuviPanda|food: I've noticed :) [16:04:10] 10Deployment-Systems, 6Release-Engineering, 10Wikimedia-Hackathon-2015: HHVM RepoAuthoritative Hackathon proof of concept - https://phabricator.wikimedia.org/T91074#1073661 (10thcipriani) 3NEW [16:08:03] (03CR) 10Tobias Gritschacher: [C: 031] add Christoph Fischer to Wikidata notifications [integration/config] - 10https://gerrit.wikimedia.org/r/193379 (owner: 10WMDE-Fisch) [16:16:57] greg-g: not fully sure how to deal with it. I'm prioritizing toollabs stability for now (next few days at least) since it definitely has waaaay more users than bets [16:16:58] Beta [16:17:33] greg-g: I do have some time set up with hashar next week to kill the last ::beta role (Parsoid) [16:17:57] And my tracking bug is getting better and smaller [16:18:12] what is your tracking bug again? [16:20:15] YuviPanda|food: from http://etherpad.wikimedia.org/p/BetaClusterpriorityworksync : [16:20:20] * We only have a small amount of Yuvi's time, so lets prioritize it and get all of the important things on his list. The list that Yuvi is working from is all the blockers of https://phabricator.wikimedia.org/T87220. [16:20:24] I'm not sure that is a complete and sufficien list for this project to be deemed successful. I'm guessing much of the stuff in NEXT: Maint and NEXT: Feature (link above) will be needeed as well. [16:21:12] Should decide which of those things are and setup another blocked [16:21:16] Blocker type thing [16:21:29] https://phabricator.wikimedia.org/project/board/497/query/open/?order=priority [16:21:33] we have projects for a reason :) [16:21:58] Right. [16:22:03] I don't want a "yuvi's bugs" tracking bug, no one else do that :) [16:22:08] does* [16:22:11] Maybe I should go through and assign things to me [16:22:14] Yeah yeah [16:22:17] #Yuvi's-Bugs [16:22:20] 'Type thing' [16:22:34] sure :) [16:22:48] * greg-g needs to spend some time on that backlog again [16:23:31] greg-g: I dont know what 'done' looks like for me. [16:23:56] That tracking bug was useful because with that 'done' meant 'beta and prod do not differ unless absolutely necessary' [16:24:06] YuviPanda|food: help make https://www.mediawiki.org/wiki/Beta_cluster/2014-15-Q3 happen [16:24:09] Whole the work board was a loose collection of stuff that needed doing [16:25:42] My food arrived [16:25:43] Brb [16:32:52] (03CR) 10Hashar: [C: 031] add Christoph Fischer to Wikidata notifications [integration/config] - 10https://gerrit.wikimedia.org/r/193379 (owner: 10WMDE-Fisch) [16:38:16] (03CR) 10Hashar: [C: 031] "Excellent! Cant wait for MWV people to get used to rspec and testing." [integration/config] - 10https://gerrit.wikimedia.org/r/192857 (https://phabricator.wikimedia.org/T76627) (owner: 10Dduvall) [16:43:41] (03CR) 10Hashar: [C: 04-1] "Bah while reviewing the child change I noticed an issue. The check-voter pipeline should be replaced by 'check'. See inline diff, sorry." (032 comments) [integration/config] - 10https://gerrit.wikimedia.org/r/192857 (https://phabricator.wikimedia.org/T76627) (owner: 10Dduvall) [16:44:43] 10Beta-Cluster, 6Labs, 6operations: Backport new salt-syndic packages - https://phabricator.wikimedia.org/T85442#1073725 (10ArielGlenn) I've imported salt-syndic_2014.1.11 into our lucid/precise/rtrusty repos. All dependencies should be there already. Let me know if it wfy. [16:44:57] (03CR) 10Hashar: [C: 04-1] "Lovely cucumber. Need to use 'test' pipeline, see my comments on parent change https://gerrit.wikimedia.org/r/#/c/192857/ ." (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/192860 (https://phabricator.wikimedia.org/T89489) (owner: 10Dduvall) [16:47:50] (03CR) 10Hashar: [C: 04-1] "This change drops the chrome job, probably not what you wanted." (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/193361 (owner: 10Zfilipin) [17:00:59] I can't join the hangout... [17:01:45] wth: http://i.imgur.com/QlRhFbB.png [17:02:28] (03CR) 10Cmcmahon: "I am kind of surprised that this worked, but it's pretty neat. I'd like to see how SauceLabs captures a failure for this test." [integration/config] - 10https://gerrit.wikimedia.org/r/193361 (owner: 10Zfilipin) [17:03:50] greg-g: might be Iceweasel. I find when I use Chromium I can't share my screen in Hangout, I have to use Chrome. [17:05:48] greg-g: default disabled plugins, i think it can be allowed via the popin you get when you click on the lock icon [17:09:45] I just restarted my browser and it worked, try the IT Dept method first :) [17:10:51] ah yes, the old "turn it off and back on" solution. that's always worth trying. [17:16:51] weird. Jenkins UI is in German for me, but looking at the settings it should be en-US. anyone know how to get it back to English? [17:18:26] * andre__ hands over a babelfish [17:20:49] "Fingerabdruck überprüfen" is my new jam [17:26:56] chrismcmahon: https://wikitech.wikimedia.org/wiki/Release_Engineering/Argh#Jenkins_interface_language [17:27:33] thanks legoktm, doing that now [17:28:23] it worked [17:33:24] greg-g: back. [17:33:43] so none of that directly relates to what I”m doing. It’s just that without the solid base it’ll be hard to do those [17:33:52] so my goal has always been in my head ‘minimize differences between beta and prod' [17:33:58] which that tracking bug encapsulates perfectly [17:51:00] (03PS1) 10Jdlrobson: Setup Gather browser tests job [integration/config] - 10https://gerrit.wikimedia.org/r/193393 (https://phabricator.wikimedia.org/T91082) [17:54:04] 10Quality-Assurance, 10Gather, 7Jenkins, 5Patch-For-Review: Setup Gather browser tests Jenkins job - https://phabricator.wikimedia.org/T91082#1073914 (10Jdlrobson) [17:59:47] 10Beta-Cluster: Cannot login to Betalabs - https://phabricator.wikimedia.org/T91084#1073939 (10Ryasmeen) 3NEW [18:06:56] Yippee, build fixed! [18:06:57] Project browsertests-PageTriage-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #438: FIXED in 55 sec: https://integration.wikimedia.org/ci/job/browsertests-PageTriage-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce/438/ [18:08:35] !log Kicked deployment-bastion node in jenkins to try to fix jobs [18:08:39] Logged the message, Master [18:17:02] What's up with that? :/ [18:17:30] https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/7858/console is in progress but is "Waiting for next available executor on deployment-bastion.eqiad" for the hiwiki config [18:18:04] but that's also "Waiting for next available executor on deployment-bastion.eqiad" [18:18:46] I am starting to think I should make a ticket, except that the labs hardware issue might be the root cause. got another mysterious error just now Database query error (internal_api_error_DBQueryError) (MediawikiApi::ApiError) [18:19:04] any more details? [18:19:59] Krenair: not really, I noticed it starting about maybe 36 hours ago, intermittent weird connections errors, sometimes to databases, sometimes apparently to other bits of beta cluster [18:20:20] Krenair: if it continues to happen over the weekend I'll take more action [18:24:01] (03PS3) 10Dduvall: Make mediawiki-vagrant rspec job voting [integration/config] - 10https://gerrit.wikimedia.org/r/192857 (https://phabricator.wikimedia.org/T76627) [18:24:06] chrismcmahon, we certainly have a few exception logs for that [18:24:35] Yippee, build fixed! [18:24:35] Project browsertests-CentralNotice-en.wikipedia.beta.wmflabs.org-windows_7-internet_explorer-11-sauce build #147: FIXED in 6 min 33 sec: https://integration.wikimedia.org/ci/job/browsertests-CentralNotice-en.wikipedia.beta.wmflabs.org-windows_7-internet_explorer-11-sauce/147/ [18:24:41] Error: 145 Table './centralauth/localnames' is marked as crashed and should be repaired (10.68.16.193) [18:24:43] well, fuck [18:24:59] Yeah that'll break things, chrismcmahon [18:26:56] Do we not have an sql command in deployment-prep like we do in prod? :/ [18:30:22] Yippee, build fixed! [18:30:23] Project browsertests-CirrusSearch-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #477: FIXED in 2 min 24 sec: https://integration.wikimedia.org/ci/job/browsertests-CirrusSearch-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/477/ [18:33:35] has someone repaired it..? [18:34:44] no idea [18:38:11] something is wrong with labs in general as https://wikitech.wikimedia.org/wiki/Nova_Resource:I-00000220.eqiad.wmflabs shows deployment-db1 is stopped (rebooting), yet I can SSH into it [18:42:14] (03PS4) 10Dduvall: Make mediawiki-vagrant rspec job voting [integration/config] - 10https://gerrit.wikimedia.org/r/192857 (https://phabricator.wikimedia.org/T76627) [18:46:17] (03PS2) 10Dduvall: Make mediawiki-vagrant cucumber job voting [integration/config] - 10https://gerrit.wikimedia.org/r/192860 (https://phabricator.wikimedia.org/T89489) [18:46:35] Yippee, build fixed! [18:46:35] Project UploadWizard-api-commons.wikimedia.beta.wmflabs.org build #1548: FIXED in 33 sec: https://integration.wikimedia.org/ci/job/UploadWizard-api-commons.wikimedia.beta.wmflabs.org/1548/ [18:47:29] Yippee, build fixed! [18:47:29] Project browsertests-Echo-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #394: FIXED in 13 min: https://integration.wikimedia.org/ci/job/browsertests-Echo-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/394/ [18:48:18] chrismcmahon, last occurred 18:27:49 [18:48:39] going back further in the log shows it stopping for a while :/ [18:48:45] that was probably the test I just quoted from [18:48:56] chrismcmahon, give me a ping if you see it again, perhaps? [18:49:08] Krenair: will do [18:49:41] Krenair: the daily browser test builds are all running now and will be for the next couple of hours [18:49:52] Yippee, build fixed! [18:49:52] Project browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-windows_8.1-internet_explorer-11-sauce build #340: FIXED in 42 min: https://integration.wikimedia.org/ci/job/browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-windows_8.1-internet_explorer-11-sauce/340/ [18:49:57] Krenair: if you want to camp out on the log for the next little while that might be valuable [18:55:55] Yippee, build fixed! [18:55:55] Project browsertests-Math-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #429: FIXED in 1 min 5 sec: https://integration.wikimedia.org/ci/job/browsertests-Math-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce/429/ [18:57:07] Yippee, build fixed! [18:57:07] Project browsertests-CentralNotice-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #211: FIXED in 1 min 11 sec: https://integration.wikimedia.org/ci/job/browsertests-CentralNotice-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/211/ [19:13:59] 6Release-Engineering, 10RESTBase: Update / maintain beta labs restbase cluster - https://phabricator.wikimedia.org/T91102#1074329 (10GWicke) 3NEW a:3GWicke [19:26:48] Yippee, build fixed! [19:26:49] Project browsertests-Flow-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #532: FIXED in 39 min: https://integration.wikimedia.org/ci/job/browsertests-Flow-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce/532/ [19:27:31] Yippee, build fixed! [19:27:31] Project browsertests-WikiLove-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #469: FIXED in 3 min 19 sec: https://integration.wikimedia.org/ci/job/browsertests-WikiLove-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/469/ [19:27:59] Yippee, build fixed! [19:28:00] Project browsertests-PageTriage-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #434: FIXED in 1 min 10 sec: https://integration.wikimedia.org/ci/job/browsertests-PageTriage-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/434/ [19:36:21] Yippee, build fixed! [19:36:21] Project browsertests-Echo-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #376: FIXED in 10 min: https://integration.wikimedia.org/ci/job/browsertests-Echo-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce/376/ [19:37:56] PROBLEM - Puppet failure on deployment-cache-upload02 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [19:39:49] Yippee, build fixed! [19:39:50] Project browsertests-CentralNotice-en.wikipedia.beta.wmflabs.org-os_x_10.9-safari-sauce build #147: FIXED in 2 min 3 sec: https://integration.wikimedia.org/ci/job/browsertests-CentralNotice-en.wikipedia.beta.wmflabs.org-os_x_10.9-safari-sauce/147/ [19:44:00] PROBLEM - Puppet failure on deployment-cache-mobile03 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [19:45:32] PROBLEM - Puppet failure on deployment-cache-text02 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [19:45:46] Yippee, build fixed! [19:45:46] Project browsertests-CentralNotice-en.wikipedia.beta.wmflabs.org-windows_7-internet_explorer-10-sauce build #147: FIXED in 1 min 37 sec: https://integration.wikimedia.org/ci/job/browsertests-CentralNotice-en.wikipedia.beta.wmflabs.org-windows_7-internet_explorer-10-sauce/147/ [19:54:19] PROBLEM - Puppet failure on deployment-cache-bits01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [20:04:59] Yippee, build fixed! [20:05:00] Project browsertests-VisualEditor-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #542: FIXED in 25 min: https://integration.wikimedia.org/ci/job/browsertests-VisualEditor-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce/542/ [20:30:34] Project browsertests-MultimediaViewer-mediawiki.org-linux-firefox-sauce build #493: FAILURE in 1 min 17 sec: https://integration.wikimedia.org/ci/job/browsertests-MultimediaViewer-mediawiki.org-linux-firefox-sauce/493/ [20:30:50] 6Release-Engineering, 6Engineering-Community, 6Team-Practices, 10Wikimedia-Hackathon-2015, and 2 others: RelEng team offsite - May 2015 - Pre Wikimedia Hackathon - https://phabricator.wikimedia.org/T89036#1074647 (10Rfarrand) [20:41:24] Project browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-windows_8-internet_explorer-sauce build #492: FAILURE in 39 min: https://integration.wikimedia.org/ci/job/browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-windows_8-internet_explorer-sauce/492/ [20:42:29] Yippee, build fixed! [20:42:29] Project browsertests-Math-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #437: FIXED in 1 min 3 sec: https://integration.wikimedia.org/ci/job/browsertests-Math-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/437/ [20:52:44] chrismcmalunch: hey, sorry again, I am unable to even look at a computer right now, i hate these migraines [21:02:18] Yippee, build fixed! [21:02:18] Project browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #493: FIXED in 29 min: https://integration.wikimedia.org/ci/job/browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce/493/ [21:08:01] Project browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-windows_7-internet_explorer-9-sauce build #338: FAILURE in 44 min: https://integration.wikimedia.org/ci/job/browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-windows_7-internet_explorer-9-sauce/338/ [21:08:12] greg-g: fine by me if you want to cancel. [21:15:24] greg-g: sent email about one issue I wanted to bring up with you, but not super high priority [21:23:08] ahah [21:40:10] hashar: all fixed up i think https://gerrit.wikimedia.org/r/#/c/192860/ [21:40:31] and https://gerrit.wikimedia.org/r/#/c/192857/ [21:41:02] marxarelli: hey :) [21:41:15] the check-voter / check / test pipelines are slightly confusing :/ [21:41:43] hashar: yeah, i had to think about your explanation for a second, but it makes sense now [21:42:13] (03CR) 10Hashar: [C: 031] "All great. Feel free to deploy whenever MWV repo participants are ready :-)" [integration/config] - 10https://gerrit.wikimedia.org/r/192857 (https://phabricator.wikimedia.org/T76627) (owner: 10Dduvall) [21:42:59] marxarelli: we have some kind of workflow documented at http://www.mediawiki.org/wiki/Continuous_integration/Workflow [21:43:03] should still be accurate [21:43:58] (03CR) 10Hashar: [C: 031] "test + gate-and-submit for the win. The same goes for all other projects :-)" [integration/config] - 10https://gerrit.wikimedia.org/r/192860 (https://phabricator.wikimedia.org/T89489) (owner: 10Dduvall) [21:44:07] marxarelli: all good_ [21:44:14] hashar: nice. thanks! [21:44:41] it'll feel good to close out some mw-vagrant tasks today :) [21:44:44] have you ever deployed a zuul change ? [21:45:30] hashar: yes [21:45:59] hashar: followed this https://www.mediawiki.org/wiki/Continuous_integration/Zuul#Deploy_configuration [21:46:25] (03CR) 10Dduvall: [C: 032] Make mediawiki-vagrant rspec job voting [integration/config] - 10https://gerrit.wikimedia.org/r/192857 (https://phabricator.wikimedia.org/T76627) (owner: 10Dduvall) [21:46:32] (03CR) 10Dduvall: [C: 032] Make mediawiki-vagrant cucumber job voting [integration/config] - 10https://gerrit.wikimedia.org/r/192860 (https://phabricator.wikimedia.org/T89489) (owner: 10Dduvall) [21:47:31] (03Merged) 10jenkins-bot: Make mediawiki-vagrant rspec job voting [integration/config] - 10https://gerrit.wikimedia.org/r/192857 (https://phabricator.wikimedia.org/T76627) (owner: 10Dduvall) [21:47:37] (03Merged) 10jenkins-bot: Make mediawiki-vagrant cucumber job voting [integration/config] - 10https://gerrit.wikimedia.org/r/192860 (https://phabricator.wikimedia.org/T89489) (owner: 10Dduvall) [21:49:36] PROBLEM - Host deployment-restbase02 is DOWN: CRITICAL - Host Unreachable (10.68.16.234) [21:49:40] !log Reloading Zuul to deploy I273270295fa5a29422a57af13f9e372bced96af1 and I81f5e785d26e21434cd66dc694b4cfe70c1fa494 [21:49:46] Logged the message, Master [21:50:00] PROBLEM - Host deployment-restbase03 is DOWN: CRITICAL - Host Unreachable (10.68.16.240) [21:53:41] marxarelli: congratulations_ [21:56:15] !log Job beta-update-databases-eqiad and node deployment-bastion.eqiad have been stuck for the past 4 hours [21:56:19] Logged the message, Master [21:58:20] !log Ragekilled all queued jobs related to beta and force restarted Jenkins slave agent on deployment-bastion.eqiad [21:58:24] Logged the message, Master [21:58:54] 21:58:44 Finished: SUCCESS [21:58:55] Yay [21:59:45] :) [22:00:03] Krinkle: I still have two unread emails from you :( [22:00:11] havent found the envy/time to process them yet [22:00:18] last two weeks have been messy [22:00:26] hashar: Also new policy as of now: Don't fix anything related to CI without filing a task. We're too often forgetting how broken things are. [22:00:29] We need to keep tracking stuff [22:00:53] I mean things that aren't newly broken. E.g. I'm filing a task now that beta-update deployment job is shit and needs rewriting/fixing/whatever [22:01:01] It crashes every other day, unacceptable. [22:01:22] hehe [22:01:42] the job that wraps around scap? [22:01:43] 10Beta-Cluster, 5Patch-For-Review, 7Puppet: Puppet failures on deployment-bastion - https://phabricator.wikimedia.org/T75520#1074769 (10Krinkle) Is this still an issue? [22:02:06] hashar: The race condition / problem /something/ thingy, that causes that job to loop back to itself and deadlock waiting for executors [22:02:18] yeah that is logged in phabricator [22:02:41] hashar: Which one? https://phabricator.wikimedia.org/T72597 ? [22:03:03] yup [22:03:09] havent followed up with upstream [22:14:05] hashar: Would it be feasible to rewrite the job to not trigger this bug? [22:14:28] I think you had some ideas during the hackathon for using bash or python instead [22:16:30] Krinkle: for the db update yeah [22:16:34] anyone aware of a problem with varnish in beta? http://en.wikipedia.beta.wmflabs.org/wiki/Special:MobileOptions gives a 503 [22:16:42] hehe [22:16:52] one would really need to write a "debug 503 errors" [22:17:46] chrismcmahon: that page / extension must throw some fatal error or exception [22:17:56] maybe logstash-beta has the info? [22:18:05] I cant login, my ssh keys are on some other computer [22:20:28] hashar: Also new policy as of now: Don't fix anything related to CI without filing a task. We're too often forgetting how broken things are. [22:21:03] so... don't try to run the troubleshooting stuff on mw.org? [22:22:23] Krenair: Krinkle has a good point [22:22:29] !log has its limits :D [22:22:46] I am probably going to start some weekly CI triage meetings [22:22:59] I'm not trying to make a point [22:22:59] probably one during european afternoon and another for SF morning [22:23:04] hashar: just walked kaldari through debugging on beta and he things he found the issue for the 503 so :) [22:23:19] JohnFLewis: awesome!!! [22:23:21] If we're going to start asking people to file bugs before fixing them, OK, just that we should document it on the page [22:23:36] could raise that to the QA list maybe [22:23:40] I think it is a good idea [22:24:04] Phabricator makes it way easier to handle for sure [22:24:41] JohnFLewis: dont you get access on the beta cluster? [22:25:02] hashar: I have access, but same issue as you :p [22:25:09] :) [22:25:19] I have killed my main laptop on tuesday [22:25:38] Though it is laziness more than anything as I am like 5 metres away from my computer ;) [22:25:39] killed another comp on wednesday while ordering a new comp [22:25:56] hashar: step away from whatever computing device you are currently using [22:26:06] and just two hours ago I was still fighting with the new Intel NUC (mini computer) I bought [22:26:21] Reedy: oh this one is safe. It has no ssh key yet :) [22:26:46] hashar: "yet" don't make ops revoke more stuff :p [22:27:05] exactly [22:27:20] and I have had to change/revoke a hundred or so of passwords [22:27:29] Yikes [22:29:27] PROBLEM - Free space - all mounts on deployment-jobrunner01 is CRITICAL: CRITICAL: deployment-prep.deployment-jobrunner01.diskspace.root.byte_percentfree.value (<11.11%) [22:30:39] well I am off. Time to sleep [22:53:14] 6Release-Engineering, 10Staging, 3releng-201415-Q3: [Quarterly Success Metric] By team test history - https://phabricator.wikimedia.org/T88706#1074918 (10dduvall) [23:23:02] marxarelli: is there a nicer way of finding the domain name for a given wiki in vagrant then to check if it's called 'devwiki' and then use either role::mediawiki::hostname or mediawiki::multiwiki::base_domain based on that? [23:24:27] PROBLEM - Free space - all mounts on deployment-jobrunner01 is CRITICAL: CRITICAL: deployment-prep.deployment-jobrunner01.diskspace.root.byte_percentfree.value (<12.50%) [23:44:41] marxarelli: is there a way to raise the vagrant memory limit for the initial provisioning of a role? I get occasional segfaults while the pip packages for sentry are being compiled, probably due to memory limits [23:45:39] tgr: re wiki domain name, unfortunately, i don't think so [23:46:11] tgr: re memory, there _is_ a way for roles to bump the memory but it's a bit fragile at the moment [23:46:47] i believe cirrussearch does it [23:47:39] ah, I see [23:48:39] the problem is a `vagrant reload` is required to actually apply the change, and users don't tend to know that [23:48:46] it is done post-provision too [23:48:55] right [23:49:04] at one point it automagically called reload but I think that is broken now [23:49:42] I'll just ignore it then [23:49:45] @machine.action(:reload, {}) [23:49:56] that is supposed to do it (in the plugin) [23:49:56] we need to bump the base requirement anyway [23:50:03] to at least 1.5G [23:50:10] :( probably [23:50:14] eventually it should use precompiled python packages from our repo anyway [23:50:23] or stop running the jobrunner with hhvm [23:50:42] oh werd. that's a big chunk o ram [23:50:46] or tune the hhvm that the jobrunner uses to grab less ram if possible [23:51:16] *nod* hhvm cache space for a script that is tiny [23:51:48] * bd808 knows where some of the bodies are buried [23:52:06] a combination of things is probably in order: greater base requirement, more efficient hhvm stuffs, and possibly role descriptions outside of puppet that can dictate tweaks to settings [23:52:47] hmmm... a role could change things outside of puppet now [23:52:56] e.g. a role_settings.yaml file or something [23:53:36] we'd just need some parseable metadata in the role header doc and a way to clean up after [23:53:56] that actually could be used for another problem, how to uninstall a role when it is disabled [23:55:26] re uninstall, maybe a convention like puppet/modules/roles/role_name/uninstall.pp? [23:55:54] you'd still have to know to run it, but yeah [23:55:55] * marxarelli second guesses that idea [23:56:26] most roles don't need it but a few do. like cirrus [23:56:45] which leaves elasticsearch installed [23:57:00] of course that's not a problem if you destroy your vm on a regular basis [23:57:22] * bd808 has also thought of something that gently whines at you if your VM is over 1 month old [23:57:24] i've run into it a number of times [23:57:39] though we're probably switching up roles more than most people [23:58:01] I have 4 different vagrant VMs running right now :) [23:58:05] on my laptop [23:58:16] haha. mw-vagrant needs to be a tamagotchi [23:58:29] yes! [23:58:51] should we really be destroying VMs that often? [23:58:51] Eloquence has describe it like that before actually. needs care and feeding or it dies [23:59:08] legoktm: I would recommend it, yes [23:59:21] that encourages you not to make custom hacks [23:59:40] hmm, I just don't want to have to re-populate the database again [23:59:41] and instead add new puppet to set things up [23:59:49] that's the rub [23:59:55] I have a bunch of test users and intentionally broken stuff in CA to match production :/