[00:00:02] err.. java [00:00:30] pre-commit hooks are annoying cuz you have to tell people to install them :\ [00:01:28] https://gerrit.googlesource.com/plugins/commit-validator-sample/+/master :) [00:01:51] Really is this easy, one file per rule: https://gerrit.googlesource.com/plugins/commit-validator-sample/+/master/src/main/java/com/googlesource/gerrit/plugins/validators/HelloWorldValidator.java [00:02:25] * bd808 shudders at remembering enough java to do useful work [00:02:50] those 7 years of my life were sooo long ago [00:03:59] In any case: pretty easy should we want to go that route [00:04:10] * RainbowSprinkles decidedly does not lick any cookies, but points to the jar [00:04:29] * bd808 waves hands and mumbles "manager" [00:04:38] :) [01:15:38] 10MediaWiki-Codesniffer: use code sniffer to check for unused imports - https://phabricator.wikimedia.org/T167694#3341293 (10Samwilson) This sounds like something that Phan would check, rather than phpcs (not that it currently does, I think; there's a mention [[ https://github.com/etsy/phan/pull/243 | here ]] th... [01:34:34] PROBLEM - Puppet staleness on deployment-aqs01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [43200.0] [05:32:08] 10MediaWiki-Codesniffer: use code sniffer to check for unused imports - https://phabricator.wikimedia.org/T167694#3341293 (10Legoktm) Probably more like something phan does, but it should also be doable in phpcs as well. [06:38:06] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team, 10Labs, 10Labs-Infrastructure: Create a new instance flavor for deployment-prep - https://phabricator.wikimedia.org/T167723#3342471 (10Paladox) What about using the xtra large image to future proof things? [06:42:22] Project selenium-Wikibase » chrome,beta,Linux,BrowserTests build #390: 04FAILURE in 2 hr 2 min: https://integration.wikimedia.org/ci/job/selenium-Wikibase/BROWSER=chrome,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=BrowserTests/390/ [07:49:56] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Next): deployment-tin has disk space issues - https://phabricator.wikimedia.org/T166492#3343634 (10hashar) [07:50:01] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Kanban), 10Labs, 10Labs-Infrastructure: Create a new instance flavor for deployment-prep - https://phabricator.wikimedia.org/T167723#3343631 (10hashar) 05Open>03Resolved a:03hashar The new flavor I am requesting is the same as `m1.large` ! [07:50:28] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Next): deployment-tin has disk space issues - https://phabricator.wikimedia.org/T166492#3297749 (10hashar) 05stalled>03Open `m1.large` would do it. [08:25:45] 10Release-Engineering-Team (Kanban), 10Operations, 10HHVM, 10Upstream: HHVM 3.18 crashes in realloc() as exposed by luasandbox - https://phabricator.wikimedia.org/T165043#3343697 (10MoritzMuehlenhoff) [08:27:28] 10Release-Engineering-Team (Kanban), 10Operations, 10HHVM, 10Upstream: HHVM 3.18 crashes in realloc() as exposed by luasandbox - https://phabricator.wikimedia.org/T165043#3255165 (10MoritzMuehlenhoff) 05Open>03Resolved I've built new HHVM packages (3.18.2+wmf5) which include the upstream fix from https... [09:16:18] 10Browser-Tests-Infrastructure, 10Release-Engineering-Team (Kanban), 10Patch-For-Review, 10User-zeljkofilipin: Run WebdriverIO tests in CI for extensions - https://phabricator.wikimedia.org/T164721#3343900 (10zeljkofilipin) Apologies for the delay, I will be working on it this week. [09:19:55] 10Browser-Tests-Infrastructure, 10Release-Engineering-Team (Next), 10Wikidata, 10Patch-For-Review, and 3 others: Increase in failures caused by Saucelabs - https://phabricator.wikimedia.org/T152963#3343903 (10zeljkofilipin) @pmiazga apologies for the delay, I have to finish some selenium+node tasks before... [09:36:44] hey hey hashar [09:37:19] I appear to be having an issue in mediawiki phpunit tests where a neew db connection starts getting used mid way through a test and thus the temporary tables being used are no longer there [09:37:26] Appearing in https://gerrit.wikimedia.org/r/#/c/351645/11 [09:37:30] heard of something like this before? [09:41:08] addshore: yup the temp tables are per connections [09:41:40] could it be due to the SiteTableSiteLookup ? [09:41:54] or HashSiteStore [09:42:20] I know nothing about that code though :( [09:42:22] hmmmm [09:42:37] I have 2 stack traces of what gets the connections [09:43:12] but I think it might be something else that is resetting the loadbalancer or something between these 2 stack traces. [09:43:24] meh, need to get xdebug setup in my new dev environment... [09:43:45] :( [09:45:03] I am myself fighting having different results between "composer exec phpunit" vs ./vendor/bin/phpunit [09:45:04] grbmbmbm [09:46:44] hahaha [09:47:02] well, I'm not using docker for my development sutff, just need to figure out what i do with breakpoints now... [09:47:03] hah [09:50:52] !log big refactoring for zookeeper merged in operations/puppet - https://gerrit.wikimedia.org/r/#/c/354449 - ping the Analytics team for any issue [09:50:55] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [09:51:14] (hope that nothing will come up but let me know otherwise) [09:52:45] PROBLEM - Puppet errors on deployment-zookeeper02 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [09:58:37] PROBLEM - Puppet errors on deployment-zookeeper01 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [10:02:23] lol [10:02:27] checking [10:03:14] Error 400 on SERVER: Could not find class role::zookeeper::server [10:04:40] I am wondering if it makes sense to assign the profile directly [10:06:17] 10Release-Engineering-Team (Watching / External), 10Developer-Relations, 10MediaWiki-General-or-Unknown, 10WMF-Legal, 10RfC: Remove @author lines from code - https://phabricator.wikimedia.org/T139301#3344114 (10Nemo_bis) [10:13:34] (03PS1) 10Hashar: test: ensure consistent report width [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/358546 [10:22:26] PROBLEM - Long lived cherry-picks on puppetmaster on deployment-puppetmaster02 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [10:36:58] !log delete deployment-zookeeper01 (old trusty instance, replaced with a jessie one) [10:37:01] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [10:38:10] 10Release-Engineering-Team (Kanban), 10CirrusSearch, 10Discovery, 10Discovery-Search, and 2 others: Figure out why browser tests can't create suggestion box - https://phabricator.wikimedia.org/T162966#3344255 (10zeljkofilipin) The above patch fixes the problem for phantomjs 2.1.1. The problem is still repr... [10:42:46] RECOVERY - Puppet errors on deployment-zookeeper02 is OK: OK: Less than 1.00% above the threshold [0.0] [10:43:40] PROBLEM - Host deployment-zookeeper01 is DOWN: CRITICAL - Host Unreachable (10.68.17.157) [10:44:38] yeah sorry zk01, so long and thanks for all the fish [10:46:39] (03PS1) 10Hashar: PHP CodeSniffer on CI should only lint HEAD [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/358554 (https://phabricator.wikimedia.org/T158974) [10:53:04] (03CR) 10jerkins-bot: [V: 04-1] PHP CodeSniffer on CI should only lint HEAD [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/358554 (https://phabricator.wikimedia.org/T158974) (owner: 10Hashar) [10:53:46] !log rolling restart of all kafka brokers to pick up the new zookeper change (only deployment-zookeeper02 available) [10:53:49] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [10:54:34] all right everything should be good [10:59:32] PROBLEM - Puppet errors on deployment-pdf01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [11:03:59] (03PS1) 10Hashar: Fix WhiteSpace/SpaceAfterClosureSniff [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/358567 (https://phabricator.wikimedia.org/T149623) [11:04:35] (03CR) 10Hashar: Add sniff to enforce "function (" for closures (031 comment) [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/355183 (https://phabricator.wikimedia.org/T149623) (owner: 10Legoktm) [11:05:58] (03PS2) 10Hashar: PHP CodeSniffer on CI should only lint HEAD [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/358554 (https://phabricator.wikimedia.org/T158974) [11:10:27] (03PS1) 10Zfilipin: Fixed link to documentation [selenium] - 10https://gerrit.wikimedia.org/r/358568 (https://phabricator.wikimedia.org/T59841) [11:24:47] (03PS1) 10KartikMistry: Update Apertium packages [integration/config] - 10https://gerrit.wikimedia.org/r/358571 (https://phabricator.wikimedia.org/T167247) [11:49:16] (03PS4) 10Aude: Add composer-install & use in composer-test-mwextension [integration/config] - 10https://gerrit.wikimedia.org/r/354522 (https://phabricator.wikimedia.org/T165316) [12:03:06] (03CR) 10Hashar: [C: 032] "As the repositories get renamed, we will have to mark the old ones in Gerrit read-only / archived." [integration/config] - 10https://gerrit.wikimedia.org/r/358571 (https://phabricator.wikimedia.org/T167247) (owner: 10KartikMistry) [12:04:27] (03Merged) 10jenkins-bot: Update Apertium packages [integration/config] - 10https://gerrit.wikimedia.org/r/358571 (https://phabricator.wikimedia.org/T167247) (owner: 10KartikMistry) [12:10:39] (03PS1) 10Zfilipin: WIP Headless Chrome [selenium] - 10https://gerrit.wikimedia.org/r/358578 (https://phabricator.wikimedia.org/T167507) [12:11:55] 10Release-Engineering-Team, 10Page-Previews, 10Reading-Web-Backlog, 10Reading-Web-Kanban-Board: Create bot that automatically rebases and rebuilds patches to master - https://phabricator.wikimedia.org/T167181#3344408 (10ovasileva) [12:12:27] (03CR) 10jerkins-bot: [V: 04-1] WIP Headless Chrome [selenium] - 10https://gerrit.wikimedia.org/r/358578 (https://phabricator.wikimedia.org/T167507) (owner: 10Zfilipin) [12:19:29] (03CR) 10Hashar: Fixed link to documentation (031 comment) [selenium] - 10https://gerrit.wikimedia.org/r/358568 (https://phabricator.wikimedia.org/T59841) (owner: 10Zfilipin) [12:33:26] (03CR) 10Hashar: [C: 04-1] "Also need a test to cover the new option in spec/browser_factory/chrome_spec.rb" (032 comments) [selenium] - 10https://gerrit.wikimedia.org/r/358578 (https://phabricator.wikimedia.org/T167507) (owner: 10Zfilipin) [12:39:32] (03CR) 10Hashar: [C: 04-1] Fixed link to documentation [selenium] - 10https://gerrit.wikimedia.org/r/358568 (https://phabricator.wikimedia.org/T59841) (owner: 10Zfilipin) [12:40:49] (03Abandoned) 10Zfilipin: Fixed link to documentation [selenium] - 10https://gerrit.wikimedia.org/r/358568 (https://phabricator.wikimedia.org/T59841) (owner: 10Zfilipin) [12:47:32] (03PS2) 10Hashar: Add a skin specific selenium job [integration/config] - 10https://gerrit.wikimedia.org/r/358088 (https://phabricator.wikimedia.org/T167543) (owner: 10Jdlrobson) [12:48:30] 10Deployment-Systems, 10Operations: Have fallback communication channel when freenode has problems - https://phabricator.wikimedia.org/T127904#3344510 (10Marostegui) p:05High>03Normal [12:49:31] (03PS2) 10Zfilipin: WIP Headless Chrome [selenium] - 10https://gerrit.wikimedia.org/r/358578 (https://phabricator.wikimedia.org/T167507) [12:49:58] (03CR) 10Hashar: [C: 032] Add a skin specific selenium job [integration/config] - 10https://gerrit.wikimedia.org/r/358088 (https://phabricator.wikimedia.org/T167543) (owner: 10Jdlrobson) [12:51:03] (03Merged) 10jenkins-bot: Add a skin specific selenium job [integration/config] - 10https://gerrit.wikimedia.org/r/358088 (https://phabricator.wikimedia.org/T167543) (owner: 10Jdlrobson) [12:57:39] 10Release-Engineering-Team (Kanban), 10MinervaNeue, 10Reading-Web-Backlog, 10Patch-For-Review: Skins cannot run browser tests per commit - https://phabricator.wikimedia.org/T167543#3344527 (10hashar) 05Open>03Resolved a:05hashar>03Jdlrobson CI config aced by @Jdlrobson I have reopened/rebased the... [12:57:52] 10Release-Engineering-Team (Kanban), 10MinervaNeue, 10Reading-Web-Backlog, 10Patch-For-Review: Skins cannot run browser tests per commit - https://phabricator.wikimedia.org/T167543#3344530 (10hashar) 05Resolved>03Open [13:01:36] (03PS1) 10Hashar: Rename selenium skin job [integration/config] - 10https://gerrit.wikimedia.org/r/358585 (https://phabricator.wikimedia.org/T167543) [13:04:48] PROBLEM - Puppet errors on deployment-conf03 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [13:08:03] (03CR) 10Hashar: [C: 032] Rename selenium skin job [integration/config] - 10https://gerrit.wikimedia.org/r/358585 (https://phabricator.wikimedia.org/T167543) (owner: 10Hashar) [13:10:59] (03Merged) 10jenkins-bot: Rename selenium skin job [integration/config] - 10https://gerrit.wikimedia.org/r/358585 (https://phabricator.wikimedia.org/T167543) (owner: 10Hashar) [13:15:19] (03PS1) 10Hashar: Inject dependencies for mwskin-mw-selenium-jessie [integration/config] - 10https://gerrit.wikimedia.org/r/358594 (https://phabricator.wikimedia.org/T167543) [13:15:32] (03CR) 10Hashar: [C: 032] Inject dependencies for mwskin-mw-selenium-jessie [integration/config] - 10https://gerrit.wikimedia.org/r/358594 (https://phabricator.wikimedia.org/T167543) (owner: 10Hashar) [13:20:31] (03Merged) 10jenkins-bot: Inject dependencies for mwskin-mw-selenium-jessie [integration/config] - 10https://gerrit.wikimedia.org/r/358594 (https://phabricator.wikimedia.org/T167543) (owner: 10Hashar) [13:25:19] 10Release-Engineering-Team (Kanban), 10MinervaNeue, 10Reading-Web-Backlog, 10Patch-For-Review: Skins cannot run browser tests per commit - https://phabricator.wikimedia.org/T167543#3344561 (10hashar) So the CI cruft is configured. It find a feature `All good.` but no scenario: ``` 00:01:38.852 + bundle ex... [13:40:47] twentyafterfour: do you want to roll with https://gerrit.wikimedia.org/r/#/c/177181/ ? [13:46:17] Project selenium-VisualEditor » firefox,beta,Linux,BrowserTests build #427: 04FAILURE in 2 min 16 sec: https://integration.wikimedia.org/ci/job/selenium-VisualEditor/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=BrowserTests/427/ [13:47:17] !log nodepool force running puppet for: lower min-ready for trusty [puppet] - https://gerrit.wikimedia.org/r/356466 [13:47:39] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [13:54:22] (03PS2) 10Zfilipin: WIP Cleanup Ruby jobs [integration/config] - 10https://gerrit.wikimedia.org/r/356413 (https://phabricator.wikimedia.org/T164479) [13:55:23] 10Gerrit, 10Developer-Relations, 10GitHub-Mirrors, 10Repository-Admins, and 2 others: Add CODE_OF_CONDUCT.md to Wikimedia repositories - https://phabricator.wikimedia.org/T165540#3323678 (10Yaron_Koren) I'm not sure I should wade into this controversy, but here goes... Huji somewhat made this point earlie... [14:00:57] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Kanban), 10Labs, 10Labs-Infrastructure, 10Nodepool: Lower rate of Nodepool requests to OpenStack API - https://phabricator.wikimedia.org/T167803#3344784 (10hashar) [14:01:31] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Kanban), 10Labs, 10Labs-Infrastructure, 10Nodepool: Lower rate of Nodepool requests to OpenStack API - https://phabricator.wikimedia.org/T167803#3344799 (10hashar) I have quickly talked to @chasemp about it. It is best done early in a... [14:03:29] PROBLEM - Free space - all mounts on deployment-phab01 is CRITICAL: CRITICAL: deployment-prep.deployment-phab01.diskspace.root.byte_percentfree (<100.00%) [14:15:52] (03PS3) 10Zfilipin: WIP Cleanup Ruby jobs [integration/config] - 10https://gerrit.wikimedia.org/r/356413 (https://phabricator.wikimedia.org/T164479) [14:19:00] 10Browser-Tests-Infrastructure, 10Release-Engineering-Team (Backlog), 10Tracking, 10User-zeljkofilipin: Migration of browsertests* Jenkins jobs to selenium* jobs cleanup and optional task - https://phabricator.wikimedia.org/T140235#3344853 (10zeljkofilipin) [14:21:06] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Kanban): deployment-phab01 spams FAIL170612 [ERROR] mysqld: The table 'daemon_logevent' is full - https://phabricator.wikimedia.org/T167688#3344854 (10hashar) p:05Triage>03High Also causes Shinken alarms since the `/` partition is full :( [14:21:09] 10Browser-Tests-Infrastructure, 10Release-Engineering-Team (Backlog), 10Goal, 10User-zeljkofilipin: Migration of browsertests* Jenkins jobs to selenium* jobs cleanup and optional task - https://phabricator.wikimedia.org/T140235#3344857 (10zeljkofilipin) [14:24:00] 10Browser-Tests-Infrastructure, 10Release-Engineering-Team (Backlog), 10Goal, 10User-zeljkofilipin: Migration of browsertests* Jenkins jobs to selenium* jobs cleanup and optional task - https://phabricator.wikimedia.org/T140235#3344868 (10zeljkofilipin) [14:24:04] 10Continuous-Integration-Config, 10Release-Engineering-Team (Kanban), 10User-zeljkofilipin: Refactor Jenkins job configuration for UploadWizard-api-commons.wikimedia.beta.wmflabs.org - https://phabricator.wikimedia.org/T140076#3344865 (10zeljkofilipin) 05Open>03Resolved a:03zeljkofilipin ``` ~/Document... [14:25:06] 10Browser-Tests-Infrastructure, 10Release-Engineering-Team (Backlog), 10Goal, 10User-zeljkofilipin: Migration of browsertests* Jenkins jobs to selenium* jobs cleanup and optional task - https://phabricator.wikimedia.org/T140235#2457586 (10zeljkofilipin) 05Open>03stalled [14:37:50] (03PS5) 10Zfilipin: WIP Run WebdriverIO tests in CI for extensions [integration/config] - 10https://gerrit.wikimedia.org/r/352602 (https://phabricator.wikimedia.org/T164721) [15:05:54] (03CR) 10Hashar: [C: 031] "As discussed as a first iteration that is good enough ™. That opens a lot of questions for the future though:" [integration/config] - 10https://gerrit.wikimedia.org/r/357741 (https://phabricator.wikimedia.org/T166888) (owner: 10Thcipriani) [15:25:29] (03PS6) 10Zfilipin: WIP Run WebdriverIO tests in CI for extensions [integration/config] - 10https://gerrit.wikimedia.org/r/352602 (https://phabricator.wikimedia.org/T164721) [15:29:09] 10Browser-Tests-Infrastructure, 10Release-Engineering-Team, 10MediaWiki-General-or-Unknown, 10Epic, and 6 others: Port Selenium tests from Ruby to Node.js - https://phabricator.wikimedia.org/T139740#3345062 (10zeljkofilipin) [15:37:27] o/ [15:37:43] I'm getting permission issues working with my repo on halfak@deployment-tin:/srv/deployment/ores/deploy [15:38:03] halfak: have you tried sudo -s? [15:38:40] wtf I can root? [15:39:32] it's labs [15:39:40] still :\ [15:39:42] :) [15:39:52] fixed perms and moving on. [15:39:53] Thanks Amir1 [15:41:01] New hash is 862aea9. Relevant task is T167223 [15:41:01] T167223: Early June ORES prod deploy - https://phabricator.wikimedia.org/T167223 [15:41:04] * halfak takes notes [15:43:04] (03PS7) 10Zfilipin: Run WebdriverIO tests in CI for extensions [integration/config] - 10https://gerrit.wikimedia.org/r/352602 (https://phabricator.wikimedia.org/T164721) [15:46:46] Project selenium-MobileFrontend » chrome,beta,Linux,BrowserTests build #453: 04FAILURE in 24 min: https://integration.wikimedia.org/ci/job/selenium-MobileFrontend/BROWSER=chrome,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=BrowserTests/453/ [15:56:07] Project selenium-MobileFrontend » firefox,beta,Linux,BrowserTests build #453: 04FAILURE in 34 min: https://integration.wikimedia.org/ci/job/selenium-MobileFrontend/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=BrowserTests/453/ [15:57:47] (03CR) 10Hashar: [C: 031] "A few random comments :]" (038 comments) [integration/config] - 10https://gerrit.wikimedia.org/r/357741 (https://phabricator.wikimedia.org/T166888) (owner: 10Thcipriani) [15:59:42] !log deployed ores-prod-deploy:862aea9 [15:59:45] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [16:04:17] !log cherry-picked 357985/4 on puppetmaster [16:04:20] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [16:24:34] (03PS1) 10Zfilipin: Run mediawiki-core-qunit-selenium-jessie job for RelatedArticles [integration/config] - 10https://gerrit.wikimedia.org/r/358630 (https://phabricator.wikimedia.org/T164721) [16:29:08] (03CR) 10Zfilipin: [C: 032] Run mediawiki-core-qunit-selenium-jessie job for RelatedArticles [integration/config] - 10https://gerrit.wikimedia.org/r/358630 (https://phabricator.wikimedia.org/T164721) (owner: 10Zfilipin) [16:30:28] (03Merged) 10jenkins-bot: Run mediawiki-core-qunit-selenium-jessie job for RelatedArticles [integration/config] - 10https://gerrit.wikimedia.org/r/358630 (https://phabricator.wikimedia.org/T164721) (owner: 10Zfilipin) [17:00:28] !log hacking apache on mediawiki05 to test rewrite rules [17:00:31] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [17:05:44] PROBLEM - Puppet errors on swift is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [17:07:19] PROBLEM - Puppet errors on swift-storage-01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [17:10:29] 10Browser-Tests-Infrastructure, 10Release-Engineering-Team (Kanban), 10Patch-For-Review, 10User-zeljkofilipin: Run WebdriverIO tests in CI for extensions - https://phabricator.wikimedia.org/T164721#3345389 (10zeljkofilipin) TLDR: mediawiki-core-qunit-selenium-jessie runs for RelatedArticles in experimental... [17:30:53] 10Deployment-Systems, 10Operations: Have fallback communication channel when freenode has problems - https://phabricator.wikimedia.org/T127904#3345441 (10Dzahn) We could just agree on something like "if freenode is down we all switch to efnet, same channel names" and be done with it. vs. installing our own irc... [17:44:14] RainbowSprinkles: thaaaaank yoooou :D [17:44:24] harej: I'm being lazy [17:44:35] So messages go out in this deploy [17:44:37] Enabling is then easier [17:45:02] I'm not sure what a non-lazy deployment looks like so it looks good to me. [17:45:27] Chad just merged your patch; I think now you have to set up the messages and then scap? [17:45:58] Well, Chad is gonna scap for the train later today [17:46:07] So that means I don't need to scap to make it live on testwiki [17:46:19] I can just merge the patch, and sync the settings files [17:46:29] Should make the deploy window needed a lot smaller! [17:46:50] I'm scapping this minute actually, I like to load all the baggages on the train before it's time to depart the station :) [17:46:54] (choo choo) [17:47:27] Around what time will everything be up and ready? (Not to be a bother, just so I know when to check testwiki) [17:48:14] Oh, I'm not *enabling* it yet. Just sync'ing the code + messages into the cache :) [17:48:17] That's on Reedy [17:48:19] :) [17:48:40] harej: So we can just get a small window sometime soon [17:48:51] And we don't have to spend most of it waiting for scap to run [17:51:55] I'm happy for whenever, just let me know when it is :P [17:57:29] I guess we need to just pick a window and do it [18:00:01] I've got a meeting in a couple of hours... Probably not worth trying to fit it in between that and going to the airport [18:00:05] Tomorrow would be fine [18:02:04] deployment-ms-be04.deployment-prep.eqiad.wmflabs is gobbling CPU like mad, is something interesting happening with deployment-prep? And/or is that instance sick? [18:02:31] godog: ^ maybe yours? [18:09:14] 10Gerrit, 10Developer-Relations, 10GitHub-Mirrors, 10Repository-Admins, and 2 others: Add CODE_OF_CONDUCT.md to Wikimedia repositories - https://phabricator.wikimedia.org/T165540#3345594 (10Nuria) > It could have been done by anyone in the world, on any platform, and under any sort of social conditions, As... [18:10:42] PROBLEM - Puppet errors on deployment-etcd-01 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [18:12:19] PROBLEM - Puppet errors on deployment-cache-upload04 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [18:22:34] PROBLEM - Puppet errors on deployment-cache-text04 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [18:28:41] bblack: are those you, by chance? [18:28:47] https://www.irccloud.com/pastebin/DHUSiZvp/ [18:29:37] andrewbogott: probably [18:30:50] grrr at pastbin that has some kind of "don't allow Copy for Pasting" bullshit, and grrr harder that my browser actually obeys it :P [18:31:37] PROBLEM - Work requests waiting in Zuul Gearman server https://grafana.wikimedia.org/dashboard/db/zuul-gearman on contint1001 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [140.0] [18:32:48] andrewbogott: looking into why [18:33:14] great. it's not blocking anyone as far as I know, I just noticed while looking at a different thing. [18:34:00] andrewbogott: do those labs hosts use different versions of puppet/ruby sorts of things than prod? [18:34:19] should be the same, or nearly the same. No exported resources or puppetdb though [18:34:39] And deployment-prep has its own local puppetmaster that may not merge quite as often [18:35:27] oh hmmm [18:35:41] I think it has to do with fact stringify/type changes, maybe labs isn't caught up with prod on that [18:36:09] basically the problem in the ruby code in that template is that it does: [18:36:10] that box is using deployment-puppetmaster02.deployment-prep.eqiad.wmflabs as the master [18:36:12] ncpus = @facts['processorcount'] [18:36:23] and then later uses ncpus in a numeric math expression [18:36:40] we used to do more like: ncpus = @facts['processorcount'].to_i [18:37:03] because all facts were string data types, now they're really integers from the get-go and we've been stripping out all those excess .to_i type things [18:37:06] hm, pretty big diff on that puppetmaster [18:37:17] !log doing a git fetch and rebase for deployment-puppetmaster02 [18:37:20] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [18:38:17] !log restarting apache2 on deployment-puppetmaster02 [18:38:18] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [18:41:02] PROBLEM - Puppet errors on deployment-db04 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [18:41:20] hm, all those syncs and updates don't seem to have made a difference [18:42:07] yeah [18:43:06] 10Beta-Cluster-Infrastructure: Request for Importers privilege on beta cluster - https://phabricator.wikimedia.org/T167823#3345675 (10pmiazga) [18:43:08] ok here's a guess [18:43:27] in https://phabricator.wikimedia.org/T166372 v.olans says: After having upgraded the fleet to facter version 2.4.6 in T166203, we can now test that disabling stringify_facts is actually a noop across the fleet. [18:43:30] T166203: Upgrade facter to version 2.4.6 - https://phabricator.wikimedia.org/T166203 [18:43:34] but: [18:43:35] root@deployment-cache-upload04:~# facter --version [18:43:35] 2.2.0 [18:43:54] trying facter package update locally there [18:44:00] 10Beta-Cluster-Infrastructure: Request for Importers privilege on beta cluster - https://phabricator.wikimedia.org/T167823#3345689 (10pmiazga) [18:44:17] yeah that did it [18:44:29] so labs probably needs to follow suit on "apt-get install facter" everywhere [18:44:40] deployment-prep anyways [18:44:41] we did, but I guess we missed beta since it has a local salt master [18:44:44] I'll do that now [18:44:47] ok awesome [18:45:04] just 'apt-get install facter' right? [18:45:20] yeah that worked for me on upload04 [18:45:27] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Next): deployment-tin has disk space issues - https://phabricator.wikimedia.org/T166492#3345693 (10greg) [18:45:31] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Watching / External), 10Labs, 10Labs-Infrastructure: Create a new instance flavor for deployment-prep - https://phabricator.wikimedia.org/T167723#3345690 (10greg) 05Resolved>03Invalid a:05hashar>03None [18:45:32] maybe 'apt-get -y install facter' to avoid prompts [18:45:49] also, lol at the other thing it fixed in the varnish config templating: [18:45:52] --p thread_pool_min=250 -p thread_pool_max=2222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222 -p thread_pool_timeout=120 \ [18:45:57] +-p thread_pool_min=250 -p thread_pool_max=500 -p thread_pool_timeout=120 \ [18:46:33] !log using salt to "apt-get -y install facter" on all deployment-prep instances [18:46:36] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [18:47:17] implicit dynamic typing ftw (that that wasn't just a flat out compile error and instead tried to configure more threads than there are protons in the universe) [18:48:01] ok, now I see [18:48:03] -worker_processes 1; [18:48:03] +worker_processes 1; [18:48:08] that seems promising :) [18:48:35] ACKNOWLEDGEMENT - Work requests waiting in Zuul Gearman server https://grafana.wikimedia.org/dashboard/db/zuul-gearman on contint1001 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [140.0] amusso Unusual amount of patches sent to mediawiki/extensions [18:52:20] yeah all the crazy work that went into those patches amounts to a no-op for a single cpu host no matter what :) [18:52:48] really for any virtuals, even if they have >1 virtual cpus [18:57:20] RECOVERY - Puppet errors on deployment-cache-upload04 is OK: OK: Less than 1.00% above the threshold [0.0] [18:57:26] 10Beta-Cluster-Infrastructure, 10media-storage, 10Patch-For-Review: deployment-ms-be03.deployment-prep and deployment-ms-be04.deployment-prep have high load / system CPU - https://phabricator.wikimedia.org/T160990#3345715 (10hashar) Seems the load on deployment-ms-be04 is enough to overload labvirt1006 ( T16... [18:57:32] RECOVERY - Puppet errors on deployment-cache-text04 is OK: OK: Less than 1.00% above the threshold [0.0] [18:59:24] 10Gerrit, 10Developer-Relations, 10GitHub-Mirrors, 10Repository-Admins, and 2 others: Add CODE_OF_CONDUCT.md to Wikimedia repositories - https://phabricator.wikimedia.org/T165540#3345719 (10Yaron_Koren) @Nuria - I believe that's incorrect; the code of conduct defines Wikimedia technical spaces as "physical... [19:00:43] RECOVERY - Puppet errors on deployment-etcd-01 is OK: OK: Less than 1.00% above the threshold [0.0] [19:03:18] https://github.com/blog/2378-nested-teams-add-depth-to-your-team-structure [19:21:02] RECOVERY - Puppet errors on deployment-db04 is OK: OK: Less than 1.00% above the threshold [0.0] [19:28:55] 10Beta-Cluster-Infrastructure, 10media-storage, 10Patch-For-Review: deployment-ms-be03.deployment-prep and deployment-ms-be04.deployment-prep have high load / system CPU - https://phabricator.wikimedia.org/T160990#3345809 (10hashar) nscd: ``` Tue 13 Jun 2017 07:14:14 PM UTC - 30055: handle_request: request r... [19:31:56] PROBLEM - Puppet errors on deployment-imagescaler01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [19:37:41] Zuul seems to be backed up. Is that mediawiki/extensions/Babel task stalled? [19:39:06] bearND: too many patches got sent [19:39:30] RainbowSprinkles: cant we push the patches anomie has sent instead of CR+2 ? [19:39:36] hashar: ah, ok [19:40:08] hashar: I'm actually getting really tired of mass changes generally. [19:40:15] CI'd or not. [19:40:21] Seems to be an increase lately.... [19:40:23] :'( [19:40:40] for the code of conduct, we just blindly pushed the change since it was trivial [19:40:44] bypassing Gerrit/CI entirely [19:40:53] in this case, I guess we can wait for the patch to be tested [19:40:55] then push it [19:41:21] Should've pushed with +2 attached to it already [19:41:23] * RainbowSprinkles sighs [19:41:50] hashar: If we just directly push them, then we've got ~100 patches to deal with in gerrit. [19:41:53] It won't auto-close them [19:41:58] §?????? [19:42:01] it does in my experience [19:42:09] A direct push to a changeset? [19:42:11] * RainbowSprinkles shrugs [19:42:13] Never tried [19:42:16] send patch to Gerrit [19:42:22] wait for CI to vote V+2 to make sure pass [19:42:25] then locally git push [19:42:28] That's...weird [19:42:33] and Gerrit is smart enough to autoclose the change for you [19:42:41] with a message such as "hashar successfully pushed change" [19:42:43] Yeah, I guess I didn't think Gerrit could be smart [19:42:49] Gerrit is stupid most of the time ;-) [19:42:50] see also: jgit [19:43:22] hashar: Anyway, I don't really care. If you want to do that with Brad, go ahead. [19:47:17] I am gonna skip. It is late [19:50:49] Or could just go to bed and it'll be gone by the morning :p [19:52:57] chasemp: yeah, I'm not sure what happened with that one [19:57:48] PROBLEM - Puppet errors on deployment-ms-be04 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [20:04:07] hashar: Actually, the /test/ queue is going down pretty quick. I think easiest would be to just +2 them a few at a time as traffic is low. [20:04:14] I mean, there's no rush that they *have* to land quickly :) [20:04:25] sure thing :] [20:05:08] https://grafana.wikimedia.org/dashboard/db/zuul-gearman?orgId=1 it is slowly going down [20:09:10] 10Continuous-Integration-Config, 10MediaWiki-General-or-Unknown, 10MediaWiki-Unit-tests, 10Tracking: Let ApiDocumentationTest structure test pass on all repos - https://phabricator.wikimedia.org/T154838#3346083 (10Umherirrender) [20:19:34] !log deployment-prep: added Polishdeveloper to the "importer" global group. https://deployment.wikimedia.beta.wmflabs.org/wiki/Special:GlobalUserRights/Polishdeveloper - T167823 [20:19:38] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [20:19:38] T167823: Request for Importers privilege on beta cluster - https://phabricator.wikimedia.org/T167823 [20:20:42] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Kanban): Request for Importers privilege on beta cluster - https://phabricator.wikimedia.org/T167823#3346133 (10hashar) a:03hashar Should be good now based on https://deployment.wikimedia.beta.wmflabs.org/wiki/Special:GlobalUserRights/Polishdeveloper... [20:20:53] 10Gerrit, 10Labs, 10wikitech.wikimedia.org: Request to rename LegoFan4000 to MacFan4000 on WikiTech - https://phabricator.wikimedia.org/T165624#3346137 (10bd808) >>! In T165624#3342899, @bd808 wrote: > * rename `MacFan4000` to `Abandoned-MacFan4000` using [[ https://wikitech.wikimedia.org/wiki/Special:Rename... [20:21:07] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Kanban): Request for Importers privilege on beta cluster - https://phabricator.wikimedia.org/T167823#3345675 (10hashar) p:05Triage>03Normal [20:25:48] hashar: How does one drop just a particular queue? Like if we wanted to drop the postmerge jobs [20:26:37] RainbowSprinkles: you cant. Then the postmerge jobs are at lowest precedence. So they will end up running when the queues are empty [20:26:45] and that will catch up over (my) night :] [20:28:01] Ah ok, I thought we had done that once before [20:28:02] nvm [20:31:36] !log Nodepool: deleted a bunch of Trusty instances. It scheduled lot of them that are taking slots in the pool. Better have jessie nodes to be spawned instead since there is high demand for them [20:31:40] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [20:31:46] Deficit: ci-jessie-wikimedia: 139 (start: 140 min-ready: 12 ready: 1 capacity: 0) [20:31:47] :( [20:32:48] RECOVERY - Puppet errors on deployment-ms-be04 is OK: OK: Less than 1.00% above the threshold [0.0] [20:36:24] bah I think nodepool is confused [20:36:40] it keeps spawning trusty instances even if ther eis no demand for them [20:37:28] !log Restarting Nodepool. apparently confused in pool tracking and spawning to many Trusty nodes (7 instead of 4) [20:37:36] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [20:38:07] so apparantly you shouldn't trust nodepool with trusty :) [20:38:24] ;D [20:53:12] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Kanban): Request for Importers privilege on beta cluster - https://phabricator.wikimedia.org/T167823#3346222 (10pmiazga) @hashar it works, thanks. I'm going to import max 20-30 articles. [20:53:19] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Kanban): Request for Importers privilege on beta cluster - https://phabricator.wikimedia.org/T167823#3346223 (10pmiazga) 05Open>03Resolved [20:53:30] I still think that testing php55 on wmf stuff is pointless........ (and yes, I know about T86081) [20:53:30] T86081: Complete the use of HHVM over Zend PHP on the Wikimedia cluster - https://phabricator.wikimedia.org/T86081 [20:55:19] RainbowSprinkles: we want mediawiki to continue working on zend php [20:55:32] I said for wmf stuff [20:56:10] what's your definition of "wmf stuff"? [20:57:03] Platonides: deploy branches [20:58:11] that's still mediawiki… [20:59:17] Platonides: yes, but those branches are not use dby outsiders. Everything merged to master will run on zend [20:59:32] ah I found the bug in zuul/gearman/nodepool :] [20:59:50] build:mediawiki-core-phpcs-jessie:ci-trusty-wikimedia 0 0 0 [20:59:50] build:mediawiki-core-phpcs-jessie:ci-jessie-wikimedia 0 0 8 [20:59:50] build:mediawiki-core-phpcs-jessie 6 1 8 [21:00:03] job is registered to work on two different kind of workers (trusty and jessie) [21:00:07] there are 6 in the queue [21:00:32] and that demand of 6 is added to both trusty and jessie. That results in unwanted trusty instances to be spawned [21:01:13] I don't like too much that mentality of "we can commit things here that are not allowed on master" [21:01:16] it's a road of peri [21:01:18] *peril [21:01:48] that's not the point. the point is that php55 tests slow down deploys and swat [21:02:23] That ^ [21:02:29] and typically there is nothing in a deploy branch that isn't already merged in master [21:02:34] That too ^ [21:02:57] nobody is saying that wmf-config should be written in hack (today anyway) [21:03:08] I said that! But it was a dumb idea :p [21:03:41] * bd808 wants to switch to PHP7 and be done with parallel runtimes [21:04:04] but I'm not even a MW dev anymore so ... :) [21:04:42] what about php 7.1? [21:04:56] 10Release-Engineering-Team (Kanban), 10Phabricator: Custom task form for #mediawiki-extension-requests - https://phabricator.wikimedia.org/T160374#3346237 (10mmodell) Done: #cleanup [21:06:20] what about it? [21:08:41] He said php7. So was wondering about if we should switch to php 7.1. [21:09:29] I miss php4 :( [21:10:06] Why php4? [21:11:15] They were simpler times [21:12:51] paladox: I think you are misunderstanding [21:13:06] Oh sorry. [21:13:11] 10Release-Engineering-Team (Kanban), 10Phabricator: Custom task form for #mediawiki-extension-requests - https://phabricator.wikimedia.org/T160374#3346254 (10mmodell) and the form: https://phabricator.wikimedia.org/transactions/editengine/maniphest.task/view/33/ [21:13:27] !log Gracefully restarting Zuul [21:13:32] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [21:23:35] PROBLEM - Free space - all mounts on deployment-kafka01 is CRITICAL: CRITICAL: deployment-prep.deployment-kafka01.diskspace.root.byte_percentfree (<10.00%) [21:24:54] 10Release-Engineering-Team (Kanban), 10Phabricator: Custom task form for #mediawiki-extension-requests - https://phabricator.wikimedia.org/T160374#3346277 (10mmodell) 05Open>03Resolved [21:25:20] 10Release-Engineering-Team (Kanban), 10Phabricator: Custom task form for #mediawiki-extension-requests - https://phabricator.wikimedia.org/T160374#3096644 (10mmodell) I'm sorry that took me so long, I've been slacking on the custom forms lately. [21:30:41] RECOVERY - Work requests waiting in Zuul Gearman server https://grafana.wikimedia.org/dashboard/db/zuul-gearman on contint1001 is OK: OK: Less than 30.00% above the threshold [90.0] [21:32:34] 10Release-Engineering-Team (Kanban), 10Phabricator: Custom task form for #mediawiki-extension-requests - https://phabricator.wikimedia.org/T160374#3346287 (10SamanthaNguyen) @mmodell No worries! Thanks for creating it. :) [21:34:07] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Kanban): deployment-phab01 spams FAIL170612 [ERROR] mysqld: The table 'daemon_logevent' is full - https://phabricator.wikimedia.org/T167688#3346289 (10mmodell) I'll just kill the instance [21:35:00] hmm [21:35:01]

Queue only mode: preparing to exit, queue length: 71

[21:35:03] hashar ^^ [21:35:12] yeah I am on it [21:35:12] i get that on https://integration.wikimedia.org/zuul/ [21:35:13] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Kanban): deployment-phab01 spams FAIL170612 [ERROR] mysqld: The table 'daemon_logevent' is full - https://phabricator.wikimedia.org/T167688#3346294 (10mmodell) 05Open>03Resolved a:03mmodell [21:35:15] purging the queue [21:35:15] ok [21:35:17] and restarting zuul [21:35:18] thanks [21:35:19] :) [21:36:04] PROBLEM - Host deployment-phab01 is DOWN: CRITICAL - Host Unreachable (10.68.18.216) [21:52:11] PROBLEM - zuul_gearman_service on contint1001 is CRITICAL: connect to address 127.0.0.1 and port 4730: Connection refused [21:52:51] PROBLEM - zuul_service_running on contint1001 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/share/python/zuul/bin/python /usr/bin/zuul-server [21:55:13] RECOVERY - zuul_gearman_service on contint1001 is OK: TCP OK - 0.000 second response time on 127.0.0.1 port 4730 [21:55:51] RECOVERY - zuul_service_running on contint1001 is OK: PROCS OK: 2 processes with regex args ^/usr/share/python/zuul/bin/python /usr/bin/zuul-server [21:58:21] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Kanban), 10Zuul: Zuul refused to start from contint1001 - https://phabricator.wikimedia.org/T167833#3346309 (10hashar) [22:00:37] I deleted deployment-phab01, not sure how to make shinken ignore it? [22:00:58] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Kanban), 10Zuul: Zuul refused to start from contint1001 - https://phabricator.wikimedia.org/T167833#3346309 (10Paladox) Is there an error generated? [22:02:10] twentyafterfour: it regenerates its config from time to time via a cron [22:02:23] so that should clear out eventually [22:03:51] twentyafterfour: I tried earlier to clean up the daemon_log table but I could not even find a way to connect to mysql [22:03:53] so I gave up [22:03:58] Hi, im wondering is anyone aware scap deploy and scap deploy --force are throwing errors. [22:04:25] https://phabricator.wikimedia.org/P5574 [22:05:22] !log Zuul resarted manually from a terminal on contint1001. It does not have any statsd configuration so we will miss metrics for a bit till it is restarted properly. [22:05:25] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [22:06:22] 10Scap: scap failing intermittently at git tag - https://phabricator.wikimedia.org/T167836#3346358 (10Paladox) [22:07:33] hashar i thought zuul dosen't have a systemd script? Looking in puppet i see only it for zuul-merger. I also see not systemd script in the zuul deb package. Could it be the init script? [22:07:46] yup should be the init script [22:07:52] but it is broken somehow apparently :( [22:07:59] will try again tomorrow [22:08:31] ok [22:08:42] 10Scap: scap failing intermittently at git tag - https://phabricator.wikimedia.org/T167836#3346358 (10greg) What version are you using? We only support was is deployed to production by the deb packages. [22:08:53] hashar i guess your task should be triaged as ubn or high? [22:09:04] na service is working [22:09:12] ok [22:09:15] I mean [22:09:19] Zuul itself is working fine [22:09:24] fixing the script can wait tomorrow [22:09:47] ok [22:09:53] stuff is processing fine, we are just going to miss a few hours of metrics [22:10:03] not the end of the world. For now I am heading to bed! [22:10:19] hashar: thanks, that instance probably isn't needed anymore. I was using it to test scap deployment of phabricator but we have it deploying to phab-01 now (though paladox just found a bug with that) [22:10:40] yep. git tag interrmitly fails [22:11:54] great !!! [22:11:58] 10Scap: scap failing intermittently at git tag - https://phabricator.wikimedia.org/T167836#3346394 (10mmodell) it should be the right thing, but I'll double check [22:13:41] 10Scap: scap failing intermittently at git tag - https://phabricator.wikimedia.org/T167836#3346417 (10mmodell) ``` twentyafterfour@phab-tin:~$ scap version 3.5.8-1 ``` [22:13:48] 10Scap: scap failing intermittently at git tag - https://phabricator.wikimedia.org/T167836#3346418 (10Paladox) >>! In T167836#3346372, @greg wrote: > What version are you using? We only support was is deployed to production by the deb packages Puppet installs it. [22:15:48] 10Release-Engineering-Team (Kanban), 10Scap: scap failing intermittently at git tag - https://phabricator.wikimedia.org/T167836#3346422 (10mmodell) 05Open>03Resolved a:03mmodell Ohh, I found the problem: ``` twentyafterfour@phab-tin:/srv/deployment/phabricator$ ls -la deployment/.git total 84 drwxrwxr-x... [22:16:26] 10Release-Engineering-Team (Kanban), 10Scap: scap failing intermittently at git tag - https://phabricator.wikimedia.org/T167836#3346426 (10Paladox) ah, thanks. [22:33:17] 10Gerrit, 10Developer-Relations, 10GitHub-Mirrors, 10Repository-Admins, and 2 others: Add CODE_OF_CONDUCT.md to Wikimedia repositories - https://phabricator.wikimedia.org/T165540#3346493 (10Mattflaschen-WMF) >>! In T165540#3323678, @MZMcBride wrote: > @Tgr: I'm struggling to see how you filing this task, r... [22:56:13] RainbowSprinkles apparently there's a bug in apache mina that may make Ed25519 support buggy https://bugs.chromium.org/p/gerrit/issues/detail?id=6504 it's fixed but they haven't done a release yet. [22:56:29] though in my testing it worked. So it's probaly depending on the os you have. [22:58:02] Won't block upgrading. [22:59:36] Yep. i know that :), just was saying :) [23:00:37] 10Gerrit, 10Developer-Relations, 10GitHub-Mirrors, 10Repository-Admins, and 2 others: Add CODE_OF_CONDUCT.md to Wikimedia repositories - https://phabricator.wikimedia.org/T165540#3324036 (10Dzahn) >>! In T165540#3336728, @Isarra wrote: > doesn't make sense to put it in every single repository when the CoC... [23:02:11] PROBLEM - zuul_gearman_service on contint1001 is CRITICAL: connect to address 127.0.0.1 and port 4730: Connection refused [23:02:52] PROBLEM - zuul_service_running on contint1001 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/share/python/zuul/bin/python /usr/bin/zuul-server [23:06:43] (03PS1) 10Legoktm: Fix-up SpaceAfterClosureSniff [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/358894 [23:08:25] 10Beta-Cluster-Infrastructure, 10media-storage, 10Patch-For-Review: deployment-ms-be03.deployment-prep and deployment-ms-be04.deployment-prep have high load / system CPU - https://phabricator.wikimedia.org/T160990#3117645 (10faidon) That sounds like an issue with nscd and probably needs to be solved on that... [23:08:52] (03CR) 10Legoktm: [C: 032] "Thanks :D" [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/358567 (https://phabricator.wikimedia.org/T149623) (owner: 10Hashar) [23:09:15] (03Abandoned) 10Legoktm: Fix-up SpaceAfterClosureSniff [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/358894 (owner: 10Legoktm) [23:13:12] RECOVERY - zuul_gearman_service on contint1001 is OK: TCP OK - 0.000 second response time on 127.0.0.1 port 4730 [23:13:51] RECOVERY - zuul_service_running on contint1001 is OK: PROCS OK: 2 processes with regex args ^/usr/share/python/zuul/bin/python /usr/bin/zuul-server [23:15:04] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Kanban), 10Zuul: Zuul refused to start from contint1001 - https://phabricator.wikimedia.org/T167833#3346601 (10Paladox) I guess we should create a real systemd script to see if that fixes something. maybe systemd-sysvinit is buggy? [23:16:00] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Kanban), 10Zuul: Zuul refused to start from contint1001 - https://phabricator.wikimedia.org/T167833#3346309 (10Dzahn) ``` root@contint1001:~# systemctl status zuul ● zuul.service - LSB: Zuul Loaded: loaded (/etc/init.d/zuul) Active: a... [23:16:38] (03CR) 10Legoktm: [C: 032] "No idea this was a thing. Thanks :D" [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/358546 (owner: 10Hashar) [23:18:36] (03CR) 10Legoktm: "I'm OK with this, but I'd like to release phpcs 3.0 first and see how much of an impact the new multiprocessing phpcs makes for performanc" [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/358554 (https://phabricator.wikimedia.org/T158974) (owner: 10Hashar) [23:18:43] (03Merged) 10jenkins-bot: test: ensure consistent report width [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/358546 (owner: 10Hashar) [23:21:16] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Kanban), 10Zuul: Zuul refused to start from contint1001 - https://phabricator.wikimedia.org/T167833#3346617 (10hashar) 05Open>03Resolved a:03Dzahn I tried with either: ``` systemctl start zuul /etc/init.d/zuul start ``` Maybe `restar... [23:21:36] 10Gerrit, 10Developer-Relations, 10GitHub-Mirrors, 10Repository-Admins, and 2 others: Add CODE_OF_CONDUCT.md to Wikimedia repositories - https://phabricator.wikimedia.org/T165540#3346620 (10Yaron_Koren) Oh, I didn't realize that @Isarra also made essentially the same point that I did. [23:24:47] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Kanban), 10Operations, 10Zuul: Migrate zuul-server behind systemd service - https://phabricator.wikimedia.org/T167845#3346626 (10hashar) [23:31:26] 10Gerrit, 10Release-Engineering-Team (Kanban), 10Patch-For-Review: Update gerrit to 2.13.8 - https://phabricator.wikimedia.org/T158946#3346652 (10demon) 05Open>03Resolved [23:31:38] 10Gerrit, 10Release-Engineering-Team (Kanban), 10Patch-For-Review, 10Regression, 10Upstream: Cannot log into Gerrit as of recent upgrade - https://phabricator.wikimedia.org/T152640#3346653 (10demon) 05Open>03Resolved [23:47:11] 10Deployment-Systems, 10Release-Engineering-Team (Next): Automate branch cutting, with period to test on Beta Cluster - https://phabricator.wikimedia.org/T167553#3346668 (10greg) [23:47:19] 10Deployment-Systems, 10Release-Engineering-Team (Next): Automate branch cutting, with period to test on Beta Cluster - https://phabricator.wikimedia.org/T167553#3336607 (10greg) p:05Triage>03Normal [23:48:38] 10Browser-Tests-Infrastructure, 10Release-Engineering-Team (Next), 10JavaScript, 10Patch-For-Review, 10User-zeljkofilipin: WebdriverIO should run Chrome headlessly - https://phabricator.wikimedia.org/T167507#3346674 (10greg) [23:48:56] 10Browser-Tests-Infrastructure, 10Release-Engineering-Team (Next), 10JavaScript, 10Patch-For-Review, 10User-zeljkofilipin: Refactor webdriverio tests for mediawiki core so users and pages are created via the api - https://phabricator.wikimedia.org/T167502#3346675 (10greg) [23:49:20] 10Browser-Tests-Infrastructure, 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Next), 10Wikidata, and 2 others: Run Wikibase daily browser tests on Jenkins - https://phabricator.wikimedia.org/T167432#3346676 (10greg) [23:49:55] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Next), 10Nodepool: Change time at which Nodepool refresh the images - https://phabricator.wikimedia.org/T166889#3346677 (10greg) [23:51:05] 10Continuous-Integration-Config, 10MinervaNeue: Setup CI on Minerva repo - https://phabricator.wikimedia.org/T166750#3346681 (10greg) [23:51:39] 10Continuous-Integration-Config, 10Continuous-Integration-Infrastructure (Little Steps Sprint), 10Release-Engineering-Team (Backlog): Get rid of zend tests for wmf branches - https://phabricator.wikimedia.org/T94149#3346683 (10greg) [23:54:51] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Backlog), 10Operations: Make it possible to run the mediawiki testsuite against a staging repo of apt.wikimedia.org - https://phabricator.wikimedia.org/T157038#3346689 (10greg) [23:55:52] 10Deployment-Systems, 10Release-Engineering-Team (Backlog), 10Operations, 10Beta-Cluster-reproducible, and 2 others: Switch mwscript from Zend PHP5 to default php alternative (egHHVM) - https://phabricator.wikimedia.org/T146285#3346691 (10greg) [23:56:32] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team, 10MediaWiki-Unit-tests, 10MediaWiki-extensions-WikibaseClient, and 4 others: Job mediawiki-extensions-php55 frequently fails due to "Segmentation fault" - https://phabricator.wikimedia.org/T142158#3346693 (10greg) >>! In T142158#3278995,...