[01:36:18] 10Beta-Cluster-Infrastructure, 10TemplateStyles, 10Wikimedia-Extension-setup: Deploy TemplateStyles to the beta-cluster - https://phabricator.wikimedia.org/T133414#3317111 (10tstarling) [02:10:51] new phpcs version is no-go :( https://github.com/squizlabs/PHP_CodeSniffer/issues/1497 [07:48:50] RECOVERY - Puppet errors on deployment-jobrunner02 is OK: OK: Less than 1.00% above the threshold [0.0] [07:51:30] !log Fixed puppet on deployment-aqs instances [07:51:33] elukey: ^^^ :) [07:51:34] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [07:51:40] some hiera variable was missing [07:52:07] big refactoring done a while ago, didn't check labs [07:52:16] I can imagine :-} [07:52:21] sorry :( [07:52:23] thanks for the fix! [07:52:30] anyway puppet is happy now [07:58:10] hashar: continuing from #ops - whenever you have time I'd really like to have your opinion about https://github.com/facebook/hhvm/issues/7854 [07:58:17] and if there is a quick way to test it [08:02:14] RECOVERY - Puppet errors on deployment-aqs03 is OK: OK: Less than 1.00% above the threshold [0.0] [08:02:15] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team: deployment-phab02 is shutoff since April 23rd, can it be deleted? - https://phabricator.wikimedia.org/T167090#3317351 (10hashar) [08:02:30] RECOVERY - Puppet errors on deployment-aqs01 is OK: OK: Less than 1.00% above the threshold [0.0] [08:05:10] RECOVERY - Puppet errors on deployment-aqs02 is OK: OK: Less than 1.00% above the threshold [0.0] [08:06:04] elukey: I have no idea what those persistent connections are supposed to be :( [08:06:30] but it seems PHP has some feature to keep the connection open on completion. Which might make sense when it is used as a fcgi/fpm process maybe [08:07:36] I checked with tcpdump and didn't manage to reduce the connection wasted after each run of RunJobs.php [08:08:00] (clearly saw tons of SYNs with related FINs in a short timeframe) [08:08:23] I also tried to manually comment all the close() attempts for RedisConnectionPool.php [08:08:26] nothing changed [08:10:15] probably because mediawiki close the connection explicitly ? [08:12:23] RECOVERY - Puppet errors on deployment-restbase02 is OK: OK: Less than 1.00% above the threshold [0.0] [08:15:33] RECOVERY - Puppet errors on deployment-restbase01 is OK: OK: Less than 1.00% above the threshold [0.0] [08:16:35] 10Continuous-Integration-Config, 10Release-Engineering-Team (Kanban): Remove SemanticSignup extension from CI config - https://phabricator.wikimedia.org/T165660#3317385 (10hashar) 05Open>03Resolved a:03hashar SemanticSignup has been removed back in December 2016 with the other semantic extensions: 0bf736... [08:20:44] hashar: I thought it was the RedisConnectionPool's descriptor, not sure where mediawiki could do it elsewhere [08:20:47] do you have any suggestion? [08:22:12] 10Continuous-Integration-Config, 10Operations, 10Operations-Software-Development, 10Patch-For-Review: Flake8 for python files without extension in puppet repo - https://phabricator.wikimedia.org/T144169#2590514 (10Joe) >>! In T144169#2836235, @fgiunchedi wrote: > After some discussion in https://gerrit.wik... [08:24:12] elukey: I would have to jump in that eventually [08:24:32] though RedisConnectionPool explicitly close all the connection on destruction [08:24:39] though that is the /rpc/runJobs.php part [08:25:23] for the jobrunner service I have no idea [08:28:02] hashar: RunJobs.php instanciate MediaWiki() and leverages RedisConnectionPool no? [08:28:08] Is my understanding correct? [08:28:31] jobrunner/jobchron use another codebase [08:28:38] but my focus is RunJobs.php [08:30:16] elukey: yes :-) [08:30:41] with RunJobs.php apparently having the connections closed when RedisConnectionPool goes out of scope (via the __destruct() ); [08:30:53] I tried to comment the close() in there [08:30:56] on jobrunner02 [08:31:15] but didn't have too much time to keep going [09:02:31] 10Release-Engineering-Team (Kanban), 10Phabricator, 10Patch-For-Review: Switch phabricator production to codfw - https://phabricator.wikimedia.org/T164810#3317457 (10mmodell) [09:02:36] 10Release-Engineering-Team (Kanban), 10Operations, 10Phabricator: setup/install phab1001.eqiad.wmnet - https://phabricator.wikimedia.org/T163938#3317456 (10mmodell) [09:03:14] 10Release-Engineering-Team (Kanban), 10Operations, 10Phabricator: setup/install phab1001.eqiad.wmnet - https://phabricator.wikimedia.org/T163938#3215445 (10mmodell) @robh: Thanks, this is on my radar. The current plan is to switch production to phab2001.codfw temporarily, then switch back from there to phab1... [09:08:00] 10Release-Engineering-Team (Kanban), 10Operations, 10Phabricator: setup/install phab1001.eqiad.wmnet - https://phabricator.wikimedia.org/T163938#3317477 (10mmodell) [09:09:15] 10Release-Engineering-Team (Kanban), 10Phabricator, 10Availability, 10Patch-For-Review, 10WorkType-NewFunctionality: Deploy phabricator to phab2001.codfw.wmnet - https://phabricator.wikimedia.org/T137928#3317482 (10mmodell) [09:10:02] 10Release-Engineering-Team (Kanban), 10Phabricator, 10Availability, 10Patch-For-Review, 10WorkType-NewFunctionality: Deploy phabricator to phab2001.codfw.wmnet - https://phabricator.wikimedia.org/T137928#2558406 (10mmodell) [09:10:05] 10Release-Engineering-Team (Kanban), 10Phabricator, 10Availability, 10WorkType-NewFunctionality: Configure phabricator clustering for daemons and repositories - https://phabricator.wikimedia.org/T143175#3317484 (10mmodell) 05Open>03stalled [09:20:56] 10Release-Engineering-Team (Kanban), 10Phabricator: Custom task form for #Wikimedia-CentralNotice-Administration - https://phabricator.wikimedia.org/T164356#3317502 (10mmodell) https://phabricator.wikimedia.org/maniphest/task/edit/form/32/ [09:22:03] 10Release-Engineering-Team (Kanban), 10Phabricator: Custom task form for #Wikimedia-CentralNotice-Administration - https://phabricator.wikimedia.org/T164356#3317503 (10mmodell) I can't adjust the edit policy on individual forms, unfortunately. So Let me know if there are changes you would like me to make to th... [09:22:18] 10Release-Engineering-Team (Kanban), 10Phabricator: Custom task form for #Wikimedia-CentralNotice-Administration - https://phabricator.wikimedia.org/T164356#3317504 (10mmodell) 05Open>03Resolved [09:22:51] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team: deployment-phab02 is shutoff since April 23rd, can it be deleted? - https://phabricator.wikimedia.org/T167090#3317505 (10mmodell) Delete it I guess. [09:24:46] !log Deleting deployment-phab02 instance. Has been shut off since April 23rd - T167090 [09:24:51] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [09:24:51] T167090: deployment-phab02 is shutoff since April 23rd, can it be deleted? - https://phabricator.wikimedia.org/T167090 [09:25:13] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team (Kanban): deployment-phab02 is shutoff since April 23rd, can it be deleted? - https://phabricator.wikimedia.org/T167090#3317508 (10hashar) 05Open>03Resolved a:03hashar Thank you :) [09:29:54] 10Release-Engineering-Team (Kanban), 10Phabricator, 10Project-Admins, 10Release: Decide whether to continue using deployment blocker tasks or combine them with the release milestones - https://phabricator.wikimedia.org/T164978#3317518 (10mmodell) 05Open>03Resolved a:03mmodell Seems like this was sett... [09:46:17] 10Release-Engineering-Team (Backlog), 10Phabricator, 10Patch-For-Review, 10Technical-Debt: Replace deprecated phabricator conduit api calls in phab_epipe.py file - https://phabricator.wikimedia.org/T159043#3317553 (10mmodell) p:05Triage>03Low [10:02:46] 10Deployment-Systems, 10Release-Engineering-Team (Kanban), 10Scap (Scap3-Adoption-Phase1), 10scap2, and 2 others: Deploy jobrunner with scap3 (Trebuchet jobrunner/jobrunner) - https://phabricator.wikimedia.org/T129148#3317643 (10hashar) [10:02:55] 10Deployment-Systems, 10Release-Engineering-Team (Kanban), 10Scap (Scap3-Adoption-Phase1), 10scap2, 10MediaWiki-JobRunner: scap should allow restarting multiple services - https://phabricator.wikimedia.org/T167098#3317631 (10hashar) [10:10:49] PROBLEM - Puppet errors on deployment-eventlogging03 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [10:11:24] I am working on it [10:11:27] will check [10:12:03] 10Deployment-Systems, 10Release-Engineering-Team (Kanban), 10Scap (Scap3-Adoption-Phase1), 10scap2, and 2 others: Deploy jobrunner with scap3 (Trebuchet jobrunner/jobrunner) - https://phabricator.wikimedia.org/T129148#3317769 (10hashar) [10:12:06] 10Deployment-Systems, 10Release-Engineering-Team (Kanban), 10Scap (Scap3-Adoption-Phase1), 10scap2, and 2 others: figure out how to not restart jobrunner/jobchron in the non-active DC - https://phabricator.wikimedia.org/T167104#3317757 (10hashar) [10:16:29] 10Deployment-Systems, 10Release-Engineering-Team (Kanban), 10Scap (Scap3-Adoption-Phase1), 10scap2, and 2 others: figure out how to not restart jobrunner/jobchron in the non-active DC - https://phabricator.wikimedia.org/T167104#3317805 (10akosiaris) I 've had a quick look into the `mask` feature of systemd... [10:21:14] 10Continuous-Integration-Config, 10Operations, 10Operations-Software-Development, 10Patch-For-Review: Flake8 for python files without extension in puppet repo - https://phabricator.wikimedia.org/T144169#3317841 (10fgiunchedi) >>! In T144169#3317402, @Joe wrote: >> Re: naming, I think an obvious convention... [10:44:20] !log running eventlogging_cleaner.py (https://gerrit.wikimedia.org/r/#/c/356383/) on eventlogging to test the cleaning of old events [10:44:24] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [10:45:50] RECOVERY - Puppet errors on deployment-eventlogging03 is OK: OK: Less than 1.00% above the threshold [0.0] [10:45:51] this --^ is what will happen for the log database on the analytics slaves (and then on the el master as well) in production to implement data retention policies [10:47:29] the idea is to drop data older than 90 days, applying a custom policy (namely only setting to NULL some attributes and not deleting the row) for some tables [10:47:47] if anybody has concerns please reach out to me! [10:50:18] (going to lunch, brb in 1h) [10:52:00] This is in fatalmonitor: "16 Syntax Error: Couldn't find trailer dictionary" [10:52:16] Just to make sure it's already known [11:03:28] PROBLEM - Puppet errors on deployment-aqs01 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [11:06:52] Amir1: it is already a task in phabricator 100% sure [11:07:04] that is just the stderr/stdout of some command that leaks to HHVM stdout [11:07:13] and ends up being logged [11:09:50] okay [11:10:10] 7 Warning: JsonConfig: Invalid $wgJsonConfigModels['JsonConfig.Dashiki'] array value, 'class' not found [Called from JsonConfig\JCSingleton::getContentClass in /srv/mediawiki/php-1.30.0-wmf.2/extensions/JsonConfig/includes/JCSingleton.php at line 403] in /srv/mediawiki/php-1.30.0-wmf.2/includes/debug/MWDebug.php on line 309 [11:10:15] this one is also scary [11:12:59] Amir1: https://phabricator.wikimedia.org/T166335 :) [11:13:01] most are already filled in Phabricator [11:13:01] :D [11:13:06] going out for lunch [11:13:18] hashar: have fun [11:13:28] thanks for taking care of this [11:27:58] (03PS1) 10Tobias Gritschacher: Add CI jobs for AdvancedSearch extension [integration/config] - 10https://gerrit.wikimedia.org/r/357370 (https://phabricator.wikimedia.org/T166661) [11:29:55] 10Scap (Scap3-Adoption-Phase1), 10releng-201516-q4, 10releng-201718-q1, 10Trebuchet: [keyresult] Migrate remaining trebuchet deployed services - https://phabricator.wikimedia.org/T129290#3318174 (10akosiaris) [11:29:58] 10Deployment-Systems, 10Scap (Scap3-Adoption-Phase1), 10scap2, 10Monitoring, and 2 others: Deploy servermon with scap3 - https://phabricator.wikimedia.org/T129152#3318171 (10akosiaris) 05Open>03Resolved a:03akosiaris Migration completed. Servermon is now deployed using scap3. Resolving. [11:39:17] hashar: https://gerrit.wikimedia.org/r/357369 for when you're around. Not sure if it's a problem in tox configuration of that repo or a bug in jenkins [11:39:33] or I'm missing something very obvious [11:43:27] RECOVERY - Puppet errors on deployment-aqs01 is OK: OK: Less than 1.00% above the threshold [0.0] [12:00:12] 10Continuous-Integration-Config, 10Gerrit, 10MediaWiki-extensions-Other: Archive AWSSDK MediaWiki extension - https://phabricator.wikimedia.org/T167124#3318282 (10hashar) [12:03:01] 10Continuous-Integration-Config, 10MW-1.30-release-notes (WMF-deploy-2017-05-23_(1.30.0-wmf.2)), 10Patch-For-Review: mass adding of jakub-onderka/php-console-highlighter missed changes to composer.lock files - https://phabricator.wikimedia.org/T164751#3318313 (10hashar) 05Open>03Resolved AWSSDK should be... [12:37:35] !log Removing HHVM from permanent Trusty slaves [12:37:38] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [13:03:22] PROBLEM - Puppet errors on deployment-restbase02 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [13:06:31] (03CR) 10Hashar: [C: 032] Add CI jobs for AdvancedSearch extension [integration/config] - 10https://gerrit.wikimedia.org/r/357370 (https://phabricator.wikimedia.org/T166661) (owner: 10Tobias Gritschacher) [13:07:39] 10Browser-Tests-Infrastructure, 10Release-Engineering-Team, 10MediaWiki-General-or-Unknown, 10Epic, and 6 others: Port Selenium tests from Ruby to Node.js - https://phabricator.wikimedia.org/T139740#3318421 (10zeljkofilipin) [13:08:05] (03Merged) 10jenkins-bot: Add CI jobs for AdvancedSearch extension [integration/config] - 10https://gerrit.wikimedia.org/r/357370 (https://phabricator.wikimedia.org/T166661) (owner: 10Tobias Gritschacher) [13:28:39] 10Scap: Allow to choose different targets on where to scap to - https://phabricator.wikimedia.org/T165486#3318483 (10thcipriani) p:05Triage>03Normal Check out the `--limit` flag. That may be all you need: https://github.com/wikimedia/scap/blob/master/scap/targets.py#L50-L62 [13:30:15] 10Deployment-Systems, 10Release-Engineering-Team (Kanban), 10Scap (Scap3-Adoption-Phase1), 10MediaWiki-JobRunner, 10Operations: figure out how to not restart jobrunner/jobchron in the non-active DC - https://phabricator.wikimedia.org/T167104#3318487 (10thcipriani) [13:30:23] (03CR) 10Hashar: [C: 032] "I have updated our JJB fork to include the patch I have proposed upstream https://review.openstack.org/#/c/471030/" [integration/config] - 10https://gerrit.wikimedia.org/r/357209 (https://phabricator.wikimedia.org/T161895) (owner: 10Hashar) [13:31:27] 10Deployment-Systems, 10Release-Engineering-Team (Kanban), 10Scap (Scap3-Adoption-Phase1), 10MediaWiki-JobRunner: scap should allow restarting multiple services - https://phabricator.wikimedia.org/T167098#3318493 (10thcipriani) [13:33:17] (03Merged) 10jenkins-bot: Run composer test from mediawiki-extensions jobs [integration/config] - 10https://gerrit.wikimedia.org/r/357209 (https://phabricator.wikimedia.org/T161895) (owner: 10Hashar) [13:37:14] 10Continuous-Integration-Infrastructure (Little Steps Sprint), 10Release-Engineering-Team (Kanban), 10Patch-For-Review: For MediaWiki extensions, merge composer test into mwext-textextension / mediawiki-extensions jobs - https://phabricator.wikimedia.org/T161895#3318515 (10hashar) 05Open>03Resolved For m... [13:43:23] RECOVERY - Puppet errors on deployment-restbase02 is OK: OK: Less than 1.00% above the threshold [0.0] [13:54:49] 10Continuous-Integration-Infrastructure, 10Operations: CI for operations/puppet is taking too long - https://phabricator.wikimedia.org/T166888#3318627 (10faidon) >>! In T166888#3316057, @greg wrote: > Looking at the data we have it seems that the tests themselves take about [[ https://integration.wikimedia.org... [14:08:51] 10Deployment-Systems, 10Release-Engineering-Team (Kanban), 10Scap (Scap3-Adoption-Phase1), 10MediaWiki-JobRunner: scap should allow restarting multiple services - https://phabricator.wikimedia.org/T167098#3318690 (10thcipriani) [14:13:29] (03PS1) 10Aude: Update Wikidata to wmf/1.30.0-wmf.4 [tools/release] - 10https://gerrit.wikimedia.org/r/357387 [14:13:46] (03CR) 10Aude: [C: 032] Update Wikidata to wmf/1.30.0-wmf.4 [tools/release] - 10https://gerrit.wikimedia.org/r/357387 (owner: 10Aude) [14:24:58] (03Abandoned) 10Hashar: Skip php 5.5 for mediawiki wmf branches [integration/config] - 10https://gerrit.wikimedia.org/r/344642 (https://phabricator.wikimedia.org/T94149) (owner: 10Hashar) [14:26:17] 10Continuous-Integration-Config, 10Continuous-Integration-Infrastructure (Little Steps Sprint), 10Release-Engineering-Team: Get rid of zend tests for wmf branches - https://phabricator.wikimedia.org/T94149#3318798 (10hashar) 05Open>03stalled Stalled, pending completion of Zend -> HHVM migration. [14:27:03] (03Abandoned) 10Hashar: (WIP) mediawiki-releases job [integration/config] - 10https://gerrit.wikimedia.org/r/333280 (owner: 10Hashar) [14:33:18] Project selenium-WikiLove » firefox,beta,Linux,BrowserTests build #415: 04FAILURE in 1 min 17 sec: https://integration.wikimedia.org/ci/job/selenium-WikiLove/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=BrowserTests/415/ [14:34:03] !log deleting buildlog.integration.eqiad.wmflabs was mean to receive Jenkins logs in ElasticSearch. We are experimenting with relforge1001.eqiad.wmnet now - T78705 [14:34:08] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [14:34:08] T78705: Send Jenkins build log and results to ElasticSearch - https://phabricator.wikimedia.org/T78705 [14:35:43] PROBLEM - Host buildlog is DOWN: CRITICAL - Host Unreachable (10.68.22.2) [14:41:40] 10Deployment-Systems, 10Release-Engineering-Team (Kanban), 10Scap (Scap3-Adoption-Phase1), 10MediaWiki-JobRunner: scap should allow restarting multiple services - https://phabricator.wikimedia.org/T167098#3318875 (10thcipriani) D677 should handle multiple service restart/reloads. > And since jobchron does... [15:01:29] (03CR) 10Aude: [C: 032] Update Wikidata to wmf/1.30.0-wmf.4 [tools/release] - 10https://gerrit.wikimedia.org/r/357387 (owner: 10Aude) [15:03:13] (03Merged) 10jenkins-bot: Update Wikidata to wmf/1.30.0-wmf.4 [tools/release] - 10https://gerrit.wikimedia.org/r/357387 (owner: 10Aude) [15:28:03] 10Browser-Tests-Infrastructure, 10Release-Engineering-Team, 10MediaWiki-General-or-Unknown, 10Epic, and 6 others: Port Selenium tests from Ruby to Node.js - https://phabricator.wikimedia.org/T139740#3319109 (10zeljkofilipin) [15:28:05] 10Browser-Tests-Infrastructure, 10Release-Engineering-Team (Kanban), 10JavaScript, 10User-zeljkofilipin: npm run selenium fails for node 8 and/or npm 5 - https://phabricator.wikimedia.org/T167153#3319096 (10zeljkofilipin) [15:29:26] 10Browser-Tests-Infrastructure, 10Release-Engineering-Team (Kanban), 10JavaScript, 10User-zeljkofilipin: `npm run selenium` fails for node 8 and/or npm 5 - https://phabricator.wikimedia.org/T167153#3319113 (10zeljkofilipin) [15:31:19] 10Browser-Tests-Infrastructure, 10Release-Engineering-Team (Kanban), 10JavaScript, 10User-zeljkofilipin: `npm run selenium` fails for node 8 and/or npm 5 - https://phabricator.wikimedia.org/T167153#3319096 (10zeljkofilipin) Looks related: https://github.com/webdriverio/wdio-mocha-framework/issues/30 [15:38:13] 10Browser-Tests-Infrastructure, 10Release-Engineering-Team (Kanban), 10JavaScript, 10User-zeljkofilipin: `npm run selenium` fails for node 8 and/or npm 5 - https://phabricator.wikimedia.org/T167153#3319165 (10zeljkofilipin) The workaround is to downgrade to node 6. Mac+homebrew: ``` $ brew install node@6... [15:38:26] 10Browser-Tests-Infrastructure, 10Release-Engineering-Team (Kanban), 10JavaScript, 10User-zeljkofilipin: `npm run selenium` fails for node 8 and/or npm 5 - https://phabricator.wikimedia.org/T167153#3319166 (10zeljkofilipin) p:05Normal>03Low [15:49:18] Can someone point me to the CoC task? [15:55:50] 10Browser-Tests-Infrastructure, 10Release-Engineering-Team (Kanban), 10JavaScript, 10User-zeljkofilipin: `npm run selenium` fails for node 8 and/or npm 5 - https://phabricator.wikimedia.org/T167153#3319219 (10zeljkofilipin) 05Open>03Invalid Deleting `node_modules` folder and running `npm install` fixed... [15:55:53] 10Browser-Tests-Infrastructure, 10Release-Engineering-Team, 10MediaWiki-General-or-Unknown, 10Epic, and 6 others: Port Selenium tests from Ruby to Node.js - https://phabricator.wikimedia.org/T139740#3319221 (10zeljkofilipin) [16:01:38] 10Continuous-Integration-Config, 10MediaWiki-General-or-Unknown, 10MediaWiki-Unit-tests, 10Tracking: Let ApiDocumentationTest structure test pass on all repos - https://phabricator.wikimedia.org/T154838#3319245 (10Umherirrender) [16:04:27] PROBLEM - Puppet errors on deployment-aqs01 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [16:12:14] Project selenium-CentralNotice » chrome,beta,Linux,BrowserTests build #418: 04FAILURE in 11 min: https://integration.wikimedia.org/ci/job/selenium-CentralNotice/BROWSER=chrome,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=BrowserTests/418/ [16:23:25] Project selenium-CentralNotice » firefox,beta,Linux,BrowserTests build #418: 04FAILURE in 22 min: https://integration.wikimedia.org/ci/job/selenium-CentralNotice/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=BrowserTests/418/ [16:36:20] Project selenium-CentralNotice » chrome,beta,OS X 10.9,BrowserTests build #418: 04FAILURE in 35 min: https://integration.wikimedia.org/ci/job/selenium-CentralNotice/BROWSER=chrome,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=OS%20X%2010.9,label=BrowserTests/418/ [16:37:08] PROBLEM - Puppet errors on integration-slave-docker-1000 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [16:39:06] 10Continuous-Integration-Infrastructure, 10Operations: CI for operations/puppet is taking too long - https://phabricator.wikimedia.org/T166888#3319367 (10greg) Look, we get it, CI is slower than people would like. When we proposed the nodepool backend we were optimizing for clean environment and maintainabilit... [16:40:10] 10Continuous-Integration-Infrastructure, 10Operations: CI for operations/puppet is taking too long - https://phabricator.wikimedia.org/T166888#3319372 (10greg) We're still open to helping get ops/puppet in a better place than it is now with small wins until we can migrate to the new docker based system, if you... [16:40:50] greg-g: ci suggestion add imstances that are permentant for just operations [16:41:17] Zppix: you don't know any of the context of how to maintain this, your suggestions aren't helpful [16:41:36] greg-g: im trying to understand... :/ [16:44:27] RECOVERY - Puppet errors on deployment-aqs01 is OK: OK: Less than 1.00% above the threshold [0.0] [16:48:22] Zppix the reason why releng uses nodepool is for security. [16:48:35] It's better security. [16:49:30] Project selenium-MobileFrontend » firefox,beta,Linux,BrowserTests build #446: 04FAILURE in 1 hr 27 min: https://integration.wikimedia.org/ci/job/selenium-MobileFrontend/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=BrowserTests/446/ [16:55:19] Project selenium-MobileFrontend » chrome,beta,Linux,BrowserTests build #446: 04FAILURE in 1 hr 33 min: https://integration.wikimedia.org/ci/job/selenium-MobileFrontend/BROWSER=chrome,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=BrowserTests/446/ [16:59:13] and maintainability, permanent slaves are/were horrible maintenance costs. It took a ton of antoine's time every day [16:59:32] we replaced that with a burden on WMCS, so *shrug* [17:12:11] RECOVERY - Puppet errors on integration-slave-docker-1000 is OK: OK: Less than 1.00% above the threshold [0.0] [17:16:52] 10Release-Engineering-Team (Kanban), 10Release Pipeline (Blubber): Fix Blubber variant expansion for boolean/int config properties - https://phabricator.wikimedia.org/T166353#3319526 (10thcipriani) [17:19:04] twentyafterfour, RainbowSprinkles https://phabricator.wikimedia.org/T14974 still continues to error out. [17:20:18] subbu hi, i have a fix for that [17:20:39] ok thanks. [17:20:47] subbu: https://phabricator.wikimedia.org/D679 [17:21:05] subbu: also see task https://phabricator.wikimedia.org/T166958 [17:21:30] ah, ok. ty. [17:22:21] Your welcome :) [17:29:30] hey, I wanted to deploy LoginNotify to testwiki at 11 - any objections? [17:32:23] MaxSem: uh, coordinate with RainbowSprinkles [17:32:43] Um...assuming scap is finished before then, fine by me [17:35:27] PROBLEM - Puppet errors on deployment-aqs01 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [17:46:41] thanks [17:58:42] 10MediaWiki-Codesniffer, 10Upstream: PHP_CodeSniffer 3.x breaks when prepend-autoloader: false is set (like it is in MediaWiki core) - https://phabricator.wikimedia.org/T167168#3319724 (10Legoktm) [18:02:24] 10Release-Engineering-Team (Kanban), 10Release, 10Train Deployments: MW-1.30.0-wmf.3 deployment blockers - https://phabricator.wikimedia.org/T166829#3319746 (10greg) [18:10:12] James_F: uh, chad cut a wmf.3 and did some prep work on it today, not skipping and going with wmf.4. How hard for us to rename some tags/tasks? The deploy tracking tasks should be easy enough, but I'm unsure of the releasetaggerbot logic and how renaming those tags will pan out [18:10:29] greg-g: That's… going to be a real pain. [18:10:30] RECOVERY - Puppet errors on deployment-aqs01 is OK: OK: Less than 1.00% above the threshold [0.0] [18:10:42] greg-g: Will be out of meetings in ~4 hours. [18:11:09] James_F: :( :( after he has a break he can probably do the prep work again (~1hr of work).... [18:23:58] 10Release-Engineering-Team (Kanban), 10Release, 10Train Deployments: MW-1.30.0-wmf.4 deployment blockers - https://phabricator.wikimedia.org/T166829#3319946 (10greg) [18:34:45] 10Release-Engineering-Team, 10Page-Previews, 10Reading-Web-Backlog, 10Epic: [EPIC] Generate compiled assets from continuous integration - https://phabricator.wikimedia.org/T158980#3320005 (10Jdlrobson) [18:34:47] 10Release-Engineering-Team, 10Page-Previews, 10Reading-Web-Backlog: Create bot that automatically rebases and rebuilds patches to master - https://phabricator.wikimedia.org/T167181#3319993 (10Jdlrobson) [18:36:04] 10Release-Engineering-Team, 10Page-Previews, 10Reading-Web-Backlog, 10Epic: [EPIC] Generate compiled assets from continuous integration - https://phabricator.wikimedia.org/T158980#3053519 (10Jdlrobson) We discussed https://etherpad.wikimedia.org/p/moderntoolingchain [18:50:19] 10Gerrit, 10Labs, 10wikitech.wikimedia.org: Request to rename LegoFan4000 to MacFan4000 on WikiTech - https://phabricator.wikimedia.org/T165624#3320121 (10demon) >>! In T165624#3274279, @bd808 wrote: > User `MacFan4000` already exists in LDAP. The account was created 2016-08-25T22:53:04Z. Both accounts are r... [19:18:37] 10MediaWiki-Codesniffer: Question about sniff about Visibility must be declared on property - https://phabricator.wikimedia.org/T166381#3294334 (10Legoktm) I copied it over to https://www.mediawiki.org/wiki/Manual_talk:Coding_conventions/PHP#Declaring_multiple_properties_with_same_public.2Fprivate.2Fprotected_st... [19:36:26] PROBLEM - Puppet errors on deployment-aqs01 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [19:48:33] 10Scap, 10Discovery, 10Interactive-Sprint, 10Maps (Kartotherian): Break Kartotherian scap3 deployment into 2 groups - https://phabricator.wikimedia.org/T147337#3320409 (10debt) Moving to backlog until such time that we can take this up again. [20:11:27] RECOVERY - Puppet errors on deployment-aqs01 is OK: OK: Less than 1.00% above the threshold [0.0] [20:22:31] Project beta-code-update-eqiad build #158782: 04FAILURE in 41 sec: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/158782/ [20:23:38] PROBLEM - Puppet errors on integration-slave-trusty-1006 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [20:23:40] Yippee, build fixed! [20:23:41] Project beta-code-update-eqiad build #158783: 09FIXED in 40 sec: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/158783/ [20:38:37] 10Browser-Tests-Infrastructure, 10Release-Engineering-Team, 10MediaWiki-General-or-Unknown, 10Epic, and 6 others: Port Selenium tests from Ruby to Node.js - https://phabricator.wikimedia.org/T139740#3320738 (10Jdforrester-WMF) Mass-moving all items tagged for MediaWiki 1.30.0-wmf.3, as that was never relea... [20:38:48] 10Continuous-Integration-Config, 10Goal, 10I18n, 10MW-1.30-release-notes (WMF-deploy-2017-06-06_(1.30.0-wmf.4)), 10Patch-For-Review: Configure banana checker for i18n files to run on all MediaWiki extensions and skins - https://phabricator.wikimedia.org/T94547#3320749 (10Jdforrester-WMF) Mass-moving all... [20:42:57] Project selenium-Echo » firefox,beta,Linux,BrowserTests build #417: 04FAILURE in 1 min 56 sec: https://integration.wikimedia.org/ci/job/selenium-Echo/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=BrowserTests/417/ [21:03:08] 10Scap: Allow to choose different targets on where to scap to - https://phabricator.wikimedia.org/T165486#3320989 (10demon) 05Open>03declined No, that's not quite what's being asked for here. What the OP is asking for is being able to deploy stuff //outside of// `/srv/deployment`. Kinda like how a debian pac... [21:03:38] RECOVERY - Puppet errors on integration-slave-trusty-1006 is OK: OK: Less than 1.00% above the threshold [0.0] [21:25:51] RainbowSprinkles why not post a letter to everyone to fix the bug when at the post office. [21:25:53] 10Release-Engineering-Team (Kanban), 10Release, 10Train Deployments: MW-1.30.0-wmf.4 deployment blockers - https://phabricator.wikimedia.org/T166829#3321159 (10kaldari) [21:53:40] Project beta-code-update-eqiad build #158792: 04FAILURE in 39 sec: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/158792/ [21:54:15] ^ That's me [21:55:12] are you bisecting it? :P [22:01:38] 10Gerrit, 10Release-Engineering-Team (Kanban), 10Patch-For-Review: Update gerrit to 2.13.8 - https://phabricator.wikimedia.org/T158946#3321330 (10demon) 05Open>03Resolved a:03demon [22:03:40] Yippee, build fixed! [22:03:41] Project beta-code-update-eqiad build #158793: 09FIXED in 40 sec: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/158793/ [22:04:14] 10Gerrit, 10Release-Engineering-Team (Kanban): Update gerrit to 2.13.8 - https://phabricator.wikimedia.org/T158946#3321374 (10Paladox) [22:04:54] 10Gerrit, 10MediaWiki-Vagrant, 10Patch-For-Review: "index-pack failed" when installing new MediaWiki-Vagrant box - https://phabricator.wikimedia.org/T152801#3321384 (10demon) [22:04:56] 10Gerrit, 10BlueSpice, 10Patch-For-Review, 10Upstream: Merge/Submit error on Gerrit: "org.eclipse.jgit.errors.MissingObjectException: Missing unknown" for BlueSpiceExtensions' REL1_27 branch - https://phabricator.wikimedia.org/T153079#3321383 (10demon) [22:05:00] 10Gerrit, 10Release-Engineering-Team (Kanban), 10Operations, 10Beta-Cluster-reproducible, and 2 others: gerrit jgit gc caused mediawiki/core repo problems - https://phabricator.wikimedia.org/T151676#3321381 (10demon) 05Open>03Resolved This shouldn't actually be a problem anymore. [22:11:06] 10Gerrit: "add comment" feature doesn't allow you to write a comment while viewing the code or viewing the other comments - https://phabricator.wikimedia.org/T48777#3321427 (10Paladox) Proposing to decline. PolyGerrit allows you to show a diff whilst you view comments. {F8393471} [22:14:16] 10Gerrit, 10Release-Engineering-Team (Kanban): Changing the commit description creates out-dated patch. - https://phabricator.wikimedia.org/T54292#3321435 (10demon) 05Open>03Resolved a:03demon Actually, this is fixed. You cannot begin editing from an older version of a change anymore. [22:15:10] 10Gerrit, 10Labs, 10wikitech.wikimedia.org: Request to rename LegoFan4000 to MacFan4000 on WikiTech - https://phabricator.wikimedia.org/T165624#3321441 (10bd808) User `MacFan4000` has no contributions on wikitech, so we could just delete the local MediaWiki account and its associated LDAP record and then tre... [22:15:57] 10Gerrit, 10Release-Engineering-Team (Kanban), 10Operations, 10Beta-Cluster-reproducible, 10Upstream: gerrit jgit gc caused mediawiki/core repo problems - https://phabricator.wikimedia.org/T151676#3321443 (10Paladox) [22:18:58] 10Gerrit: Inconvenient review workflow for Gerrit - https://phabricator.wikimedia.org/T142256#2528816 (10demon) > It would be an improvement if you could submit comments (including ones on older patch sets) and do +/- replies (which of course would apply to the top set) in one action instead of three actions. Y... [22:21:36] 10Gerrit, 10Labs, 10wikitech.wikimedia.org: Request to rename LegoFan4000 to MacFan4000 on WikiTech - https://phabricator.wikimedia.org/T165624#3321463 (10demon) >>! In T165624#3321441, @bd808 wrote: > User `MacFan4000` has no contributions on wikitech, so we could just delete the local MediaWiki account and... [22:43:16] 10Gerrit: Gerrit code review view jumps/scrolls up and down when commenting - https://phabricator.wikimedia.org/T159919#3321545 (10Paladox) @Aklapper yes it relates to gwtui but it also relates to codemirror. Upstream are only taking bug fixes for gwtui really. They are pushing users to contribute to PolyGerri... [22:44:12] 10Continuous-Integration-Config, 10Release-Engineering-Team (Watching / External), 10Discovery, 10Discovery-Analysis (Current work): Add lint/CI to all wikimedia/discovery analytics repositories - https://phabricator.wikimedia.org/T153856#3321547 (10mpopov) **Also for future reference**: RStudio (the folks... [22:58:46] 10Continuous-Integration-Infrastructure, 10Operations: CI for operations/puppet is taking too long - https://phabricator.wikimedia.org/T166888#3321592 (10faidon) Well, first of all, right before I filed this task, Antoine said on IRC: > containers for CI would be for later. The priority has been set toward sta... [23:29:03] 10Release-Engineering-Team (Kanban), 10Release, 10Train Deployments: MW-1.30.0-wmf.4 deployment blockers - https://phabricator.wikimedia.org/T166829#3321689 (10Krinkle) [23:34:20] uhh, phab exception: https://phabricator.wikimedia.org/T163283 [23:37:34] I don't seem to be able to log in to gerrit [23:38:46] I can still log in to other LDAP-linked things, just not gerrit [23:43:35] tried logging out and back in and that worked [23:44:27] I tried clearing cookies, but maybe it's localStorage or something [23:45:43] nope, doesn't seem to work in a different browser either [23:45:59] looking at error_log of gerrit [23:46:21] 'Tim Starling' failed to sign in: Cannot assign external ID "gerrit:tim starling" to account 4926; external ID already in use. [23:46:25] eh.. [23:46:59] RainbowSprinkles: ^ [23:47:03] ever seen that? [23:47:37] GDI [23:47:42] That was...fixed....ages ago [23:47:59] Lemme find the old task [23:48:02] I gotta do something to his user [23:48:24] T49385 [23:48:25] T49385: Can't log in into Gerrit — "Cannot assign user name" - https://phabricator.wikimedia.org/T49385 [23:49:15] Actually, T152640 [23:49:15] T152640: Cannot log into Gerrit as of recent upgrade - https://phabricator.wikimedia.org/T152640 [23:49:20] More recent [23:49:35] the error seems a little different though [23:50:05] "cannot assign user name" vs. "cannot assign external ID" [23:50:27] ah [23:51:59] though we did https://gerrit.wikimedia.org/r/#/c/326150/ [23:52:20] did it become case-sensitive again during upgrade? [23:52:39] No, that shouldn't be it [23:52:44] And his DB entries look sane [23:52:48] Lemme force a reindex on him [23:53:04] ok [23:54:12] Bleh, it won't do it for a single account. [23:56:44] TimStarling: Give it another shot? [23:57:42] still doesn't work [23:57:56] Dangit. Hmmm [23:58:10] Lemme dig a little further, hang tight [23:58:34] this is on cobalt? [23:58:42] yes [23:59:01] /var/lib/gerrit2/review_site/logs/ if you wanted to see that [23:59:59] so the account ID changes each time, it is actually creating a new user for me each time I try to log in