[05:46:00] PROBLEM - Free space - all mounts on deployment-fluorine02 is CRITICAL: CRITICAL: deployment-prep.deployment-fluorine02.diskspace._srv.byte_percentfree (<10.00%) [06:38:50] PROBLEM - Mediawiki Error Rate on graphite-labs is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [10.0] [07:06:02] RECOVERY - Free space - all mounts on deployment-fluorine02 is OK: OK: All targets OK [07:38:50] PROBLEM - Mediawiki Error Rate on graphite-labs is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [10.0] [08:53:50] RECOVERY - Mediawiki Error Rate on graphite-labs is OK: OK: Less than 1.00% above the threshold [1.0] [08:59:27] PROBLEM - Puppet errors on castor02 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [09:39:28] RECOVERY - Puppet errors on castor02 is OK: OK: Less than 1.00% above the threshold [0.0] [11:18:35] hey greg-g: it's been some time matej merged into master a new feature for AbuseFilter and it still does not appear in the beta cluster. Can't see anything in the beta cluster fatalmonitor. Just keep waiting? Thanks. [11:28:01] 10Beta-Cluster-Infrastructure, 10MediaWiki-Authentication-and-authorization, 10MW-1.30-release-notes (WMF-deploy-2017-08-08_(1.30.0-wmf.13)), 10Patch-For-Review: "Loss of session data" on Beta Cluster - https://phabricator.wikimedia.org/T172560#3515174 (10MarcoAurelio) `AV3L4KyohvX_CkfKdDP1 Loading Central... [12:22:39] Project selenium-GettingStarted ยป firefox,beta,Linux,BrowserTests build #490: 04FAILURE in 39 sec: https://integration.wikimedia.org/ci/job/selenium-GettingStarted/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=BrowserTests/490/ [12:29:43] tabby [12:29:52] (I just wanted to type tabby tab) [12:30:40] ftr: these look green: https://integration.wikimedia.org/ci/view/Beta/job/beta-code-update-eqiad/ [12:33:26] https://en.wikipedia.beta.wmflabs.org/wiki/Special:Version links to https://phabricator.wikimedia.org/rEABF389995916c275e0628975a5316a99001f7fcb539 for abuse filter which looks right to me? [12:35:11] * greg-g goes to breakfast [13:28:55] 10Scap (Scap3-Adoption-Phase1), 10Analytics, 10Analytics-EventLogging, 10Performance-Team: Use scap3 to deploy eventlogging/eventlogging - https://phabricator.wikimedia.org/T118772#3515344 (10Ottomata) > However, does this also mean we should deploy updates to eventlogging ourselves? Yup! > might also hel... [13:32:02] 10Continuous-Integration-Infrastructure, 10Patch-For-Review, 10Technical-Debt: Local Phan errors that don't show up on CI - https://phabricator.wikimedia.org/T172935#3515361 (10WMDE-Fisch) @hashar: @WMDE-leszek and me will try to update the version to 0.8.5 because currently it's a pain to have this configur... [13:36:47] PROBLEM - Parsoid on deployment-parsoid09 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:39:11] 10Browser-Tests-Infrastructure, 10Release-Engineering-Team (Next), 10MinervaNeue, 10Readers-Web-Backlog, and 5 others: [4 hrs] MinervaNeue browser test are flaking (waiting for {:class=>"mw-notification", :tag_name=>"div"} to become present ) - https://phabricator.wikimedia.org/T170890#3446357 (10phuedx) T... [13:41:42] RECOVERY - Parsoid on deployment-parsoid09 is OK: HTTP OK: HTTP/1.1 200 OK - 1051 bytes in 5.854 second response time [13:47:51] (03PS1) 10WMDE-leszek: Use PHAN 0.8.5 [integration/config] - 10https://gerrit.wikimedia.org/r/371033 [13:49:07] (03CR) 10WMDE-leszek: Use PHAN 0.8.5 (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/371033 (owner: 10WMDE-leszek) [14:23:14] 10Beta-Cluster-Infrastructure, 10MediaWiki-Authentication-and-authorization, 10MediaWiki-extensions-CentralAuth, 10MW-1.30-release-notes (WMF-deploy-2017-08-08_(1.30.0-wmf.13)), 10Patch-For-Review: "Loss of session data" on Beta Cluster - https://phabricator.wikimedia.org/T172560#3515646 (10greg) @Anomie... [15:12:52] PROBLEM - Puppet errors on deployment-ms-fe02 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [15:26:33] 10Continuous-Integration-Config, 10Wikipedia-Android-App-Backlog, 10Technical-Debt: Add support to periodic CI tests for exercising arbitrary revisions - https://phabricator.wikimedia.org/T152455#3515921 (10Niedzielski) I believe this task would simply need to add an environment variable (string parameter) f... [15:53:32] RECOVERY - Puppet errors on deployment-imagescaler02 is OK: OK: Less than 1.00% above the threshold [0.0] [16:12:23] 10Beta-Cluster-Infrastructure, 10MediaWiki-Authentication-and-authorization, 10MediaWiki-extensions-CentralAuth, 10MW-1.30-release-notes (WMF-deploy-2017-08-08_(1.30.0-wmf.13)), 10Patch-For-Review: "Loss of session data" on Beta Cluster - https://phabricator.wikimedia.org/T172560#3516153 (10greg) p:05Tr... [16:29:12] PROBLEM - Puppet errors on deployment-parsoid09 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [16:29:31] 10Browser-Tests-Infrastructure, 10Release-Engineering-Team (Next), 10MinervaNeue, 10Readers-Web-Backlog, and 5 others: [4 hrs] MinervaNeue browser test are flaking (waiting for {:class=>"mw-notification", :tag_name=>"div"} to become present ) - https://phabricator.wikimedia.org/T170890#3446357 (10pmiazga)... [16:29:45] 10Browser-Tests-Infrastructure, 10Release-Engineering-Team (Next), 10MinervaNeue, 10Readers-Web-Backlog, and 5 others: [4 hrs] MinervaNeue browser test are flaking (waiting for {:class=>"mw-notification", :tag_name=>"div"} to become present ) - https://phabricator.wikimedia.org/T170890#3516269 (10pmiazga)... [16:30:34] PROBLEM - Puppet errors on deployment-mediawiki06 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [16:42:10] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Kanban): Spike: Evaluate experimental Docker based CI w/ scap builds - https://phabricator.wikimedia.org/T150501#3516318 (10thcipriani) [16:42:12] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Kanban), 10Patch-For-Review: Set up experimental Docker CI slave - https://phabricator.wikimedia.org/T150502#3516316 (10thcipriani) 05Open>03Resolved we have 4 different docker CI slaves. The code to create new ones is merged in puppet... [16:50:35] RECOVERY - Puppet errors on deployment-mediawiki06 is OK: OK: Less than 1.00% above the threshold [0.0] [17:09:10] RECOVERY - Puppet errors on deployment-parsoid09 is OK: OK: Less than 1.00% above the threshold [0.0] [17:41:50] (03PS1) 10Legoktm: Add script to get phan version from extra.phan-version in composer.json [integration/jenkins] - 10https://gerrit.wikimedia.org/r/371095 [17:43:10] (03CR) 10Addshore: [C: 031] Add script to get phan version from extra.phan-version in composer.json [integration/jenkins] - 10https://gerrit.wikimedia.org/r/371095 (owner: 10Legoktm) [17:43:22] (03CR) 10Legoktm: [C: 032] "Reviewed and tested by Addshore" [integration/jenkins] - 10https://gerrit.wikimedia.org/r/371095 (owner: 10Legoktm) [17:49:41] (03PS1) 10Addshore: Get phan version from composer.json [integration/config] - 10https://gerrit.wikimedia.org/r/371097 [17:49:58] (03PS2) 10Addshore: Get phan version from composer.json [integration/config] - 10https://gerrit.wikimedia.org/r/371097 [17:51:59] (03CR) 10Addshore: "See https://gerrit.wikimedia.org/r/#/c/371097 :D" [integration/config] - 10https://gerrit.wikimedia.org/r/371033 (owner: 10WMDE-leszek) [17:54:36] (03CR) 10WMDE-Fisch: "wohoo!" [integration/jenkins] - 10https://gerrit.wikimedia.org/r/371095 (owner: 10Legoktm) [18:04:03] (03CR) 10jerkins-bot: [V: 04-1] Add script to get phan version from extra.phan-version in composer.json [integration/jenkins] - 10https://gerrit.wikimedia.org/r/371095 (owner: 10Legoktm) [18:14:24] (03PS2) 10Legoktm: Add script to get phan version from extra.phan-version in composer.json [integration/jenkins] - 10https://gerrit.wikimedia.org/r/371095 [18:14:30] (03CR) 10jerkins-bot: [V: 04-1] Get phan version from composer.json [integration/config] - 10https://gerrit.wikimedia.org/r/371097 (owner: 10Addshore) [18:14:32] (03CR) 10Legoktm: [C: 032] Add script to get phan version from extra.phan-version in composer.json [integration/jenkins] - 10https://gerrit.wikimedia.org/r/371095 (owner: 10Legoktm) [18:21:35] (03Merged) 10jenkins-bot: Add script to get phan version from extra.phan-version in composer.json [integration/jenkins] - 10https://gerrit.wikimedia.org/r/371095 (owner: 10Legoktm) [18:45:56] PROBLEM - Long lived cherry-picks on puppetmaster on deployment-puppetmaster02 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [18:46:02] 10Release-Engineering-Team (Kanban), 10Patch-For-Review, 10Release, 10Train Deployments: 1.30.0-wmf.13 deployment blockers - https://phabricator.wikimedia.org/T170631#3516820 (10mmodell) [18:47:25] 10Release-Engineering-Team (Kanban), 10Release, 10Train Deployments: 1.30.0-wmf.14 deployment blockers - https://phabricator.wikimedia.org/T170632#3437602 (10mmodell) [18:47:28] 10Release-Engineering-Team (Kanban), 10Patch-For-Review, 10Release, 10Train Deployments: 1.30.0-wmf.13 deployment blockers - https://phabricator.wikimedia.org/T170631#3437587 (10mmodell) [18:47:31] 10Release-Engineering-Team (Kanban), 10Patch-For-Review, 10Release, 10Train Deployments: 1.30.0-wmf.12 deployment blockers - https://phabricator.wikimedia.org/T168053#3516828 (10mmodell) [18:51:07] Cannot assign user name "reception123" to account 5206; name already in use. [18:51:09] Again? [18:51:27] Every time I get logged out and try to login again this issue comes back [18:51:31] paladox: RainbowSprinkles ^ [18:51:40] uh [18:51:47] There's a bug then [18:52:09] yup, I guess the "gerrit:" prefix is gone from the table again [18:52:12] Reception123 gerrit 2.13 is very buggy with ldap we are finding. [18:52:14] and yep [18:52:28] needs chad to re add it and to flush the cache [18:53:12] Yup. I tried logging out and logging in again immediately after my account was fixed for the second time, and it worked. It probably only occurs when I'm automatically logged out [18:55:35] yep [18:56:27] 10Gerrit, 10Release-Engineering-Team (Kanban), 10Regression, 10Upstream: Cannot log into Gerrit as of recent upgrade - https://phabricator.wikimedia.org/T152640#3516881 (10Reception123) And after I got logged out, and try to log in again, the issue is back :( Probably the gerrit: prefix again. [18:56:47] I wonder if we need to get a upstream or patch the ldap issues ourselves paladox [18:56:49] just commenting to keep track of the error [18:57:00] Zppix this cannot be patched in 2.13 [18:57:03] 2.13 is now eol [18:57:06] end of life [18:57:12] paladox oh yes i forgot [18:57:14] 2.14 looks to seem to have the fix [18:57:28] or 2.15 will as accounts are being migrated to notedb [18:57:40] any date for an upgrade to 2.14? [18:57:45] nope [18:57:55] Do you think it's going to be any time soon? (like this month) [18:58:05] i am unsure [18:58:27] 2.15 stores external id in the repo [18:58:39] so hopefully this issue will be resolved soon [18:58:49] well then until then the prefix will always have to be added manually for my user_id :( [18:59:19] yep [19:01:26] Well if you'd stop breaking gerrit Reception123 :P [19:03:23] (03Abandoned) 10WMDE-leszek: Use PHAN 0.8.5 [integration/config] - 10https://gerrit.wikimedia.org/r/371033 (owner: 10WMDE-leszek) [19:07:10] Can we haz more CI capacity? [19:08:25] meow-thumbs-up [19:09:02] if it was that simple... :P [19:09:36] Krinkle for the hackathon? [19:16:25] Just generally [19:17:13] Will need to create a task [19:17:20] make a check payable to Mr. Hashar, Esq. [19:17:21] and add cloud phabricator project [19:17:25] lol [19:17:37] who does checks these days :) [19:17:44] me [19:19:52] lol [19:29:56] 10Continuous-Integration-Config, 10MediaWiki-Core-Tests, 10MediaWiki-General-or-Unknown, 10Tracking: Let ApiDocumentationTest structure test pass on all repos - https://phabricator.wikimedia.org/T154838#3516998 (10Umherirrender) [19:29:59] 10Continuous-Integration-Config, 10MediaWiki-extensions-WikiLexicalData-or-OmegaWiki, 10Easy, 10I18n, 10Patch-For-Review: Extension OmegaWiki failing tests due to missing apihelp messages - https://phabricator.wikimedia.org/T155044#3516997 (10Umherirrender) 05Open>03Resolved [19:30:14] 10Continuous-Integration-Config, 10MediaWiki-Core-Tests, 10MediaWiki-General-or-Unknown, 10Tracking: Let ApiDocumentationTest structure test pass on all repos - https://phabricator.wikimedia.org/T154838#2925530 (10Umherirrender) [19:30:17] 10Continuous-Integration-Config, 10MediaWiki-extensions-Other, 10Easy, 10I18n, 10Patch-For-Review: Extension WikiObjectModel failing tests due to missing apihelp messages - https://phabricator.wikimedia.org/T155034#3517000 (10Umherirrender) 05Open>03Resolved [19:30:44] 10Continuous-Integration-Config, 10MediaWiki-Core-Tests, 10MediaWiki-General-or-Unknown, 10Tracking: Let ApiDocumentationTest structure test pass on all repos - https://phabricator.wikimedia.org/T154838#2925530 (10Umherirrender) [19:30:47] 10Continuous-Integration-Config, 10MediaWiki-extensions-Survey, 10Easy, 10I18n, 10Patch-For-Review: Extension Survey failing tests due to missing apihelp messages - https://phabricator.wikimedia.org/T155031#3517008 (10Umherirrender) 05Open>03Resolved [19:31:04] (03PS2) 10Umherirrender: [Survey] Make unit tests voting [integration/config] - 10https://gerrit.wikimedia.org/r/369968 [19:37:27] 10Scap (Scap3-Adoption-Phase1), 10releng-201516-q2, 10releng-201516-q3, 10releng-201516-q4, 10scap2: [keyresult] Migrate all Service team owned services and MW to scap - https://phabricator.wikimedia.org/T109926#3517033 (10mobrovac) [19:37:30] 10Scap (Scap3-Adoption-Phase1), 10releng-201516-q4, 10releng-201718-q1, 10Trebuchet: [keyresult] Migrate remaining trebuchet deployed services - https://phabricator.wikimedia.org/T129290#3517032 (10mobrovac) [19:37:34] 10Scap (Scap3-Adoption-Phase1), 10Cassandra, 10Services (done): Deploy logstash logback encoder with scap3 - https://phabricator.wikimedia.org/T116340#3517028 (10mobrovac) 05Open>03Resolved a:03mobrovac The repository has been switched to Scap3, resolving. [19:54:06] 10Continuous-Integration-Infrastructure, 10Nodepool: Increase Jenkins/Nodepool capacity - https://phabricator.wikimedia.org/T173047#3517091 (10Krinkle) [19:56:07] thcipriani: question: if I set service_name but not service_port, will it still restart the service but not try to check any ports? (hoping for a yes here :P) [19:56:23] yes :) [19:56:28] cool [19:57:49] relevant bit https://github.com/wikimedia/scap/blob/master/scap/deploy.py#L395-L402 tasks.handle_services restart and tasks.check_port checks ports [19:58:39] cool thnx [20:16:08] (03PS1) 10Umherirrender: [CreatePageUw][UIFeedback] Add npm job [integration/config] - 10https://gerrit.wikimedia.org/r/371138 [20:29:44] 10Continuous-Integration-Infrastructure, 10Nodepool: Increase Jenkins/Nodepool capacity - https://phabricator.wikimedia.org/T173047#3517286 (10Krinkle) [20:54:02] 10Scap (Scap3-Adoption-Phase1), 10releng-201516-q2, 10releng-201516-q3, 10releng-201516-q4, 10scap2: [keyresult] Migrate all Service team owned services and MW to scap - https://phabricator.wikimedia.org/T109926#3517364 (10mobrovac) [20:54:09] 10Scap (Scap3-Adoption-Phase1), 10releng-201516-q4, 10releng-201718-q1, 10Trebuchet: [keyresult] Migrate remaining trebuchet deployed services - https://phabricator.wikimedia.org/T129290#3517363 (10mobrovac) [20:54:10] 10Scap (Scap3-Adoption-Phase1), 10Cassandra, 10Services (done): Deploy cassandra metrics collector via scap3 - https://phabricator.wikimedia.org/T137371#3517359 (10mobrovac) 05Open>03Resolved a:03mobrovac metrics-collector uses Scap3 for its deployments now. Resolving. [20:56:51] 10Scap (Scap3-Adoption-Phase1), 10releng-201516-q2, 10releng-201516-q3, 10releng-201516-q4, and 2 others: [keyresult] Migrate all Service team owned services and MW to scap - https://phabricator.wikimedia.org/T109926#3517369 (10mobrovac) All of the services owned and/or maintained by the #services team ar... [21:27:15] 10Continuous-Integration-Config, 10TestMe: fix or mark as inactive extensions currently failing CI - https://phabricator.wikimedia.org/T134090#3517409 (10Umherirrender) [21:27:19] 10Continuous-Integration-Config, 10MediaWiki-Core-Tests, 10MediaWiki-General-or-Unknown, 10Tracking: Let ApiDocumentationTest structure test pass on all repos - https://phabricator.wikimedia.org/T154838#3517406 (10Umherirrender) 05Open>03Resolved a:03Umherirrender All current extensions have all need... [21:34:05] PROBLEM - Puppet errors on deployment-kafka01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [21:35:47] PROBLEM - Puppet errors on deployment-etcd-01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [21:36:46] PROBLEM - Puppet errors on deployment-pdf01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [22:04:30] (03PS1) 10Legoktm: Default PREFIX to /usr [integration/uprightdiff] - 10https://gerrit.wikimedia.org/r/371201 [22:04:39] (03CR) 10jerkins-bot: [V: 04-1] Default PREFIX to /usr [integration/uprightdiff] - 10https://gerrit.wikimedia.org/r/371201 (owner: 10Legoktm) [22:08:53] 10Deployment-Systems, 10MediaWiki-extensions-CentralAuth, 10Performance-Team: $wgLocalVirtualHosts should include login.wikimedia.org, wikidata.org and others? - https://phabricator.wikimedia.org/T172357#3517457 (10Legoktm) It should probably just be `www.wikidata.org` so it doesn't include `query.wikidata.o... [22:14:20] (03PS1) 10Legoktm: Address paravoid's review [integration/uprightdiff] (debian) - 10https://gerrit.wikimedia.org/r/371202 [22:16:12] (03PS2) 10Legoktm: Enable Squiz.WhiteSpace.OperatorSpacing [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/370321 (https://phabricator.wikimedia.org/T171393) (owner: 10Umherirrender) [22:16:16] (03CR) 10Legoktm: [C: 032] Enable Squiz.WhiteSpace.OperatorSpacing [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/370321 (https://phabricator.wikimedia.org/T171393) (owner: 10Umherirrender) [22:16:57] (03Merged) 10jenkins-bot: Enable Squiz.WhiteSpace.OperatorSpacing [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/370321 (https://phabricator.wikimedia.org/T171393) (owner: 10Umherirrender) [22:22:14] (03PS2) 10Legoktm: Enforce "short" type definitions on @param in comments [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/370232 (https://phabricator.wikimedia.org/T145162) (owner: 10Umherirrender) [22:23:01] (03CR) 10Legoktm: [C: 032] "Nice :)" [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/370232 (https://phabricator.wikimedia.org/T145162) (owner: 10Umherirrender) [22:24:23] 10MediaWiki-Codesniffer, 10MediaWiki-General-or-Unknown, 10MW-1.30-release-notes (WMF-deploy-2017-08-15 (1.30.0-wmf.14)): MediaWiki core was not passing phpcs - https://phabricator.wikimedia.org/T172933#3517493 (10Umherirrender) I would say the bypassed failures are fixed, but not the problem that there coul... [22:26:15] (03Merged) 10jenkins-bot: Enforce "short" type definitions on @param in comments [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/370232 (https://phabricator.wikimedia.org/T145162) (owner: 10Umherirrender) [22:30:18] (03PS2) 10Umherirrender: Added OpeningKeywordBraceSniff [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/370312 [22:37:57] 10MediaWiki-Codesniffer, 10Patch-For-Review: Spacing around + - https://phabricator.wikimedia.org/T171393#3517506 (10Umherirrender) 05Open>03Resolved [22:38:25] 10MediaWiki-Codesniffer: Provide Codesniffer rules to enforce "short" type definitions (int/bool, not integer/boolean) - https://phabricator.wikimedia.org/T145162#3517509 (10Umherirrender) [22:52:52] (03CR) 10Legoktm: [C: 032] Added OpeningKeywordBraceSniff [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/370312 (owner: 10Umherirrender) [22:54:31] (03Merged) 10jenkins-bot: Added OpeningKeywordBraceSniff [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/370312 (owner: 10Umherirrender) [22:54:41] 10Release-Engineering-Team (Kanban), 10Patch-For-Review, 10Release, 10Train Deployments: 1.30.0-wmf.13 deployment blockers - https://phabricator.wikimedia.org/T170631#3517532 (10mmodell) 05Open>03Resolved