[03:34:43] 10Release-Engineering-Team (Kanban): 20170721-Train-Wikidata - Post mortem - https://phabricator.wikimedia.org/T173435#3572249 (10Jrbranaa) Postmortem meeting etherpad notes available at: https://etherpad.wikimedia.org/p/20170831-Post_Mortem-T164173. [03:35:06] 10Release-Engineering-Team (Kanban): 20170721-Train-Wikidata - Post mortem - https://phabricator.wikimedia.org/T173435#3572250 (10Jrbranaa) 05Open>03Resolved [03:37:48] 10Release-Engineering-Team (Kanban), 10Technical-Debt: Setup Tech Debt SIG meetings - https://phabricator.wikimedia.org/T173351#3572253 (10Jrbranaa) p:05Triage>03Normal [03:42:30] 10Release-Engineering-Team (Kanban): wmf.14 Blocker - Post Mortem - Cannot flush pre-lock snapshot because writes are pending - https://phabricator.wikimedia.org/T173477#3572254 (10Jrbranaa) p:05Triage>03Normal [03:44:04] 10Release-Engineering-Team (Kanban): Identify Orphaned components/code - https://phabricator.wikimedia.org/T173349#3572255 (10Jrbranaa) p:05Triage>03Normal [03:48:31] 10Release-Engineering-Team (Watching / External), 10Developer-Wishlist, 10Composer: Setup a Composer Repository (Packagist) for MediaWiki Extensions - https://phabricator.wikimedia.org/T170897#3572256 (10dbarratt) >>! In T170897#3547492, @Legoktm wrote: > But I think the main issue is that conceptually exten... [03:58:43] Project selenium-MultimediaViewer » firefox,mediawiki,Linux,BrowserTests build #503: 04FAILURE in 2 min 43 sec: https://integration.wikimedia.org/ci/job/selenium-MultimediaViewer/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=mediawiki,PLATFORM=Linux,label=BrowserTests/503/ [03:59:16] 10Deployment-Systems, 10MediaWiki-Interwiki, 10Wikimedia-Incident: Determine whether broken interwiki cache causes errors and whether those can be caught in deployment - https://phabricator.wikimedia.org/T174758#3572264 (10Krinkle) [04:07:37] Project selenium-MultimediaViewer » safari,beta,OS X 10.9,BrowserTests build #503: 04FAILURE in 11 min: https://integration.wikimedia.org/ci/job/selenium-MultimediaViewer/BROWSER=safari,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=OS%20X%2010.9,label=BrowserTests/503/ [04:20:45] 10MediaWiki-Releasing, 10MW-1.27-release, 10NewPHP, 10Patch-For-Review: Make MediaWiki 1.27 (LTS) compatible with PHP 7.1 - https://phabricator.wikimedia.org/T174262#3572285 (10Legoktm) [21:09:39] (PS1) Legoktm: Don't pass $this by reference [skins/Vector] - https://gerrit.wikimedia.org/r/37511... [04:57:16] PROBLEM - Work requests waiting in Zuul Gearman server https://grafana.wikimedia.org/dashboard/db/zuul-gearman on contint1001 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [140.0] [05:02:56] ^ me [05:41:51] PROBLEM - Mediawiki Error Rate on graphite-labs is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [10.0] [05:51:51] PROBLEM - Mediawiki Error Rate on graphite-labs is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [10.0] [05:53:57] 10Gerrit, 10Repository-Ownership-Requests, 10VPS-project-libraryupgrader: Grant libraryupgrader +2 rights for some library bumps - https://phabricator.wikimedia.org/T174760#3572337 (10Legoktm) [05:57:36] 10MediaWiki-Codesniffer: MW-CS erroneously removes $ from parameter comment - https://phabricator.wikimedia.org/T174761#3572349 (10Legoktm) [06:36:49] PROBLEM - Mediawiki Error Rate on graphite-labs is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [10.0] [06:37:39] 10MediaWiki-Codesniffer, 10Patch-For-Review: Provide Codesniffer rules to enforce "short" type definitions (int/bool, not integer/boolean) - https://phabricator.wikimedia.org/T145162#3572378 (10Legoktm) In https://gerrit.wikimedia.org/r/#/c/375285/1/ShortUrl.utils.php "@return Boolean" was not changed to the s... [06:57:19] PROBLEM - Puppet errors on deployment-kafka01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [08:56:50] RECOVERY - Mediawiki Error Rate on graphite-labs is OK: OK: Less than 1.00% above the threshold [1.0] [09:11:06] PROBLEM - Work requests waiting in Zuul Gearman server https://grafana.wikimedia.org/dashboard/db/zuul-gearman on contint1001 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [140.0] [09:22:53] PROBLEM - Puppet errors on deployment-mira is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [10:02:53] RECOVERY - Puppet errors on deployment-mira is OK: OK: Less than 1.00% above the threshold [0.0] [10:31:52] morning! [10:32:21] it seems like the zuul queue is stuck? https://integration.wikimedia.org/zuul/ [10:32:51] i'm trying to get patches verified for a UBN https://phabricator.wikimedia.org/T174724 [11:08:52] seconding that [11:08:56] we are stuck too [11:10:21] Yeh, I feel like there should be a low-priority queue for changes like this [11:10:43] gate-submit doesnt matter too much, but test could have a low prio one [11:11:40] hmm, I could probably cancel a buck tonne of the jobs [11:11:51] the test jobs are for changes that have already been merged... [11:12:56] Not sure if there is a way to do that.. [11:13:51] One down, 98 to go.... [11:26:00] heh [11:26:25] 69 [11:38:34] 51 [12:03:21] 25 [12:12:47] 19 [12:22:06] 12! [12:26:41] RECOVERY - Work requests waiting in Zuul Gearman server https://grafana.wikimedia.org/dashboard/db/zuul-gearman on contint1001 is OK: OK: Less than 30.00% above the threshold [90.0] [12:27:59] YAY [12:28:09] 11 [13:11:56] addshore: eek [13:12:16] I am not around today and havent checked the CI status [13:13:45] Amir1: addshore: was someone spamming a lot of changes to Gerrit? [13:14:10] yeah, legoktm :D [13:14:33] they were waiting in the queue for six hours [13:14:46] 10Release-Engineering-Team, 10MediaWiki-ResourceLoader, 10Page-Previews, 10Readers-Web-Backlog: Provide a reliable test environment that mimics production for running integration tests - https://phabricator.wikimedia.org/T174786#3573043 (10Jdlrobson) [13:39:29] (03PS1) 10Jdlrobson: Add a Jenkins job for Popups browser tests [integration/config] - 10https://gerrit.wikimedia.org/r/375377 (https://phabricator.wikimedia.org/T174786) [13:39:52] 10Release-Engineering-Team, 10MediaWiki-ResourceLoader, 10Page-Previews, 10Readers-Web-Backlog, and 2 others: Provide a reliable test environment that mimics production for running integration tests - https://phabricator.wikimedia.org/T174786#3573118 (10Jdlrobson) [13:40:25] 10Release-Engineering-Team, 10MediaWiki-ResourceLoader, 10Page-Previews, 10Readers-Web-Backlog, and 3 others: Provide a reliable test environment that mimics production for running integration tests - https://phabricator.wikimedia.org/T174786#3573043 (10Jdlrobson) (pulling in to add preventative measures o... [13:46:41] Yippee, build fixed! [13:46:41] Project selenium-VisualEditor » firefox,beta,Linux,BrowserTests build #509: 09FIXED in 2 min 40 sec: https://integration.wikimedia.org/ci/job/selenium-VisualEditor/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=BrowserTests/509/ [13:53:30] (03CR) 10Niedzielski: [C: 031] "This looks ok to me. The layout already has a Popups entry. Maybe MobileFrontend should depend on Popups in parameter_functions but that s" [integration/config] - 10https://gerrit.wikimedia.org/r/375377 (https://phabricator.wikimedia.org/T174786) (owner: 10Jdlrobson) [14:30:25] 10MediaWiki-Releasing, 10MW-1.27-release, 10NewPHP, 10Patch-For-Review: Make MediaWiki 1.27 (LTS) compatible with PHP 7.1 - https://phabricator.wikimedia.org/T174262#3573269 (10Reedy) Monobook REL1_2[7-9] cherry picks in https://gerrit.wikimedia.org/r/#/q/I525b06d4dcf2aa7796bd6769edf8703f15e5612a [14:42:31] 10MediaWiki-Releasing, 10MW-1.27-release, 10NewPHP, 10Patch-For-Review: Make MediaWiki 1.27 (LTS) compatible with PHP 7.1 - https://phabricator.wikimedia.org/T174262#3573283 (10Reedy) Vector REL1_2[7-9] cherry picks in https://gerrit.wikimedia.org/r/#/q/Ife076b323b1f4091fd851acb4b470000d2206cae [16:18:48] 10Release-Engineering-Team (Kanban), 10Release, 10Train Deployments: 1.30.0-wmf.16 deployment blockers - https://phabricator.wikimedia.org/T170634#3573545 (10demon) 05Open>03Resolved [16:25:11] PROBLEM - Long lived cherry-picks on puppetmaster on deployment-puppetmaster02 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [16:58:40] 10MediaWiki-Codesniffer, 10Patch-For-Review: Provide Codesniffer rules to enforce "short" type definitions (int/bool, not integer/boolean) - https://phabricator.wikimedia.org/T145162#3573621 (10Umherirrender) >>! In T145162#3572378, @Legoktm wrote: > In https://gerrit.wikimedia.org/r/#/c/375285/1/ShortUrl.util... [17:14:13] addshore: I think the solution is that we need to make CI faster. Most of the time is spent waiting for nodepool to spin up new VMs [17:14:16] 10Gerrit, 10Repository-Ownership-Requests, 10VPS-project-libraryupgrader: Grant libraryupgrader +2 rights for some library bumps - https://phabricator.wikimedia.org/T174760#3573697 (10demon) Sadly, we cannot restrict (easily, would require some plugin/submit rule work) on the Gerrit side. If we're going to b... [17:21:23] (and normally I do this on weekends, not weeknights but this time was an exception) [17:25:07] 10Gerrit, 10Repository-Ownership-Requests, 10VPS-project-libraryupgrader: Grant libraryupgrader +2 rights for some library bumps - https://phabricator.wikimedia.org/T174760#3573713 (10Legoktm) >>! In T174760#3573697, @demon wrote: > Sadly, we cannot restrict (easily, would require some plugin/submit rule wor... [17:33:06] 10Scap (Scap3-Adoption-Phase1), 10Trebuchet: Cleanup /srv/deployment - https://phabricator.wikimedia.org/T170881#3446038 (10RobH) root removal of /srv/deployment/STALE/ done =] [17:34:11] 10Scap (Scap3-Adoption-Phase1), 10releng-201516-q4, 10releng-201718-q1, 10Trebuchet: [keyresult] Migrate remaining trebuchet deployed services - https://phabricator.wikimedia.org/T129290#3573744 (10demon) [17:34:13] 10Release-Engineering-Team (Kanban), 10Scap (Scap3-Adoption-Phase1), 10Trebuchet: Cleanup /srv/deployment - https://phabricator.wikimedia.org/T170881#3573741 (10demon) 05Open>03Resolved a:03demon Yay thx. I thinkkkkkkkk that's it. Resolving :) [17:37:34] (03PS1) 10Niedzielski: Marvin: install NPM >= 5 [integration/config] - 10https://gerrit.wikimedia.org/r/375405 [17:37:39] (03CR) 10Niedzielski: [C: 04-2] Marvin: install NPM >= 5 [integration/config] - 10https://gerrit.wikimedia.org/r/375405 (owner: 10Niedzielski) [17:39:04] (03CR) 10Niedzielski: [C: 04-2] "@hashar, this needs your blessing when convenient. I'm sorry for the great change this patch will have (see commit message). If you know a" [integration/config] - 10https://gerrit.wikimedia.org/r/375405 (owner: 10Niedzielski) [17:40:23] 10Deployment-Systems, 10Scap (Scap3-Adoption-Phase1), 10scap2, 10Discovery: Deploy discovery-analytics with scap3 - https://phabricator.wikimedia.org/T129149#3573756 (10demon) p:05Triage>03Normal Let's do it :) [17:41:35] 10Deployment-Systems, 10Release-Engineering-Team (Kanban): Automate the recurring management of wikitech:Deployments and phab:#train_deployments - https://phabricator.wikimedia.org/T114488#3573761 (10mmodell) [17:41:52] 10Deployment-Systems, 10Release-Engineering-Team (Kanban): Automate the recurring management of wikitech:Deployments and phab:#train_deployments - https://phabricator.wikimedia.org/T114488#1697362 (10mmodell) [17:44:08] 10Gerrit, 10Repository-Ownership-Requests, 10VPS-project-libraryupgrader: Grant libraryupgrader +2 rights for some library bumps - https://phabricator.wikimedia.org/T174760#3573774 (10demon) >>! In T174760#3573713, @Legoktm wrote: >>>! In T174760#3573697, @demon wrote: >> Sadly, we cannot restrict (easily, w... [17:45:57] 10Gerrit, 10Repository-Ownership-Requests, 10VPS-project-libraryupgrader: Grant libraryupgrader +2 rights for some library bumps - https://phabricator.wikimedia.org/T174760#3573778 (10demon) But also....if we're using the API we're having to auth using the actual LDAP password for the user...which is even mo... [17:48:38] 10Gerrit, 10Repository-Ownership-Requests, 10VPS-project-libraryupgrader: Grant libraryupgrader +2 rights for some library bumps - https://phabricator.wikimedia.org/T174760#3573784 (10Legoktm) >>! In T174760#3573774, @demon wrote: > Not really sure. Basically just kind of thinking the whole "don't depend on... [17:48:56] oh perfect, we both came to the same conclusion [17:51:43] sounds good [17:51:50] 10Release-Engineering-Team (Kanban): 20170721-Train-Wikidata - Post mortem - https://phabricator.wikimedia.org/T173435#3573789 (10mmodell) [17:51:52] 10Release-Engineering-Team (Kanban), 10Wikidata, 10Wikimedia-Incident: Expand on the incident report for wikidata wmf.10 - wmf.14 train deployment - https://phabricator.wikimedia.org/T173433#3573787 (10mmodell) 05Open>03Resolved a:03mmodell [17:52:47] 10Gerrit, 10Repository-Ownership-Requests, 10VPS-project-libraryupgrader: Grant libraryupgrader +2 rights for some library bumps - https://phabricator.wikimedia.org/T174760#3573792 (10Paladox) @demon we could do want upstream gerrit did for polygerrit. (without a plugin). it's done in the All-Project. [17:55:05] 10Gerrit, 10Repository-Ownership-Requests, 10VPS-project-libraryupgrader: Grant libraryupgrader +2 rights for some library bumps - https://phabricator.wikimedia.org/T174760#3573803 (10demon) Do what? You can't do path-based matching in Gerrit ACLs. Just branch (maybe topic?) based. [17:57:12] legoktm: Oh, we'll also want to assign you to the Non-Interactive Users thingie [17:57:15] 10Release-Engineering-Team, 10MediaWiki-ResourceLoader, 10Page-Previews, 10Readers-Web-Backlog, and 3 others: Provide a reliable test environment that mimics production for running integration tests - https://phabricator.wikimedia.org/T174786#3573808 (10Legoktm) > This seems like a big lesson learned for u... [17:57:18] So your pushes will be given BATCH priority [17:57:23] (lower, therefore) [17:57:43] Uses a separate (and smaller) threadpool [18:01:19] 10Gerrit, 10Repository-Ownership-Requests, 10VPS-project-libraryupgrader: Grant libraryupgrader +2 rights for some library bumps - https://phabricator.wikimedia.org/T174760#3573852 (10Paladox) You can do paths. Upstream did it. [18:04:38] 10Gerrit, 10Repository-Ownership-Requests, 10VPS-project-libraryupgrader: Grant libraryupgrader +2 rights for some library bumps - https://phabricator.wikimedia.org/T174760#3573858 (10Paladox) See https://gerrit-review.googlesource.com/#/c/gerrit/+/118072/3/rules.pl [18:06:19] 10Gerrit, 10Repository-Ownership-Requests, 10VPS-project-libraryupgrader: Grant libraryupgrader +2 rights for some library bumps - https://phabricator.wikimedia.org/T174760#3573863 (10demon) Ugh, Prolog. Craziest idea they ever had. No. [18:07:20] no_justification: https://gerrit.wikimedia.org/r/#/admin/groups/4,members ? [18:07:32] Yes, that. [18:07:38] Also, JenkinsBot should maybe not be in Batch [18:07:43] We want it to be done quickly [18:07:48] Not be in a smaller threadpool [18:27:11] 10Release-Engineering-Team (Kanban), 10Scap (Scap3-Adoption-Phase1), 10Trebuchet: Cleanup /srv/deployment - https://phabricator.wikimedia.org/T170881#3573887 (10greg) \o/ [19:07:24] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Backlog): On contint-operations-puppet apt-get yields about locale - https://phabricator.wikimedia.org/T174584#3574013 (10greg) [19:16:19] 10MediaWiki-Releasing, 10MW-1.27-release, 10NewPHP: Make MediaWiki 1.27 (LTS) compatible with PHP 7.1 - https://phabricator.wikimedia.org/T174262#3574032 (10Reedy) [19:32:22] Can someone refresh my memory on the requirements for ci whitelist for recheck? [19:32:59] requirement 1: being on the whitelist [19:33:01] if someone with +2 in ci config trusts them, effectively [19:33:50] greg-g: ok but isnt there a merge changes amnt requirement? [19:33:57] no_justification: lol [19:49:30] (03Draft2) 10Zppix: add samtar to ci whitelist [integration/config] - 10https://gerrit.wikimedia.org/r/375428 [19:56:04] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Backlog): On contint-operations-puppet apt-get yields about locale - https://phabricator.wikimedia.org/T174584#3574112 (10hashar) In the dockerfiles/contint-operations-puppet/Dockerfile we roughly do: ENV LANG='en_US.UTF-8' LANGUAGE='en_US:e... [20:08:26] 10Release-Engineering-Team, 10MediaWiki-ResourceLoader, 10Page-Previews, 10Readers-Web-Backlog, and 3 others: Provide a reliable test environment that mimics production for running integration tests - https://phabricator.wikimedia.org/T174786#3574122 (10Krinkle) Thanks for creating this task. From my persp... [20:09:39] hashar: when you get a moment of time mind taking a look at https://gerrit.wikimedia.org/r/375428 for me please [20:10:21] (03CR) 10Krinkle: Add a Jenkins job for Popups browser tests (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/375377 (https://phabricator.wikimedia.org/T174786) (owner: 10Jdlrobson) [20:10:47] 10Release-Engineering-Team, 10Page-Previews, 10Performance-Team, 10Readers-Web-Backlog, and 3 others: Provide a reliable test environment that mimics production for running integration tests - https://phabricator.wikimedia.org/T174786#3574161 (10Krinkle) [20:15:14] (03CR) 10Jdlrobson: Add a Jenkins job for Popups browser tests (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/375377 (https://phabricator.wikimedia.org/T174786) (owner: 10Jdlrobson) [20:27:40] PROBLEM - App Server Main HTTP Response on deployment-mediawiki04 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:27:54] `5~hrmmm? [20:27:55] ^ [20:28:08] nevermind the garbage terminal characters [20:28:30] * greg-g tries loading... [20:29:48] loads fine [20:30:09] https://en.wikipedia.beta.wmflabs.org/wiki/Special:Version does that is [20:31:57] Fine for me aswell [20:32:33] RECOVERY - App Server Main HTTP Response on deployment-mediawiki04 is OK: HTTP OK: HTTP/1.1 200 OK - 51105 bytes in 1.104 second response time [20:33:04] greg-g, yeah I can't figure out what went wrong there [20:33:20] load doesn't look crazy /me shrugs [20:33:24] everything looks okay to me [20:42:47] Project selenium-Echo » firefox,beta,Linux,BrowserTests build #504: 04FAILURE in 1 min 47 sec: https://integration.wikimedia.org/ci/job/selenium-Echo/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=BrowserTests/504/ [20:42:49] Project selenium-Echo » chrome,beta,Linux,BrowserTests build #504: 04FAILURE in 1 min 49 sec: https://integration.wikimedia.org/ci/job/selenium-Echo/BROWSER=chrome,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=BrowserTests/504/ [22:21:02] Project selenium-CentralAuth » firefox,beta,Linux,BrowserTests build #506: 04FAILURE in 1 min 2 sec: https://integration.wikimedia.org/ci/job/selenium-CentralAuth/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=BrowserTests/506/ [22:32:59] 10Release-Engineering-Team (Watching / External), 10Phlogiston: Adjust phlogiston configuration for Release Engineering - https://phabricator.wikimedia.org/T170359#3574516 (10ksmith) [23:03:43] 10Release-Engineering-Team (Kanban), 10User-greg: 201718Q2 RelEng related progam goals - https://phabricator.wikimedia.org/T174835#3574529 (10greg)