[00:02:20] 10Gerrit, 10Upstream: New Gerrit UI does not contain copyable Change-Id or parent commit id - https://phabricator.wikimedia.org/T195280 (10Paladox) [00:02:22] 10Gerrit, 10Upstream: Polygerrit Outgoing reviews, Incoming reviews and Recently closed headings should be more different from commits - https://phabricator.wikimedia.org/T186406 (10Paladox) [00:02:40] 10Gerrit: Place holder task for 2.16 upgrade - https://phabricator.wikimedia.org/T200739 (10Paladox) [00:04:44] 10Gerrit, 10User-Addshore: "Included groups" do not appear in New gerrit UI - https://phabricator.wikimedia.org/T200310 (10Paladox) [00:04:46] 10Gerrit: Place holder task for 2.16 upgrade - https://phabricator.wikimedia.org/T200739 (10Paladox) [00:05:13] 10Gerrit: Place holder task for 2.16 upgrade - https://phabricator.wikimedia.org/T200739 (10Paladox) PolyGerrit's change view is getting a minor redesign on top of the one already done in 2.16. See https://bugs.chromium.org/p/gerrit/issues/detail?id=8882&desc=2#c28 [00:06:32] 10Gerrit: No way to mark patches as not WIP (which blocks merging) - https://phabricator.wikimedia.org/T197238 (10Paladox) 05Open>03Resolved Closing as resolved as 2.15.3 should have resolved most of these issues with only the admin being able to unmark wips :) [00:52:39] (03PS2) 10Dduvall: Establish a new class for patch set related things [integration/pipelinelib] - 10https://gerrit.wikimedia.org/r/447005 [00:53:42] (03PS3) 10Dduvall: Establish a new class for patch set related things [integration/pipelinelib] - 10https://gerrit.wikimedia.org/r/447005 [01:07:43] is there a phab template for deploying a new extension? [01:07:46] I can't seem to find on [01:07:48] e [01:09:29] 10Gerrit, 10Release-Engineering-Team, 10User-Addshore: Access to create gerrit groups for WMDE-Leszek - https://phabricator.wikimedia.org/T200311 (10Paladox) We could add the user to this https://gerrit.wikimedia.org/r/admin/groups/119,members group which will grant them "create repos" and "create groups" ri... [01:12:48] 10Gerrit, 10Release-Engineering-Team (Someday), 10Operations, 10Patch-For-Review: Gerrit shows HTTP 500 error when pasting extended unicode characters - https://phabricator.wikimedia.org/T145885 (10Paladox) [01:12:52] 10Gerrit, 10Developer-Wishlist (2017), 10Patch-For-Review, 10Upstream: Free-form tagging in gerrit - https://phabricator.wikimedia.org/T37534 (10Paladox) [01:12:54] 10Gerrit, 10Release-Engineering-Team: Migrate to NoteDb - https://phabricator.wikimedia.org/T174034 (10Paladox) 05Open>03Resolved This has been resolved now :). Groups will be migrated to notedb in T200739 [01:13:05] 10Gerrit, 10Release-Engineering-Team: Migrate to NoteDb - https://phabricator.wikimedia.org/T174034 (10Paladox) [01:15:23] 10Gerrit, 10ORES, 10Scoring-platform-team: Research Project Idea: Use AI to suggest improvements to patches uploaded to gerrit - https://phabricator.wikimedia.org/T195235 (10Paladox) You can use this api https://gerrit-review.googlesource.com/Documentation/rest-api-changes.html#apply-fix i think (I'm not sur... [01:19:55] 10Continuous-Integration-Infrastructure, 10Wikimedia-General-or-Unknown, 10Core-Platform-Team (CPT-Q1-Jul-Sep-2018), 10Patch-For-Review, 10Performance-Team (Radar): Deprecate/obsolete $wgWikimediaJenkinsCI - https://phabricator.wikimedia.org/T200650 (10Legoktm) [01:42:29] 10Gerrit, 10Release-Engineering-Team, 10User-Addshore: Access to create gerrit groups for WMDE-Leszek - https://phabricator.wikimedia.org/T200311 (10Legoktm) Create groups or create repositories? I don't know why anyone would need permissions for the former without the latter. [02:06:51] 10Project-Admins: Create project for "Wikimedia Foundation website" - https://phabricator.wikimedia.org/T200756 (10Varnent) [02:07:04] 10Project-Admins: Create project for "Wikimedia Foundation website" - https://phabricator.wikimedia.org/T200756 (10Varnent) [02:09:28] 10Release-Engineering-Team, 10DNS, 10Operations, 10Traffic, and 4 others: Move Foundation Wiki to new URL when new Wikimedia Foundation website launches - https://phabricator.wikimedia.org/T188776 (10Varnent) Task with info on redirects: [T200754] [02:11:19] 10Phabricator: Enable image hotlinking - https://phabricator.wikimedia.org/T186246 (10Tbayer) Actually T116515 had been about images in the first place (from Commons) - it somehow morphed into a task about videos. I have (re-)filed {T199407} as a subtask of this one. [02:20:30] 10Phabricator: Embedded Commons videos are broken - https://phabricator.wikimedia.org/T200757 (10Tbayer) [02:21:08] 10Phabricator (2017-06-01), 10RelEng-Archive-FY201718-Q1: Enable embedding of media from Wikimedia Commons - https://phabricator.wikimedia.org/T116515 (10Tbayer) >>! In T116515#4419390, @mmodell wrote: > maybe we need to adjust content security policy? Indeed, it looks like it has to do with that - filed as T... [02:54:46] 10Project-Admins: Create project for "Wikimedia Foundation website" - https://phabricator.wikimedia.org/T200756 (10Legoktm) Does "wikimediafoundation.org" work as a name? That would fit the naming convention of other one-off sites like #wikitech.wikimedia.org . [04:00:50] 10Project-Admins: Create project for "Wikimedia Foundation website" - https://phabricator.wikimedia.org/T200756 (10Varnent) >>! In T200756#4463917, @Legoktm wrote: > Does "wikimediafoundation.org" work as a name? That would fit the naming convention of other one-off sites like #wikitech.wikimedia.org . @Legoktm... [04:22:14] 10Project-Admins: Create project for "Wikimedia Foundation website" - https://phabricator.wikimedia.org/T200756 (10Legoktm) 05Open>03Resolved a:03Legoktm Created #wikimediafoundation.org [04:32:05] 10Continuous-Integration-Infrastructure, 10Wikimedia-General-or-Unknown, 10Core-Platform-Team (CPT-Q1-Jul-Sep-2018), 10Patch-For-Review, 10Performance-Team (Radar): Deprecate/obsolete $wgWikimediaJenkinsCI - https://phabricator.wikimedia.org/T200650 (10Legoktm) [05:02:42] 10Project-Admins: Create project for "Wikimedia Foundation website" - https://phabricator.wikimedia.org/T200756 (10Varnent) Thank you @Legoktm! [05:05:11] 10Release-Engineering-Team, 10DNS, 10Operations, 10Traffic, and 5 others: Move Foundation Wiki to new URL when new Wikimedia Foundation website launches - https://phabricator.wikimedia.org/T188776 (10Varnent) [05:08:22] 10Project-Admins, 10wikimediafoundation.org: Create project for "Wikimedia Foundation website" - https://phabricator.wikimedia.org/T200756 (10Varnent) [05:34:44] (03PS1) 10Legoktm: Add npm6 docker image [integration/config] - 10https://gerrit.wikimedia.org/r/449399 [05:53:33] (03PS2) 10Legoktm: Add npm6 docker image [integration/config] - 10https://gerrit.wikimedia.org/r/449399 [05:53:35] (03PS1) 10Legoktm: Add npm6-audit-docker job [integration/config] - 10https://gerrit.wikimedia.org/r/449401 [05:54:11] (03CR) 10Legoktm: [C: 032] Add npm6 docker image [integration/config] - 10https://gerrit.wikimedia.org/r/449399 (owner: 10Legoktm) [05:55:21] (03PS1) 10Legoktm: Run experimental npm6-audit-docker job for Parsoid [integration/config] - 10https://gerrit.wikimedia.org/r/449402 [05:55:44] (03Merged) 10jenkins-bot: Add npm6 docker image [integration/config] - 10https://gerrit.wikimedia.org/r/449399 (owner: 10Legoktm) [05:56:26] !log deploying https://gerrit.wikimedia.org/r/449399 [05:56:30] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [05:57:05] (03CR) 10jerkins-bot: [V: 04-1] Run experimental npm6-audit-docker job for Parsoid [integration/config] - 10https://gerrit.wikimedia.org/r/449402 (owner: 10Legoktm) [05:58:35] (03PS2) 10Legoktm: Add npm6-audit-docker job [integration/config] - 10https://gerrit.wikimedia.org/r/449401 [05:58:37] (03PS2) 10Legoktm: Run experimental npm6-audit-docker job for Parsoid [integration/config] - 10https://gerrit.wikimedia.org/r/449402 [06:03:32] (03CR) 10Legoktm: [C: 032] Add npm6-audit-docker job [integration/config] - 10https://gerrit.wikimedia.org/r/449401 (owner: 10Legoktm) [06:03:58] (03CR) 10Legoktm: [C: 032] Run experimental npm6-audit-docker job for Parsoid [integration/config] - 10https://gerrit.wikimedia.org/r/449402 (owner: 10Legoktm) [06:05:24] (03Merged) 10jenkins-bot: Add npm6-audit-docker job [integration/config] - 10https://gerrit.wikimedia.org/r/449401 (owner: 10Legoktm) [06:05:27] (03Merged) 10jenkins-bot: Run experimental npm6-audit-docker job for Parsoid [integration/config] - 10https://gerrit.wikimedia.org/r/449402 (owner: 10Legoktm) [06:05:52] !log deployed https://gerrit.wikimedia.org/r/449402 [06:05:55] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [06:10:25] PROBLEM - Puppet errors on deployment-deploy01 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [06:23:18] (03PS4) 10Hashar: Migrate mw coverage job to Docker [integration/config] - 10https://gerrit.wikimedia.org/r/449273 (https://phabricator.wikimedia.org/T195918) [06:24:22] (03CR) 10Hashar: [C: 032] Migrate mw coverage job to Docker [integration/config] - 10https://gerrit.wikimedia.org/r/449273 (https://phabricator.wikimedia.org/T195918) (owner: 10Hashar) [06:25:04] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10Quibble, 10Patch-For-Review: Move MediaWiki extension PHPUnit coverage jobs to docker + quibble - https://phabricator.wikimedia.org/T195918 (10hashar) [06:26:15] (03Merged) 10jenkins-bot: Migrate mw coverage job to Docker [integration/config] - 10https://gerrit.wikimedia.org/r/449273 (https://phabricator.wikimedia.org/T195918) (owner: 10Hashar) [06:26:37] 10Continuous-Integration-Infrastructure, 10BlueSpice: Autofixing commits on BlueSpiceEchoConnector - https://phabricator.wikimedia.org/T200519 (10Osnard) Thanks for sharing the script. But this does only add the code sniffer manifests, doesn't it? How do you actually do the autofixing? Is that something phpcs... [06:32:59] (03PS1) 10Legoktm: Fix npm6-audit-docker job [integration/config] - 10https://gerrit.wikimedia.org/r/449405 [06:34:35] 10Continuous-Integration-Infrastructure, 10BlueSpice: Autofixing commits on BlueSpiceEchoConnector - https://phabricator.wikimedia.org/T200519 (10Legoktm) >>! In T200519#4464086, @Osnard wrote: > How do you actually do the autofixing? Is that something phpcs can do by itself? Yes, it's called "phpcbf". In mos... [06:34:50] (03CR) 10Legoktm: [C: 032] Fix npm6-audit-docker job [integration/config] - 10https://gerrit.wikimedia.org/r/449405 (owner: 10Legoktm) [06:36:45] (03Merged) 10jenkins-bot: Fix npm6-audit-docker job [integration/config] - 10https://gerrit.wikimedia.org/r/449405 (owner: 10Legoktm) [06:40:25] RECOVERY - Puppet errors on deployment-deploy01 is OK: OK: Less than 1.00% above the threshold [0.0] [07:00:20] Any other docs about Quibble than https://doc.wikimedia.org/quibble/usage.html ? [07:30:00] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10Quibble, 10Patch-For-Review: Move MediaWiki extension PHPUnit coverage jobs to docker + quibble - https://phabricator.wikimedia.org/T195918 (10hashar) [07:30:39] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10releng-201718-q3, 10Epic, 10Patch-For-Review: [EPIC] Migrate Mediawiki jobs from Nodepool to Docker - https://phabricator.wikimedia.org/T183512 (10hashar) [07:30:52] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10Quibble, 10Patch-For-Review: Move MediaWiki extension PHPUnit coverage jobs to docker + quibble - https://phabricator.wikimedia.org/T195918 (10hashar) 05Open>03Resolved [07:42:19] (03PS1) 10Hashar: Remove old mwskin-testskin jobs [integration/config] - 10https://gerrit.wikimedia.org/r/449414 [07:44:18] legoktm: Great idea to have a Phabricator template for new extensions! FWIW, this seems to be the most comprehensive page about the process, so obviously a good place to link the template if/when it exists. [07:44:52] awight: well I'm pretty sure there used to be one, I'm not sure where it went [07:45:55] (03CR) 10Legoktm: [C: 031] "Awesome :) I'd like Anomie to take a look at this before merging though." [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/449149 (https://phabricator.wikimedia.org/T171520) (owner: 10PleaseStand) [07:46:02] (03CR) 10Hashar: [C: 032] Remove old mwskin-testskin jobs [integration/config] - 10https://gerrit.wikimedia.org/r/449414 (owner: 10Hashar) [07:46:13] legoktm: I’ve seen a template for security review, IIRC [07:46:21] yep, I used that one [07:47:18] This is mentioned as an exemplary extension checklist, if we end up writing a new template: https://phabricator.wikimedia.org/T190716 [07:48:00] * awight growls at missing “document type: template” search type [07:48:30] (03Merged) 10jenkins-bot: Remove old mwskin-testskin jobs [integration/config] - 10https://gerrit.wikimedia.org/r/449414 (owner: 10Hashar) [07:49:17] legoktm: I have migrated all the mediawiki coverage jobs to docker :]]] [07:49:33] <3 [07:49:38] and for mediawiki using sqlite is twice faster than with mysql [07:49:39] ah darn, looks like templates are simply tasks, so there’s no search condition available [07:50:00] awight: I don't think that's a good one, it has no items in the checklist about a mw.o page, the help page, and so on [07:50:34] ok, should be replaced then, good to know! [07:50:34] awight: https://phabricator.wikimedia.org/T108557 is the full checklist I remember [07:51:10] Yeah that’s worded in generic terms, looks like it came from the template as you say [07:51:46] https://phabricator.wikimedia.org/search/query/1mD5Nk6GmAE0/#R [07:51:58] Only two hits, neither is a template [07:55:08] Interesting, this BZ task has some of the same wording: https://phabricator.wikimedia.org/T58181 [07:59:21] Oh, that’s just from mw:Review_queue. LOL: https://phabricator.wikimedia.org/T72499#766473 [08:02:48] I’ve been interested in the new extension process due to T200297, and hopefully will have some clarifications to make about DBA review… [08:02:49] T200297: Introduce a new namespace for collaborative judgments about wiki entities - https://phabricator.wikimedia.org/T200297 [08:05:52] 10Beta-Cluster-Infrastructure, 10Mathoid: Move mathoid to deployment-sca* hosts in Beta Cluster - https://phabricator.wikimedia.org/T142255 (10mobrovac) 05Open>03declined This is no longer an issue: Mathoid has been moved to our k8s infrastructure (nominally it still exists on SCB, but it's not used there... [08:05:55] 10Beta-Cluster-Infrastructure, 10Operations, 10Puppet, 10Technical-Debt, 10Tracking: Minimize differences between beta and production (Tracking) - https://phabricator.wikimedia.org/T87220 (10mobrovac) [08:07:02] heh "my name is Greg" [08:47:24] 10Continuous-Integration-Config, 10Release-Engineering-Team (Kanban), 10Discovery-Search (Current work): Help needed to setup the elastic6 dev branch on CirrusSearch, Elastica and mediawiki-vendor - https://phabricator.wikimedia.org/T200669 (10hashar) Following a discussion with David this morning: The jobs... [08:59:07] 10Gerrit: Place holder task for Gerrit 2.16 upgrade - https://phabricator.wikimedia.org/T200739 (10Aklapper) [09:02:44] 10Gerrit: Rename Gerrit repository "LdapGroups" to "LDAPGroups" - https://phabricator.wikimedia.org/T200736 (10Aklapper) [09:04:05] (03PS1) 10Ema: Use backports for varnishkafka [integration/config] - 10https://gerrit.wikimedia.org/r/449426 (https://phabricator.wikimedia.org/T200445) [09:05:25] (03CR) 10jerkins-bot: [V: 04-1] Use backports for varnishkafka [integration/config] - 10https://gerrit.wikimedia.org/r/449426 (https://phabricator.wikimedia.org/T200445) (owner: 10Ema) [09:08:01] (03PS2) 10Ema: Use backports for varnishkafka [integration/config] - 10https://gerrit.wikimedia.org/r/449426 (https://phabricator.wikimedia.org/T200445) [09:32:29] 10Release-Engineering-Team, 10MediaWiki-extensions-WikimediaIncubator, 10Epic, 10I18n: Make creating a new Language project easier - https://phabricator.wikimedia.org/T165585 (10Verdy_p) Note that there's absolutely NO need to create "temporary" domains for languages codes and project in Incubator. We can... [09:34:11] (03PS1) 10Hashar: Only skip wmf-quibble jobs on REL branches [integration/config] - 10https://gerrit.wikimedia.org/r/449428 [09:35:47] (03PS2) 10Hashar: Only skip wmf-quibble jobs on REL branches [integration/config] - 10https://gerrit.wikimedia.org/r/449428 (https://phabricator.wikimedia.org/T200669) [09:36:23] (03CR) 10Hashar: [C: 032] Only skip wmf-quibble jobs on REL branches [integration/config] - 10https://gerrit.wikimedia.org/r/449428 (https://phabricator.wikimedia.org/T200669) (owner: 10Hashar) [09:38:04] (03Merged) 10jenkins-bot: Only skip wmf-quibble jobs on REL branches [integration/config] - 10https://gerrit.wikimedia.org/r/449428 (https://phabricator.wikimedia.org/T200669) (owner: 10Hashar) [09:40:28] 10Continuous-Integration-Config, 10Release-Engineering-Team (Kanban), 10Discovery-Search (Current work), 10Patch-For-Review: Help needed to setup the elastic6 dev branch on CirrusSearch, Elastica and mediawiki-vendor - https://phabricator.wikimedia.org/T200669 (10hashar) After more talking with David the s... [09:45:04] (03CR) 10Hashar: [C: 032] "That is how we want for the other repository mixing -wikimedia and -backports :]" [integration/config] - 10https://gerrit.wikimedia.org/r/449426 (https://phabricator.wikimedia.org/T200445) (owner: 10Ema) [09:45:26] hashar: how to run extensions tests in Quibble? [09:46:42] (03Merged) 10jenkins-bot: Use backports for varnishkafka [integration/config] - 10https://gerrit.wikimedia.org/r/449426 (https://phabricator.wikimedia.org/T200445) (owner: 10Ema) [09:47:07] kart_: I wrote a bit about it on https://phabricator.wikimedia.org/phame/post/view/99/introducing_quibble/ and https://lists.wikimedia.org/pipermail/qa/2018-April/002699.html [09:47:33] kart_: the readme as much more informations : https://doc.wikimedia.org/quibble/ [09:47:54] kart_: but really, it is just an helper to clone repositories / checkout patches then install mediawiki and run test commands [09:48:17] Yeah. [09:48:32] Thanks. Looks nice. I'll explore more in coming days. [09:48:40] hashar: is it possible to add more extensions to the default test set? I am getting frustrated. Right now there are three separate issues blocking merges in CX and Translate that do not seem to be caused by any changes in those extensions and I would like not have breaking changes merged in the first place. [09:49:58] or maybe just two [09:50:04] 10Continuous-Integration-Infrastructure, 10Quibble: Allow running individual tests on Jenkins - https://phabricator.wikimedia.org/T200684 (10Simetrical) The problem is that sometimes running locally produces different results from on the CI infrastructure, or you don't want to set up extensions etc. I'd like... [09:56:52] (03PS1) 10Ema: varnishkafka: use debian-glue (voting) [integration/config] - 10https://gerrit.wikimedia.org/r/449433 (https://phabricator.wikimedia.org/T200445) [10:00:29] (03CR) 10Hashar: [C: 032] varnishkafka: use debian-glue (voting) [integration/config] - 10https://gerrit.wikimedia.org/r/449433 (https://phabricator.wikimedia.org/T200445) (owner: 10Ema) [10:00:35] (03CR) 10Hashar: [C: 04-2] varnishkafka: use debian-glue (voting) [integration/config] - 10https://gerrit.wikimedia.org/r/449433 (https://phabricator.wikimedia.org/T200445) (owner: 10Ema) [10:01:12] (03PS2) 10Hashar: varnishkafka: use debian-glue (voting) [integration/config] - 10https://gerrit.wikimedia.org/r/449433 (https://phabricator.wikimedia.org/T200445) (owner: 10Ema) [10:01:38] (03CR) 10Hashar: [C: 032] "I have changed the commit message to point to the Gerrit change https://gerrit.wikimedia.org/r/#/c/449425/ instead of the CI console (whi" [integration/config] - 10https://gerrit.wikimedia.org/r/449433 (https://phabricator.wikimedia.org/T200445) (owner: 10Ema) [10:03:00] (03Merged) 10jenkins-bot: varnishkafka: use debian-glue (voting) [integration/config] - 10https://gerrit.wikimedia.org/r/449433 (https://phabricator.wikimedia.org/T200445) (owner: 10Ema) [10:15:40] !log gerrit: deleting branch wmf/es6 on mediawiki/vendor . We use 'es6' branch instead. (made for dcausse ) [10:15:44] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [10:30:03] 10Beta-Cluster-Infrastructure: deployment-tin:/srv/mediawiki-staging/wmf-config/event-schemas is out of sync - https://phabricator.wikimedia.org/T199270 (10dcausse) Pinging @Pchelolo as he may use this repo for testing purposes. [11:44:12] 10Continuous-Integration-Config, 10Gerrit: Allow pushing more than 10 changes at once - https://phabricator.wikimedia.org/T200785 (10Simetrical) [11:47:55] 10Continuous-Integration-Config, 10Gerrit: Allow pushing more than 10 changes at once - https://phabricator.wikimedia.org/T200785 (10Simetrical) Originally caused by: https://github.com/wikimedia/puppet/commit/9e886b760ac6d3774e7d51ee9dea18678ae1b9be Is the changes/push limit necessary, or could we do with j... [12:45:25] 10Continuous-Integration-Infrastructure: Auto-fix errors in pushed changesets where possible - https://phabricator.wikimedia.org/T200790 (10Simetrical) [12:47:22] PROBLEM - Work requests waiting in Zuul Gearman server on contint1001 is CRITICAL: CRITICAL: 57.14% of data above the critical threshold [140.0] https://grafana.wikimedia.org/dashboard/db/zuul-gearman?panelId=10&fullscreen&orgId=1 [12:53:03] 10Continuous-Integration-Config, 10Release-Engineering-Team (Kanban), 10Discovery-Search (Current work), 10Patch-For-Review: Help needed to setup the elastic6 dev branch on CirrusSearch, Elastica and mediawiki-vendor - https://phabricator.wikimedia.org/T200669 (10dcausse) 05Open>03Resolved thanks! this... [13:38:13] RECOVERY - Work requests waiting in Zuul Gearman server on contint1001 is OK: OK: Less than 30.00% above the threshold [90.0] https://grafana.wikimedia.org/dashboard/db/zuul-gearman?panelId=10&fullscreen&orgId=1 [13:52:16] 10Gerrit: Rename Gerrit repository "LdapGroups" to "LDAPGroups" - https://phabricator.wikimedia.org/T200736 (10Paladox) We should also make the group own its self. [14:27:14] 10Beta-Cluster-Infrastructure, 10Analytics, 10Services (watching): deployment-tin:/srv/mediawiki-staging/wmf-config/event-schemas is out of sync - https://phabricator.wikimedia.org/T199270 (10Pchelolo) In production, the event-schemas are deployed automatically as soon as a patch is merged in gerrit in this... [14:38:50] 10Beta-Cluster-Infrastructure, 10Analytics, 10Services (watching): deployment-tin:/srv/mediawiki-staging/wmf-config/event-schemas is out of sync - https://phabricator.wikimedia.org/T199270 (10Krenair) I'm assuming whoever set up the submodule in mediawiki-config forgot to add a `git submodule update` or some... [14:49:02] 10Beta-Cluster-Infrastructure, 10Analytics, 10Services (watching): deployment-tin:/srv/mediawiki-staging/wmf-config/event-schemas is out of sync - https://phabricator.wikimedia.org/T199270 (10Ottomata) I'm not sure I knew this repo was even in wmf-config. Not sure why it is there...I'm pretty sure the Event... [14:54:45] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10Patch-For-Review, 10User-zeljkofilipin, 10Wikimedia-Incident: Create selenium-MediaWiki-jessie daily Jenkins job - https://phabricator.wikimedia.org/T185011 (10zeljkofilipin) a:05zeljkofilipin>03None [15:01:57] 10Beta-Cluster-Infrastructure, 10Analytics, 10Services (watching): deployment-tin:/srv/mediawiki-staging/wmf-config/event-schemas is out of sync - https://phabricator.wikimedia.org/T199270 (10dcausse) If this submodule is not referenced by anything else other than mediawiki then there should be no reason to... [15:19:43] 10Phabricator, 10Regression: Creating subtasks sets more restrictive edit permissions than parent task: Others cannot edit such subtasks anymore - https://phabricator.wikimedia.org/T199122 (10Mainframe98) I encountered this with T200798, It also appears to prevent me from seeing the Herald transcript for that... [15:28:00] PROBLEM - Puppet errors on integration-r-lang-01 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [15:32:53] PROBLEM - Free space - all mounts on deployment-maps04 is CRITICAL: CRITICAL: deployment-prep.deployment-maps04.diskspace._srv.byte_percentfree (<55.56%) [15:39:50] PROBLEM - SSH on integration-slave-docker-1017 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:44:41] RECOVERY - SSH on integration-slave-docker-1017 is OK: SSH OK - OpenSSH_6.7p1 Debian-5+deb8u4 (protocol 2.0) [15:46:09] 10Phabricator: Embedded Commons videos are broken - https://phabricator.wikimedia.org/T200757 (10mmodell) This should be fixed with rPHAB386b86fde376 [15:47:23] 10Phabricator: Embedded Commons videos are broken - https://phabricator.wikimedia.org/T200757 (10mmodell) [15:47:40] 10Phabricator: Embedded Commons videos are broken - https://phabricator.wikimedia.org/T200757 (10mmodell) p:05Triage>03Normal [15:51:06] 10Phabricator, 10Regression: Creating subtasks sets more restrictive edit permissions than parent task: Others cannot edit such subtasks anymore - https://phabricator.wikimedia.org/T199122 (10mmodell) 05Open>03Resolved a:03mmodell This should be resolved now: I changed default edit policy on maniphest ta... [16:08:02] RECOVERY - Puppet errors on integration-r-lang-01 is OK: OK: Less than 1.00% above the threshold [0.0] [16:11:06] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10MW-1.32-release-notes (WMF-deploy-2018-07-24 (1.32.0-wmf.14)), 10Patch-For-Review, 10User-zeljkofilipin: Run tests daily targeting beta cluster for all repositories with Selenium tests - https://phabricator.wikimedia.org/T188742 (10zeljkofilip... [16:30:45] PROBLEM - Work requests waiting in Zuul Gearman server on contint1001 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [140.0] https://grafana.wikimedia.org/dashboard/db/zuul-gearman?panelId=10&fullscreen&orgId=1 [16:32:27] (03CR) 10Thcipriani: [V: 032 C: 032] "Good rename" [integration/pipelinelib] - 10https://gerrit.wikimedia.org/r/447005 (owner: 10Dduvall) [16:46:51] PROBLEM - SSH on integration-slave-docker-1017 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:47:53] 10Continuous-Integration-Infrastructure, 10MediaWiki-Codesniffer: Auto-fix errors in pushed changesets where possible - https://phabricator.wikimedia.org/T200790 (10Legoktm) > maybe we could have a button to push that approves the diff and automatically supplies a new changeset I like that idea. Some sniffs a... [16:48:33] 10Gerrit: Allow pushing more than 10 changes at once - https://phabricator.wikimedia.org/T200785 (10Legoktm) [16:58:21] 10Continuous-Integration-Infrastructure, 10MediaWiki-Codesniffer: Auto-fix errors in pushed changesets where possible - https://phabricator.wikimedia.org/T200790 (10Krinkle) Meanwhile, we can also improve the documentation and workflows for doing it locally. In theory, it should be as simple as `composer run f... [17:04:10] 10Continuous-Integration-Infrastructure, 10MediaWiki-Codesniffer: Auto-fix errors in pushed changesets where possible - https://phabricator.wikimedia.org/T200790 (10Legoktm) Just `composer fix` :) With MediaWiki core though it takes a decent amount of time (5-6 minutes for me) to run over everything, so you sh... [17:09:32] (03CR) 10Thcipriani: [C: 032] Update references to deployment-bastion [integration/config] - 10https://gerrit.wikimedia.org/r/449288 (owner: 10Alex Monk) [17:11:40] (03Merged) 10jenkins-bot: Update references to deployment-bastion [integration/config] - 10https://gerrit.wikimedia.org/r/449288 (owner: 10Alex Monk) [17:11:46] RECOVERY - SSH on integration-slave-docker-1017 is OK: SSH OK - OpenSSH_6.7p1 Debian-5+deb8u4 (protocol 2.0) [17:17:59] 10Release-Engineering-Team, 10DNS, 10Operations, 10Traffic, and 5 others: Move Foundation Wiki to new URL when new Wikimedia Foundation website launches - https://phabricator.wikimedia.org/T188776 (10Varnent) @BBlack and @Reedy - One of the places that did not seem to respect the temporary nature of that U... [17:21:18] 10Project-Admins, 10Developer-Advocacy (Jul-Sep 2018): Sort out scope/confusion between #Possible-Tech-Projects and #Outreach-Programs-Projects tags - https://phabricator.wikimedia.org/T198101 (10Aklapper) a:05Aklapper>03srishakatux Reassigning to @srishakatux as I cannot solve T198101#4430365 by myself... [17:29:11] RECOVERY - Work requests waiting in Zuul Gearman server on contint1001 is OK: OK: Less than 30.00% above the threshold [90.0] https://grafana.wikimedia.org/dashboard/db/zuul-gearman?panelId=10&fullscreen&orgId=1 [17:50:49] !log Apply role::webperf::profiling_tools to deployment-webperf12; T195312 / T180761 [17:50:54] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [17:50:55] T180761: Move XHGui from tungsten to webperf-002 - https://phabricator.wikimedia.org/T180761 [17:50:56] T195312: Move flame graphs hosting from mwlog1001 to webperf-2 and enable in Beta Cluster - https://phabricator.wikimedia.org/T195312 [18:01:19] PROBLEM - Work requests waiting in Zuul Gearman server on contint1001 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [140.0] https://grafana.wikimedia.org/dashboard/db/zuul-gearman?panelId=10&fullscreen&orgId=1 [18:12:55] PROBLEM - Puppet errors on deployment-webperf12 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [18:17:55] RECOVERY - Free space - all mounts on deployment-maps04 is OK: OK: All targets OK [18:27:08] Nikerabbit: yes, please file bugs in the #ci-config project to add more extensions [18:28:13] legoktm: okay! I have to check if there has been previous requests [18:28:39] 10Continuous-Integration-Config, 10MediaWiki-extensions-ParserFunctions: ParserFunction tests fail on gerrit - https://phabricator.wikimedia.org/T200831 (10Huji) [18:28:52] PROBLEM - Free space - all mounts on deployment-maps04 is CRITICAL: CRITICAL: deployment-prep.deployment-maps04.diskspace._srv.byte_percentfree (<11.11%) [18:29:54] 10Continuous-Integration-Config, 10MediaWiki-extensions-ParserFunctions: ParserFunction tests fail on gerrit - https://phabricator.wikimedia.org/T200831 (10Legoktm) [18:33:53] RECOVERY - Free space - all mounts on deployment-maps04 is OK: OK: All targets OK [18:39:28] RECOVERY - Work requests waiting in Zuul Gearman server on contint1001 is OK: OK: Less than 30.00% above the threshold [90.0] https://grafana.wikimedia.org/dashboard/db/zuul-gearman?panelId=10&fullscreen&orgId=1 [18:48:28] PROBLEM - Work requests waiting in Zuul Gearman server on contint1001 is CRITICAL: CRITICAL: 57.14% of data above the critical threshold [140.0] https://grafana.wikimedia.org/dashboard/db/zuul-gearman?panelId=10&fullscreen&orgId=1 [19:11:57] PROBLEM - Free space - all mounts on deployment-tin is CRITICAL: CRITICAL: deployment-prep.deployment-tin.diskspace._mnt.byte_percentfree (No valid datapoints found)deployment-prep.deployment-tin.diskspace._srv.byte_percentfree (<10.00%) [19:22:23] 10Continuous-Integration-Config, 10MediaWiki-extensions-GlobalPreferences, 10Community-Tech-Sprint: Re-add GlobalPreferences to extension-gate - https://phabricator.wikimedia.org/T199761 (10Niharika) [19:24:18] 10Continuous-Integration-Config, 10Community-Tech, 10MediaWiki-extensions-GlobalPreferences, 10Community-Tech-Sprint: Re-add GlobalPreferences to extension-gate - https://phabricator.wikimedia.org/T199761 (10Niharika) [19:25:09] RECOVERY - Work requests waiting in Zuul Gearman server on contint1001 is OK: OK: Less than 30.00% above the threshold [90.0] https://grafana.wikimedia.org/dashboard/db/zuul-gearman?panelId=10&fullscreen&orgId=1 [19:30:32] 10Beta-Cluster-Infrastructure, 10Discovery-Search, 10Elasticsearch: Puppet errors on deployment-elastic* instances - https://phabricator.wikimedia.org/T200842 (10Krenair) [19:43:55] 10Release-Engineering-Team (Kanban), 10Release Pipeline: Generic CI job for running Blubber-built test entry points - https://phabricator.wikimedia.org/T200843 (10dduvall) [19:45:46] 10Phabricator: Cannot unassign myself from Phabricator project - https://phabricator.wikimedia.org/T200844 (10Izno) [19:45:49] (03PS1) 10Dduvall: Provide a generic job for testing via Blubber [integration/config] - 10https://gerrit.wikimedia.org/r/449527 (https://phabricator.wikimedia.org/T200843) [19:50:01] !log Configuring Jenkins to include integration/pipelinelib as an available pipeline library [19:50:04] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [19:53:31] 19:47:30 DEBUG:git.cmd:AutoInterrupt wait stderr: "fatal: git fetch_pack: expected ACK/NAK, got 'ERR want d183bb22ab4aa9e83e23a354d4acc547400577c8 not valid'\nfatal: The remote end hung up unexpectedly" [19:53:35] that's a new one... [20:09:52] PROBLEM - Free space - all mounts on deployment-maps04 is CRITICAL: CRITICAL: deployment-prep.deployment-maps04.diskspace._srv.byte_percentfree (<44.44%) [20:15:56] 10Phabricator: Cannot obviously unassign myself from Phabricator project - https://phabricator.wikimedia.org/T200844 (10Izno) [20:17:35] 10Release-Engineering-Team (Kanban), 10Release, 10Train Deployments: 1.32.0-wmf.15 deployment blockers - https://phabricator.wikimedia.org/T191061 (10mmodell) [20:19:51] RECOVERY - Free space - all mounts on deployment-maps04 is OK: OK: All targets OK [20:24:03] 10Phabricator: Cannot obviously unassign myself from Phabricator project - https://phabricator.wikimedia.org/T200844 (10Aklapper) 05Open>03stalled https://phabricator.wikimedia.org/project/query/oYMgUXSn9IZa/#R lists three projects that you are a member of. What exactly blocks you from leaving them? (Maybe t... [20:29:23] 10Beta-Cluster-Infrastructure: Request for shell access and steward rights in beta cluster - https://phabricator.wikimedia.org/T194267 (10Rxy) a:05Rxy>03Krenair >>! In T194267#4462628, @Krenair wrote: > Yep. > > This isn't your account, is it? https://deployment.wikimedia.beta.wmflabs.org/wiki/Special:Centr... [20:30:42] 10Beta-Cluster-Infrastructure: Request for shell access and steward rights in beta cluster - https://phabricator.wikimedia.org/T194267 (10Krenair) 05Open>03Resolved (change visibility) 20:30, 31 July 2018 Krenair (talk | contribs | block) changed global group membership for User:Rxy from (none) to steward (r... [20:43:22] gerrit admin around? [20:43:42] if so, please unmark https://gerrit.wikimedia.org/r/#/c/operations/mediawiki-config/+/449513/ as WIP so it can be SWATed [20:48:48] Hauskatze: done [20:48:57] :) [21:07:58] 10Phabricator: Cannot obviously unassign myself from Phabricator project - https://phabricator.wikimedia.org/T200844 (10Izno) Well, I've now managed to leave the one project (now I wish I had tracked what that was--I assume it was Tech Debt). I am still listed as a member on the milestones according to my profi... [21:16:41] 10Phabricator: Cannot obviously unassign myself from Phabricator project - https://phabricator.wikimedia.org/T200844 (10MarcoAurelio) You were a watcher of https://phabricator.wikimedia.org/project/members/2296/ (I was going to copy the domain and ended removing you from that milestone, sorry if that was unwante... [21:18:30] 10Phabricator: Cannot obviously unassign myself from Phabricator project - https://phabricator.wikimedia.org/T200844 (10Izno) 05stalled>03Resolved a:03Izno Huh, that behavior's a bit strange. But that's resolved-enough for me. [21:22:55] thcipriani: could you do the same with https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/384755/ ? [21:23:29] yep [21:23:31] done [21:24:53] chachi :) [21:26:48] 10Continuous-Integration-Infrastructure, 10MediaWiki-Codesniffer: Auto-fix errors in pushed changesets where possible - https://phabricator.wikimedia.org/T200790 (10hashar) [21:29:29] 10Continuous-Integration-Infrastructure, 10MediaWiki-Codesniffer: Auto-fix errors in pushed changesets where possible - https://phabricator.wikimedia.org/T200790 (10hashar) Cant you pass a list of files to `composer fix` / `phpcbf` ? I do that in core for `composer test` to greatly speed up the linting durat... [21:36:29] 10Gerrit: Allow pushing more than 10 changes at once - https://phabricator.wikimedia.org/T200785 (10hashar) 05Open>03declined https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/447821/ introduces the ContentLanguage, its parent looks unrelated ( https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/447788/ u... [21:41:28] hasharAway: / thcipriani zuul seems to have become stuck [21:41:47] Hauskatze: it is fine [21:41:50] just got overloaded [21:41:55] duh [21:42:07] the changes in the [coverage] pipeline are deprioritized and run after all the other jobs are done. So that part pills up [21:42:24] for gate-and-submit, I guess too many patches got +2ed [21:42:38] why all of them are marked as dependencies? [21:43:14] they are glued together [21:43:22] aha [21:43:40] well I hope the queue is less busy for evening SWAT :) [21:43:44] and I guess Zuul is currently limiting gate-and-submit to only 2 changes [21:43:53] it will raises the window as changes manages to pass/merge [21:44:03] so other changes are held in the queue [21:45:00] with https://integration.wikimedia.org/ci/label/m4executor/load-statistics showing the instances running mediawiki tests reached the limit of 15 concurrent builds [21:46:27] Hauskatze: so not stuck, but it has some backlog :\ [21:46:59] hasharAway: good to know it's working; God's mills may grind slowly but they grind finely. [21:57:02] (03PS1) 10Hashar: Migrate ProofreadPage to Quibble [integration/config] - 10https://gerrit.wikimedia.org/r/449617 (https://phabricator.wikimedia.org/T198173) [21:57:22] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10releng-201718-q3, 10Epic, 10Patch-For-Review: [EPIC] Migrate Mediawiki jobs from Nodepool to Docker - https://phabricator.wikimedia.org/T183512 (10hashar) [21:59:44] (03CR) 10Hashar: [C: 032] Migrate ProofreadPage to Quibble [integration/config] - 10https://gerrit.wikimedia.org/r/449617 (https://phabricator.wikimedia.org/T198173) (owner: 10Hashar) [22:01:11] (03Merged) 10jenkins-bot: Migrate ProofreadPage to Quibble [integration/config] - 10https://gerrit.wikimedia.org/r/449617 (https://phabricator.wikimedia.org/T198173) (owner: 10Hashar) [22:04:44] !log moving active deployment-prep deployment server to deployment-deploy01 [22:04:47] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [22:09:08] thcipriani: nice :) [22:09:47] Krinkle: maybe :) still making sure everything works right :) [22:10:11] thcipriani: I'd like to review the query used by scap and fix it ASAP. Got a minute? [22:10:19] For canary regressions [22:10:38] sure [22:10:40] Maybe it's all good already, but I'd like to sanity check :) [22:11:07] So where do I find the current query used, and does prod match latest git? [22:11:15] that'd be good. I feel like I'm generally the only one who looks at it [22:11:33] the current query is in logstash_checker.py in operations/pupppet in the services module [22:16:32] Krinkle: https://gerrit.wikimedia.org/r/plugins/gitiles/operations/puppet/+/production/modules/service/files/logstash_checker.py#115 [22:16:34] thcipriani: Aye, yeah, so that's what I feared. It's missing channel:fatal (ouch), and it's excluding level=INFO, which is most of channel:error and type:hhvm [22:17:08] If we'd monitor all of type:mediawiki, then it'd make sense to exclude INFO/DEBUG, but if we're just looking at these channels, I'd be included to just consider all levels. [22:18:06] For later consideration, I would actually recommend a wildcard type:mediawiki for higher severity levels, so that we catch errors from all areas of mediawiki and extensions (resourceloader, memcached, jobqueue, etc.) [22:18:15] But for now, these two changes would make a good start. [22:18:51] Looking at recent data via https://logstash.wikimedia.org/goto/b316b2f6951e84f3dbc810319bf4afdb [22:20:42] 10Continuous-Integration-Infrastructure, 10Zuul: Zuul silently ignores patches if a dependency loop is involved somehow - https://phabricator.wikimedia.org/T181574 (10hashar) [22:20:47] one thing to keep in mind is high-variability may lead to false positives. So the recent spike in that graph (which seems like it may be due to input) may have prevented a deployment [22:21:26] I agree that's an issue to look out for, but doesn't relate more to one source of errors than others. [22:22:14] 10Continuous-Integration-Infrastructure, 10Quibble, 10Patch-For-Review: Address fixme in mw-fetch-composer-dev.sh - https://phabricator.wikimedia.org/T181940 (10hashar) ``` # FIXME integration/composer used to be outdated and broke the # autoloader. Since composer 1.0.0-alpha11 the following... [22:23:01] Hm.. so level=INFO from type=hhvm can actually be excluded. Not because it's unimportant, but because hhvm logs everything twice. Once under INFO and again under NOTICE or WARNING. [22:23:20] But that does not apply to mediawiki, there channel=error level=INFO matters (and arguably shouldn't use that level) [22:23:21] sure, just something to be mindful of: if a particular channel is noisy it may lead to frustration [22:24:00] Yeah, i'm not adding more channels besides channel:fatal which is important, more so than channel:exception. [22:24:27] gotcha [22:25:07] exception is a fatal error we managed to catch at the top level to print nicely. fatal is.. fatal, caught by our out-of-process hhvm syslog monitor, when is then rewritten from type:hhvm-json to type:mediawiki channe:fatal. [22:27:40] so as a first pass: add channel:fatal, remove level:INFO from must_not [22:28:11] probably get rid of the message exclusions aside from SlowTimer at this point. They were important at one point. [22:29:12] Indeed. The latter two message exclusions have no matches recently. [22:29:26] We may want to keep excluding (type:hhvm AND level:INFO) in some way [22:29:42] afaik it wouldn't make any difference besides amplification [22:30:00] 10Continuous-Integration-Infrastructure: Feature request: Evaluate "require" field from "extension.json" in automated test environment - https://phabricator.wikimedia.org/T185736 (10hashar) [22:30:02] everything from hhvm-type is logged to its proper level (NOTICE, WARNING, ERROR) and to INFO. [22:30:32] 10Continuous-Integration-Infrastructure: Feature request: Evaluate "require" field from "extension.json" in automated test environment - https://phabricator.wikimedia.org/T185736 (10hashar) [22:30:45] 10Continuous-Integration-Infrastructure, 10Quibble: Feature request: Evaluate "require" field from "extension.json" in automated test environment - https://phabricator.wikimedia.org/T185736 (10hashar) [22:32:07] 10Phabricator, 10Tools: Publicly log account bans made using the phab-ban tool - https://phabricator.wikimedia.org/T200856 (10bd808) [22:32:53] (03Abandoned) 10Paladox: testing quota [All-Projects] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/433909 (owner: 10Paladox) [22:35:10] (03Abandoned) 10Paladox: Update mw-install-postgresql to include the install script [integration/jenkins] - 10https://gerrit.wikimedia.org/r/316232 (https://phabricator.wikimedia.org/T22343) (owner: 10Paladox) [22:36:14] Krinkle: ok, I'll fiddle with modifications to the query and add you as a reviewer. As I mentioned I'd ideally like to move the query into mwconfig. we talked about it a bit at our team meeting Monday, not enough hours in the day :( [22:37:02] Project beta-scap-eqiad build #218179: 15ABORTED in 20 min: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/218179/ [22:37:09] ^ me [22:38:06] thcipriani: Yeah, moving to wmf-config will be good, +1 :) [22:39:19] 10Phabricator, 10Tools: Publicly log account bans made using the phab-ban tool - https://phabricator.wikimedia.org/T200856 (10bd808) Logging could be done in a few ways: * The tool could keep track in a ToolsDB table and provide an interface to display the data * The tool could add a comment to a phabricator t... [22:44:25] 10Phabricator, 10Tools: Publicly log account bans made using the phab-ban tool - https://phabricator.wikimedia.org/T200856 (10matmarex) > The tool could add a comment to a phabricator task designated for tracking these actions Wikipedia managed that way for years. ;) https://en.wikipedia.org/wiki/Wikipedia:Bl... [22:47:56] RECOVERY - Puppet errors on deployment-webperf12 is OK: OK: Less than 1.00% above the threshold [0.0] [22:51:24] thcipriani, how goes deploy01? [22:52:38] Krenair: I think it's good. I noticed that horizon config overriding scap::deployment_server (which is why I aborted beta-scap-eqiad a second ago) but scap3 deploys work, waiting on l10n to verify scap2 deployments. [22:53:02] cool [22:53:41] 10Beta-Cluster-Infrastructure, 10Discovery-Search, 10Elasticsearch: Puppet errors on deployment-elastic* instances - https://phabricator.wikimedia.org/T200842 (10Krenair) a:03Krenair [22:54:40] 10Beta-Cluster-Infrastructure, 10Discovery-Search, 10Elasticsearch: Puppet errors on deployment-elastic* instances - https://phabricator.wikimedia.org/T200842 (10Krenair) Now got it to: `Error: Could not retrieve catalog from remote server: Error 500 on SERVER: Server Error: Evaluation Error: Error while eva... [22:57:31] 10Beta-Cluster-Infrastructure, 10Discovery-Search, 10Elasticsearch: Puppet errors on deployment-elastic* instances - https://phabricator.wikimedia.org/T200842 (10Krenair) Cherry-pick is for https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/444610/ In particular: ```diff --git a/modules/profile/manifests... [23:01:40] what [23:01:45] gitroot@deployment-puppetmaster03:/var/lib/git/operations/puppet(production u+32)# git pull --rebase origin production [23:01:45] From https://gerrit.wikimedia.org/r/p/operations/puppet [23:01:45] * branch production -> FETCH_HEAD [23:01:45] error: cannot lock ref 'refs/remotes/origin/production': is at 8eb8343c0171b292d954f96695b35fc4f017ab42 but expected badaba00b64a132db01106ee129f10a8ce8abd0c [23:01:48] ! badaba00b6..8eb8343c01 production -> origin/production (unable to update local ref) [23:01:50] I've never seen this before [23:02:11] okay now it works [23:03:21] just lucky i guess? Krenair i don't have a moment just this moment, but chances are from the error message an extra hiera key needs to be added as `profile::elasticsearch::cirrus::certificate_name: %{::fqdn}` [23:03:43] yeah did that one [23:03:44] i must have missed the labs side, i wish puppet compiler could do those too.. [23:04:14] It appears to have an old broken cherry-pick of 'Switch elasticsearch to use tlsproxy module' [23:04:22] Should I replace it with the latest? [23:04:37] Krenair: hmm, yes that seems reasonable [23:08:47] hm [23:08:56] well puppet is happy, but [23:09:03] "An error has occurred while searching: We could not complete your search due to a temporary problem. Please try again later." on beta now [23:09:06] 10Continuous-Integration-Infrastructure: castor caching model seems to be broken for non-voting or coverage jobs - https://phabricator.wikimedia.org/T189077 (10hashar) [23:09:08] useful error messages [23:09:26] 10Phabricator, 10Tools: Publicly log account bans made using the phab-ban tool - https://phabricator.wikimedia.org/T200856 (10MarcoAurelio) Option 1 looks good. If too much work, what about a page on Wikitech, the same way the Server Admin Logs work? A never-to-be-closed task is something I don't really like t... [23:10:11] 2018-07-31 23:10:03 [W2DsSwpEEj4AAAxIwWcAAAAL] deployment-mediawiki-07 enwiki 1.32.0-alpha CirrusSearch WARNING: Search backend error during full_text search for 'testing 123' after 3: unknown: Couldn't connect to host, Elasticsearch down? {"queryType":"full_text","tookMs":3,"query":"testing 123","limit":21,"suggestion":"","syntax":["full_text","full_text_simple_match"],"error_message":"unknown: Couldn't connect to host, Elasticsearch down?"} [23:11:44] so what host is it trying to connect to exactly [23:11:46] Krenair: heh, checking [23:12:43] Krenair: i typoed the port in the puppet patch, it has 9423 and should be 9243 .. .sec [23:12:48] oh hey I did notice this [23:12:52] - listen [::]:9243 default_server deferred backlog=16384 reuseport ipv6only=on fastopen=150 ssl http2; [23:12:53] - listen 9243 default_server deferred backlog=16384 reuseport fastopen=150 ssl http2; [23:12:53] + listen [::]:9423 default_server deferred backlog=16384 reuseport ipv6only=on fastopen=150 ssl http2; [23:12:56] + listen 9423 default_server deferred backlog=16384 reuseport fastopen=150 ssl http2; [23:13:00] yeah [23:13:00] PROBLEM - Puppet errors on deployment-mira is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [23:14:22] alright that's better [23:14:27] search works again [23:14:39] great! [23:16:01] thanks ebernhardson [23:17:45] RECOVERY - Puppet errors on deployment-elastic09 is OK: OK: Less than 1.00% above the threshold [0.0] [23:18:04] PROBLEM - Puppet errors on deployment-tin is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [23:18:04] RECOVERY - Puppet errors on deployment-elastic06 is OK: OK: Less than 1.00% above the threshold [0.0] [23:18:28] ebernhardson, I've fixed your change in gerrit [23:19:05] 10Beta-Cluster-Infrastructure, 10Discovery-Search, 10Elasticsearch: Puppet errors on deployment-elastic* instances - https://phabricator.wikimedia.org/T200842 (10Krenair) 05Open>03Resolved We replaced it with the latest version of the change, applied it and puppet was happy but search broke. Fixed the pa... [23:19:26] RECOVERY - Puppet errors on deployment-elastic07 is OK: OK: Less than 1.00% above the threshold [0.0] [23:19:27] RECOVERY - Puppet errors on deployment-elastic05 is OK: OK: Less than 1.00% above the threshold [0.0] [23:19:43] Krenair: thanks [23:19:49] np [23:21:57] PROBLEM - Free space - all mounts on deployment-tin is CRITICAL: CRITICAL: deployment-prep.deployment-tin.diskspace._mnt.byte_percentfree (No valid datapoints found)deployment-prep.deployment-tin.diskspace._srv.byte_percentfree (<10.00%) [23:31:15] so puppet on -tin and -mira has started failing [23:31:16] Error: Could not retrieve catalog from remote server: Error 500 on SERVER: Server Error: Invalid relationship: Exec[PHP module mail enable] { subscribe => File[/etc/php5/mods-available/mail.ini] }, because File[/etc/php5/mods-available/mail.ini] doesn't seem to be in the catalog [23:34:03] https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/449219/4/modules/mediawiki/manifests/php.pp is looking suspicious [23:35:23] I guess that removed jessie support [23:42:33] `/go oper [23:42:39] bah [23:43:01] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team, 10Operations, 10Jenkins, 10Patch-For-Review: Upgrade deployment-prep deployment servers to stretch - https://phabricator.wikimedia.org/T192561 (10Krenair) Since https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/449219/ got merged, puppet has...