[01:31:30] (03CR) 10Krinkle: "We usually use the 'archived' state rather than removal, but I've forgotten why.." [integration/config] - 10https://gerrit.wikimedia.org/r/441879 (https://phabricator.wikimedia.org/T188377) (owner: 10Elukey) [01:36:20] (03CR) 10Krinkle: "+1 (Would prefer for .quibble-ci.json or .config/quibble-ci.yaml)" [integration/config] - 10https://gerrit.wikimedia.org/r/441984 (https://phabricator.wikimedia.org/T196960) (owner: 10Hashar) [02:31:34] who is responsible for the phabricator account approval queue? since I'm an admin, I'm getting emails [02:32:28] there's unapproved accounts going back to Saturday, is there a reason for leaving them in the queue instead of disabling them? [02:33:12] I mean disable Brianamagana678 and pawangupta [02:33:36] either approve or disable TestforT197550 and TestforT197550again, whatever works [02:34:25] then there's another 33 users in the queue after that, who apparently haven't been reviewed [02:36:39] maybe andre__ , twentyafterfour have an opinion? [02:37:00] I would clear it myself but it's unclear what the criteria are, you don't get much information [02:51:51] eh, well if everyone is out for the night then I'll just do it [03:02:19] It’s morning here :) [03:03:23] * paladox has no idea why he is still up at 4am but I can see the blue sky outside now [04:45:02] 10Diffusion, 10GitHub-Mirrors, 10Repository-Admins: Mirroring mediawiki/core to GitHub from diffusion does not work - https://phabricator.wikimedia.org/T135494#4314014 (10demon) As I've said before, there's no need to push them? [05:17:26] ok, done [06:16:08] Hi everybody [06:16:27] I'd need some help for T197503 [06:16:27] T197503: Archive operations/puppet/varnishkafka repository - https://phabricator.wikimedia.org/T197503 [06:16:39] (and also similar ones that I am working on) [06:17:10] I filed a change for integration/config in https://gerrit.wikimedia.org/r/#/c/integration/config/+/441879/ (but not sure if it is correct) [06:17:22] and I have no idea how to delete a github mirror sync [06:43:09] PROBLEM - Free space - all mounts on deployment-tin is CRITICAL: CRITICAL: deployment-prep.deployment-tin.diskspace._mnt.byte_percentfree (No valid datapoints found)deployment-prep.deployment-tin.diskspace._srv.byte_percentfree (<11.11%) [07:08:12] RECOVERY - Free space - all mounts on deployment-tin is OK: OK: deployment-prep.deployment-tin.diskspace._mnt.byte_percentfree (No valid datapoints found) [07:15:29] elukey: usually you can simply disable a repository in phabricator in order to disable mirroring to github. I went ahead and did that for rOPVK and also archived the repository on github.com [07:15:37] I didn't go so far as to delete it on github [07:16:26] twentyafterfour: thanks! Just saw the update on the task.. would you have time to do the same for jmxtrans|kafkatee puppet repos? (I don't have permits) [07:31:19] (03PS1) 10Hashar: Migrate Wikidata.org extension to Quibble [integration/config] - 10https://gerrit.wikimedia.org/r/442024 (https://phabricator.wikimedia.org/T183512) [07:31:40] (03CR) 10Hashar: [C: 032] Migrate Wikidata.org extension to Quibble [integration/config] - 10https://gerrit.wikimedia.org/r/442024 (https://phabricator.wikimedia.org/T183512) (owner: 10Hashar) [07:32:03] elukey: sure, do you have the task ids? I can also give you perms. [07:33:00] (03Merged) 10jenkins-bot: Migrate Wikidata.org extension to Quibble [integration/config] - 10https://gerrit.wikimedia.org/r/442024 (https://phabricator.wikimedia.org/T183512) (owner: 10Hashar) [07:35:27] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10releng-201718-q3, 10Epic, 10Patch-For-Review: [EPIC] Migrate Mediawiki jobs from Nodepool to Docker - https://phabricator.wikimedia.org/T183512#4314247 (10hashar) [07:35:31] 10Continuous-Integration-Infrastructure (shipyard), 10MediaWiki-Configuration, 10Wikidata, 10Wikidata.org, 10Patch-For-Review: [Wikidata.org] AutoLoaderStructureTest::testPSR4Completeness fails - https://phabricator.wikimedia.org/T198077#4314245 (10hashar) 05Open>03Resolved a:03Lucas_Werkmeister_WMDE [07:35:34] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10releng-201718-q3, 10Epic, 10Patch-For-Review: [EPIC] Migrate Mediawiki jobs from Nodepool to Docker - https://phabricator.wikimedia.org/T183512#4260773 (10hashar) [07:35:45] twentyafterfour: the ids are the subtasks of the parent T188377, perms also work fine! [07:35:45] T188377: Import some Analytics git puppet submodules to operations/puppet - https://phabricator.wikimedia.org/T188377 [07:35:56] I am trying to do the proper clean ups :) [07:36:19] elukey: you are now a member of repository-admins but I went ahead and deactivated those two in phab [07:36:25] thanksssss [07:36:43] do do so in the future you just go to 'manage repository' and choose deactivate from the 'actions' dropdown [07:36:48] ack [07:37:15] https://gerrit.wikimedia.org/r/#/c/integration/config/+/441879/ is fine to merge or completely wrong? (integration/config) [07:37:28] it is basically the last step remaining [07:38:41] elukey: looks good to me but I'm not that familiar with zuul. Should ask hashar to review that change I guess [07:39:29] all right thanks! [07:40:45] elukey: looks like there is a conduit method for archiving projects, we might be able to somewhat automate this process [07:40:52] er archiving repos I mean [07:41:06] (there is an api for both, actually) [07:41:29] I wasn't aware of the workflow until Timo showed it to me, I completely agree that automating it would be great [07:42:02] conduit couldn't be much easier to use, it's a very straightforward post some json to a url endpoint [07:42:33] the gerrit part can probably be automated too since it's also got a nice rest api [07:45:14] twentyafterfour: elukey I am there [07:45:41] elukey: those modules are gone are they ? [07:45:59] hashar: o/ [07:46:01] elukey: if they are legacy we can archive them [07:46:14] they have been merged to operations/puppet [07:46:31] cool I will fill an archiving task :D [07:46:41] I have it! [07:46:48] subtasks of T188377 [07:46:48] T188377: Import some Analytics git puppet submodules to operations/puppet - https://phabricator.wikimedia.org/T188377 [07:46:56] I am following the list :) [07:47:06] https://gerrit.wikimedia.org/r/#/c/integration/config/+/441879/ is next (integration/config0 [07:52:09] elukey: filled https://phabricator.wikimedia.org/T198170 :] [07:52:25] elukey: could you amend https://gerrit.wikimedia.org/r/#/c/integration/config/+/441879/1/zuul/layout.yaml and replace the templates puppet-module / tox-docker with 'archived' ? [07:52:26] eg: [07:52:27] template: [07:52:31] - name: archived [07:52:40] then link to T198170 :] [07:52:41] T198170: Archive the puppet modules jmxtrans kafkatee varnishkafka - https://phabricator.wikimedia.org/T198170 [07:53:39] ack! [07:59:43] (03PS1) 10Elukey: Archive the puppet modules jmxtrans, varnishkafka and kafkatee [integration/config] - 10https://gerrit.wikimedia.org/r/442026 (https://phabricator.wikimedia.org/T198170) [08:00:08] hashar: https://gerrit.wikimedia.org/r/#/c/integration/config/+/442026/1/zuul/layout.yaml ? [08:00:24] elukey: yup :] [08:00:40] (03CR) 10Hashar: [C: 032] Archive the puppet modules jmxtrans, varnishkafka and kafkatee [integration/config] - 10https://gerrit.wikimedia.org/r/442026 (https://phabricator.wikimedia.org/T198170) (owner: 10Elukey) [08:00:40] I noticed that there are other modules that have been merged, like the mariadb one [08:00:54] not even sure about wikimetrics, kafka, etc.. [08:01:00] (those sound analytics though :P) [08:01:58] (03Merged) 10jenkins-bot: Archive the puppet modules jmxtrans, varnishkafka and kafkatee [integration/config] - 10https://gerrit.wikimedia.org/r/442026 (https://phabricator.wikimedia.org/T198170) (owner: 10Elukey) [08:03:15] elukey: https://phabricator.wikimedia.org/maniphest/task/edit/form/33/ (if you have access) gives the prefilled form [08:03:36] it lists all the cleanup actions needed [08:03:40] I don't :( [08:03:45] ;D [08:04:22] join the cleanup project to get access to the form [08:05:45] do I want to join the cleanup project? :P [08:05:50] kidding, doing it [08:07:21] (03Abandoned) 10Elukey: Remove puppet submodules merged into operations/puppet [integration/config] - 10https://gerrit.wikimedia.org/r/441879 (https://phabricator.wikimedia.org/T188377) (owner: 10Elukey) [08:10:14] done :) [08:12:16] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10releng-201718-q3, 10Epic, 10Patch-For-Review: [EPIC] Migrate Mediawiki jobs from Nodepool to Docker - https://phabricator.wikimedia.org/T183512#4314382 (10hashar) [08:12:28] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10releng-201718-q3, 10Epic, 10Patch-For-Review: [EPIC] Migrate Mediawiki jobs from Nodepool to Docker - https://phabricator.wikimedia.org/T183512#4260894 (10hashar) [08:43:19] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10releng-201718-q3, 10Epic, 10Patch-For-Review: [EPIC] Migrate Mediawiki jobs from Nodepool to Docker - https://phabricator.wikimedia.org/T183512#4314440 (10hashar) [08:43:30] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10releng-201718-q3, 10Epic, 10Patch-For-Review: [EPIC] Migrate Mediawiki jobs from Nodepool to Docker - https://phabricator.wikimedia.org/T183512#4260911 (10hashar) [08:59:41] 10Release-Engineering-Team (Watching / External), 10DBA, 10Datasets-General-or-Unknown, 10Patch-For-Review, and 2 others: Automate the check and fix of object, schema and data drifts between mediawiki HEAD, production masters and slaves - https://phabricator.wikimedia.org/T104459#4314461 (10jcrespo) I ment... [09:24:19] PROBLEM - Apertium APY on deployment-apertium02 is CRITICAL: Connection refused [09:34:08] PROBLEM - Apertium APY on deployment-sca02 is CRITICAL: Connection refused [09:42:47] PROBLEM - Apertium APY on deployment-sca01 is CRITICAL: Connection refused [10:09:07] 10Scap (Scap3-MediaWiki-MVP), 10Operations, 10Wikimedia-Incident: Scap sync --restart not working - https://phabricator.wikimedia.org/T198185#4314698 (10mobrovac) p:05Triage>03Unbreak! [10:52:00] 10Release-Engineering-Team (Watching / External), 10DBA, 10Datasets-General-or-Unknown, 10Patch-For-Review, and 2 others: Automate the check and fix of object, schema and data drifts between mediawiki HEAD, production masters and slaves - https://phabricator.wikimedia.org/T104459#4314828 (10Ladsgroup) Rega... [10:58:30] 10MediaWiki-Releasing, 10MW-1.31-release: MediaWiki 1.31 Release notes states that THIS IS NOT A RELEASE YET! - https://phabricator.wikimedia.org/T198180#4314856 (10Aklapper) [11:03:03] (03PS1) 10Hashar: Migrate PropertySuggester to Quibble [integration/config] - 10https://gerrit.wikimedia.org/r/442069 (https://phabricator.wikimedia.org/T183512) [11:04:52] (03PS1) 10Hashar: Migrate Popups to Quibble [integration/config] - 10https://gerrit.wikimedia.org/r/442070 (https://phabricator.wikimedia.org/T183512) [11:05:23] (03CR) 10Hashar: [C: 032] Migrate PropertySuggester to Quibble [integration/config] - 10https://gerrit.wikimedia.org/r/442069 (https://phabricator.wikimedia.org/T183512) (owner: 10Hashar) [11:05:28] (03CR) 10Hashar: [C: 032] Migrate Popups to Quibble [integration/config] - 10https://gerrit.wikimedia.org/r/442070 (https://phabricator.wikimedia.org/T183512) (owner: 10Hashar) [11:05:58] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10releng-201718-q3, 10Epic, 10Patch-For-Review: [EPIC] Migrate Mediawiki jobs from Nodepool to Docker - https://phabricator.wikimedia.org/T183512#4314876 (10hashar) [11:06:39] (03Merged) 10jenkins-bot: Migrate PropertySuggester to Quibble [integration/config] - 10https://gerrit.wikimedia.org/r/442069 (https://phabricator.wikimedia.org/T183512) (owner: 10Hashar) [11:06:48] (03Merged) 10jenkins-bot: Migrate Popups to Quibble [integration/config] - 10https://gerrit.wikimedia.org/r/442070 (https://phabricator.wikimedia.org/T183512) (owner: 10Hashar) [11:22:47] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10releng-201718-q3, 10Epic, 10Patch-For-Review: [EPIC] Migrate Mediawiki jobs from Nodepool to Docker - https://phabricator.wikimedia.org/T183512#4314935 (10hashar) [12:12:54] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10releng-201718-q3, 10Epic, 10Patch-For-Review: [EPIC] Migrate Mediawiki jobs from Nodepool to Docker - https://phabricator.wikimedia.org/T183512#4315119 (10hashar) [12:13:09] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10releng-201718-q3, 10Epic, 10Patch-For-Review: [EPIC] Migrate Mediawiki jobs from Nodepool to Docker - https://phabricator.wikimedia.org/T183512#4263152 (10hashar) [12:41:30] hashar: lmk if I can do anything to help, thanks for the CI tweaks! [12:41:59] awight: thanks :] [12:42:19] awight: yeah I found out ORES default to point to ores.wikimedia.org, and I guess it always require an ORES Service to be available [12:42:32] so I am just crafting a hack to skip the selenium tests when being run in jenkins [12:43:25] hashar: We have ORES services running in labs and beta cluster, if either of those will satisfy the network architecture? [12:43:32] http://ores-beta.wmflabs.org/ [12:43:38] https://ores-staging.wmflabs.org/ [12:44:04] awight: unlikely [12:44:18] that is for selenium tests being triggered when a patch is sent to gerrit [12:44:38] the job spawns a docker instance that has mediawiki installed + ORES and exposed to http://localhost/ [12:44:45] the selenium tests are then run against that local instance [12:45:18] We have a dockerfile now, but I don't think it's configured in CI yet... [12:45:45] yeah [12:45:55] It's fine to skip the tests temporarily. We'll put the effort in to make them work in the long run, though. [12:46:01] I will comment on the patch when it is ready [12:46:05] just testing it out for now [12:46:06] Thanks! [13:00:41] awight: yeah turns out to be a bit more complicated :] mediawiki/core has its own browser test which end up triggering some ORES hook and thus the tests fail as well [13:00:47] * hashar heading to SWAT [13:03:32] oooh rats [13:12:30] 10Release-Engineering-Team, 10MediaWiki-General-or-Unknown, 10MW-1.29-release: Formalise and Announce REL1_29 EOL - https://phabricator.wikimedia.org/T197669#4299300 (10Kghbln) > There is also the question whether we should (answer is probably) do a 1.29.3 release for maintenance reasons - https://github.com... [13:14:18] PROBLEM - Free space - all mounts on integration-slave-docker-1006 is CRITICAL: CRITICAL: integration.integration-slave-docker-1006.diskspace.root.byte_percentfree (<22.22%) [13:18:14] !log cleaned containers on integration-slave-docker-1006 [13:18:16] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [13:24:18] RECOVERY - Free space - all mounts on integration-slave-docker-1006 is OK: OK: All targets OK [13:26:34] 10MediaWiki-Releasing, 10MW-1.31-release: MediaWiki 1.31 Release notes states that THIS IS NOT A RELEASE YET! - https://phabricator.wikimedia.org/T198180#4314576 (10Kghbln) The issue is that we link to the master version of the "RELEASE-NOTES-1.31" file all over MediaWiki.org so the end user will always see "T... [13:27:35] 10Continuous-Integration-Config, 10MediaWiki-extensions-Translate: Translate mw extension tests failing for branch REL1_31 - https://phabricator.wikimedia.org/T198110#4315417 (10Paladox) p:05Triage>03High [13:27:52] 10Continuous-Integration-Config, 10MediaWiki-extensions-Translate: Translate mw extension tests failing for branch REL1_31 - https://phabricator.wikimedia.org/T198110#4311999 (10Paladox) Similar to T197933 [13:27:58] 10Continuous-Integration-Config: CI is broken for TimedMediaHandler REL branches - https://phabricator.wikimedia.org/T197933#4307243 (10Paladox) p:05Triage>03High [13:31:37] hashar oh, i wonder if this https://github.com/wikimedia/integration-config/blob/master/jjb/mediawiki.yaml#L222 is why ext doin't get there deps? [13:37:42] hmm the master branch runs wmf-quibble-vendor-mysql-hhvm-docke [13:37:43] but the rel branches run mediawiki-extensions-* tests [13:44:02] yeah wmf-quibble is the replacement for wmf and master branch [13:44:13] and I have to add a new job release-quibble for master and REL branches [13:44:18] found it [13:44:19] https://github.com/wikimedia/integration-config/blob/master/zuul/parameter_functions.py#L500 [13:44:22] hashar ^^ [13:44:36] yeah that is the function that set all the dependencies [13:44:42] it is no more being applied for the release branches [13:45:12] yeh [13:45:12] anyway for REL branches , we should no more trigger mediawiki-extensions-* jobs [13:45:17] yeh [13:46:31] hashar looks to have been caused by https://github.com/wikimedia/integration-config/commit/bc1dfe4571af83978b76c8a54617a1ecb2cfd97a [13:48:07] hashar i think we should remove that if check just for now [13:48:16] until we are ready to remove mediawiki-extensions [13:48:20] PROBLEM - Free space - all mounts on integration-slave-docker-1001 is CRITICAL: CRITICAL: integration.integration-slave-docker-1001.diskspace.root.byte_percentfree (<55.56%) [13:48:58] (03PS1) 10Paladox: Fix setting extension deps for mediawiki-extensions-* tests [integration/config] - 10https://gerrit.wikimedia.org/r/442103 [13:49:25] (03PS2) 10Paladox: Fix setting extension deps for mediawiki-extensions-* tests [integration/config] - 10https://gerrit.wikimedia.org/r/442103 [13:50:23] (03PS3) 10Paladox: Fix setting extension deps for mediawiki-extensions-* tests [integration/config] - 10https://gerrit.wikimedia.org/r/442103 (https://phabricator.wikimedia.org/T197933) [13:50:24] hashar ^^ [13:50:34] (03CR) 10Paladox: "This change is ready for review." [integration/config] - 10https://gerrit.wikimedia.org/r/442103 (https://phabricator.wikimedia.org/T197933) (owner: 10Paladox) [13:51:50] (03CR) 10jerkins-bot: [V: 04-1] Fix setting extension deps for mediawiki-extensions-* tests [integration/config] - 10https://gerrit.wikimedia.org/r/442103 (https://phabricator.wikimedia.org/T197933) (owner: 10Paladox) [13:52:38] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10releng-201718-q3, 10Epic, 10Patch-For-Review: [EPIC] Migrate Mediawiki jobs from Nodepool to Docker - https://phabricator.wikimedia.org/T183512#4315553 (10hashar) [13:53:07] (03PS4) 10Paladox: Fix setting extension deps for mediawiki-extensions-* tests [integration/config] - 10https://gerrit.wikimedia.org/r/442103 (https://phabricator.wikimedia.org/T197933) [13:54:32] (03CR) 10jerkins-bot: [V: 04-1] Fix setting extension deps for mediawiki-extensions-* tests [integration/config] - 10https://gerrit.wikimedia.org/r/442103 (https://phabricator.wikimedia.org/T197933) (owner: 10Paladox) [13:55:21] (03PS5) 10Paladox: Fix setting extension deps for mediawiki-extensions-* tests [integration/config] - 10https://gerrit.wikimedia.org/r/442103 (https://phabricator.wikimedia.org/T197933) [13:56:46] (03CR) 10jerkins-bot: [V: 04-1] Fix setting extension deps for mediawiki-extensions-* tests [integration/config] - 10https://gerrit.wikimedia.org/r/442103 (https://phabricator.wikimedia.org/T197933) (owner: 10Paladox) [13:58:36] (03PS6) 10Paladox: Fix setting extension deps for mediawiki-extensions-* tests [integration/config] - 10https://gerrit.wikimedia.org/r/442103 (https://phabricator.wikimedia.org/T197933) [14:03:20] RECOVERY - Free space - all mounts on integration-slave-docker-1001 is OK: OK: All targets OK [14:40:38] 10Beta-Cluster-Infrastructure, 10MediaWiki-extensions-Translate: [betacluster] "Uncaught TypeError: Cannot read property 'changeSettings' of null" when clicking on any option of row tux-message-selector - https://phabricator.wikimedia.org/T185038#4315702 (10Nikerabbit) [14:43:57] hashar: Should package.json be optional for extension repos? It seems mwgate-node-docker is failing for an extension branch without package.json - https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/PCRGUIInserts/+/442114/ [14:44:16] It should be optional, right? [14:47:20] 10Continuous-Integration-Config, 10Mobile-Content-Service, 10Patch-For-Review, 10Reading-Infrastructure-Team-Backlog (Kanban): Create a CI task for MCS periodic tests - https://phabricator.wikimedia.org/T177896#4315729 (10Mholloway) [14:47:28] 10Continuous-Integration-Config, 10Mobile-Content-Service, 10Reading-Infrastructure-Team-Backlog (Kanban): Create a CI task for MCS periodic tests - https://phabricator.wikimedia.org/T177896#3674205 (10Mholloway) [14:47:53] 10Continuous-Integration-Config, 10Mobile-Content-Service, 10Reading-Infrastructure-Team-Backlog (Kanban): Create a CI task for MCS periodic tests - https://phabricator.wikimedia.org/T177896#3674205 (10Mholloway) a:05Mholloway>03None [14:50:15] 10Continuous-Integration-Config, 10Mobile-Content-Service, 10Reading-Infrastructure-Team-Backlog (Kanban): Create a CI task for MCS periodic tests - https://phabricator.wikimedia.org/T177896#4315739 (10Mholloway) [15:08:33] 10Release-Engineering-Team (Kanban), 10MediaWiki-Core-Tests, 10MediaWiki-User-preferences, 10Wikimedia-log-errors (Shared Build Failure): Selenium "User should be able to change preferences" test flaky - https://phabricator.wikimedia.org/T198137#4315773 (10Krinkle) [15:11:03] !log Changing integration-slave-docker-1012 in jenkins m1executor -> m4executor. It is a m1.medium instance and can thus run mediawiki jobs [15:11:05] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [15:11:06] (03PS1) 10Mholloway: Run mobileapps-periodic-test in npm-test docker image [integration/config] - 10https://gerrit.wikimedia.org/r/442126 (https://phabricator.wikimedia.org/T177896) [15:11:24] !log Deleting integration-slave-docker-1013 integration-slave-docker-1014 and integration-slave-docker-1015 . Recreating them as m1.medium instances [15:11:26] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [15:11:39] 10Continuous-Integration-Config, 10Mobile-Content-Service, 10Patch-For-Review, 10Reading-Infrastructure-Team-Backlog (Kanban): Create a CI task for MCS periodic tests - https://phabricator.wikimedia.org/T177896#4315782 (10Mholloway) [15:12:08] PROBLEM - Host integration-slave-docker-1013 is DOWN: CRITICAL - Host Unreachable (10.68.23.152) [15:13:06] PROBLEM - Host integration-slave-docker-1014 is DOWN: CRITICAL - Host Unreachable (10.68.19.202) [15:13:50] PROBLEM - Host integration-slave-docker-1015 is DOWN: CRITICAL - Host Unreachable (10.68.22.252) [15:16:13] ^^I have deleted them [15:23:48] RECOVERY - Host integration-slave-docker-1015 is UP: PING OK - Packet loss = 0%, RTA = 0.78 ms [15:24:20] RECOVERY - Host integration-slave-docker-1013 is UP: PING OK - Packet loss = 0%, RTA = 5.70 ms [15:27:07] RECOVERY - Host integration-slave-docker-1014 is UP: PING OK - Packet loss = 0%, RTA = 3.81 ms [15:27:59] RECOVERY - SSH on integration-slave-docker-1014 is OK: SSH OK - OpenSSH_6.7p1 Debian-5+deb8u4 (protocol 2.0) [15:29:19] (03PS1) 10Hashar: Migrate PCRGUIInserts to Quibble [integration/config] - 10https://gerrit.wikimedia.org/r/442128 (https://phabricator.wikimedia.org/T183512) [15:31:42] (03CR) 10Hashar: [C: 032] Migrate PCRGUIInserts to Quibble [integration/config] - 10https://gerrit.wikimedia.org/r/442128 (https://phabricator.wikimedia.org/T183512) (owner: 10Hashar) [15:32:58] !log repooling integration-slave-docker-1013 integration-slave-docker-1014 and integration-slave-docker-1015 (converted to m1.medium instances) [15:33:00] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [15:33:01] (03Merged) 10jenkins-bot: Migrate PCRGUIInserts to Quibble [integration/config] - 10https://gerrit.wikimedia.org/r/442128 (https://phabricator.wikimedia.org/T183512) (owner: 10Hashar) [15:47:30] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10releng-201718-q3, 10Epic, 10Patch-For-Review: [EPIC] Migrate Mediawiki jobs from Nodepool to Docker - https://phabricator.wikimedia.org/T183512#4315904 (10hashar) [16:16:34] 10MediaWiki-Releasing, 10Analytics: Create dashboard showing MediaWiki tarball download statistics - https://phabricator.wikimedia.org/T119772#4316069 (10Nuria) a:05Nuria>03None [16:25:25] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10releng-201718-q3, 10Epic, 10Patch-For-Review: [EPIC] Migrate Mediawiki jobs from Nodepool to Docker - https://phabricator.wikimedia.org/T183512#4316166 (10hashar) [16:25:37] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10releng-201718-q3, 10Epic, 10Patch-For-Review: [EPIC] Migrate Mediawiki jobs from Nodepool to Docker - https://phabricator.wikimedia.org/T183512#4278609 (10hashar) [16:36:44] 10Release-Engineering-Team (Next), 10Maps-Sprint, 10Repository-Admins, 10Maps (Tilerator): Setup diffusion and github sync for kartotherian and tilerator package repositories - https://phabricator.wikimedia.org/T182848#4316235 (10Jhernandez) 05Open>03Resolved a:03Jhernandez [17:08:10] (03CR) 10Jforrester: "This looks sane, but I'm not a Docker expert." [integration/config] - 10https://gerrit.wikimedia.org/r/442126 (https://phabricator.wikimedia.org/T177896) (owner: 10Mholloway) [18:17:00] 10Release-Engineering-Team (Kanban), 10Release, 10Train Deployments: 1.32.0-wmf.10 deployment blockers - https://phabricator.wikimedia.org/T191056#4316774 (10dduvall) [18:18:34] 10Release-Engineering-Team (Kanban), 10Release, 10Train Deployments: 1.32.0-wmf.10 deployment blockers - https://phabricator.wikimedia.org/T191056#4092272 (10dduvall) Trace of l10update failure (see related task and SAL message): ``` 17:59:05 Started scap: testwiki to php-1.32.0-wmf.10 and rebuild l10n cach... [19:13:43] 10MediaWiki-Releasing, 10MW-1.31-release: MediaWiki 1.31 Release notes states that THIS IS NOT A RELEASE YET! - https://phabricator.wikimedia.org/T198180#4314576 (10Lziobro) The phrase was removed by 5cfc9accca2c but added back by e7f2209acf7b. [19:17:47] 10MediaWiki-Releasing, 10MW-1.31-release: MediaWiki 1.31 Release notes states that THIS IS NOT A RELEASE YET! - https://phabricator.wikimedia.org/T198180#4317132 (10Jdforrester-WMF) 05Open>03Invalid It is correct that REL1_31 is not a release yet (it will eventually be branched as MW 1.31.1, but right now... [19:46:59] 10Release-Engineering-Team (Kanban), 10Patch-For-Review, 10Release, 10Train Deployments: 1.32.0-wmf.10 deployment blockers - https://phabricator.wikimedia.org/T191056#4317253 (10dduvall) Now cdb-rebuild failed on mwdebug2002 due to lack of disk space. ``` 19:40:01 Failure processing (u'/srv/mediawiki/php-... [19:49:20] Krinkle: James_F: fund finding, karma-qunit does not seem to handle QUnit.todo() properly :] [19:49:51] hasharAway: Hm.. details? [19:50:01] method undefined? [19:50:07] TypeError being thrown [19:50:21] OK. That's a bug, and should be easy to fix. [19:50:26] Where is it happening? mw core master? [19:50:33] the test itself is fine but the code is wrong so I thought about using QUnit.skip() [19:50:50] when running tests in the web browser with the qunit ui being shown, the test is a success and it is prefixed with [TODO] [19:51:06] but in karma-qunit , it never handles the fact the test is a todo [19:51:25] name: 've.ui.TableWidget', module: 'ext.graph.visualEditor', skipped: false, todo: true, failed: 1, [19:51:34] so skipped: false && failed: 1 ==> error [19:51:41] but given todo: true, it should not be an error [19:52:02] Krinkle: I am just ranting. I will fill an issue to karma wheneverI have found their latest source code [19:52:10] hashar: hold on, this might not be a bug. [19:52:27] Is the error that QUnit.todo() doesn't exist, or do you think karma-qunit handles it incorrectly? [19:52:44] Please note that todo() does not skip it, instead, QUnit.todo() asserts that at least one assertion is failing. [19:52:50] the later. QUnit.todo() does exist and works as expected [19:52:52] It's like asserting the inverse. [19:52:53] the test is run as expected [19:53:06] the result is correct (one failure, it is not skipped, it is a todo) [19:53:10] but karma-qunit mis handle it [19:53:25] Ah, it is seeing the failure and making the build fail, forgetting to look at obj.todo. [19:53:30] yup [19:53:46] And you're sure this is not a case of the unit test passing and qunit informing failure:1 because there are no failures? [19:54:01] Krinkle: https://github.com/karma-runner/karma-qunit/blob/master/src/adapter.js#L101-L116 [19:54:34] since my javascript debugging skill === undefined, I went with a console.log( test ); [19:54:53] I understand, I'm just not a 100% sure that handling 'todo' is part of the reporters responsibility. I would expect QUnit to handle this inversion internally, so that failed: 1, only if there is a real problem. [19:55:11] ah yeah that could be done that way as well [19:55:18] the thing is [19:55:33] if it is a todo and there is no failure, the reporter should mark it as being a failure [19:55:45] since QUnit.todo() expects at least one assertion to fail [19:55:53] Rephrased: Either QUnit is exposing the internal failure and making the reporter responsible to distinguish todo from normal, or QUnit is handling it. In the first case, the bug is in karma-qunit. In the second case, the test will fail both with Karma and normally on Special:JSTest, and means the bug is in VE. [19:56:16] bug = using todo() for a test that is passing. [19:56:22] (maybe) [19:56:22] I have confirmed it is all green on Special:JavascriptTest with a nice [todo] label [19:56:28] Okay. [19:56:36] and there is definitely a failure being reported [19:56:41] Perfect. [19:56:59] hey [19:57:05] if you file a bug, I'll make sure to get qunitjs/team on it. We may end up agreeing that this should not be handled by karma-qunit. [19:57:07] I am almost proud to have figured it out eventually! [19:57:12] navigating the qunit / karma code etc [19:57:42] At least, for me, I think it would be nice for the reporter/JSON api to already be processed so failure:1 = real failure, instead of the karma/tap/nodejs/whatever interface having to look for 'todo' and all that stuff. [19:57:53] But yeah, feel free to /cc @krinkle me once you have it. [19:58:57] Krinkle: https://phabricator.wikimedia.org/F22692049 :] [19:59:02] the web interface is all fine [19:59:17] Yes. I'm already thinking about how to solve it in karma/qunit. [19:59:26] Right now, this is a real upstream bug. No doubt. [19:59:44] 10MediaWiki-Releasing, 10MW-1.31-release: MediaWiki 1.31 Release notes states that THIS IS NOT A RELEASE YET! - https://phabricator.wikimedia.org/T198180#4317278 (10Kghbln) > To repeat myself from the last thread about this, you want https://raw.githubusercontent.com/wikimedia/mediawiki/1.31.0/RELEASE-NOTES-1.... [19:59:45] I'm just curious whether the fix should be for karma-qunit to support 'todo', or for qunitjs to not output failures:1. [19:59:51] Both are solutions :) [20:01:04] yeah I have no idea :] [20:01:33] 10Release-Engineering-Team (Kanban), 10Patch-For-Review, 10Release, 10Train Deployments: 1.32.0-wmf.10 deployment blockers - https://phabricator.wikimedia.org/T191056#4317283 (10dduvall) Since the sync succeeded other than cdb update on mwdeploy2002, I'm going to move ahead with group0 today. [20:01:42] one sure thing, the web ui of qunit handles it properly. So at least that is a good thing [20:02:16] Krinkle: thank you very much for the confirmation something weird is going on. I am going to fill a bug for karma-runner/karma-qunit :] [20:09:01] !log package upgrades on -sca01 to try to fix apertium stuff [20:09:03] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [20:12:47] RECOVERY - Apertium APY on deployment-sca01 is OK: HTTP OK: HTTP/1.1 200 OK - 5996 bytes in 0.014 second response time [20:13:00] PROBLEM - Puppet errors on deployment-sca01 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [20:14:16] RECOVERY - Apertium APY on deployment-apertium02 is OK: HTTP OK: HTTP/1.1 200 OK - 5996 bytes in 0.010 second response time [20:16:22] !log done the same on -sca02 and -apertium02 [20:16:24] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [20:19:01] RECOVERY - Apertium APY on deployment-sca02 is OK: HTTP OK: HTTP/1.1 200 OK - 5996 bytes in 0.007 second response time [20:24:50] 10Beta-Cluster-Infrastructure, 10Performance-Team, 10Patch-For-Review: Set up webperf node in Beta Cluster - https://phabricator.wikimedia.org/T195314#4317358 (10Krinkle) [20:24:55] 10Beta-Cluster-Infrastructure, 10Performance-Team, 10Patch-For-Review: Set up webperf node in Beta Cluster - https://phabricator.wikimedia.org/T195314#4222943 (10Krinkle) [20:26:36] RECOVERY - Puppet errors on deployment-sca01 is OK: OK: Less than 1.00% above the threshold [0.0] [20:36:11] 10MediaWiki-Releasing, 10MW-1.31-release: MediaWiki 1.31 Release notes states that THIS IS NOT A RELEASE YET! - https://phabricator.wikimedia.org/T198180#4317388 (10Kghbln) Quite a lot of links on the branch info pages to the release notes were broken due to different reasons so I just created a template [[ ht... [20:36:50] Krinkle: filled as https://github.com/karma-runner/karma-qunit/issues/111 :] I will use QUnit.skip() meanwhile [20:47:30] 10Beta-Cluster-Infrastructure, 10Performance-Team, 10Patch-For-Review: Set up webperf node in Beta Cluster - https://phabricator.wikimedia.org/T195314#4317397 (10Krinkle) I've confirmed that the `webperf::navtiming` service is also working as expected in the Beta Cluster. * Varnish VCL for `/beacon/event`.... [20:53:42] James_F: thank you :] [20:54:07] hashar: Just keep being awesome. :-) [20:54:30] 10Beta-Cluster-Infrastructure, 10Analytics: Disk usage on deployment-kafa-jumbo-* causing alerts - https://phabricator.wikimedia.org/T198262#4317436 (10Krenair) [20:54:40] James_F: I am trying my best! There is still one QUnit tests failling for Graph though, Iwill try to hunt it down, else QUnit.skip() will get rid of it \o/ [20:56:01] * James_F grins. [20:57:08] 10Release-Engineering-Team (Kanban), 10Patch-For-Review, 10Release, 10Train Deployments: 1.32.0-wmf.10 deployment blockers - https://phabricator.wikimedia.org/T191056#4317451 (10dduvall) Manually cleaned up old deployments (wmf.3, wmf.4, wmf.5) to ensure enough disk space. I'm re-syncing now. [20:57:21] 10Beta-Cluster-Infrastructure, 10Analytics: Disk usage on deployment-kafa-jumbo-* causing alerts - https://phabricator.wikimedia.org/T198262#4317452 (10Krenair) /var/log/kafka is 1.2G on -2 ```root@deployment-kafka-jumbo-2:~# du -hsx /var/log/kafka/* | grep M 257M /var/log/kafka/controller.log.1 257M /var/log... [20:59:29] deployment-prep shinken now down to 6 items - one is a lingering graphite data point, two are kafka diskspace (opened a task for that), one is cherry-picks, one is puppet on -deploy01 (npm, have a task for that), and one is puppet on -cache-text04 (I need to figure that one out properly still) [21:30:17] uh, what's this doing here: [21:30:18] krenair@deployment-changeprop:~$ ls -lh /usr/local/bin/npm [21:30:18] lrwxrwxrwx 1 root staff 38 Aug 5 2016 /usr/local/bin/npm -> ../lib/node_modules/npm/bin/npm-cli.js [21:30:31] it doesn't come from the npm package [21:31:27] It doesn't? I have that locally (albeit pointed to /usr/local/lib/node_modules/…) [21:31:46] Maybe it's an old npm package that doesn't provide it and someone added it manually? [21:32:25] I'm guessing someone added is manually yeah [21:32:43] npm: [21:32:44] Installed: (none) [21:32:44] Candidate: 1.4.21+ds-2 [21:32:55] I know for a fact that page makes /usr/bin/npm rather than /usr/local/bin/npm [21:33:00] that package* [21:33:16] krenair@deployment-changeprop:~$ dpkg -S /usr/local/bin/npm [21:33:16] dpkg-query: no path found matching pattern /usr/local/bin/npm [21:33:36] Wow, 1.4?! [21:34:02] No wonder someone in desperation manually fixed things. [21:34:04] that's what everything else in beta runs [21:34:22] it comes from debian jessie [21:34:33] * James_F sighs. [21:34:42] presumably prod too [21:34:51] 1.4.21+ds-2 0 [21:34:51] 500 http://http.debian.net/debian/ jessie/main amd64 Packages [21:35:40] npm is no longer packaged in apt [21:35:48] it is packaged by nodejs repo though [21:35:56] it's packaged for jessie [21:36:10] yeh i meant the latest release [21:36:19] right well deployment-changeprop doesn't run stretch [21:36:32] it has jessie [21:48:22] PROBLEM - Puppet errors on deployment-tin is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [22:14:37] 10Beta-Cluster-Infrastructure, 10Performance-Team, 10Patch-For-Review: Set up webperf node in Beta Cluster - https://phabricator.wikimedia.org/T195314#4317641 (10Krinkle) [22:15:52] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team, 10Operations, 10Patch-For-Review: Upgrade deployment-prep deployment servers to stretch - https://phabricator.wikimedia.org/T192561#4317643 (10Krenair) ```lang=diff,name=crappy cherry-picked hack to try to get npm installed and puppet happy diff --... [22:17:52] !log arming keyholder on deployment-deploy01 [22:17:54] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [22:19:28] okay why does the dumpsdeploy key have a passphrase and where can I find it [22:19:36] RECOVERY - Puppet errors on deployment-deploy01 is OK: OK: Less than 1.00% above the threshold [0.0] [22:21:32] found it at https://wikitech.wikimedia.org/wiki/Keyholder [22:24:25] 10Beta-Cluster-Infrastructure, 10Release-Engineering-Team, 10Operations, 10Patch-For-Review: Upgrade deployment-prep deployment servers to stretch - https://phabricator.wikimedia.org/T192561#4317653 (10Krenair) I've armed keyholder on the new host. So what next @thcipriani? [22:24:38] documented it in the normal place [22:28:24] RECOVERY - Puppet errors on deployment-tin is OK: OK: Less than 1.00% above the threshold [0.0] [22:40:43] not sure keyholder is working properly though [22:40:52] jenkins-deploy@deployment-deploy01:~$ SSH_AUTH_SOCK=/run/keyholder/proxy.sock ssh mwdeploy@deployment-mediawiki-07 [22:40:52] sign_and_send_pubkey: signing failed: agent refused operation [22:40:56] this is fine on deployment-tin [22:41:41] same group membership for jenkins-deploy... [22:42:38] 10Project-Admins: Requests for addition to the #acl*Project-Admins group (in comments) - https://phabricator.wikimedia.org/T706#4317685 (10MBinder_WMF) Request to add @JTannerWMF so that she can manage boards/sprints for #collaboration-team-triage [22:48:48] 10Release-Engineering-Team: jenkins-bot LDAP entry contains pmtpa references - https://phabricator.wikimedia.org/T198271#4317713 (10Krenair) [23:04:04] I think actually a better question is why does it work at all on deployment-tin [23:46:18] 10Scap, 10Operations, 10Wikimedia-Incident: Update Debian Package for Scap3 to 3.8.3-1 - https://phabricator.wikimedia.org/T198277#4317855 (10thcipriani) p:05Triage>03Unbreak! [23:52:19] PROBLEM - Puppet errors on deployment-restbase02 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [23:59:39] PROBLEM - Puppet errors on deployment-zotero01 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [23:59:59] PROBLEM - Puppet errors on deployment-logstash2 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0]