[00:14:04] !log deleting instance integration-slave-docker-1029 [00:14:08] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [00:17:22] PROBLEM - Host integration-slave-docker-1029 is DOWN: CRITICAL - Host Unreachable (10.68.16.95) [00:23:36] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Kanban), 10Patch-For-Review: Move CI docker storage engine to device mapper - https://phabricator.wikimedia.org/T203841 (10dduvall) Configuring Docker to use the device mapper storage driver on CI instances is going to be trickier than it s... [00:25:22] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team (Kanban), 10Patch-For-Review: Move CI docker storage engine to device mapper - https://phabricator.wikimedia.org/T203841 (10dduvall) Additionally, there appear to be some issues with the direct-lvm driver on jessie when using thin pools (whi... [00:37:55] 10Continuous-Integration-Infrastructure (shipyard), 10Release-Engineering-Team (Kanban), 10Patch-For-Review: Rebuild quibble images for Chrome 69 and Firefox 60 - https://phabricator.wikimedia.org/T203902 (10Legoktm) a:03hashar [00:46:12] RECOVERY - Puppet errors on deployment-certcentral-testclient03 is OK: OK: Less than 1.00% above the threshold [0.0] [00:47:07] 10Release-Engineering-Team (Kanban), 10Scap, 10Operations: mwscript rebuildLocalisationCache.php takes 40 minutes on HHVM (rather than ~5 on PHP 5) - https://phabricator.wikimedia.org/T191921 (10Legoktm) It's unclear to me why we're continuing to invest so much time in getting HHVM to work when we're going t... [00:59:12] 10Release-Engineering-Team (Kanban), 10Scap, 10Operations: mwscript rebuildLocalisationCache.php takes 40 minutes on HHVM (rather than ~5 on PHP 5) - https://phabricator.wikimedia.org/T191921 (10Legoktm) ``` legoktm@deploy1001:~$ time PHP=php7.0 mwscript rebuildLocalisationCache.php --wiki=enwiki --outdir=/t... [01:11:34] 10Release-Engineering-Team (Kanban), 10Scap, 10Operations: mwscript rebuildLocalisationCache.php takes 40 minutes on HHVM (rather than ~5 on PHP 5) - https://phabricator.wikimedia.org/T191921 (10mmodell) @legoktm: There was some concern about incompatibilities between the mbstring in php7 vs hhvm and an asse... [01:48:12] PROBLEM - Puppet errors on integration-slave-docker-1015 is CRITICAL: CRITICAL: 88.89% of data above the critical threshold [0.0] [01:57:01] 10Phabricator: upload failure on Phabricator without clear problem description - https://phabricator.wikimedia.org/T204096 (10Effeietsanders) [02:01:16] 10Phabricator: upload failure on Phabricator without clear problem description - https://phabricator.wikimedia.org/T204096 (10Effeietsanders) [02:33:54] 10Deployments, 10MediaWiki-extensions-LocalisationUpdate, 10I18n, 10Wikimedia-production-error: l10n-update not updating Vector and extensions - https://phabricator.wikimedia.org/T103879 (10Krinkle) 05Open>03declined Seems to work fine now. Please file a new task if/when similar issues are seen. [02:41:38] 10Release-Engineering-Team (Kanban), 10Scap, 10Operations: mwscript rebuildLocalisationCache.php takes 40 minutes on HHVM (rather than ~5 on PHP 5) - https://phabricator.wikimedia.org/T191921 (10Krinkle) I share the same concern as what @mmodell remembers. Having said that, I believe at this point in time th... [02:43:12] RECOVERY - Puppet errors on integration-slave-docker-1015 is OK: OK: Less than 1.00% above the threshold [0.0] [04:13:15] 10Release-Engineering-Team (Kanban), 10Scap, 10Operations: mwscript rebuildLocalisationCache.php takes 40 minutes on HHVM (rather than ~5 on PHP 5) - https://phabricator.wikimedia.org/T191921 (10Legoktm) I ran a script to diff the CDBs I just generated with PHP 7.0, and there's no functional diff (just some... [04:26:41] Yippee, build fixed! [04:26:41] Project mediawiki-core-code-coverage-docker build #3756: 09FIXED in 1 hr 26 min: https://integration.wikimedia.org/ci/job/mediawiki-core-code-coverage-docker/3756/ [05:46:21] 10Continuous-Integration-Config, 10Patch-For-Review: Combine composer-php55 and composer-hhvm jobs - https://phabricator.wikimedia.org/T142457 (10Legoktm) 05Open>03declined This would have been a good optimization in the nodepool era, but no longer makes sense in a docker world. [07:11:14] 10Project-Admins: New Extension: ChangeUserPasswords - https://phabricator.wikimedia.org/T202275 (10Mz83ude) It is up! [07:24:46] 10Continuous-Integration-Config, 10Multi-Content-Revisions, 10Wikidata, 10MW-1.32-release-notes (WMF-deploy-2018-09-18 (1.32.0-wmf.22)), and 2 others: Wikibase CI broken (database errors) - https://phabricator.wikimedia.org/T204065 (10Addshore) So, the CI is now fixed. As a follow up it could be a good ide... [07:33:16] hashar: https://integration.wikimedia.org/ci/job/quibble-vendor-mysql-hhvm-docker/16260/ 39 mins :D [07:33:24] I think its just down to the number of tests being run though [07:37:10] addshore: ooh my god :( [07:43:26] addshore: so a good chunk of that is parallel-lint running on every files .. that takes 320seconds doh [07:45:11] oooh [07:45:13] :D [07:45:25] that is https://phabricator.wikimedia.org/T198493 [07:45:36] mediawiki/core has the same issue [07:46:01] but for mediawiki/core quibble would invoke "composer test" passing as an argument the list of files that have been changed in the patchset [07:52:27] 10Continuous-Integration-Infrastructure, 10Wikidata, 10Jenkins: php-lint in wikibase times out - https://phabricator.wikimedia.org/T198493 (10hashar) That is still going on. For extensions and skins Quibble does: ``` composer validate --no-check-publish composer install --no-progress --prefer-dist --profile... [07:52:34] addshore: there is surely an issue with hhvm/parallel-lint [07:52:48] I dont see why it takes 320 seconds when phpcs takes 12 seconds [07:57:28] is it linting all of vendor etc too? [07:57:33] i guess not.... [08:08:26] addshore: na vendor node_modules .git are excluded (or should be) [08:08:35] from a quick debug session [08:08:44] parallel-lint invokes php with -n [08:09:00] which is to prevent loading the system php.ini [08:09:12] thus hhvm does not load /etc/hhvm/php.ini which has the performance tweaks [08:13:36] it is still slow bah [08:13:38] :( [08:24:21] 10Release-Engineering-Team (Kanban), 10Scap, 10Operations, 10Patch-For-Review: Scap should use Eval.Jit=1 when calling rebuildLocalisationCache.php via HHVM - https://phabricator.wikimedia.org/T203680 (10hashar) The number of threads is irrelevant as shown on T191921#4248767: > * 1 thread (32 cores): 1... [08:34:48] 10Phabricator: upload failure on Phabricator without clear problem description - https://phabricator.wikimedia.org/T204096 (10Aklapper) 05Open>03stalled Please provide steps to reproduce (such as browser, file size, the steps performed, etc): https://mediawiki.org/wiki/How_to_report_a_bug [08:40:04] hashar: :( [09:05:21] (03CR) 10Hashar: [C: 032] docker: update npm-browser for Firefox/Chromium [integration/config] - 10https://gerrit.wikimedia.org/r/459580 (https://phabricator.wikimedia.org/T203902) (owner: 10Hashar) [09:06:08] (03CR) 10Hashar: [C: 032] docker: rebuild Quibble stretch for Chromium/Firefox [integration/config] - 10https://gerrit.wikimedia.org/r/459575 (https://phabricator.wikimedia.org/T203902) (owner: 10Hashar) [09:07:03] (03Merged) 10jenkins-bot: docker: update npm-browser for Firefox/Chromium [integration/config] - 10https://gerrit.wikimedia.org/r/459580 (https://phabricator.wikimedia.org/T203902) (owner: 10Hashar) [09:07:38] (03Merged) 10jenkins-bot: docker: rebuild Quibble stretch for Chromium/Firefox [integration/config] - 10https://gerrit.wikimedia.org/r/459575 (https://phabricator.wikimedia.org/T203902) (owner: 10Hashar) [09:22:12] Project beta-update-databases-eqiad build #28261: 04FAILURE in 5.6 sec: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/28261/ [09:25:03] \nFatal error: syntax error, unexpected $end, expecting ']' in /srv/mediawiki-staging/wmf-config/InitialiseSettings.php on line 13786\n") [09:30:30] Yippee, build fixed! [09:30:30] Project beta-update-databases-eqiad build #28262: 09FIXED in 2 min 49 sec: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/28262/ [09:34:28] beta-update-databases seems to have failed due to some cosmic ray ... [09:38:20] (03CR) 10Hashar: "oojs-ui-docker-publish oojs-ui-npm-run-jenkins-node-6-docker updated" [integration/config] - 10https://gerrit.wikimedia.org/r/459582 (https://phabricator.wikimedia.org/T203902) (owner: 10Hashar) [09:38:28] !log Updating jobs oojs-ui-docker-publish oojs-ui-npm-run-jenkins-node-6-docker for Chromium 69 and Firefox 60 - T203902 [09:38:33] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [09:38:34] T203902: Rebuild quibble images for Chrome 69 and Firefox 60 - https://phabricator.wikimedia.org/T203902 [09:40:51] (03PS2) 10Hashar: Bump wdio selenium jobs Chromium 69/Firefox 60 [integration/config] - 10https://gerrit.wikimedia.org/r/459577 (https://phabricator.wikimedia.org/T203902) [09:48:45] (03CR) 10Hashar: [C: 032] "oojs-ui-docker-publish oojs-ui-npm-run-jenkins-node-6-docker updated and that seems to work fine." [integration/config] - 10https://gerrit.wikimedia.org/r/459582 (https://phabricator.wikimedia.org/T203902) (owner: 10Hashar) [09:49:13] (03PS3) 10Hashar: Bump wdio selenium jobs Chromium 69/Firefox 60 [integration/config] - 10https://gerrit.wikimedia.org/r/459577 (https://phabricator.wikimedia.org/T203902) [09:50:54] (03Merged) 10jenkins-bot: Update oojs-ui jobs for Chromium 69/Firefox 60 [integration/config] - 10https://gerrit.wikimedia.org/r/459582 (https://phabricator.wikimedia.org/T203902) (owner: 10Hashar) [09:58:43] 10Project-Admins: New Extension: ChangeUserPasswords - https://phabricator.wikimedia.org/T202275 (10Aklapper) >>! In T202275#4576326, @Mz83ude wrote: > It is up! Where to see it? https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/extensions/ChangeUserPasswords/+/master still only has a `.gitreview` file here [10:21:57] (03PS1) 10Addshore: Add more Wikibase extensions to gatedextensions [integration/config] - 10https://gerrit.wikimedia.org/r/459991 (https://phabricator.wikimedia.org/T204065) [10:23:06] (03PS2) 10Addshore: Add more Wikibase extensions to gatedextensions [integration/config] - 10https://gerrit.wikimedia.org/r/459991 (https://phabricator.wikimedia.org/T204065) [10:24:39] (03CR) 10jerkins-bot: [V: 04-1] Add more Wikibase extensions to gatedextensions [integration/config] - 10https://gerrit.wikimedia.org/r/459991 (https://phabricator.wikimedia.org/T204065) (owner: 10Addshore) [12:34:59] (03CR) 10Hashar: "In zuul-layout.yaml you also need to add the template 'extension-gate' on each of the repositories and that might do it." [integration/config] - 10https://gerrit.wikimedia.org/r/459991 (https://phabricator.wikimedia.org/T204065) (owner: 10Addshore) [12:35:58] (03CR) 10Hashar: "recheck" [selenium] - 10https://gerrit.wikimedia.org/r/330671 (owner: 10Hashar) [12:39:37] (03CR) 10Hashar: "I have updated mediawiki-selenium-integration-docker which pass https://gerrit.wikimedia.org/r/#/c/mediawiki/selenium/+/330671/" [integration/config] - 10https://gerrit.wikimedia.org/r/459577 (https://phabricator.wikimedia.org/T203902) (owner: 10Hashar) [12:39:39] hasharAway: thanks for the direction :D [12:43:37] (03PS3) 10Addshore: Add more Wikibase extensions to gatedextensions [integration/config] - 10https://gerrit.wikimedia.org/r/459991 (https://phabricator.wikimedia.org/T204065) [12:44:30] (03CR) 10Addshore: "I pushed a patch with the fixes." [integration/config] - 10https://gerrit.wikimedia.org/r/459991 (https://phabricator.wikimedia.org/T204065) (owner: 10Addshore) [13:48:04] 10Release-Engineering-Team (Kanban), 10Scap, 10Operations: mwscript rebuildLocalisationCache.php takes 40 minutes on HHVM (rather than ~5 on PHP 5) - https://phabricator.wikimedia.org/T191921 (10thcipriani) [13:48:07] 10Release-Engineering-Team (Kanban), 10Scap, 10Operations, 10Patch-For-Review: Scap should use Eval.Jit=1 when calling rebuildLocalisationCache.php via HHVM - https://phabricator.wikimedia.org/T203680 (10thcipriani) 05Open>03Invalid Will try to move to php7.0 per discussion on T191921 [13:51:13] PROBLEM - SSH on integration-slave-docker-1004 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:10:43] 10Continuous-Integration-Config, 10User-Addshore: Run less tests when a patch is in WIP mode in Gerrit - https://phabricator.wikimedia.org/T204125 (10Addshore) [14:11:04] RECOVERY - SSH on integration-slave-docker-1004 is OK: SSH OK - OpenSSH_6.7p1 Debian-5+deb8u6 (protocol 2.0) [14:32:56] reminder: no puppet merges for a while, as we are doing data center switchover for the next hour (?) [15:07:55] 10Phabricator: upload failure on Phabricator without clear problem description - https://phabricator.wikimedia.org/T204096 (10Effeietsanders) I'm using CHrome on mac. The file is a png image of 4.2 MB. [15:18:19] 10Phabricator: "Upload Failure: Exception: No configred storage engine" trying to upload a file to Phabricator - https://phabricator.wikimedia.org/T204096 (10Aklapper) [15:18:30] 10Phabricator: "Upload Failure: Exception: No configured storage engine" trying to upload a file to Phabricator - https://phabricator.wikimedia.org/T204096 (10Aklapper) [15:19:13] 10Phabricator: "Upload Failure: Exception: No configured storage engine" trying to upload a file to Phabricator - https://phabricator.wikimedia.org/T204096 (10Aklapper) 4MB is the max [15:19:22] 10Phabricator: "Upload Failure: Exception: No configured storage engine" trying to upload a file to Phabricator - https://phabricator.wikimedia.org/T204096 (10Aklapper) [15:19:26] 10Phabricator (Upstream), 10Upstream: Unclear error message when uploading a larger attachment: "Exception: No configured storage engine can store this file." - https://phabricator.wikimedia.org/T155130 (10Aklapper) [15:24:31] 10Continuous-Integration-Config, 10Multi-Content-Revisions, 10Wikidata, 10MW-1.32-release-notes (WMF-deploy-2018-09-18 (1.32.0-wmf.22)), and 2 others: WikibaseLexeme CI broken (database errors) - https://phabricator.wikimedia.org/T204065 (10Jdforrester-WMF) [15:39:23] releng folks: okay to do beta cluster deploy this week, right? [15:40:27] subbu: yep, beta cluster isn't moving anywhere this week [15:40:34] k [15:44:32] (03CR) 10Jforrester: "Gate is already quite slow. Ideally we'd have every production extension and skin in here, but…" [integration/config] - 10https://gerrit.wikimedia.org/r/459991 (https://phabricator.wikimedia.org/T204065) (owner: 10Addshore) [15:47:02] !log cherry-pick https://gerrit.wikimedia.org/r/c/operations/puppet/+/459875 on integration-puppetmaster01 for testing [15:47:05] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [15:49:39] !log provisioning new xlarge integration-slave-docker-1030 [15:49:44] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [15:58:42] PROBLEM - Puppet errors on integration-slave-docker-1030 is CRITICAL: CRITICAL: 85.71% of data above the critical threshold [0.0] [16:00:42] I should menton that puppet merges are back in style now, switchover is complete [16:13:41] RECOVERY - Puppet errors on integration-slave-docker-1030 is OK: OK: Less than 1.00% above the threshold [0.0] [16:20:51] 10Release-Engineering-Team (Kanban), 10Operations, 10User-MModell: Create keyholder gerrit repo - https://phabricator.wikimedia.org/T203108 (10mmodell) a:05mmodell>03faidon [16:21:02] (03PS2) 10Hashar: Add mw ext ContentTranslation to gated extensions [integration/config] - 10https://gerrit.wikimedia.org/r/450508 (https://phabricator.wikimedia.org/T86930) (owner: 10Santhosh) [16:23:23] PROBLEM - Free space - all mounts on integration-slave-docker-1027 is CRITICAL: CRITICAL: integration.integration-slave-docker-1027.diskspace.root.byte_percentfree (<20.00%) [16:24:56] 10Release-Engineering-Team (Kanban), 10User-zeljkofilipin: Find top 15 target projects that could use Selenium tests to prevent incidents - https://phabricator.wikimedia.org/T199133 (10zeljkofilipin) Ideas: - https://www.mediawiki.org/wiki/Developers/Maintainers - https://github.com/zeljkofilipin/gerrit - htt... [16:25:08] 10Release-Engineering-Team (Kanban), 10User-zeljkofilipin: Find top 15 target projects that could use Selenium tests to prevent incidents - https://phabricator.wikimedia.org/T199133 (10zeljkofilipin) a:03zeljkofilipin [16:40:19] andrewbogott: hey andrew, question about role parameters: i added one in https://gerrit.wikimedia.org/r/c/operations/puppet/+/459875/5/modules/role/manifests/ci/slave/labs/docker.pp (not merged yet but cherry picked on integration-puppetmaster01) it doesn't show up in horizon. should it or am i doing something wrong? [16:40:59] it won't show up until the commit gets merged marxarelli [16:41:07] it needs to be on the main labs puppetmaster [16:41:10] ah, ok [16:41:18] that makes more sense then [16:41:24] can probably just stick it in hiera beforehand [16:41:28] i can maybe just add it in hiera now [16:41:40] yep yep! ty! [16:43:25] RECOVERY - Free space - all mounts on integration-slave-docker-1027 is OK: OK: All targets OK [16:51:45] thcipriani: https://gerrit.wikimedia.org/r/c/operations/puppet/+/459875#message-5170d82e2f91d88da7ac2e6662cddba78b5e0399 \o/ [16:52:04] moar space for docker [17:05:18] https://gerrit-review.googlesource.com/c/gerrit/+/195730 that will help new users. [17:05:34] * paladox goes back to the apple event. [17:06:41] goes back to watching paladox' live stream [17:06:49] lol [17:27:05] 10Phabricator, 10Security-Team: Add 'risk' field to tasks created via advanced template - https://phabricator.wikimedia.org/T204138 (10chasemp) p:05Triage>03Normal [17:27:47] 10Phabricator, 10Security-Team: Add 'risk' field to tasks created via advanced template - https://phabricator.wikimedia.org/T204138 (10chasemp) [17:28:37] (03PS1) 10Thcipriani: keyholder: add CI to project [integration/config] - 10https://gerrit.wikimedia.org/r/460067 [17:29:39] (03CR) 1020after4: [C: 032] "looks good" [integration/config] - 10https://gerrit.wikimedia.org/r/460067 (owner: 10Thcipriani) [17:31:10] (03Merged) 10jenkins-bot: keyholder: add CI to project [integration/config] - 10https://gerrit.wikimedia.org/r/460067 (owner: 10Thcipriani) [17:36:57] !log Reloading Zuul to deploy https://gerrit.wikimedia.org/r/#/c/integration/config/+/460067/ [17:37:01] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [17:40:38] !log adding Jenkins node integration-slave-docker-1030 [17:40:41] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [17:40:43] 10Release-Engineering-Team (Kanban), 10Operations, 10User-MModell: Create keyholder gerrit repo - https://phabricator.wikimedia.org/T203108 (10mmodell) It's now mirroring to https://github.com/wikimedia/operations-software-keyholder/ [17:42:34] 10Release-Engineering-Team, 10Operations: Keyholder phab repo duplicate work - https://phabricator.wikimedia.org/T203003 (10thcipriani) [17:42:37] 10Gerrit, 10Phabricator, 10Release-Engineering-Team (Someday): Stop using Differential for code review - https://phabricator.wikimedia.org/T191182 (10thcipriani) [17:42:39] 10Release-Engineering-Team (Kanban), 10Operations, 10User-MModell: Create keyholder gerrit repo - https://phabricator.wikimedia.org/T203108 (10thcipriani) 05Open>03Resolved >>! In T203108#4560449, @faidon wrote: > I'd resolve this task, but I'm not sure what else needs to be done with regards to GitHub m... [17:50:48] 10Phabricator, 10Security-Team: Add 'risk' field to tasks created via advanced template - https://phabricator.wikimedia.org/T204138 (10chasemp) [17:52:55] !log removing integration-slave-docker-1001 jenkins node for replacement with a xlarge instance [17:52:59] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [17:58:46] !log deleting now idle node integration-slave-docker-1001 [17:58:50] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [17:59:18] !log deleting integration-slave-docker-1001 instance [17:59:25] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [18:00:28] !log launching new xlarge instance integration-slave-docker-1031 [18:00:31] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [18:02:53] PROBLEM - Host integration-slave-docker-1001 is DOWN: CRITICAL - Host Unreachable (10.68.23.127) [18:04:26] PROBLEM - Free space - all mounts on integration-slave-docker-1027 is CRITICAL: CRITICAL: integration.integration-slave-docker-1027.diskspace.root.byte_percentfree (<10.00%) [18:05:13] maintenance-disconnect-full-disks build 2309 integration-slave-docker-1027 (/: 98%): OFFLINE due to disk space [18:08:23] ^ thcipriani: yikes. didn't even have time to replace it before it croaked again [18:08:32] 1031 is coming up shortly [18:10:15] maintenance-disconnect-full-disks build 2310 integration-slave-docker-1027: OFFLINE due to disk space [18:15:12] maintenance-disconnect-full-disks build 2311 integration-slave-docker-1027: OFFLINE due to disk space [18:16:16] * thcipriani fixes alerts in the interim [18:17:20] updated offline reason for integration-slave-docker-1027 to stop the alert but leave offline [18:18:18] !log added new jenkins node integration-slave-docker-1031 with 4 executors [18:18:22] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [18:20:19] !log deleting jenkins node integration-slave-docker-1027 due to insufficient /var/lib/docker space (replaced with 1031 which has dedicated /var/lib/docker volume) [18:20:23] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [18:21:39] !log deleting integration-slave-docker-1027 instance [18:21:42] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [18:22:23] thcipriani: oh, you're still on 1027. did you need that for testing the fix to alerts? [18:22:42] marxarelli: nope, you can remove it [18:22:51] rad! done [18:24:18] PROBLEM - Host integration-slave-docker-1027 is DOWN: CRITICAL - Host Unreachable (10.68.17.222) [18:25:02] !log launching new bigram instance integration-slave-docker-1032 [18:25:06] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [18:30:10] 10Release-Engineering-Team (Kanban), 10User-zeljkofilipin: Find top 15 target projects that could use Selenium tests to prevent incidents - https://phabricator.wikimedia.org/T199133 (10zeljkofilipin) I'm not sure if there is a more specific task for this, I'll check later, this is what I need to do: - Review... [18:44:27] !log adding jenkins node integration-slave-docker-1032 with 4 executors [18:44:31] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [18:59:35] 10Continuous-Integration-Config, 10Multi-Content-Revisions, 10Wikidata, 10MW-1.32-release-notes (WMF-deploy-2018-09-18 (1.32.0-wmf.22)), and 2 others: WikibaseLexeme CI broken (database errors) - https://phabricator.wikimedia.org/T204065 (10daniel) 05Open>03Resolved [19:13:09] 10Phabricator, 10Security-Team, 10Patch-For-Review: Add 'risk' field to tasks created via advanced template - https://phabricator.wikimedia.org/T204138 (10chasemp) Talked to @20after4 for a bit about this and we added it to the advanced form. Let's see how this works out. [19:13:31] 10Phabricator, 10Security-Team, 10Patch-For-Review: Add 'risk' field to tasks created via advanced template - https://phabricator.wikimedia.org/T204138 (10chasemp) 05Open>03Resolved [19:23:29] (03PS1) 10Hashar: QA report for gated extensions and their dependencies [integration/config] - 10https://gerrit.wikimedia.org/r/460080 [19:29:19] 10Continuous-Integration-Config, 10Wikidata, 10User-Addshore: Add more Wikibase extensions to gatedextensions - https://phabricator.wikimedia.org/T204153 (10hashar) p:05Triage>03High [19:29:35] 10Continuous-Integration-Config, 10Multi-Content-Revisions, 10Wikidata, 10MW-1.32-release-notes (WMF-deploy-2018-09-18 (1.32.0-wmf.22)), and 2 others: WikibaseLexeme CI broken (database errors) - https://phabricator.wikimedia.org/T204065 (10hashar) >>! In T204065#4576329, @Addshore wrote: > So, the CI is n... [19:30:00] (03PS4) 10Hashar: Add more Wikibase extensions to gatedextensions [integration/config] - 10https://gerrit.wikimedia.org/r/459991 (https://phabricator.wikimedia.org/T204153) (owner: 10Addshore) [19:30:35] (03CR) 10Hashar: "Now pointing to the new task T204153" [integration/config] - 10https://gerrit.wikimedia.org/r/459991 (https://phabricator.wikimedia.org/T204153) (owner: 10Addshore) [19:32:24] 10Continuous-Integration-Config, 10Wikidata, 10Patch-For-Review, 10User-Addshore: Add more Wikibase extensions to gatedextensions - https://phabricator.wikimedia.org/T204153 (10hashar) T86930 also ask for ContentTranslation to be added and T200976 is for Scribunto. [19:32:49] (03CR) 10jerkins-bot: [V: 04-1] QA report for gated extensions and their dependencies [integration/config] - 10https://gerrit.wikimedia.org/r/460080 (owner: 10Hashar) [19:58:55] !log replacing integration-slave-docker-1032 offline 85/15% split for docker/workspace left too little space for workspace. puppet change has been updated to use 70/30% volume space ratio [19:58:59] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [20:01:37] !log launching new integration-slave-docker-1033 bigram instance [20:01:40] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [20:05:29] PROBLEM - Host integration-slave-docker-1032 is DOWN: CRITICAL - Host Unreachable (10.68.21.248) [20:06:05] 10Phabricator, 10Security-Team, 10Patch-For-Review: Add 'risk' field to tasks created via advanced template - https://phabricator.wikimedia.org/T204138 (10Legoktm) @chasemp is it expected that this field will show up on non-security tasks like T204154? [20:09:52] 10Phabricator-Production-Instance: Decide whether we need to add a severity (impact) field to match Bugzilla's - https://phabricator.wikimedia.org/T102 (10Aklapper) For the records, T204138 added a `Risk` field. [20:14:04] 10Phabricator, 10Security-Team, 10Patch-For-Review: Add 'risk' field to tasks created via advanced template - https://phabricator.wikimedia.org/T204138 (10mmodell) @legoktm: to avoid that we'd need to create a separate form which I think might be a better idea, I'm afraid people will be annoyed by the extra... [20:16:21] !log adding newly provisioned integration-slave-docker-1033 jenkins node with 4 executors [20:16:25] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [20:17:54] 10Continuous-Integration-Config, 10Wikidata, 10Patch-For-Review, 10User-Addshore: Add more Wikibase extensions to gatedextensions - https://phabricator.wikimedia.org/T204153 (10Jdforrester-WMF) Copying my comment in gerrit: > Gate is already quite slow. Ideally we'd have every production extension and ski... [20:24:17] 10Phabricator, 10Security-Team, 10Patch-For-Review: Add 'risk' field to tasks created via advanced template - https://phabricator.wikimedia.org/T204138 (10greg) >>! In T204138#4578229, @mmodell wrote: > @legoktm: to avoid that we'd need to create a separate form which I think might be a better idea, I'm afra... [20:25:00] twentyafterfour: heh, you removed the field already (thanks!) but it confused me why I couldn't see it in the advanced form for about a minute ;) [20:25:24] greg-g: yeah I'm on top of it ;) [20:25:45] made a separate form: https://phabricator.wikimedia.org/maniphest/task/edit/form/48/ [20:25:50] rock, thanks man [20:27:20] 10Phabricator, 10Security-Team, 10Patch-For-Review: Add 'risk' field to tasks created via advanced template - https://phabricator.wikimedia.org/T204138 (10mmodell) I went ahead and created https://phabricator.wikimedia.org/maniphest/task/edit/form/48/ which is an exact copy of form ♯3. ... it's a little anno... [20:35:41] 10Phabricator, 10Release-Engineering-Team (Kanban), 10Security-Team, 10User-MModell: Should security tasks be a custom type maniphest? - https://phabricator.wikimedia.org/T204160 (10mmodell) p:05Triage>03Normal [20:37:15] 10Phabricator, 10Release-Engineering-Team (Kanban), 10Security-Team, 10User-MModell: Should security tasks be a custom type maniphest? - https://phabricator.wikimedia.org/T204160 (10mmodell) [20:38:16] 10Phabricator, 10Release-Engineering-Team (Kanban), 10Security-Team, 10User-MModell: Should security tasks be a custom type in maniphest? - https://phabricator.wikimedia.org/T204160 (10mmodell) [20:50:00] !log taking integration-slave-docker-1002 offline for replacement [20:50:04] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [20:51:53] !log deleting integration-slave-docker-1002 instance [20:51:56] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [20:57:21] !log taking integration-slave-docker-1003/-1004 offline for replacement [20:57:25] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [20:59:35] !log deleting integration-slave-docker-1003/-1004 instances [20:59:38] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [21:01:43] !log launching integration-slave-docker-1034 bigram instance [21:01:47] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [21:02:08] PROBLEM - Host integration-slave-docker-1004 is DOWN: CRITICAL - Host Unreachable (10.68.16.233) [21:02:36] PROBLEM - Host integration-slave-docker-1003 is DOWN: CRITICAL - Host Unreachable (10.68.23.87) [21:17:38] !log adding new jenkins node integration-slave-docker-1034 with 4 executors [21:17:42] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [21:18:22] 10Phabricator, 10Security-Team, 10Patch-For-Review: Add 'risk' field to tasks created via advanced template - https://phabricator.wikimedia.org/T204138 (10chasemp) 05Resolved>03Open re-opening so we can figure something out, totally down for risk rating not showing on regular advanced tasks. @20after4 a... [21:20:32] 10Phabricator, 10Release-Engineering-Team (Kanban), 10Security-Team, 10User-MModell: Should security tasks be a custom type in maniphest? - https://phabricator.wikimedia.org/T204160 (10chasemp) ```. @20after4 and I are talking about doing a few things: Creating a task type 'security' Have due date and ris... [22:29:08] 10MediaWiki-Releasing, 10MediaWiki-Release-Tools: MediaWiki release patch files should be based off of the previous tarball - https://phabricator.wikimedia.org/T181116 (10Legoktm) a:03Legoktm [22:29:43] 10Gerrit, 10Release-Engineering-Team (Next), 10DBA, 10Operations, 10Patch-For-Review: Gerrit is failing to connect to db on gerrit2001 thus preventing systemd from working - https://phabricator.wikimedia.org/T176532 (10Paladox) with the migration to notedb accounts and changes have been removed from the... [22:31:25] 10MediaWiki-Releasing, 10Release-Engineering-Team (Someday), 10MediaWiki-Release-Tools: Make make-release not need to clone MW for every branch - https://phabricator.wikimedia.org/T180522 (10Legoktm) a:03Legoktm [22:32:06] 10MediaWiki-Releasing, 10Release-Engineering-Team (Someday), 10MediaWiki-Release-Tools: make-release vendor diffing broken for REL1_27/REL1_28 - https://phabricator.wikimedia.org/T180532 (10Legoktm) a:03Legoktm [22:35:37] 10MediaWiki-Releasing: Test files appear in MW tarball diff patches, generate ignored hunks - https://phabricator.wikimedia.org/T94664 (10Legoktm) [22:35:39] 10MediaWiki-Releasing, 10MediaWiki-Release-Tools: MediaWiki release patch files should be based off of the previous tarball - https://phabricator.wikimedia.org/T181116 (10Legoktm) [22:36:23] 10MediaWiki-Releasing, 10MediaWiki-Release-Tools: MediaWiki release patch files should be based off of the previous tarball - https://phabricator.wikimedia.org/T181116 (10Legoktm) [22:36:26] 10MediaWiki-Releasing, 10Release-Engineering-Team, 10MW-1.31-release: Upgrade patches for tarball releases don't apply cleanly to tarball installation - https://phabricator.wikimedia.org/T73379 (10Legoktm) [22:36:50] (03PS3) 10Legoktm: Move to `git archive` like model for MediaWiki releases [tools/release] - 10https://gerrit.wikimedia.org/r/454609 (https://phabricator.wikimedia.org/T199467) [22:37:46] (03CR) 10jerkins-bot: [V: 04-1] Move to `git archive` like model for MediaWiki releases [tools/release] - 10https://gerrit.wikimedia.org/r/454609 (https://phabricator.wikimedia.org/T199467) (owner: 10Legoktm) [22:41:28] (03PS4) 10Legoktm: Move to `git archive` like model for MediaWiki releases [tools/release] - 10https://gerrit.wikimedia.org/r/454609 (https://phabricator.wikimedia.org/T199467) [22:50:20] 10Phabricator, 10Security-Team, 10Patch-For-Review: Add 'risk' field to tasks created via advanced template - https://phabricator.wikimedia.org/T204138 (10Krinkle) Yeah, also noticed it at T135798 which is a pre-existing task where I made an edit on, which then silently recorded that I set "Risk Rating: N/A"... [22:51:48] (03PS5) 10Legoktm: Move to `git archive` like model for MediaWiki releases [tools/release] - 10https://gerrit.wikimedia.org/r/454609 (https://phabricator.wikimedia.org/T199467) [22:55:58] (03CR) 10Legoktm: "This should be ready for review now. It's not the prettiest Python, but it works pretty well in my testing. The initial patchsets had a --" [tools/release] - 10https://gerrit.wikimedia.org/r/454609 (https://phabricator.wikimedia.org/T199467) (owner: 10Legoktm) [22:57:00] 10Gerrit, 10Release-Engineering-Team (Next), 10DBA, 10Operations, 10Patch-For-Review: Gerrit is failing to connect to db on gerrit2001 thus preventing systemd from working - https://phabricator.wikimedia.org/T176532 (10Dzahn) You are saying we won't need any mysql/mariadb for Gerrit anymore? [22:58:12] 10Gerrit, 10Release-Engineering-Team (Next), 10DBA, 10Operations, 10Patch-For-Review: Gerrit is failing to connect to db on gerrit2001 thus preventing systemd from working - https://phabricator.wikimedia.org/T176532 (10Paladox) Yep, but currently 2.x will still require a db just 2.15 does not read change... [23:04:32] (03PS6) 10Legoktm: Move to `git archive` like model for MediaWiki releases [tools/release] - 10https://gerrit.wikimedia.org/r/454609 (https://phabricator.wikimedia.org/T199467) [23:12:30] PROBLEM - Puppet errors on deployment-webperf11 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [23:37:21] 13maintenance-disconnect-full-disks build 2401 thcipriani test: OFFLINE due to disk space [23:40:18] 13maintenance-disconnect-full-disks build 2402 thcipriani test: OFFLINE due to disk space [23:43:43] (03PS1) 10Thcipriani: Refactor maintenance to timeout after 5 minutes [integration/config] - 10https://gerrit.wikimedia.org/r/460174 (https://phabricator.wikimedia.org/T204077) [23:47:31] RECOVERY - Puppet errors on deployment-webperf11 is OK: OK: Less than 1.00% above the threshold [0.0] [23:50:19] (03PS2) 10Thcipriani: Refactor maintenance to timeout after 5 minutes [integration/config] - 10https://gerrit.wikimedia.org/r/460174 (https://phabricator.wikimedia.org/T204077) [23:53:23] (03PS1) 10Thcipriani: Add /var/lib/docker partition to maintenance check [integration/config] - 10https://gerrit.wikimedia.org/r/460176