[00:09:45] PROBLEM - Puppet staleness on integration-slave-jessie-1001 is CRITICAL 100.00% of data above the critical threshold [43200.0] [01:57:03] Project browsertests-Gather-en.m.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #122: ABORTED in 3 hr 0 min: https://integration.wikimedia.org/ci/job/browsertests-Gather-en.m.wikipedia.beta.wmflabs.org-linux-chrome-sauce/122/ [02:30:42] !log Various beta-mediawiki-config-update-eqiad jobs have been stuck for over 13 hours. [02:30:47] Logged the message, Master [02:36:13] !log Jenkins is unable to launch slave agent on deployment-bastion.eqiad. Using "Jenkins Script Console" throws HTTP 503. [02:36:18] Logged the message, Master [02:40:18] 10Beta-Cluster: beta-scap-eqiad fails to sync due to "Permission denied (public key)" - https://phabricator.wikimedia.org/T99644#1295655 (10Krinkle) 3NEW [02:40:19] !log deployment-bastion.eqiad magically back online and catching up jobs, though failing due to T99644 [02:40:23] Logged the message, Master [02:41:43] Krinkle:I know what causes that error but not how to fix it for sure [02:41:52] I'll see if I can poke it back to life [02:41:55] k [02:42:17] when the server is rebooted a command needs to be run manually to prime the ssh-agent used for scap [02:54:07] !log Primed keyholder agent via `sudo -u keyholder env SSH_AUTH_SOCK=/run/keyholder/agent.sock ssh-add /etc/keyholder.d/mwdeploy_rsa` [02:54:10] Logged the message, Master [02:56:54] 10Beta-Cluster: beta-scap-eqiad fails to sync due to "Permission denied (public key)" - https://phabricator.wikimedia.org/T99644#1295666 (10bd808) After deployment-bastion reboots, the keyholder ssh-agent that is used by scap needs to be primed manually (it's a security thing for prod): ``` $ ssh deployment-bas... [02:59:04] Krinkle: scap is running (and syncing) now [02:59:05] Yippee, build fixed! [02:59:05] Project beta-scap-eqiad build #53390: FIXED in 5 min 18 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/53390/ [02:59:41] bd808: Hm.. only after full reboot of the actual instance? [02:59:44] Or also in other scenarios? [02:59:47] Thx :) [03:00:00] well, any time the agent loses the key [03:00:13] but that should only happen on reboot [03:01:29] 10Beta-Cluster: beta-scap-eqiad fails to sync due to "Permission denied (public key)" - https://phabricator.wikimedia.org/T99644#1295668 (10bd808) 5Open>3Resolved a:3bd808 https://integration.wikimedia.org/ci/view/Beta/job/beta-scap-eqiad/53390/console ``` 02:54:54 02:54:54 Started sync-apaches 02:54:54 s... [04:29:48] Project browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-windows_7-internet_explorer-9-sauce build #443: FAILURE in 37 min: https://integration.wikimedia.org/ci/job/browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-windows_7-internet_explorer-9-sauce/443/ [04:54:02] Project browsertests-CentralNotice-en.m.wikipedia.beta.wmflabs.org-os_x_10.10-iphone-sauce build #74: ABORTED in 3 hr 0 min: https://integration.wikimedia.org/ci/job/browsertests-CentralNotice-en.m.wikipedia.beta.wmflabs.org-os_x_10.10-iphone-sauce/74/ [05:22:04] Project browsertests-CentralNotice-en.m.wikipedia.beta.wmflabs.org-linux-android-sauce build #105: ABORTED in 3 hr 0 min: https://integration.wikimedia.org/ci/job/browsertests-CentralNotice-en.m.wikipedia.beta.wmflabs.org-linux-android-sauce/105/ [05:34:40] Project browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-windows_7-internet_explorer-11-sauce build #423: FAILURE in 32 min: https://integration.wikimedia.org/ci/job/browsertests-MultimediaViewer-en.wikipedia.beta.wmflabs.org-windows_7-internet_explorer-11-sauce/423/ [05:42:04] Project browsertests-MobileFrontend-en.m.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #685: ABORTED in 3 hr 0 min: https://integration.wikimedia.org/ci/job/browsertests-MobileFrontend-en.m.wikipedia.beta.wmflabs.org-linux-firefox-sauce/685/ [05:47:08] PROBLEM - Puppet staleness on deployment-restbase02 is CRITICAL 44.44% of data above the critical threshold [43200.0] [05:52:21] PROBLEM - Puppet staleness on deployment-restbase01 is CRITICAL 11.11% of data above the critical threshold [43200.0] [06:09:19] 10Deployment-Systems, 6Services: Evaluate Ansible as a deployment tool - https://phabricator.wikimedia.org/T93433#1295779 (10Joe) @Gwicke, I regularly use fabric to automate tasks on my side, so no one argues if you want to use ansible on your computer to automate things in production. If we want to use someth... [08:00:54] 6Release-Engineering: Update repositories that use mediawiki_selenium Ruby gem 1.x - https://phabricator.wikimedia.org/T94083#1295814 (10zeljkofilipin) a:3zeljkofilipin [08:10:19] 10Browser-Tests, 6Release-Engineering: "gem build" should fail if there are _any_ warnings - https://phabricator.wikimedia.org/T1333#1295827 (10dduvall) 5Open>3declined a:3dduvall I don't think it's desirable to fail on _any_ warning. We should leave it up to the discretion of the developer building the... [08:23:16] 6Release-Engineering: Update repositories that use mediawiki_selenium Ruby gem 1.x - https://phabricator.wikimedia.org/T94083#1295833 (10zeljkofilipin) [08:24:13] 6Release-Engineering: Update repositories that use mediawiki_selenium Ruby gem 1.x - https://phabricator.wikimedia.org/T94083#1154455 (10zeljkofilipin) [08:26:21] 10Browser-Tests: Upgrade CentralNotice browser tests to use mediawiki_selenium 1.x - https://phabricator.wikimedia.org/T99652#1295839 (10dduvall) 3NEW [08:29:28] 10Browser-Tests: Upgrade CirrusSearch browser tests to use mediawiki_selenium 1.x - https://phabricator.wikimedia.org/T99653#1295850 (10dduvall) 3NEW [08:30:01] 10Browser-Tests: Upgrade Gather browser tests to use mediawiki_selenium 1.x - https://phabricator.wikimedia.org/T99654#1295856 (10dduvall) 3NEW [08:31:04] 10Browser-Tests: Upgrade GettingStarted browser tests to use mediawiki_selenium 1.x - https://phabricator.wikimedia.org/T99655#1295863 (10dduvall) 3NEW [08:31:40] 10Browser-Tests: Upgrade Math browser tests to use mediawiki_selenium 1.x - https://phabricator.wikimedia.org/T99656#1295869 (10dduvall) 3NEW [08:31:43] 10Browser-Tests: Update ZeroBanner repository to mediawiki_selenium Ruby gem 1.1 - https://phabricator.wikimedia.org/T99657#1295875 (10zeljkofilipin) 3NEW a:3zeljkofilipin [08:32:23] 10Browser-Tests: Upgrade MobileFrontend browser tests to use mediawiki_selenium 1.x - https://phabricator.wikimedia.org/T99658#1295882 (10dduvall) 3NEW [08:32:53] 10Browser-Tests: Upgrade MultimediaViewer browser tests to use mediawiki_selenium 1.x - https://phabricator.wikimedia.org/T99659#1295888 (10dduvall) 3NEW [08:33:10] 6Release-Engineering: Update WikiLove repository to mediawiki_selenium Ruby gem 1.1 - https://phabricator.wikimedia.org/T99660#1295895 (10zeljkofilipin) 3NEW a:3zeljkofilipin [08:35:02] 6Release-Engineering: Update VisualEditor repository to mediawiki_selenium Ruby gem 1.1 - https://phabricator.wikimedia.org/T99661#1295902 (10zeljkofilipin) 3NEW a:3zeljkofilipin [08:37:17] 6Release-Engineering: Update VisualEditor repository to mediawiki_selenium Ruby gem 1.1 - https://phabricator.wikimedia.org/T99661#1295912 (10zeljkofilipin) a:5zeljkofilipin>3None [08:37:22] 6Release-Engineering: Update WikiLove repository to mediawiki_selenium Ruby gem 1.1 - https://phabricator.wikimedia.org/T99660#1295914 (10zeljkofilipin) a:5zeljkofilipin>3None [08:37:28] 10Browser-Tests: Update ZeroBanner repository to mediawiki_selenium Ruby gem 1.1 - https://phabricator.wikimedia.org/T99657#1295916 (10zeljkofilipin) a:5zeljkofilipin>3None [08:39:06] 10Browser-Tests: Upgrade CirrusSearch browser tests to use mediawiki_selenium 1.x - https://phabricator.wikimedia.org/T99653#1295928 (10zeljkofilipin) a:3dduvall [08:39:14] 10Browser-Tests: Upgrade CentralNotice browser tests to use mediawiki_selenium 1.x - https://phabricator.wikimedia.org/T99652#1295930 (10zeljkofilipin) a:3zeljkofilipin [09:07:00] 6Release-Engineering, 6Project-Creators: SWAT Project (Tag) - https://phabricator.wikimedia.org/T99411#1295948 (10greg) p:5Triage>3Normal [09:07:32] 6Release-Engineering, 6Project-Creators: SWAT Project (Tag) - https://phabricator.wikimedia.org/T99411#1291507 (10greg) (updated the description with the proposal as I understand it from the thread, feel free to comment on that proposal with suggestions). [09:10:24] 6Release-Engineering, 6Project-Creators: SWAT Project (Tag) - https://phabricator.wikimedia.org/T99411#1295961 (10greg) [09:13:09] Yippee, build fixed! [09:13:09] Project browsertests-CentralAuth-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #113: FIXED in 2 min 32 sec: https://integration.wikimedia.org/ci/job/browsertests-CentralAuth-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/113/ [09:19:41] 10Browser-Tests, 10CirrusSearch: Support alternative API endpoints - https://phabricator.wikimedia.org/T99663#1295966 (10dduvall) 3NEW [09:19:56] 10Browser-Tests, 10CirrusSearch: Support alternative API endpoints - https://phabricator.wikimedia.org/T99663#1295973 (10dduvall) [09:19:57] 10Browser-Tests: Upgrade CirrusSearch browser tests to use mediawiki_selenium 1.x - https://phabricator.wikimedia.org/T99653#1295974 (10dduvall) [09:34:28] 10Continuous-Integration-Infrastructure, 5Patch-For-Review: Migrate all jobs to labs slaves - https://phabricator.wikimedia.org/T86659#1295986 (10hashar) [09:52:04] 10Browser-Tests: Update CentralAuth repository to mediawiki_selenium Ruby gem 1.x - https://phabricator.wikimedia.org/T99665#1296012 (10zeljkofilipin) 3NEW a:3zeljkofilipin [14:26:17] 10Continuous-Integration-Infrastructure: Fetch dependencies using composer instead of cloning mediawiki/vendor for non-wmf branches - https://phabricator.wikimedia.org/T90303#1296255 (10Legoktm) Krinkle and I discussed this a few days ago in #wikimedia-releng. We still need to run composer for wmf branches to fe... [14:29:22] 10Beta-Cluster, 6Release-Engineering, 10Continuous-Integration-Config, 10Parsoid: Parsoid patches don't update Beta Cluster automatically -- only deploy repo patches seem to update that code - https://phabricator.wikimedia.org/T92871#1296268 (10hashar) @ssastry any update? [14:33:03] 10Beta-Cluster, 6Release-Engineering, 10Continuous-Integration-Config, 10Parsoid: Parsoid patches don't update Beta Cluster automatically -- only deploy repo patches seem to update that code - https://phabricator.wikimedia.org/T92871#1296272 (10ssastry) Sorry .. I dropped the ball on this. Will discuss tod... [14:35:14] 6Release-Engineering, 6Project-Creators: SWAT Project (Tag) - https://phabricator.wikimedia.org/T99411#1296275 (10Anomie) Two things I notice that didn't make it here from the email discussion, so I'll mention them again for more discussion: # In the email it sounded like an additional component to step 3 was... [14:40:21] 6Release-Engineering, 6Project-Creators: SWAT Project (Tag) - https://phabricator.wikimedia.org/T99411#1296279 (10greg) >>! In T99411#1296275, @Anomie wrote: > Two things I notice that didn't make it here from the email discussion, so I'll mention them again for more discussion: > # In the email it sounded lik... [14:40:43] 10Continuous-Integration-Infrastructure: PHP Warning: Module 'apc' already loaded in Unknown on line 0 on zend slaves - https://phabricator.wikimedia.org/T99413#1296281 (10hashar) p:5Triage>3Low Surely annoying but not causing any specific issue. Seems APC is configured twice in /etc/php/** [14:42:30] 6Release-Engineering, 6Project-Creators: SWAT Project (Tag) - https://phabricator.wikimedia.org/T99411#1296290 (10greg) [14:45:52] Project browsertests-Wikidata-SmokeTests-linux-firefox-sauce build #257: FAILURE in 28 min: https://integration.wikimedia.org/ci/job/browsertests-Wikidata-SmokeTests-linux-firefox-sauce/257/ [14:48:26] 10Continuous-Integration-Infrastructure, 10Deployment-Systems, 5Patch-For-Review: Failed to create a temp file in beta-code-update-eqiad (Full deployment-bastion:/tmp) - https://phabricator.wikimedia.org/T97257#1296310 (10thcipriani) 5Open>3Resolved [14:50:55] 10Continuous-Integration-Infrastructure, 5Patch-For-Review: Have extensions with dependencies use the generic mwext-testextension-* job - https://phabricator.wikimedia.org/T96690#1296315 (10Legoktm) [15:28:52] 10Continuous-Integration-Infrastructure: Fetch dependencies using composer instead of cloning mediawiki/vendor for non-wmf branches - https://phabricator.wikimedia.org/T90303#1296411 (10bd808) >>! In T90303#1296255, @Legoktm wrote: > * wmf branches (and mediawiki/vendor patches) will clone mediawiki/vendor, we w... [15:32:05] Project browsertests-MobileFrontend-en.m.wikipedia.beta.wmflabs.org-linux-chrome-sauce build #651: ABORTED in 3 hr 0 min: https://integration.wikimedia.org/ci/job/browsertests-MobileFrontend-en.m.wikipedia.beta.wmflabs.org-linux-chrome-sauce/651/ [15:32:35] bd808: hmm, are you thinking of just running composer over the mediawiki/vendor repo? I think the composer-merge-plugin is already in that repo [15:32:45] is it? [15:33:03] * bd808 looks [15:33:07] https://github.com/wikimedia/mediawiki-vendor/tree/master/wikimedia/composer-merge-plugin yes [15:33:25] I think we added it because of the update.php checks [15:33:35] *nod* [15:34:01] so yeah I was thinking about having mw/vendor cloned and then running composer again over the top of it [15:34:28] it would dirty the clone obviously but save some downloads [15:34:33] and the second compsoer run [15:35:04] it should work either way though. and if we setup the composer cache right the down;loads will be negligable [15:59:22] 6Release-Engineering, 6Reading-Infrastructure-Team, 5Patch-For-Review, 7Puppet, 15User-Bd808-Test: Create basic puppet role for Sentry - https://phabricator.wikimedia.org/T84956#1296507 (10bd808) [16:37:28] 6Release-Engineering, 6Project-Creators: SWAT Project (Tag) - https://phabricator.wikimedia.org/T99411#1296636 (10mmodell) >>! In T99411#1296275, @Anomie wrote: > 2. I'm concerned that relying on the gerrit bot to dump links into the comments section of the task will lead to SWATters having to dig through the... [16:39:42] 6Release-Engineering, 6Project-Creators: SWAT Project (Tag) - https://phabricator.wikimedia.org/T99411#1296645 (10mmodell) Gerritbot would be mostly unnecessary if we would just follow phab conventions. Though, if I follow that logic to it's conclusion, gerrit would be unnecessary. [16:49:44] 6Release-Engineering, 6Project-Creators: SWAT Project (Tag) - https://phabricator.wikimedia.org/T99411#1296678 (10Legoktm) >>! In T99411#1296636, @mmodell wrote: > See T96942 for an example where I added the commits to the task - this doesn't normally happen because we use the "Bug: Tnnn" convention in our com... [16:54:46] 6Release-Engineering, 6Project-Creators: SWAT Project (Tag) - https://phabricator.wikimedia.org/T99411#1296710 (10mmodell) @legoktm: how would it break that? # You could still say "Bug: refs T96942" and I believe that would achieve both gerrit search indexing and phab task linking. # If you link the tasks in... [16:56:46] (03PS1) 10Anomie: Add ApiFeatureUsage extension to default.conf [tools/release] - 10https://gerrit.wikimedia.org/r/212015 (https://phabricator.wikimedia.org/T1272) [17:21:00] 6Release-Engineering, 6Project-Creators: SWAT Project (Tag) - https://phabricator.wikimedia.org/T99411#1296770 (10Anomie) >>! In T99411#1296636, @mmodell wrote: > # go to the commit in diffusion and "edit maniphest tasks" Do commits show up in Diffusion before they get merged in Gerrit? I've yet to see one. [17:24:04] 6Release-Engineering, 10Gather, 7Jenkins: Gather can't merge code due to issue with karma (jenkins mwext-qunit job) - https://phabricator.wikimedia.org/T99686#1296783 (10Jdlrobson) 3NEW [17:25:45] PROBLEM - App Server Main HTTP Response on deployment-mediawiki03 is CRITICAL - Socket timeout after 10 seconds [17:37:05] 6Release-Engineering, 6Project-Creators: SWAT Project (Tag) - https://phabricator.wikimedia.org/T99411#1296820 (10mmodell) @anomie: no they don't show up before being merged. [17:38:28] 6Release-Engineering, 6Project-Creators: SWAT Project (Tag) - https://phabricator.wikimedia.org/T99411#1296826 (10Anomie) Which means the Diffusion linking isn't really useful for this task until we start using Diffusion instead of Gerrit for code review. [17:42:56] 6Release-Engineering, 6Project-Creators: SWAT Project (Tag) - https://phabricator.wikimedia.org/T99411#1296844 (10mmodell) well, that brings me to my next point... why use gerrit for code review at this point? I see no blockers for adopting differential ;) [17:53:27] 6Release-Engineering, 10Gather, 10Gather Sprint Help!, 7Jenkins: Gather can't merge code due to issue with karma (jenkins mwext-qunit job) - https://phabricator.wikimedia.org/T99686#1296869 (10Jdlrobson) [17:57:48] 6Release-Engineering, 6Project-Creators: SWAT Project (Tag) - https://phabricator.wikimedia.org/T99411#1296894 (10Krenair) Wouldn't that cause CI issues? [17:57:58] 6Release-Engineering, 6Project-Creators: SWAT Project (Tag) - https://phabricator.wikimedia.org/T99411#1296895 (10mmodell) IMO: A code review tool is not the appropriate venue for branch porting requests and deployments. Lets solve this one better than just an awkward kludge of manual workflows - the tools sh... [18:06:26] 6Release-Engineering, 6Project-Creators: SWAT Project (Tag) - https://phabricator.wikimedia.org/T99411#1296921 (10mmodell) @krenair: I think there are several solutions to CI issues - one is to make CI talk to phabricator, the alternative is to keep gerrit up stream from phab and make the final commit get push... [18:14:15] 6Release-Engineering, 10Gather, 10Gather Sprint Help!, 7Jenkins: Gather can't merge code due to issue with karma (jenkins mwext-qunit job) - https://phabricator.wikimedia.org/T99686#1296951 (10Jdlrobson) Example: https://integration.wikimedia.org/ci/job/mwext-qunit/1011/console [18:17:30] 6Release-Engineering, 6Project-Creators: SWAT Project (Tag) - https://phabricator.wikimedia.org/T99411#1296967 (10Anomie) >>! In T99411#1296844, @mmodell wrote: > well, that brings me to my next point... why use gerrit for code review at this point? I see no blockers for adopting differential ;) See {T18}. Lo...