[00:02:01] reviewed [00:03:38] paladox: on https://www.dereckson.be/blog/2014/10/26/how-to-change-phabricator-logo/ you've a screenshot of what srites looked like 18 month agos [00:05:36] Dereckson ok thanks, but it should be easy now we can do it through a config for everyone to view :) [00:08:27] As I noted to D327, yes, yes. Not worthy to edit sprite for some weeks. [00:09:06] Ok [00:10:00] 23:40:50 < paladox> since re using https://phabricator.wikimedia.org/diffusion/PHAB/browse/wmf%252Fstable/webroot/rsrc/image/sprite-menu.png but dosent work [00:10:03] hey [00:10:18] that's exactly one of the two files to edit in the past (the other is the x2) [00:10:26] but then bin/celerity map was needed [00:10:32] Yep, not needed any more [00:10:49] plus a sprite i doint think will work either since it requires 80x80 [00:11:45] Sprite method worked very well. But it was a shame when you knwo the real reason. [00:11:52] Take a 2012 version [00:11:59] Option to change logo existed. [00:12:13] Then, they tried to force branding. [00:12:34] With a good cop, bad cop strategy. [00:12:57] eprestley said in tasks it wasn't really in favour of that, the designer said it was in favour [00:13:11] Yep, but lets hope they keep customisation logo support this time [00:13:40] I work with several Phab instances, I'm happy to see different colors/logos to distinguish them. [00:13:49] oh [00:14:11] https://secure.phabricator.com/D3101 [00:14:20] July 2012 the removal [00:14:35] Oh, i carnt view that diff [00:15:24] https://secure.phabricator.com/T4214 "(Product) Although I don't feel very strongly about this, it is nice to have branding in the UI. I think @btrahan feels a bit more strongly than I do. Installs are obviously free to change this locally, but conceptually this feels a little mushy to me." [00:15:39] oh [00:15:48] that's the statement I interpret as a good cop / bad cop [00:15:54] oh [00:16:15] ah indeed, D3101 is for "all users", not public [00:16:43] twentyafterfour the new phabricator includes ui improvements, if you look at diffusion page [00:16:46] yep [00:16:48] atching commit is https://github.com/phacility/phabricator/commit/0cc3cb75597e3624c4ab5113131443e8097173f6 [00:17:00] thanks [00:18:38] Dereckson or we can upload the image as a file in phabricator [00:18:42] and use its phid [00:19:00] Yes, and UI directly provides a button for such upload. [00:19:33] yes but we can also do it through puppet [00:20:29] https://s3.amazonaws.com/upload.screenshot.co/1326b4439c [00:20:40] yes, it's a JSON property [00:20:52] thanks [00:20:53] yep [00:20:54] (we *SHOULD* do it through Puppet) [00:21:08] Yep, we can upload to the file application [00:21:13] and for example it will [00:21:16] look like this [00:21:31] PHID-FILE-ve7ggcxjtiisrm4nxytr [00:21:56] you've on https://s3.amazonaws.com/upload.screenshot.co/1326b4439c the full JSON object to use for ui.logo [00:22:25] ah not copy/pastable as an image [00:24:49] Here you are: https://phabricator.wikimedia.org/P3946 [00:25:49] PHID-FILE-rs3pf2brupiulr6zcnrg is https://phabricator.wikimedia.org/F4414835 [00:25:53] thanks [00:26:24] Dereckson do you know how to set https://phabricator.wikimedia.org/P3946 as yaml [00:26:26] please? [00:27:53] ah we've the Uber Jenkins plugin now [00:28:02] oh [00:32:57] look how aniphest.priorities is declared, that's the same way [00:33:17] (with less children values) [00:33:24] ok [00:33:26] thanks [00:51:25] Dereckson could you change the edit policy on https://phabricator.wikimedia.org/F4414835 to admin please? [00:55:23] twentyafterfour https://gerrit.wikimedia.org/r/307462 :) [00:55:31] Dereckson ^^ :) [03:35:18] 06Release-Engineering-Team: Feedback for European SWAT window - https://phabricator.wikimedia.org/T143894#2593448 (10Mattflaschen-WMF) > What went well? It went smoothly for the most part originally, and in the end, everything was resolved. > What went bad? No incorrect deployment steps, but a small unforesee... [03:38:44] 10Beta-Cluster-Infrastructure, 10ContentTranslation-Deployments, 10MediaWiki-extensions-ContentTranslation: Beta: cxserver is not updated - https://phabricator.wikimedia.org/T144149#2593451 (10KartikMistry) >>! In T144149#2592051, @mobrovac wrote: > @KartikMistry, when we were switching CXServer to scap3 we... [04:18:44] Project selenium-MultimediaViewer » firefox,beta,Linux,contintLabsSlave && UbuntuTrusty build #126: 04FAILURE in 22 min: https://integration.wikimedia.org/ci/job/selenium-MultimediaViewer/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/126/ [05:18:20] 10scap: Tab completion doesn't work well for directories - https://phabricator.wikimedia.org/T144244#2593475 (10thcipriani) [06:12:22] 10Beta-Cluster-Infrastructure, 07Puppet: puppet agent -tv fails to run on deployment-sca01 - https://phabricator.wikimedia.org/T144256#2593546 (10KartikMistry) [06:46:09] PROBLEM - Puppet run on deployment-mediawiki01 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [06:53:58] PROBLEM - Puppet run on deployment-eventlogging03 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [07:16:03] 06Release-Engineering-Team (Deployment-Blockers), 05Release: MW-1.28.0-wmf.17 deployment blockers - https://phabricator.wikimedia.org/T142117#2593661 (10Arrbee) [07:21:08] RECOVERY - Puppet run on deployment-mediawiki01 is OK: OK: Less than 1.00% above the threshold [0.0] [07:33:59] RECOVERY - Puppet run on deployment-eventlogging03 is OK: OK: Less than 1.00% above the threshold [0.0] [07:37:04] good morning [08:05:10] PROBLEM - English Wikipedia Main page on beta-cluster is CRITICAL: HTTP CRITICAL: HTTP/1.1 301 TLS Redirect - string 'Wikipedia' not found on 'http://en.wikipedia.beta.wmflabs.org:80/wiki/Main_Page?debug=true' - 587 bytes in 0.002 second response time [08:06:16] PROBLEM - English Wikipedia Mobile Main page on beta-cluster is CRITICAL: HTTP CRITICAL: HTTP/1.1 301 TLS Redirect - string 'Wikipedia' not found on 'http://en.m.wikipedia.beta.wmflabs.org:80/wiki/Main_Page?debug=true' - 589 bytes in 0.002 second response time [08:13:23] 10Beta-Cluster-Infrastructure, 07Puppet: puppet agent -tv fails to run on deployment-sca01 - https://phabricator.wikimedia.org/T144256#2593802 (10Krenair) [08:13:25] 10Beta-Cluster-Infrastructure, 07Puppet: deployment-sca0[12] puppet failure due to issues involving /srv/deployment directory - https://phabricator.wikimedia.org/T143065#2593805 (10Krenair) [08:23:18] PROBLEM - Puppet run on deployment-salt02 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [09:19:14] 06Release-Engineering-Team, 10MediaWiki-Vagrant, 06Operations, 07Epic: [EPIC] Migrate base image to Debian Jessie - https://phabricator.wikimedia.org/T136429#2593946 (10MoritzMuehlenhoff) [09:20:19] 06Release-Engineering-Team, 10MediaWiki-Vagrant, 06Operations, 07Epic: [EPIC] Migrate base image to Debian Jessie - https://phabricator.wikimedia.org/T136429#2334744 (10MoritzMuehlenhoff) This got mentioned as needing ops involvement in SoS, but in yesterday's Ops meeting we weren't sure what kind of help... [09:36:01] 10Browser-Tests-Infrastructure, 10VisualEditor, 10VisualEditor-MediaWiki, 13Patch-For-Review, and 2 others: Fix font support on SauceLabs VE screenshots - https://phabricator.wikimedia.org/T141369#2593984 (10zeljkofilipin) @Elitre, @Esanders, @Amire80 Are the fonts now fixed? Images from the last build (o... [09:37:32] 07Browser-Tests, 10Wikidata, 13Patch-For-Review: [Task] Find a sensible subset of browsertests that could be run on each commit - https://phabricator.wikimedia.org/T130019#2593988 (10Tobi_WMDE_SW) https://gerrit.wikimedia.org/r/#/c/301764/ [09:37:38] 07Browser-Tests, 10Wikidata, 13Patch-For-Review: [Task] Find a sensible subset of browsertests that could be run on each commit - https://phabricator.wikimedia.org/T130019#2593990 (10Tobi_WMDE_SW) 05Open>03Resolved a:03Tobi_WMDE_SW [10:12:39] 10Continuous-Integration-Config, 06Operations, 06Operations-Software-Development: Flake8 for python files without extension in puppet repo - https://phabricator.wikimedia.org/T144169#2594020 (10Volans) [10:18:40] hashar hi, im wondering could you review https://gerrit.wikimedia.org/r/#/c/307439/ and https://gerrit.wikimedia.org/r/#/c/307441/ and https://gerrit.wikimedia.org/r/#/c/307308/ please? [10:18:54] two of the patches are to do with adding two projects to wikibugs2 [10:19:15] and the other merging upstream branch into master branch for zuul [10:19:37] Also would you be able to merge the precise branch into jessie so i can test please? [10:23:23] 10Continuous-Integration-Infrastructure (phase-out-gallium), 06Operations, 10hardware-requests: Allocate contint1001 to releng and allocate to a vlan - https://phabricator.wikimedia.org/T140257#2594023 (10mark) >>! In T140257#2553490, @thcipriani wrote: >>>! In T140257#2491705, @faidon wrote: >> I've deliber... [10:24:33] paladox: those patches are only a few hours old and you ping people that are already listed as reviewers? [10:24:54] andre__ mutante has asked for a +1 from a releng team [10:25:12] on all three? [10:25:15] No [10:25:19] just two of them [10:25:22] wikibugs2 [10:26:00] paladox, hmm, how is that related to what I just ask? [10:26:05] *asked [10:27:05] Because you said the patches are a few hours old, but i said mutante wants a +1 from releng before merging the patches, you said all three, i said no since he only wants the +1 from the wikibugs2 project patches i have [10:27:21] hashar: zeljkof: did something change recently (yesterday) how browsertests on the integration server work? Getting https://integration.wikimedia.org/ci/job/mwext-mw-selenium-composer/4618/artifact/log/mw-dberror.log when trying to create wikibase properties via the api during testing [10:28:04] hashar: I remember us talking about some database changes... [10:28:22] "us" meaning Dan :) [10:28:41] paladox: which does not answer my question, so let me rephrase: Why are these patches so urgent that you ping after a few hours? [10:28:56] (Especially as people did receive notifications already via Gerrit?) [10:28:57] There not urgent. [10:29:02] paladox, then why do you ping on IRC? [10:29:19] Because i wanted to see if he could review them [10:29:59] paladox: He already gets notifications from Gerrit. [10:30:05] Ok [10:30:15] paladox: There's no need for IRC pings after a few hours, sending emails, or calling him via the phone, or anything else to quadroplicate the amount of messages about the very same thing. [10:30:27] ok [10:30:53] paladox: I have asked you twice already to not ping about patches *after a few hours* when it's not urgent. This is the third time. [10:31:04] paladox: Please tell me what to do to not tell you a fourth time? [10:31:21] ok, im sorry. [10:31:23] paladox: Please tell me what to do to not tell you a fourth time? [10:31:33] I doint know [10:32:09] I'd really like an answer for that question. This is a repeating pattern and it seems like you forget a few minutes after I've asked you to please create less pings for folks in different places about *the very same topic*. [10:32:46] ok [10:33:35] paladox: if you don't know then please do think of an answer and tell me at some point. I really do appreciate your work but pinging is distractive especially when there is no *good* reason to ping. Please be more patient. [10:33:50] Ok sorry [10:34:13] paladox: I know you want to get stuff done. And I understand that. But duplicating notifications all over the place for no good reason is distractive. [10:34:34] paladox: Please don't make me point this out a fourth time. Thanks a lot (and thanks for your patches!) :) [10:34:38] Yep sorry, i didnt mean to duplicate notifications and cause more problems. [10:35:34] paladox: It can totally make sense to ping on IRC when things are really urgent. Or if you got no reply in Gerrit after a few days or a week. [10:35:43] but not on every single proposed patch please :) Thanks! [10:35:47] ok [10:39:04] thanks [10:54:08] 06Release-Engineering-Team (Deployment-Blockers), 05Release: MW-1.28.0-wmf.17 deployment blockers - https://phabricator.wikimedia.org/T142117#2594058 (10hashar) [10:54:32] 06Release-Engineering-Team (Deployment-Blockers), 05Release: MW-1.28.0-wmf.17 deployment blockers - https://phabricator.wikimedia.org/T142117#2523354 (10hashar) T143974 got caught on beta and reverted. Mobile is going to polish it up and propose it again later. [11:17:02] 10Continuous-Integration-Config, 06Operations, 06Operations-Software-Development: Flake8 for python files without extension in puppet repo - https://phabricator.wikimedia.org/T144169#2594078 (10Volans) [11:29:21] zeljkof: could those changes cause the failures? [11:29:43] I can not reproduce it locally [11:30:14] Tobi_WMDE_SW: not sure :( I was not involved in that, just a thought, hashar do you have any idea? [11:30:22] Tobi_WMDE_SW: did you create a task? [11:41:54] Tobi_WMDE_SW: zeljkof there is already a bug for that [11:42:05] search for "SUPER,REPLICATION CLIENT" in phabricator? [11:42:26] super old bug https://phabricator.wikimedia.org/T29975 [11:42:39] selenium related one https://phabricator.wikimedia.org/T144247 <--- Tobi_WMDE_SW [11:43:40] 10Continuous-Integration-Infrastructure: Frivolous Jenkins failures for Selenium due to DB error - https://phabricator.wikimedia.org/T144247#2594081 (10hashar) See also the old bug {T29975} Maybe it is related to the migration of the beta cluster database hosts to Jessie T138778 [11:45:42] 10Continuous-Integration-Infrastructure: Frivolous Jenkins failures for Selenium due to DB error - https://phabricator.wikimedia.org/T144247#2594086 (10Tobi_WMDE_SW) This is is also causing #Wikidata integration tests to fail: https://integration.wikimedia.org/ci/job/mwext-mw-selenium-composer/4618/artifact/log/... [11:47:43] hashar: thx, that's it. I've commented on it [11:48:18] since it's failing for wikibase on every new change I think short-term solution is to remove the affected tests [11:48:59] Tobi_WMDE_SW: it might also be an issue in mediawiki! [12:05:41] Nikerabbit: still around ? [12:05:56] Nikerabbit: I am looking for details about https://phabricator.wikimedia.org/T143889 "Replies impossible: The content format json is not supported by the content model" [12:23:14] hashar: what about it? [12:23:50] hashar: the first of those editpage patches breaks lqt, and the three rest depend on the first [12:25:51] Nikerabbit: I have reproduced it ! :) [12:27:49] simply by heading to https://test.wikimedia.beta.wmflabs.org/wiki/User_talk:Hashar and trying to reply to a message [12:28:49] 06Release-Engineering-Team (Deployment-Blockers), 05Release: MW-1.28.0-wmf.17 deployment blockers - https://phabricator.wikimedia.org/T142117#2594178 (10hashar) T143889 breaks some part of LiquidThreads and is reproducible on beta cluster. Due to a serie of patches in MediaWiki. [12:32:39] Nikerabbit: I will probably revert the whole serie [12:38:20] hashar: okay [12:38:33] hashar: unrelated, any idea why uls started suddenly failing: https://integration.wikimedia.org/ci/job/jshint/6208/console [12:38:48] I cannot produce locally, futurehostile should be there since 2015 [12:47:51] hashar: language screenshots job needs 3+hours to run, probably the same to upload all images (since the bot is rate limited to one upload every 5 seconds) :) https://integration.wikimedia.org/ci/job/language-screenshots-VisualEditor/BROWSER=chrome,PLATFORM=Windows%2010,label=ci-jessie-wikimedia/48/consoleFull [12:47:55] but seems to work fine so far [12:50:29] Nikerabbit: we should stop using jshint but instead the version from npm and nmp test job [12:50:41] Nikerabbit: it is terribly outdated [12:51:38] !log disabling https://integration.wikimedia.org/ci/view/Beta/job/beta-code-update-eqiad/ to cherry pick a revert patch [12:51:42] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [12:53:42] !log Cherry picking https://gerrit.wikimedia.org/r/#/c/307501/ on beta cluster for T143889 [12:53:46] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [12:55:19] !log Running scap on beta cluster via https://integration.wikimedia.org/ci/view/Beta/job/beta-scap-eqiad/117786/console T143889 [12:55:23] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [12:55:28] hopefully that will sync the cherry pick [12:59:06] !log beta: revert master branch to origin. Ran scap and enabled again beta-code-update-eqiad job. [12:59:10] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [13:24:32] (03PS3) 10Hashar: Revert "Move mediawiki-core-phpcs off of nodepool" [integration/config] - 10https://gerrit.wikimedia.org/r/306726 [13:48:55] zeljkof: so in short [13:48:59] I got wmf.17 on tin [13:49:05] and your scap command is now syncing it all [13:49:05] :( [13:49:09] to the other master [13:58:17] RECOVERY - Puppet run on deployment-salt02 is OK: OK: Less than 1.00% above the threshold [0.0] [14:11:30] zeljkof: you can get dcausse patch on mw1099 [14:11:36] should be fine even if jaime is still testing [14:12:03] (03CR) 10Hashar: [C: 032] Revert "Move mediawiki-core-phpcs off of nodepool" [integration/config] - 10https://gerrit.wikimedia.org/r/306726 (owner: 10Hashar) [14:13:01] (03Merged) 10jenkins-bot: Revert "Move mediawiki-core-phpcs off of nodepool" [integration/config] - 10https://gerrit.wikimedia.org/r/306726 (owner: 10Hashar) [14:19:11] (03CR) 10Mforns: [C: 031] "LGTM!" [integration/config] - 10https://gerrit.wikimedia.org/r/307113 (https://phabricator.wikimedia.org/T144119) (owner: 10Hashar) [14:21:14] dcausse: it is a bit slow we are multitasking on a bunch of things :] [14:21:28] dcausse: but your is about to be pushed. Gotta test it on mw1099 dont we? [14:21:34] err wrong channel [14:21:37] hashar: sure no problem :) [14:26:07] (03PS2) 10Hashar: [analytics/reportupdater] add tox [integration/config] - 10https://gerrit.wikimedia.org/r/307113 (https://phabricator.wikimedia.org/T144119) [14:26:12] (03CR) 10Hashar: [C: 032] [analytics/reportupdater] add tox [integration/config] - 10https://gerrit.wikimedia.org/r/307113 (https://phabricator.wikimedia.org/T144119) (owner: 10Hashar) [14:26:23] 10Continuous-Integration-Infrastructure: Frivolous Jenkins failures for Selenium due to DB error - https://phabricator.wikimedia.org/T144247#2594548 (10Ladsgroup) p:05Triage>03Unbreak! [14:26:39] 10Continuous-Integration-Infrastructure: Frivolous Jenkins failures for Selenium due to DB error - https://phabricator.wikimedia.org/T144247#2594554 (10Ladsgroup) All of jenkin jobs in Wikidata fails, we can't merge anything there. [14:27:21] (03Merged) 10jenkins-bot: [analytics/reportupdater] add tox [integration/config] - 10https://gerrit.wikimedia.org/r/307113 (https://phabricator.wikimedia.org/T144119) (owner: 10Hashar) [14:31:05] 10Continuous-Integration-Config, 10Analytics, 13Patch-For-Review: Add test runner and CI configuration to analytics/reportupdater - https://phabricator.wikimedia.org/T144119#2594575 (10hashar) All good. Thank you @mforns [14:32:51] 10Continuous-Integration-Config, 10Analytics, 13Patch-For-Review: Add test runner and CI configuration to analytics/reportupdater - https://phabricator.wikimedia.org/T144119#2594581 (10hashar) 05Open>03Resolved [14:47:05] 06Release-Engineering-Team (Deployment-Blockers), 05Release: MW-1.28.0-wmf.17 deployment blockers - https://phabricator.wikimedia.org/T142117#2594615 (10hashar) T143889 is imho no more a blocker. The root cause in MediaWiki core has been reverted. T109140 got rebased and is applied. All security patches appl... [14:47:16] 10Beta-Cluster-Infrastructure, 03Scap3, 10Citoid, 06Services, and 2 others: Can't deploy Citoid in Beta - https://phabricator.wikimedia.org/T132666#2594617 (10mobrovac) [14:56:22] so train of wmf.17 should be fine [14:56:29] at least all the preparation steps are done afaik [14:57:31] bbl [15:21:46] 05Gerrit-Migration, 10Differential: Set up arcyd to create differential revisions with `git push` (code review without arcanist) - https://phabricator.wikimedia.org/T132863#2594727 (10Paladox) @mmodell can we rise the priority of this task, since asking users to use arcanist will only make it difficulty in us... [15:24:38] 10Continuous-Integration-Infrastructure: Frivolous Jenkins failures for Selenium due to DB error - https://phabricator.wikimedia.org/T144247#2593143 (10hoo) Note: a94fe6c634780cd203ea79287b61966bacfbfdae should be reverted once this is fixed. [15:39:15] Project selenium-MobileFrontend » chrome,beta,Linux,contintLabsSlave && UbuntuTrusty build #138: 04FAILURE in 17 min: https://integration.wikimedia.org/ci/job/selenium-MobileFrontend/BROWSER=chrome,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/138/ [15:40:02] twentyafterfour hi, im thinking for doing arcanist in jenkins nodepool we could use puppet for this? [15:40:16] Since we do it for gerrit installing the token [15:40:21] that the gerritbot uses [15:41:48] openstack do it [15:41:53] https://github.com/openstack-infra/puppet-phabricator/blob/master/spec/acceptance/nodesets/nodepool-trusty.yml [15:46:57] Project selenium-MobileFrontend » firefox,beta,Linux,contintLabsSlave && UbuntuTrusty build #138: 04FAILURE in 24 min: https://integration.wikimedia.org/ci/job/selenium-MobileFrontend/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/138/ [15:48:45] 10Continuous-Integration-Infrastructure, 07Nodepool: Bring back jobs to Nodepool - https://phabricator.wikimedia.org/T143938#2594772 (10chasemp) task cycle instructions https://graphite.wikimedia.org/render/?width=1064&height=549&_salt=1472565876.337&target=cactiStyle(nodepool.task.wmflabs-eqiad.*.count)&area... [15:50:38] 06Release-Engineering-Team, 10MediaWiki-Vagrant, 06Operations, 07Epic: [EPIC] Migrate base image to Debian Jessie - https://phabricator.wikimedia.org/T136429#2594773 (10bd808) >>! In T136429#2593946, @MoritzMuehlenhoff wrote: > This got mentioned as needing ops involvement in SoS, but in yesterday's Ops me... [16:13:53] 10Continuous-Integration-Infrastructure, 07Nodepool, 13Patch-For-Review: Bring back jobs to Nodepool - https://phabricator.wikimedia.org/T143938#2594882 (10chasemp) notes on min-ready https://github.com/openstack-infra/nodepool/blob/master/nodepool/nodepool.py#L1389 [16:21:46] (03PS9) 10Awight: Use composer in DonationInterface hhvm tests [integration/config] - 10https://gerrit.wikimedia.org/r/301025 (https://phabricator.wikimedia.org/T141309) [16:21:56] (03PS2) 10Awight: Tests for DonationInterface/vendor submodule [integration/config] - 10https://gerrit.wikimedia.org/r/304875 (https://phabricator.wikimedia.org/T143025) [16:23:47] (03CR) 10Paladox: Use composer in DonationInterface hhvm tests (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/301025 (https://phabricator.wikimedia.org/T141309) (owner: 10Awight) [16:24:38] ostriches i was wrong yesturday about it correctly reporting the author of the patch for the gerritbot in phabricator [16:24:44] That is a bug [16:25:12] and is fix in gerrit 2.12.4 which will now tell you the person who created the patchset and linked it to the task in phabricator [16:27:29] hashar im wondering if you could take a look at https://gerrit.wikimedia.org/r/#/c/307033/ please? Im not sure if that is the correct way but it looks like it. Will it ignore the files i want it to ignore? [16:27:36] patch is 3 days old [16:28:05] (03CR) 10Awight: Use composer in DonationInterface hhvm tests (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/301025 (https://phabricator.wikimedia.org/T141309) (owner: 10Awight) [16:29:55] (03CR) 10Paladox: Use composer in DonationInterface hhvm tests (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/301025 (https://phabricator.wikimedia.org/T141309) (owner: 10Awight) [16:30:11] (03CR) 10Awight: "This is great, thanks for helping us continue to lint this repo. I like what the patch does, but will let CI maintainers weigh in on the " [integration/config] - 10https://gerrit.wikimedia.org/r/307033 (https://phabricator.wikimedia.org/T143598) (owner: 10Paladox) [16:31:44] (03PS10) 10Awight: Use composer in DonationInterface hhvm tests [integration/config] - 10https://gerrit.wikimedia.org/r/301025 (https://phabricator.wikimedia.org/T141309) [16:32:57] (03CR) 10jenkins-bot: [V: 04-1] Use composer in DonationInterface hhvm tests [integration/config] - 10https://gerrit.wikimedia.org/r/301025 (https://phabricator.wikimedia.org/T141309) (owner: 10Awight) [16:34:06] Yippee, build fixed! [16:34:07] Project selenium-MobileFrontend » chrome,beta,Linux,contintLabsSlave && UbuntuTrusty build #139: 09FIXED in 17 min: https://integration.wikimedia.org/ci/job/selenium-MobileFrontend/BROWSER=chrome,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/139/ [16:37:08] (03PS11) 10Awight: Use composer in DonationInterface hhvm tests [integration/config] - 10https://gerrit.wikimedia.org/r/301025 (https://phabricator.wikimedia.org/T141309) [16:37:23] (03CR) 10Paladox: Use composer in DonationInterface hhvm tests (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/301025 (https://phabricator.wikimedia.org/T141309) (owner: 10Awight) [16:38:15] (03CR) 10Paladox: Use composer in DonationInterface hhvm tests (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/301025 (https://phabricator.wikimedia.org/T141309) (owner: 10Awight) [16:39:05] 06Release-Engineering-Team (Deployment-Blockers), 13Patch-For-Review, 05Release: MW-1.28.0-wmf.17 deployment blockers - https://phabricator.wikimedia.org/T142117#2594955 (10greg) [16:39:23] (03CR) 10Awight: Use composer in DonationInterface hhvm tests (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/301025 (https://phabricator.wikimedia.org/T141309) (owner: 10Awight) [16:40:49] (03CR) 10Awight: Use composer in DonationInterface hhvm tests (031 comment) [integration/config] - 10https://gerrit.wikimedia.org/r/301025 (https://phabricator.wikimedia.org/T141309) (owner: 10Awight) [16:42:38] 05Continuous-Integration-Scaling, 13Patch-For-Review, 07Puppet: Hiera is not properly configured on Nodepool instances - https://phabricator.wikimedia.org/T129092#2594971 (10hashar) Last patch landed in puppet.git so that is definitely fixed. [17:10:05] hasharAway, https://integration.wikimedia.org/ci/job/npm-node-4/2867/console [17:10:37] bummer, canvas is not building in jenkins :( [17:18:37] (03PS1) 10Paladox: [DonationInterface] Switch jenkins tests to extension-unittests-composer-non-voting [integration/config] - 10https://gerrit.wikimedia.org/r/307543 [17:19:51] (03PS2) 10Paladox: [DonationInterface] Switch jenkins tests to extension-unittests-composer-non-voting [integration/config] - 10https://gerrit.wikimedia.org/r/307543 [17:25:19] (03CR) 10Paladox: "@Awight would you be able to take a look at this and +1 or -1 please?" [integration/config] - 10https://gerrit.wikimedia.org/r/307543 (owner: 10Paladox) [17:34:02] 10Continuous-Integration-Infrastructure, 06Operations: Upgrade jenkins-debian-glue on Jessie slaves from 0.13.0 to latest (0.17.0) - https://phabricator.wikimedia.org/T141114#2595202 (10hashar) Too many patches for today puppet swat. @fgiunchedi and I will sync tomorrow. [17:34:10] yurik: fill a task please! ::] [17:41:04] 10Continuous-Integration-Infrastructure (phase-out-gallium), 06Operations, 10hardware-requests: Allocate contint1001 to releng and allocate to a vlan - https://phabricator.wikimedia.org/T140257#2595280 (10thcipriani) >>! In T140257#2594023, @mark wrote: > we can't open arbitrary firewall holes between labs i... [17:52:49] twentyafterfour we could use puppet [17:52:58] to apply the token for acranist [17:53:11] but use the secret puppet repo for that. [17:54:38] paladox: afaik nodepool instances don't run puppet before they run tests [17:55:17] they run puppet once on the snapshot. and I tried to apply the token that way but I was told it won't work [17:56:07] cause puppet is run without the private repo [17:56:14] since the content of nodepool instances is defacto public [17:56:35] thanks hasharAway! [17:56:39] paladox: ^ [17:56:43] so either [17:56:55] we find a way to generate a token per run which is discarded once build is complete [17:57:07] or get arc unit to work without token / anonymously (unlikely) [17:57:11] or whatever else [17:57:24] or limit the token's usefulness [17:57:32] so that it can't be used for spamming [17:58:18] 290 examples, 127 failures [17:58:18] ^^ progress :] [17:58:22] I am disappearing again [18:04:07] 10Deployment-Systems, 06Release-Engineering-Team: List all SWAT deployers for each window (no longer segment) - https://phabricator.wikimedia.org/T144297#2595338 (10greg) [18:05:11] 10Deployment-Systems, 06Release-Engineering-Team, 15User-greg: List all SWAT deployers for each window (no longer segment) - https://phabricator.wikimedia.org/T144297#2595350 (10greg) [18:12:58] twentyafterfour what about doing what hashar says making arc patch, unit and lint anonymously [18:13:00] ? [18:16:21] twentyafterfour but then that brings us back to plain git/ github forking they estimate 80 hours of work [18:16:44] https://secure.phabricator.com/T6706 [18:16:59] if we could limit the api token by ip range, that might be good enough [18:16:59] Oh, or a ssh key [18:17:03] thanks [18:17:05] and yeh [18:17:12] PROBLEM - Puppet staleness on deployment-kafka03 is CRITICAL: CRITICAL: 16.67% of data above the critical threshold [43200.0] [18:17:14] anyway I'm not working on that right now, back to scap [18:17:14] maybe that requires customisation update for us [18:17:47] twentyafterfour or use the jessie slaves legotkm set up, for now doint use nodepool until the issue is fix. [18:18:24] PROBLEM - Puppet run on deployment-kafka03 is CRITICAL: CRITICAL: 14.29% of data above the critical threshold [0.0] [18:18:44] I didn't know we had jessie slaves besides nodepool [18:18:46] ;) [18:19:57] twentyafterfour yep, it was setup when nodepool started to fail a few weeks ago. [18:22:13] RECOVERY - Puppet staleness on deployment-kafka03 is OK: OK: Less than 1.00% above the threshold [3600.0] [18:23:25] RECOVERY - Puppet run on deployment-kafka03 is OK: OK: Less than 1.00% above the threshold [0.0] [18:34:09] 07Browser-Tests, 10MobileFrontend, 06Reading-Web-Backlog: Page diff test failing - https://phabricator.wikimedia.org/T144300#2595417 (10jhobs) [18:35:31] 07Browser-Tests, 10MobileFrontend, 06Reading-Web-Backlog: Page diff test failing - https://phabricator.wikimedia.org/T144300#2595433 (10jhobs) I'm a bit confused by this failure as it failed in both Chrome and Firefox in 138 but only fails in Firefox in 139... [18:40:05] !log Puppet busted on deployment-aqs01 -- Could not find data item analytics_hadoop_hosts in any Hiera data file and no default supplied at /etc/puppet/manifests/role/aqs.pp:46 [18:40:09] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [18:40:31] anybody know who takes care of aqs in beta cluster? [18:42:49] bd808: there's a big task about it: https://phabricator.wikimedia.org/T116206 [18:43:13] no resolution there though :( [18:43:51] so... yeah [18:54:02] 10Beta-Cluster-Infrastructure, 03Scap3 (Scap3-Adoption-Phase1), 10scap, 10Analytics, and 3 others: Set up AQS in Beta - https://phabricator.wikimedia.org/T116206#1743135 (10bd808) >>! In T116206#2582429, @elukey wrote: > Thanks for reporting, this is my bad since analytics_hadoop_hosts is not in hiera labs... [18:54:20] thcipriani: ^ I fixed puppet at least [18:54:42] twentyafterfour would this work by reusing (!PhabricatorEnv::isClusterRemoteAddress [18:54:46] woops wrong thing [18:54:47] this [18:54:51] https://github.com/phacility/phabricator/blob/814fa135b03606f44f8bc9036f5eaae1b355d083/src/applications/conduit/controller/PhabricatorConduitAPIController.php#L230 [18:55:16] and adding in there a check for which ever user we are using with the token [18:55:30] bd808: ack, thank you [18:57:51] !log Duplicate declaration: File[/srv/deployment] is already declared in file /etc/puppet/modules/contint/manifests/deployment_dir.pp:14; cannot redeclare at /etc/puppet/modules/service/manifests/deploy/common.pp:12 on node deployment-sca01.deployment-prep.eqiad.wmflabs [18:57:54] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [19:00:55] hiyaaa, is today's train starting soon? [19:01:17] ottomata: yeah in operations channel [19:01:48] RECOVERY - Puppet run on deployment-aqs01 is OK: OK: Less than 1.00% above the threshold [0.0] [19:06:25] 06Release-Engineering-Team (Deployment-Blockers), 05Release: Undefined index: mVersion in /srv/mediawiki/php-1.28.0-wmf.16/extensions/CentralAuth/includes/CentralAuthUser.php on line 494 - https://phabricator.wikimedia.org/T144307#2595577 (10hashar) [19:06:49] 06Release-Engineering-Team (Deployment-Blockers), 13Patch-For-Review, 05Release: MW-1.28.0-wmf.17 deployment blockers - https://phabricator.wikimedia.org/T142117#2523354 (10hashar) As soon as I have deployed that caused a large spam of {T144307} [19:08:22] Project selenium-MobileFrontend » chrome,beta,Linux,contintLabsSlave && UbuntuTrusty build #140: 04FAILURE in 17 min: https://integration.wikimedia.org/ci/job/selenium-MobileFrontend/BROWSER=chrome,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/140/ [19:09:29] 06Release-Engineering-Team (Deployment-Blockers), 13Patch-For-Review, 05Release: Undefined index: mVersion in /srv/mediawiki/php-1.28.0-wmf.16/extensions/CentralAuth/includes/CentralAuthUser.php on line 494 - https://phabricator.wikimedia.org/T144307#2595606 (10Paladox) Caused by https://phabricator.wikimedi... [19:09:36] I did my first rollback! [19:09:39] 06Release-Engineering-Team (Deployment-Blockers), 13Patch-For-Review, 05Release: Undefined index: mVersion in /srv/mediawiki/php-1.28.0-wmf.16/extensions/CentralAuth/includes/CentralAuthUser.php on line 494 - https://phabricator.wikimedia.org/T144307#2595610 (10Dereckson) [19:09:39] ever [19:09:46] LOL [19:10:54] 06Release-Engineering-Team (Deployment-Blockers), 13Patch-For-Review, 05Release: Undefined index: mVersion in /srv/mediawiki/php-1.28.0-wmf.16/extensions/CentralAuth/includes/CentralAuthUser.php on line 494 - https://phabricator.wikimedia.org/T144307#2595620 (10hashar) [19:11:45] 10Beta-Cluster-Infrastructure, 06Release-Engineering-Team, 10DBA, 13Patch-For-Review, 07WorkType-Maintenance: Upgrade mariadb in deployment-prep from Precise/MariaDB 5.5 to Jessie/MariaDB 5.10 - https://phabricator.wikimedia.org/T138778#2595623 (10dduvall) @jcrespo, great! I've set up a 2 hour calendar e... [19:13:22] !log Fixed puppet runs on deployment-sca0[12] with cherry-pick of https://gerrit.wikimedia.org/r/#/c/307561 [19:13:26] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [19:15:28] 10Beta-Cluster-Infrastructure, 03Scap3 (Scap3-Adoption-Phase1), 10scap, 10Analytics, and 3 others: Set up AQS in Beta - https://phabricator.wikimedia.org/T116206#2595633 (10bd808) [19:15:32] 10Beta-Cluster-Infrastructure, 07Puppet, 07Tracking: Deployment-prep hosts with puppet errors (tracking) - https://phabricator.wikimedia.org/T132259#2595632 (10bd808) [19:19:12] RECOVERY - Puppet run on deployment-sca01 is OK: OK: Less than 1.00% above the threshold [0.0] [19:23:18] 06Release-Engineering-Team (Deployment-Blockers), 13Patch-For-Review, 05Release, 07Wikimedia-log-errors: Undefined index: mVersion in /srv/mediawiki/php-1.28.0-wmf.16/extensions/CentralAuth/includes/CentralAuthUser.php on line 494 - https://phabricator.wikimedia.org/T144307#2595668 (10hashar) [19:23:28] RECOVERY - Puppet run on deployment-sca02 is OK: OK: Less than 1.00% above the threshold [0.0] [19:34:03] 06Release-Engineering-Team (Deployment-Blockers), 13Patch-For-Review, 05Release: MW-1.28.0-wmf.17 deployment blockers - https://phabricator.wikimedia.org/T142117#2595701 (10hashar) scap sync-wikiversions 'group0 to 1.28.0-wmf.17 (bis) T144307' [19:35:06] 06Release-Engineering-Team (Deployment-Blockers), 13Patch-For-Review, 05Release, 07Wikimedia-log-errors: Undefined index: mVersion in /srv/mediawiki/php-1.28.0-wmf.16/extensions/CentralAuth/includes/CentralAuthUser.php on line 494 - https://phabricator.wikimedia.org/T144307#2595703 (10hashar) a:05hashar>... [19:36:44] 06Release-Engineering-Team (Deployment-Blockers), 13Patch-For-Review, 05Release: MW-1.28.0-wmf.17 deployment blockers - https://phabricator.wikimedia.org/T142117#2595709 (10hashar) [19:36:46] 06Release-Engineering-Team (Deployment-Blockers), 13Patch-For-Review, 05Release, 07Wikimedia-log-errors: Undefined index: mVersion in /srv/mediawiki/php-1.28.0-wmf.16/extensions/CentralAuth/includes/CentralAuthUser.php on line 494 - https://phabricator.wikimedia.org/T144307#2595708 (10hashar) 05Open>03R... [19:36:54] (03PS2) 10Legoktm: Parsoid's tool and roundtrip tests should be on node v4 [integration/config] - 10https://gerrit.wikimedia.org/r/306710 (owner: 10Arlolra) [19:43:48] (03PS3) 10Legoktm: Parsoid's tool and roundtrip tests should be on node v4 [integration/config] - 10https://gerrit.wikimedia.org/r/306710 (owner: 10Arlolra) [19:47:04] (03CR) 10Legoktm: [C: 032] "Deploying 'parsoidsvc-deploy-parse-tool-check-jessie parsoidsvc-deploy-roundtrip-test-check-jessie parsoidsvc-source-parse-tool-check-jess" [integration/config] - 10https://gerrit.wikimedia.org/r/306710 (owner: 10Arlolra) [19:48:05] (03Merged) 10jenkins-bot: Parsoid's tool and roundtrip tests should be on node v4 [integration/config] - 10https://gerrit.wikimedia.org/r/306710 (owner: 10Arlolra) [19:48:15] !log deploying https://gerrit.wikimedia.org/r/306710 [19:48:18] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [19:51:12] hashar: ^ just fyi, that moved 4 parsoid jobs from nodepool trusty to nodepool jessie [19:51:27] chasemp: ^^ [19:52:07] legoktm: we moved the mediawiki/core phpcs job back to nodepool as well [19:52:18] awesome :) [19:52:21] legoktm: could you !log that in operations w/ nodepool in there somehere [19:52:28] we are trying to be very deliberate about it to watch load [19:52:28] okay [19:52:34] legoktm: and https://grafana.wikimedia.org/dashboard/db/zuul-job can gives you an idea of the build per / x time for a given job [19:53:57] ooh nice [19:54:48] hashar legoktm wonder if you could take a look at https://gerrit.wikimedia.org/r/#/c/227223/ please? [19:54:57] I tested it and it works [19:56:33] thanks legoktm [19:56:43] np [19:57:05] legoktm: there is a high suspicion that the rate of requests nodepool does to openstack causes it to explodes [19:57:15] or at least to uncover / hit a few corner case bugs [19:57:38] * legoktm nods [20:20:50] 10Continuous-Integration-Config, 10Fundraising-Backlog, 13Patch-For-Review: symfony-polyfill54 is breaking CI - https://phabricator.wikimedia.org/T143598#2595857 (10DStrine) p:05High>03Normal [20:23:52] (03CR) 10Paladox: "@awight I'm wondering if this is ok?" [integration/config] - 10https://gerrit.wikimedia.org/r/307543 (owner: 10Paladox) [20:27:57] PROBLEM - Puppet run on deployment-pdf01 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [20:28:01] PROBLEM - Puppet run on deployment-puppetmaster is CRITICAL: CRITICAL: 10.00% of data above the critical threshold [0.0] [20:28:19] PROBLEM - Puppet run on deployment-restbase02 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [20:28:21] PROBLEM - Puppet run on deployment-stream is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [20:28:51] PROBLEM - Puppet run on deployment-imagescaler01 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [20:29:07] PROBLEM - Puppet run on deployment-db2 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [20:29:07] I'm doing funky things with deployment-prep puppet [20:29:11] PROBLEM - Puppet run on deployment-pdf02 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [20:30:03] PROBLEM - Puppet run on deployment-sentry01 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [20:30:13] PROBLEM - Puppet run on deployment-ircd is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [20:30:13] PROBLEM - Puppet run on deployment-sca01 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [20:31:19] PROBLEM - Puppet run on deployment-memc04 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [20:32:55] PROBLEM - Puppet run on deployment-db03 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [20:33:02] PROBLEM - Puppet run on deployment-kafka01 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [20:33:06] PROBLEM - Puppet run on deployment-db1 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [20:33:30] PROBLEM - Puppet run on deployment-tmh01 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [20:34:04] PROBLEM - Puppet run on deployment-ms-be01 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [20:34:15] I've no irc foo on shutting shinken-wm up [20:34:24] PROBLEM - Puppet run on deployment-kafka03 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [20:35:03] kille dit [20:35:04] I think [20:42:03] yuvipanda: ? [20:42:59] sorry, been planning on doing https://phabricator.wikimedia.org/T120159 on deployment-prep today with bd808 and krenair in different channels, forgot this channel has no context [20:43:06] killed the shinken-wm bot [20:43:36] you just have to follow all of the channels we are in to keep up ;) [20:43:39] 20:42:34 Gem::RemoteFetcher::FetchError: Errno::ETIMEDOUT: Connection timed out - connect(2) for "rubygems.global.ssl.fastly.net" port 443 (https://rubygems.org/gems/childprocess-0.5.9.gem) [20:46:26] legoktm mutante had that problem too [20:46:31] in -operations [20:46:33] I alwasys forget about -labs-admin [20:46:52] he just retryed and it worked [20:52:07] !log cherry-picked appropriate patch on deployment-puppetmaster for T120159, did https://wikitech.wikimedia.org/w/index.php?title=Hiera:Deployment-prep/host/deployment-puppetmaster&oldid=818847 to make sure the puppetmaster allows connections from elsewhere [20:52:11] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [20:52:29] 10Continuous-Integration-Infrastructure (phase-out-gallium), 06Operations, 10hardware-requests: Allocate contint1001 to releng and allocate to a vlan - https://phabricator.wikimedia.org/T140257#2595926 (10hashar) Regarding the use of a public IP: gallium had one for historical reasons and all uses have been... [20:56:46] bd808 greg-g I'll keep logging here :) [20:56:58] good :) [20:57:12] there's db01, 02 entries that probably is just stale LDAP, but I won't get sucked into that rabbit hole today [21:01:50] every time I run a command on more than 1 host at a time I feel like I'm gonna fuck up big tie [21:02:42] (03CR) 10Awight: [C: 031] "Nice fix, thanks!" [integration/config] - 10https://gerrit.wikimedia.org/r/307543 (owner: 10Paladox) [21:06:22] !log moved deployment-db[12], deployment-stream to not use role::puppet::self, attempting to semi-automate rest [21:06:26] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [21:15:57] !log deployment-pdf02 has proper ssl certs mysteriously without me doing anything [21:16:02] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [21:16:03] !log deployment-pdf01 fixed manually [21:16:06] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [21:33:21] (03CR) 10Paladox: "Your welcome." [integration/config] - 10https://gerrit.wikimedia.org/r/307543 (owner: 10Paladox) [21:36:38] !log managed to get vim into a state where I can not quit it, probably recording a macro. I hate computers [21:36:42] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [21:37:19] !log sudo takes like 15s each time, is there no god? [21:37:23] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [21:39:50] yuvipanda: there must be something funky in your dotfiles? [21:40:06] bd808 across all different projects, with and without NFS, no dotfiles [21:40:20] I'm pretty sure it's because I'm in a lot of projects and hence a lot of groups [21:40:20] hmmm [21:42:52] `time sudo -i ls` in deployment-prep says 0.184s for me. I'm "only" in 58 groups though [21:43:08] vs 157 for you [21:43:10] yeah we have a cherry pick on beta to slowdown yuvi dsh commands [21:43:46] maybe you are forwarding your agent and the millions of keys you have ? bastion-restricted is funky ? [21:43:59] too many lookups in ssh due to use of root account [21:43:59] !log use clush to fix puppet.conf of all clients, realize also accidentally set a client's puppet.conf for the server, recover server's old conf file from a cat in shell history, restore, breathe sigh of relief [21:44:04] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [21:44:15] poor puppet.conf :( [21:45:50] fun one-liner to find your group count -- id|cut -d= -f4|tr , "\n"|wc - [21:45:59] errr -l [21:46:18] I should probably just remove myself from all these groups [21:47:32] I'm in 88 projects [21:47:36] I should remove most of them [21:52:36] hi would someone from releng review [21:52:36] https://gerrit.wikimedia.org/r/#/c/306413/ please? [22:06:52] sleep time for me *wave* [22:07:07] bd808 I'm down to a more manageable 93 groups now! [22:07:10] and sudo is faster! [22:07:16] I basically just kicked myself out of most projects [22:07:22] +1 [22:08:00] I need to bounce from a bunch too. being added as a side effect of creating the project is kind of lame [22:08:40] ya [22:14:08] bd808 I think everything except elasticsearch instances should be good now [22:14:11] and etcd [22:14:24] I'm looking through the problems on http://shinken.wmflabs.org/problems?global_search=deployment- [22:14:34] the fact that it doesn't show the full name is so annyoing [22:14:37] * yuvipanda hates computers [22:15:21] what's the problem with the es cluster? [22:15:42] bd808 it's using funky puppet ssldir stuff [22:15:48] I've a patch coming up for that [22:16:21] oh. for the ssl cert that makes their nginx proxy work? [22:16:32] I've not looked at what they did for that [22:17:37] yeah, it relies on puppet_ssldir function which depends on looking up the puppetmaster ldap variable [22:29:03] !log in lieu of blood sacrifice, restart puppetmaster on deployment-pupetmaster [22:29:07] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [23:02:57] something seems to be broken with Ruby on CI: https://integration.wikimedia.org/ci/job/rake/3085/console [23:03:37] SMalyshev: s/on CI// [23:03:38] ;-) [23:03:46] SMalyshev a recheck should probly work. [23:03:52] it is happening to other users [23:04:01] well, I'm not going that far... :) [23:04:12] paladox: unfortunately it didn't - that's after recheck [23:04:16] oh [23:04:26] sometimes wait a few minutes then do another rech [23:04:28] rechec [23:04:30] recheck [23:04:49] will try it... [23:05:38] Thanks [23:18:39] (03PS1) 10Paladox: Add rm -fR "$WORKSPACE/modules/*/bin" to jenkins job operations-puppet-doc [integration/config] - 10https://gerrit.wikimedia.org/r/307654 (https://phabricator.wikimedia.org/T143233) [23:19:00] (03PS2) 10Paladox: Add rm -fR "$WORKSPACE/modules/*/bin" to jenkins job operations-puppet-doc [integration/config] - 10https://gerrit.wikimedia.org/r/307654 (https://phabricator.wikimedia.org/T143233) [23:20:10] still broken :( https://integration.wikimedia.org/ci/job/rake/3097/console [23:20:28] Guessing ruby are having problems. [23:22:40] http://help.rubygems.org/discussions/problems/22609 [23:22:41] found it [23:22:44] will comment there [23:23:01] oh thanks [23:23:10] legoktm thanks for looking into that :) [23:24:10] hmm... looks like rerun of that npm trouble some time ago?.. [23:26:49] 10Continuous-Integration-Infrastructure: timeouts with rubygems.global.ssl.fastly.net causing jobs to fail - https://phabricator.wikimedia.org/T144325#2596308 (10Legoktm) [23:27:24] not sure [23:27:30] I wonder if this is rate limiting? [23:27:44] oh that looks like it [23:28:03] I really hope ruby revert that [23:28:06] apparently gitlab has a cache? [23:28:09] could be. or de-facto rate limiting aka DoS (intentional or not) [23:28:11] er, proxy cache thing [23:31:45] !log cherry-picking https://gerrit.wikimedia.org/r/#/c/307656/ fixed puppet on the elasticsearch machines! [23:31:49] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL, Master [23:34:03] legoktm http://stackoverflow.com/questions/21364283/gemremotefetcherunknownhosterror-while-installing-rails-version-3-2-15 ? [23:38:37] legoktm it seems pinging ruby is not working [23:38:44] taking a long time to complete [23:39:02] C:\WINDOWS\system32>ping rubygems.org [23:39:02] Pinging rubygems.org [54.186.104.15] with 32 bytes of data: [23:39:02] Request timed out. [23:39:02] Request timed out. [23:39:02] Request timed out. [23:39:03] Request timed out. [23:39:05] Ping statistics for 54.186.104.15: [23:39:09] Packets: Sent = 4, Received = 0, Lost = 4 (100% loss), [23:40:02] bd808 krenair \o/ I think I've sorted out all the deployment-prep instances! [23:40:05] maybe they're blocking ping? :S [23:40:43] oh [23:43:47] alright, verified [23:43:57] I declare my messing with puppet on deployment-prep over! [23:44:15] I have one cherry pick, which I'll merge after finishing up doing https://phabricator.wikimedia.org/T120159 on integration, etcd and wikidata-query [23:44:17] tomorrow! [23:47:58] 'rake' is running and breaking unrelated mediawiki commits [23:48:02] https://gerrit.wikimedia.org/r/#/c/305933/ [23:49:34] Krinkle see https://phabricator.wikimedia.org/T144325 please [23:50:49] Is it fixable soon? Can we disable that job for now? [23:51:01] looks like rake is back on track... at least for my patch :) [23:51:21] which means probably some rubygems host was/is having load issues [23:51:28] Krinkle looks like we have to disable it, since it seems to be ruby side that is causing the problem which we have no control over [23:51:31] It should probably run conditionally, e.g. when a ruby file is changed. [23:51:41] legoktm wonder can we disable ruby on mw please? [23:51:46] That way at least the rest can keep getting reviewed and merged [23:51:47] per ^^ [23:52:00] We have similar filters for .js and .php in various places [23:53:50] Oh but ruby can do all types of things i think, but not sure since i never used ruby [23:56:35] yeah, we should set up a filter. [23:56:55] yep [23:57:46] legoktm greg-g is contintcloud / nodepool doing ok? [23:58:03] yuvipanda: I think so? haven't heard any complaints... [23:58:12] ok!