[00:01:01] (and you'll see that through the process, they are active, then offline, then gone, as they are generated, tests run on them, and deleted) [00:01:20] That is the most safest way. [00:01:53] Does this happen on a regular basis :P [00:02:09] not all the time but it can happen [00:02:30] RECOVERY - nodepoold running on labnodepool1001 is OK: PROCS OK: 1 process with UID = 113 (nodepool), regex args ^/usr/bin/python /usr/bin/nodepoold -d [00:02:35] does what happen? [00:02:35] It's that damn Chaos Monkey [00:02:52] greg-g: nodepool acting up [00:03:08] well, it depends [00:03:18] Well if theres anything i can do ring me [00:03:19] this is probably an issue with labs right now [00:03:33] LOl, and not all the time. Zuul could be the problem, jenkins. [00:04:46] Wait greg-g i'm noticing jessie's instances going offline while they are working on an job [00:05:01] Yep [00:05:02] don't worry [00:05:06] It does that [00:05:15] It deletes and recreates them [00:05:30] thats the weirdest process i've every heard of but ok :P [00:05:45] Actually that is why it is the safest. [00:06:56] it appears jenkins finnaly catching up a bit its postmerging 2 jobs atm [00:07:49] yes, it's working again [00:08:19] it will prob take a few hrs to clear the queue up tho [00:08:32] yay works [00:08:34] and faster [00:09:23] Noticing mediawiki-core-jsduck-publish is taking a bit longer then the other jobs [00:10:41] legoktm it is working again. Would we be able to bring mw core and composer back to nodepool please? [00:11:23] I suggest maybe waiting a bit longer [00:11:23] let a few more jobs clear the queue [00:11:47] yeh [00:12:21] * paladox goes back to watching tv 01:12am bst here [00:12:22] :) [00:12:34] paladox: lets wait for the backlog to clear first [00:12:46] Yeh [00:12:53] * Zppix tottaly didnt read legoktm's mind [00:12:59] legoktm i created the revert commits [00:13:05] please ignore my -1 [00:13:28] and do it when ever you feel the best to do it. IE after all the back log is cleared and you have time [00:13:30] :) [00:13:51] (03PS1) 10Paladox: Revert "Move mediawiki-core-phpcs off of nodepool" [integration/config] - 10https://gerrit.wikimedia.org/r/304150 [00:14:25] (03PS2) 10Paladox: Revert "Temporarily move composer-hhvm/php5 jobs off of nodepool" [integration/config] - 10https://gerrit.wikimedia.org/r/304145 [00:14:30] (03CR) 10Paladox: Revert "Temporarily move composer-hhvm/php5 jobs off of nodepool" [integration/config] - 10https://gerrit.wikimedia.org/r/304145 (owner: 10Paladox) [00:14:35] paladox: legoktm: for now, let's leave things the way they are and not move jobs back to nodepool unless there are unforseen test failure issues [00:14:54] greg-g: i was thinking that too [00:15:03] ok sorry [00:15:29] Right now nothing is exploding or on fire so i think we can afford a bit of a wait [00:16:09] * Platonides sets ban on *!*@95.141.36.119 [00:16:09] * Platonides has kicked _jem_ from #mediawiki (_jem_) [00:16:20] wrong place [00:16:21] sorry [00:16:46] I was confused there a bit paladox [00:16:51] oh [00:16:59] that is for -operations [00:18:35] is trusty down? alot of jobs are queued for it and have been for a few hrs... is this a nodepool issue as well? [00:19:02] there are only two trusty nodepool [00:19:07] oh [00:19:17] slaves, so they may be the slowess to catch up [00:19:34] but most trusty slaves should be back with instances [00:19:38] 07Browser-Tests, 03Reading-Web-Sprint-78-Terminal-Velocity: Various browser tests failing due to login error - https://phabricator.wikimedia.org/T142600#2542578 (10Tgr) >>! In T142600#2541959, @greg wrote: > I presume it is User:Selenium_user on beta cluster, yes? No, I tested that in T142141#2538427. The ge... [00:20:04] Can i contribute some slaves? [00:20:26] I doint think that will work. But i guess that will be up to greg-g and releng [00:20:46] oh like i said still learning the CI ways [00:20:51] Zppix: no, not really the issue right now [00:21:00] kk [00:21:08] but thanks [00:21:12] alright, I'm headed out [00:21:13] np [00:21:15] until tomorrow [00:21:54] * paladox deffitly goes back to watching tv. [00:28:58] 10MediaWiki-Codesniffer: Update squizlabs/PHP_CodeSniffer to 3.x - https://phabricator.wikimedia.org/T142474#2542588 (10Legoktm) Once the GSoC stuff finishes. [00:38:37] 47 test jobs in queue atm doesnt seem to be clearing up as much as we though [00:38:39] thought* [00:41:29] gate queue is prioritized [01:24:17] mediawiki-core-doxygen-publish seems to be whats really slowing postmerge process down [01:25:22] yeh because in postmerge the jobs are limited to run once so it is slow, but allows us to direct the resources to the other tests [01:25:24] :) [01:25:42] * paladox goes back to watch tv [01:25:53] 02:25am here lol [01:26:49] paladox: still if it wasnt so slow i'm sure it would speed up the queue a bit seeing there wouldnt be 2 jobs in the postmerge process on average [01:28:14] https://gerrit.wikimedia.org/r/275169 has been sitting in test queue with jobs queued for 3+ hrs now [01:33:31] Love what im seeing i think jenkins is starting to kick instead of being dormat [02:17:37] Yippee, build fixed! [02:17:38] Project selenium-QuickSurveys » chrome,beta,Linux,contintLabsSlave && UbuntuTrusty build #115: 09FIXED in 4 min 36 sec: https://integration.wikimedia.org/ci/job/selenium-QuickSurveys/BROWSER=chrome,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/115/ [02:19:51] cool wmf-insecte [02:26:44] RECOVERY - Host deployment-parsoid05 is UP: PING OK - Packet loss = 0%, RTA = 0.99 ms [02:31:09] PROBLEM - Host deployment-parsoid05 is DOWN: CRITICAL - Host Unreachable (10.68.16.120) [03:17:20] Hey the queue died down only about 4 jobs remaining would it be okay to boot the nodepool back up? [04:06:41] Yippee, build fixed! [04:06:41] Project selenium-MultimediaViewer » safari,beta,OS X 10.9,contintLabsSlave && UbuntuTrusty build #105: 09FIXED in 10 min: https://integration.wikimedia.org/ci/job/selenium-MultimediaViewer/BROWSER=safari,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=OS%20X%2010.9,label=contintLabsSlave%20&&%20UbuntuTrusty/105/ [04:14:55] (03PS7) 10Lethexie: Add usage to forbid superglobals like $_GET,$_POST [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/296395 [04:21:15] (03CR) 10Chad: "Good idea for $_GET, $_POST. Probably should add $_REQUEST, $_COOKIE and $_SESSION while we're here." [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/296395 (owner: 10Lethexie) [04:49:33] PROBLEM - SSH on deployment-cache-upload04 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [05:28:01] 10Continuous-Integration-Infrastructure, 06Release-Engineering-Team, 15User-greg: Create incident report for CI outage on Aug 7th - https://phabricator.wikimedia.org/T142677#2542821 (10greg) [05:37:40] 10Deployment-Systems, 06Operations: error on tin:/srv/mediawiki-staging: insufficient permission for adding an object to repository database .git/objects - https://phabricator.wikimedia.org/T127093#2542840 (10Dzahn) this should mean that the "root owned files on staging"-issue is over with now... :) [05:40:05] 10Deployment-Systems, 06Operations: error on tin:/srv/mediawiki-staging: insufficient permission for adding an object to repository database .git/objects - https://phabricator.wikimedia.org/T127093#2542845 (10greg) p:05Triage>03Normal [05:40:17] 10Deployment-Systems, 06Operations: error on tin:/srv/mediawiki-staging: insufficient permission for adding an object to repository database .git/objects - https://phabricator.wikimedia.org/T127093#2032127 (10greg) a:03thcipriani [05:42:28] 05Gitblit-Deprecate, 10MediaWiki-extensions-WikibaseRepository, 10Wikidata: Wikiba.se git links broken - https://phabricator.wikimedia.org/T142678#2542847 (10jayvdb) [05:44:13] 10Deployment-Systems, 06Operations: error on tin:/srv/mediawiki-staging: insufficient permission for adding an object to repository database .git/objects - https://phabricator.wikimedia.org/T127093#2542861 (10Dzahn) ``` [tin:/srv/mediawiki-staging] $ find . -uid 0 ./.git/refs/remotes/readonly/master ./.git/obj... [06:07:57] 10Deployment-Systems, 06Operations: error on tin:/srv/mediawiki-staging: insufficient permission for adding an object to repository database .git/objects - https://phabricator.wikimedia.org/T127093#2542920 (10greg) Just to spite you, I think: ``` 06:01 < icinga-wm> PROBLEM - Improperly owned -0:0- files in /... [07:52:48] 05Gitblit-Deprecate, 10MediaWiki-extensions-WikibaseRepository, 10Wikidata: Wikiba.se git links broken - https://phabricator.wikimedia.org/T142678#2543054 (10Lydia_Pintscher) a:03JeroenDeDauw [08:28:40] 10Continuous-Integration-Config, 07Jenkins, 07Puppet: There is no sane way to get arcanist's conduit tokens onto nodepool CI slaves - https://phabricator.wikimedia.org/T140417#2543103 (10mmodell) [08:31:29] 06Release-Engineering-Team (Long-Lived-Branches), 03Scap3: make scap3 look in PWD to find local CLI extensions - https://phabricator.wikimedia.org/T142590#2543110 (10mmodell) [08:32:36] 06Release-Engineering-Team (Long-Lived-Branches), 03Scap3 (Scap3-MediaWiki-MVP): make scap3 look in PWD to find local CLI extensions - https://phabricator.wikimedia.org/T142590#2540325 (10mmodell) [08:32:38] 06Release-Engineering-Team (Long-Lived-Branches), 03Scap3 (Scap3-MediaWiki-MVP): Deploy mediawiki release tools repo (rMREL) with scap3 - https://phabricator.wikimedia.org/T142588#2543113 (10mmodell) [09:13:59] 03Scap3 (Scap3-MediaWiki-MVP), 10releng-201516-q3, 03releng-201617-q2, 10scap, 07WorkType-NewFunctionality: [keyresult] Migrate the MW weekly train deploy to scap3 - https://phabricator.wikimedia.org/T114313#2543223 (10mmodell) [09:14:01] 10Deployment-Systems, 03Scap3: More atomic directory operations - https://phabricator.wikimedia.org/T141913#2543225 (10mmodell) [09:20:33] 03Scap3 (Scap3-MediaWiki-MVP), 10scap: Scap should touch symlinks when originals are touched - https://phabricator.wikimedia.org/T126306#2010701 (10mmodell) The more I think about it, the more I believe that scap should really be implemented like a build system - or it should call some kind of build system dur... [10:12:17] 10Beta-Cluster-Infrastructure: Quiz Extension in ca.wikipedia.beta.wmflabs.org - https://phabricator.wikimedia.org/T142692#2543311 (10Toniher) [11:01:05] 10Continuous-Integration-Config, 10MediaWiki-extensions-Other, 13Patch-For-Review: [LightweightRDFa] The module 'ext.wikiEditor' required by 'ext.LightweightRDFa.button' must exist - https://phabricator.wikimedia.org/T93727#2543476 (10Paladox) [11:01:55] 10Continuous-Integration-Config, 10MediaWiki-extensions-Other: Foxway: Module 'ext.Foxway.DebugLoops' must not depend on 'jquery' - https://phabricator.wikimedia.org/T93724#2543478 (10Paladox) [11:27:22] RECOVERY - puppet last run on gallium is OK: OK: Puppet is currently enabled, last run 37 seconds ago with 0 failures [11:47:41] 05Gitblit-Deprecate, 10MediaWiki-extensions-WikibaseRepository, 10Wikidata: Wikiba.se git links broken - https://phabricator.wikimedia.org/T142678#2542847 (10Danny_B) {T139027} related [11:51:32] 05Gitblit-Deprecate, 10MediaWiki-extensions-WikibaseRepository, 10Wikidata: Wikiba.se git links broken - https://phabricator.wikimedia.org/T142678#2543609 (10jayvdb) >>! In T142678#2543592, @Danny_B wrote: > {T139027} related not really. this is not a blob. For this one, /summary/ needs to be removed from... [11:51:59] 05Gitblit-Deprecate, 10Diffusion: Redirect git.wikimedia.org HEAD URLs to Diffusion - https://phabricator.wikimedia.org/T141965#2543612 (10Danny_B) @Dzahn (I guess you took care of the deployment of previous ruleset) - Can this be deployed or anything else is necessary? Thank you. [11:52:42] Anyone around who could check a php segfault in a testextension composer job for me? [11:52:42] https://integration.wikimedia.org/ci/job/mwext-testextension-php55-composer/4137/console [11:53:26] hoo see https://phabricator.wikimedia.org/T142158 please [11:54:17] ah, thanks [11:54:53] Your welcome [11:56:38] 05Gitblit-Deprecate, 06Operations: Gitblit links not redirecting to the correct moved resource unless .git is part of repo name in url - https://phabricator.wikimedia.org/T139027#2543632 (10Danny_B) [12:06:26] 10Continuous-Integration-Infrastructure, 10MediaWiki-Unit-tests, 07Regression: Job mediawiki-extensions-php55 frequently fails due to "Segmentation fault" - https://phabricator.wikimedia.org/T142158#2525383 (10hoo) This is really annoying as it happens far more often than not. I've seen similar segfaults on... [12:11:45] 10Continuous-Integration-Infrastructure, 10MediaWiki-Unit-tests, 07Regression: Job mediawiki-extensions-php55 frequently fails due to "Segmentation fault" - https://phabricator.wikimedia.org/T142158#2543706 (10Paladox) Maybe it is mediawiki coursing the segfault? [12:16:20] 10MediaWiki-Codesniffer, 07Upstream: Generic.Functions.CallTimePassByReference.NotAllowed false positive when using php5.4 array syntax [] - https://phabricator.wikimedia.org/T127163#2543717 (10Aklapper) 05Open>03Resolved a:03Aklapper Closed in upstream; merged in downstream... I'd say so. Thanks! :) [12:17:13] 10MediaWiki-Codesniffer, 07Upstream: Generic.Functions.CallTimePassByReference.NotAllowed false positive when using php5.4 array syntax [] - https://phabricator.wikimedia.org/T127163#2543723 (10Paladox) Thanks :). [12:21:29] PROBLEM - Puppet run on deployment-cache-text04 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [12:22:58] Yippee, build fixed! [12:22:58] Project selenium-GettingStarted » firefox,beta,Linux,contintLabsSlave && UbuntuTrusty build #110: 09FIXED in 57 sec: https://integration.wikimedia.org/ci/job/selenium-GettingStarted/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/110/ [13:41:40] jenkins seems not to add reviewers automatically since yesterday failure [13:42:03] valhallasw`cloud ^^ [13:42:51] dcausse: yes, it's hanging on a non-ascii username. Fun! :-) [13:42:53] * valhallasw`cloud takes a look [13:43:03] valhallasw`cloud: thanks! :) [13:44:00] Thanks [13:51:06] thcipriani|afk: about? [13:53:47] dcausse: backlog processed! :-) [13:54:36] chasemp: eh, I can be for a few—what's up? [13:54:44] valhallasw`cloud: thanks :) [13:55:05] thcipriani|afk: not an emergency just an fyi I'm restarting nodepool to try to get it to give me some stats, didn't notice the afk in the name at first :) [13:55:29] heh, kk, np—nodepool restart should be fine :) [13:55:50] fyi it actually does purge build state VMs from restart according to the source [13:55:57] so it has some small invasive consequences I think [13:56:19] nsure if that stalls out those tests or what atm [13:56:53] hmm, didn't see that yesterday, but things were all kinds of stuck so I may have missed it. [13:57:10] I didn't have to clear/restart any jenkins FWIW [13:57:26] *jenkins jobs [13:58:57] it must requeue the job I think, I only noticed as I was reading teh spin up code and it explicitly says basically "if you restart me I cannot resume setup on a building host so I move to delete" [13:59:17] i.e. they don't keep state mid-build so it's all or nothing [13:59:41] neither here nor there just curious [14:00:33] 06Release-Engineering-Team, 10ArchCom-RfC, 06Developer-Relations, 06WMF-Legal, 07RfC: Create formal process for CREDITS files - https://phabricator.wikimedia.org/T139300#2543984 (10RobLa-WMF) [14:01:18] ah, interesting. [14:01:44] 06Release-Engineering-Team, 10ArchCom-RfC, 06Developer-Relations, 06WMF-Legal, and 2 others: Create formal process for CREDITS files - https://phabricator.wikimedia.org/T139300#2426179 (10RobLa-WMF) We briefly discussed this in the last couple of planning meetings (E250, E258). I'm taking on #ArchCom-RFC... [14:02:45] I suppose that makes sense if the priority is making sure that tests never fail due to setup failures. [14:03:00] * thcipriani|afk goes back to oatmeal :) [14:03:53] thcipriani|afk: https://graphite.wikimedia.org/render/?width=960&height=487&_salt=1470924134.358&target=nodepool.launch.ready.count&target=nodepool.provider.wmflabs-eqiad.max_servers&from=-1h [14:03:56] for after oatmeal :) [14:51:11] chasemp: hrm. nodepool list shows 6 instances in the ready state. Unclear why it doesn't just use what the limit is though. [14:52:10] there is some amount of churn time possibly the other thing I find odd is afaict nodepool does keep state on it's own allocated instances for it's own thigns (ready/build/delete/etc) but [14:52:23] it still continually asks for more nodes from teh provider like [14:52:34] I know I have 10 avail, and I know I"m using 10, I better go ask if I can have more [14:52:46] and that seems to be expected behavior but then it's treated as an exception [14:52:48] so teh logs are a mess [14:52:56] I haven't fully groked the reasoning here [14:53:13] it's a sort of twist on ask forgiveness not permission but seems really...stupid maybe [14:53:29] anyway, our use case I think is a bit novel [14:53:31] yeah, definitely not an exceptional condition (I think) [14:53:52] Nodepool was re-enabled for somethings ? [14:54:11] I don't remember seeing that in the log before yesterday, although I've only really looked at logs in times where there were problems. [14:55:09] Zppix: not afaik, the plan (last night anyway) was to leave jobs on the integration project vms unless there is unexpected breakage there. [14:55:47] (last night PST) [14:55:47] Ok... sorry I just woke I remeber that now i was on here when all the problems arose [14:55:57] Im in CST [14:56:25] yup, np, wasn't sure if something happened over my night of which I wasn't aware :) [14:56:33] (in MST myself) [14:56:45] thcipriani: same here, I'm biased towards only looking in times of strife [14:57:10] I'm new to CI stuff so im learning and inputting my opinions as mucuh as possible [14:57:14] much* [14:58:20] chasemp: yeah, nodepool had sort of been a hashar joint until it started having problems. I know the basics of troubleshooting it, but I'm far from an expert/understanding where its sometimes unintuitive logic comes from. [14:58:29] PROBLEM - Puppet run on deployment-sca02 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [14:58:43] Im bst, 8-6 hours ahead of you. [14:58:56] Well hello agani paladox [14:59:04] Lol, hi [14:59:12] I have been on here all day. [14:59:15] PROBLEM - Puppet run on deployment-sca01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [14:59:22] wow agani? I need to sleep some more :P [14:59:45] LOL, it is 15:59pm bst here. [15:00:06] that puppet issue cant be good [15:01:50] It is almost dinner time in the uk [15:04:01] :) [15:05:48] The errors would have been in mine early morning 12am - 1am. [15:07:23] Jenkins is back to somewhat normal with nodepool being somewhat disabled... nothing appears to be broken from what i'm gathering [15:07:52] yep [15:08:15] I'm noticing a bit of a slowdown on some jobs but thats probably due to nodepool [15:08:46] oh [15:09:09] nodepool isn't disabled, we turned it back on last night... [15:09:21] (and there's no slowdown in jobs, they're just none running right now :)) [15:09:47] s/they're/there's/ [15:10:28] ostriches: i know what i meant was last night my time (cst) jessie was being really slow unlike usual [15:11:01] And i thought nodepool was still disabled? [15:11:04] Zppix it would affect nodepool, so jessie and trusty [15:11:15] oh [15:11:17] since they would create instances on the fly and delete them [15:11:41] whereas normal instances are already created so they woulden have a problem [15:11:53] i think u already told me this paladox... I'm tired atm so i dont remeber [15:12:01] Ok [15:12:02] (i probably sound like a broken record) [15:12:05] LOL [15:12:21] * paladox was up until 2am [15:12:23] bst [15:12:33] I'm just trying to understand CI so if i get annoying just tell me to shut up :P [15:12:38] http://www.timeanddate.com/time/zones/bst [15:13:03] Zppix your free to ask questions. [15:13:19] paladox: ik i just hate getting annoying and preventing ppl from working on important stuff [15:13:32] Oh [15:13:41] its a pet peeve [15:13:51] Oh [15:14:36] I've not even touched .php files in MW core i've touched on documentation and l18n files atm [15:14:43] oh [15:14:53] actually i take that back [15:14:57] ok [15:15:00] i added some localisation to a .php file [15:15:01] thats it [15:15:08] oh [15:15:47] According to timeanddate the uk is a island LOL [15:16:27] it technically is [15:16:46] LOL, but it is probaly the biggest island [15:16:50] yes [15:16:58] paladox: ever heard of Australia? ;-) [15:17:02] Yeh [15:17:09] It is part of the union [15:17:18] Our queen is there head of state [15:17:41] union as in commenwealth [15:18:52] valhallasw`cloud: Where's Australia? Never heard of it.... [15:18:56] LOL [15:19:00] Is it big? [15:19:03] It is next to new zealand [15:19:20] https://en.wikipedia.org/wiki/Australia [15:20:52] paladox: u beat me to the link :/ [15:20:59] LOL [15:20:59] :) [15:22:07] https://67.media.tumblr.com/de53c7504bd663c6ce0812c8ddc4a23d/tumblr_noc9rsSXP91uvbiero1_400.png [15:22:13] ostriches you must of herd of australia, have you even herd of the uk? [15:22:21] ^ Basically how Americans view the world ;-) [15:22:26] LOL [15:22:30] Im american [15:24:28] Im american and british, born in america but also have a british passport :) [15:25:13] paladox: my mom got a friend whom lives in UK [15:25:15] ostriches is https://67.media.tumblr.com/de53c7504bd663c6ce0812c8ddc4a23d/tumblr_noc9rsSXP91uvbiero1_400.png that how you view the world [15:25:17] and oh [15:25:31] target deliver to the uk [15:25:47] But £25 pounds for delivery lol [15:26:35] I'm sticking around here so if anyone needs anything feel free to ring me [15:26:56] But the funny part is [15:27:16] i live near to where george washingtons familly came from [15:27:30] in northamptonshire :) [15:28:42] 06Release-Engineering-Team (Deployment-Blockers), 05Release: MW-1.28.0-wmf.10 deployment blockers - https://phabricator.wikimedia.org/T139211#2544246 (10bd808) [15:28:44] 06Release-Engineering-Team (Deployment-Blockers), 05Release: MW-1.28.0-wmf.9 deployment blockers - https://phabricator.wikimedia.org/T138555#2544247 (10bd808) [15:37:22] :) [15:45:12] 10Continuous-Integration-Config, 06Release-Engineering-Team, 10DBA, 10Datasets-General-or-Unknown, and 3 others: Automatize the check and fix of object, schema and data drifts between production masters and slaves - https://phabricator.wikimedia.org/T104459#2544318 (10ArielGlenn) https://github.com/apergos... [15:45:47] Zppix ^^ [15:45:56] ? [15:46:58] Zppix Nothing, just that i live near to where george washington came from, and i am american. [15:47:14] oh cool [15:47:23] i thought u were wanting me to open that phab task or something :P [15:47:48] No [15:53:10] (03PS1) 10BryanDavis: Add support for Depends-On statements [integration/commit-message-validator] - 10https://gerrit.wikimedia.org/r/304247 (https://phabricator.wikimedia.org/T142672) [15:53:12] (03PS1) 10BryanDavis: Add python artifacts to .gitignore [integration/commit-message-validator] - 10https://gerrit.wikimedia.org/r/304248 [15:53:14] (03PS1) 10BryanDavis: Change check_message_ok test text [integration/commit-message-validator] - 10https://gerrit.wikimedia.org/r/304249 [15:57:56] PROBLEM - Puppet run on integration-slave-trusty-1013 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [16:07:09] 05Gitblit-Deprecate, 06Release-Engineering-Team, 06Operations: Clones from git.wikimedia.org are not redirected - https://phabricator.wikimedia.org/T139206#2422729 (10jayvdb) I just ran into this, via a sysadmin who was previously using git branches to keep his wiki extensions stable. Has there been a #sysad... [16:08:08] 05Gitblit-Deprecate, 06Release-Engineering-Team: Clones from git.wikimedia.org are not redirected - https://phabricator.wikimedia.org/T139206#2544390 (10greg) [16:21:39] 10Deployment-Systems, 06Operations, 13Patch-For-Review: error on tin:/srv/mediawiki-staging: insufficient permission for adding an object to repository database .git/objects - https://phabricator.wikimedia.org/T127093#2544408 (10Dzahn) 09:26 < icinga-wm> RECOVERY - Improperly owned -0:0- files in /srv/mediaw... [16:26:09] 05Gitblit-Deprecate, 06Release-Engineering-Team: Clones from git.wikimedia.org are not redirected - https://phabricator.wikimedia.org/T139206#2544413 (10Danny_B) @jayvdb There was a [[ https://lists.wikimedia.org/pipermail/wikitech-l/2016-June/085935.html | gitblit deprecation email on wikitech-l ]]. [16:32:30] 05Gitblit-Deprecate, 06Release-Engineering-Team: Clones from git.wikimedia.org are not redirected - https://phabricator.wikimedia.org/T139206#2544417 (10Danny_B) If someone from #release-engineering-team or #operations or whoever else relevant would wrap up where it should be redirected, including some example... [16:43:10] 10Deployment-Systems, 06Operations, 13Patch-For-Review: error on tin:/srv/mediawiki-staging: insufficient permission for adding an object to repository database .git/objects - https://phabricator.wikimedia.org/T127093#2544477 (10Dzahn) and also fixed on tin: 09:44 < icinga-wm> RECOVERY - Improperly owned -0... [16:43:17] 10Deployment-Systems, 06Operations, 13Patch-For-Review: error on tin:/srv/mediawiki-staging: insufficient permission for adding an object to repository database .git/objects - https://phabricator.wikimedia.org/T127093#2544478 (10Dzahn) 05Open>03Resolved [16:43:32] 10Deployment-Systems, 06Operations: error on tin:/srv/mediawiki-staging: insufficient permission for adding an object to repository database .git/objects - https://phabricator.wikimedia.org/T127093#2032127 (10Dzahn) [16:55:09] 07Browser-Tests, 06Reading-Web-Backlog, 03Reading-Web-Sprint-78-Terminal-Velocity, 07Regression, 07Unplanned-Sprint-Work: [Regression] Fix browser tests for language switching on the beta cluster - https://phabricator.wikimedia.org/T141647#2544532 (10jhobs) 05stalled>03Resolved [16:55:38] 10Beta-Cluster-Infrastructure, 10MediaWiki-extensions-Quiz: Quiz Extension in ca.wikipedia.beta.wmflabs.org - https://phabricator.wikimedia.org/T142692#2543311 (10greg) Deploying to Beta Cluster is but one step along a process to get a new extension deployed in production. The full process (from the view of th... [16:55:44] 07Browser-Tests, 06Reading-Web-Backlog, 03Reading-Web-Sprint-78-Terminal-Velocity, 07Regression, 07Unplanned-Sprint-Work: [Regression] Fix browser tests for language switching on the beta cluster - https://phabricator.wikimedia.org/T141647#2505954 (10jhobs) a:05jhobs>03None [16:55:50] 10Beta-Cluster-Infrastructure, 10MediaWiki-extensions-Quiz: Quiz Extension in ca.wikipedia.beta.wmflabs.org - https://phabricator.wikimedia.org/T142692#2544538 (10greg) [16:56:05] 10Beta-Cluster-Infrastructure, 10MediaWiki-extensions-Quiz: Quiz Extension in ca.wikipedia.beta.wmflabs.org - https://phabricator.wikimedia.org/T142692#2543311 (10greg) p:05Triage>03Normal [17:24:02] It seems ci only has 4 nodepool instances [17:24:05] again [17:24:26] ci will become very slow like it did yesturday at high peak times [17:34:46] its slow again [17:36:06] there's 5 right now [17:36:18] well, were, down to 3 as 2 jobs just finished [17:36:39] it does appear slow to spawn/start jobs, however [17:37:33] Yep [17:37:48] But it will be like yesturday when ci gets to its busy time [17:37:58] Will take 3 hours to process tests [17:43:55] let's hope not [17:44:03] Yep [17:44:29] greg-g i guess we could re purpose gallium as a nodepool host when we migrate of it. [17:44:45] so that we have a dedicated host, but re image it with jessie. [17:44:57] dashboard I'm watching today: https://grafana-admin.wikimedia.org/dashboard/db/releng-zuul?from=1470851074400&to=1470937174400 [17:45:09] Oh, i carnt view that [17:45:12] paladox: not possible, it is being decommisioned [17:45:12] PROBLEM - Puppet run on deployment-db01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [17:45:15] doint have permission [17:45:16] and oh [17:45:18] oh, remove the -admin part [17:45:21] Ok [17:45:33] thanks [17:55:55] 03Scap3: TypeError: unsupported operand type(s) for %: 'dict' and 'tuple' - https://phabricator.wikimedia.org/T142364#2544789 (10thcipriani) 05Open>03Resolved [17:56:53] 03Scap3: TypeError: unsupported operand type(s) for %: 'dict' and 'tuple' - https://phabricator.wikimedia.org/T142364#2544797 (10thcipriani) 05Resolved>03Open Autoclosed when I landed the patch. Leaving this open until the updated package is deployed. [18:10:59] 10Beta-Cluster-Infrastructure, 10Shinken, 13Patch-For-Review, 07Wikimedia-Incident: Shinken alert for beta error rate - https://phabricator.wikimedia.org/T141785#2544875 (10greg) [18:13:23] 07Browser-Tests, 10MediaWiki-extensions-MultimediaViewer, 06Reading-Web-Backlog, 03Reading-Web-Sprint-78-Terminal-Velocity, 07Unplanned-Sprint-Work: MultimediaViewer tests fail with waiting for {:class=>"mw-mmv-final-image"} (Firefox only) - https://phabricator.wikimedia.org/T142423#2544888 (10Jdlrobson... [18:13:29] 07Browser-Tests, 10MediaWiki-extensions-MultimediaViewer, 06Reading-Web-Backlog, 03Reading-Web-Sprint-78-Terminal-Velocity, 07Unplanned-Sprint-Work: MultimediaViewer tests fail with waiting for {:class=>"mw-mmv-final-image"} (Firefox only) - https://phabricator.wikimedia.org/T142423#2544890 (10Jdlrobson... [18:38:33] ho hum [18:40:01] hey ariel [18:40:13] hey Krenair [18:40:29] what's up? [18:40:42] jus twatching the grass grow^W^W the zuul queues move [18:40:45] they are moving, it's just slow [18:41:03] I wasn't going to complain, since there is movement, but figured if there was chatter I'd see it at least [18:41:08] Oh [18:41:09] wait [18:41:12] it is slow again [18:41:36] LOL, i was saying that an hour ago, seems to be getting slower. [18:55:00] I'm gonna disappear again, the parsoid people had a long wait but everyeone else seems to be going through now [18:55:04] you know where to find me :-P [19:08:32] 10Beta-Cluster-Infrastructure, 10MediaWiki-extensions-Quiz: Quiz Extension in ca.wikipedia.beta.wmflabs.org - https://phabricator.wikimedia.org/T142692#2543311 (10Legoktm) Errr, the Quiz extension is already deployed on Wikimedia sites, so this would just be a #wikimedia-site-requests right? [19:14:36] 06Release-Engineering-Team, 06Performance-Team, 05MW-1.27-release, 07Regression, 07User-notice: First paint time regression on Wikimedia page views with wmf.23 roll-out - https://phabricator.wikimedia.org/T134553#2545101 (10Krinkle) [19:19:48] 10Browser-Tests-Infrastructure, 10MediaWiki-extensions-MultimediaViewer, 06Reading-Web-Backlog, 03Reading-Web-Sprint-78-Terminal-Velocity, 05WMF-deploy-2016-08-09_(1.28.0-wmf.14): A JSON text must at least contain two octets! (JSON::ParserError) in Multim... - https://phabricator.wikimedia.org/T129483#2545122 [21:31:47] PROBLEM - Puppet run on deployment-aqs01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [21:34:38] PROBLEM - Puppet run on integration-slave-precise-1012 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [21:35:43] PROBLEM - Puppet run on integration-slave-precise-1011 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [21:44:54] (03PS1) 10Gergő Tisza: 0.7.1: Increase log verbosity [ruby/api] - 10https://gerrit.wikimedia.org/r/304331 (https://phabricator.wikimedia.org/T142600) [21:45:06] (03PS1) 10Gergő Tisza: 1.7.3: Add API log level to environment settings [selenium] - 10https://gerrit.wikimedia.org/r/304332 (https://phabricator.wikimedia.org/T142600) [21:53:01] 07Browser-Tests, 13Patch-For-Review, 03Reading-Web-Sprint-78-Terminal-Velocity: Various browser tests failing due to login error - https://phabricator.wikimedia.org/T142600#2545786 (10Tgr) @greg I'm not sure how to proceed. The above two patches add detailed logging to the test output which is good for manua... [22:09:12] 10Beta-Cluster-Infrastructure: Add Quiz Extension in ca.wikipedia.beta.wmflabs.org - https://phabricator.wikimedia.org/T142692#2545805 (10greg) >>! In T142692#2545072, @Legoktm wrote: > Errr, the Quiz extension is already deployed on Wikimedia sites, so this would just be a #wikimedia-site-requests right? Well,... [22:09:34] 10Beta-Cluster-Infrastructure: Enable Quiz Extension on ca.wikipedia.beta.wmflabs.org for testing - https://phabricator.wikimedia.org/T142692#2545807 (10greg) [22:11:10] legoktm: sorry for the "Well, actually..." ;) [22:21:55] Project selenium-CentralAuth » firefox,beta,Linux,contintLabsSlave && UbuntuTrusty build #112: 04FAILURE in 1 min 54 sec: https://integration.wikimedia.org/ci/job/selenium-CentralAuth/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/112/ [23:06:13] (03PS2) 10Gergő Tisza: 1.7.3: Add API log level to environment settings [selenium] - 10https://gerrit.wikimedia.org/r/304332 (https://phabricator.wikimedia.org/T142600) [23:07:37] (03CR) 10jenkins-bot: [V: 04-1] 1.7.3: Add API log level to environment settings [selenium] - 10https://gerrit.wikimedia.org/r/304332 (https://phabricator.wikimedia.org/T142600) (owner: 10Gergő Tisza) [23:18:08] (03PS3) 10Gergő Tisza: 1.7.3: Add API log level to environment settings [selenium] - 10https://gerrit.wikimedia.org/r/304332 (https://phabricator.wikimedia.org/T142600) [23:30:25] 10Continuous-Integration-Infrastructure, 06Release-Engineering-Team, 15User-greg: Create incident report for CI outage on Aug 7th - https://phabricator.wikimedia.org/T142677#2546233 (10greg) 05Open>03Resolved a:03greg https://wikitech.wikimedia.org/wiki/Incident_documentation/20160807-Nodepool [23:45:21] Project beta-code-update-eqiad build #116617: 04FAILURE in 1 min 37 sec: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/116617/ [23:50:28] 10Deployment-Systems, 10Wikimedia-Logstash, 07Wikimedia-Incident: Show same set of errors/warnings/fatals in logstash's fatalmonitor as there is in `fatalmonitor` on fluorine - https://phabricator.wikimedia.org/T142784#2546271 (10greg) [23:54:37] Yippee, build fixed! [23:54:37] Project beta-code-update-eqiad build #116618: 09FIXED in 1 min 36 sec: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/116618/ [23:54:53] 10Deployment-Systems, 10Wikimedia-Logstash, 07Wikimedia-Incident: Check same set of errors/warnings/fatals in scap logstash_checker.py as there is in `fatalmonitor` on fluorine - https://phabricator.wikimedia.org/T142784#2546288 (10greg)