[09:53:41] 3Wikimedia / 3Quality Assurance: Run PhantomJS across set of Wikimedia wiki pages to ensure sane JavaScript - 10https://bugzilla.wikimedia.org/69519#c6 (10James Forrester) There's also the VE production browser test that ensures that all wikis with VisualEditor installed actually let you get to VE successful... [12:23:32] (03PS4) 10Addshore: make mw-core-phpcs-lenient-HEAD voting on master! [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/153575 (https://bugzilla.wikimedia.org/46500) [12:23:47] (03PS5) 10Addshore: make mw-core-phpcs-lenient-HEAD voting on master! [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/153575 (https://bugzilla.wikimedia.org/46500) [12:34:08] (03PS6) 10Tobias Gritschacher: WIP Moved the first Wikidata job from WMDE Jenkins [integration/jenkins-job-builder-config] (cloudbees) - 10https://gerrit.wikimedia.org/r/147093 (owner: 10Zfilipin) [12:34:57] (03PS1) 10Addshore: Add phpunit job for codesniffer [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/154257 [12:36:34] (03PS1) 10Addshore: Add triggers for codesniffer phpunit [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/154258 [16:16:58] (03PS2) 10Addshore: Add triggers for codesniffer phpunit [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/154258 [16:17:04] (03PS3) 10Addshore: Add triggers for codesniffer phpunit [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/154258 [17:06:57] 3Wikimedia / 3Quality Assurance: No data received by Saucelabs from beta labs - 10https://bugzilla.wikimedia.org/68083#c1 (10Greg Grossmeier) Has the state of this issue (and the other one, bug 68084 - Beta labs API not accessible from Saucelabs) changed recently? Still seeing the intermittent errors like th... [17:07:18] (03CR) 10Jdlrobson: [C: 031] "Internet explorer would also be useful but I'll take anything right now :)" [integration/jenkins-job-builder-config] (cloudbees) - 10https://gerrit.wikimedia.org/r/154070 (owner: 10Cmcmahon) [17:08:11] 3Wikimedia / 3Quality Assurance: Run Selenium tests in parallel - 10https://bugzilla.wikimedia.org/55867 (10Greg Grossmeier) 5ASSI>3NEW [17:08:56] 3Wikimedia / 3Quality Assurance: No data received by Saucelabs from beta labs - 10https://bugzilla.wikimedia.org/68083#c2 (10Chris McMahon) bits on beta labs seems to have been down overnight 15 August this is a very generic bug, I'm not sure it is really actionable as such [17:11:30] chrismcmahon: what about this one, which juliusz reported at the same time: https://bugzilla.wikimedia.org/show_bug.cgi?id=68084 [17:12:21] chrismcmahon: also, just taking the title of the one you commented on, the API is (now?) accessible, no? [17:12:24] greg-g: yeah, that's just a generic timeout. I think I'd like to close those, they're very general [17:12:28] thus, if that's the only issue, it should probably be closed? [17:12:32] * greg-g nods [17:12:46] do you have an idea of root cause? "just" latency? [17:13:29] 3Wikimedia / 3Quality Assurance: Beta labs API not accessible from Saucelabs - 10https://bugzilla.wikimedia.org/68084#c1 (10Chris McMahon) 5NEW>3RESO/FIX We'll address specific issues as they occur [17:13:57] 3Wikimedia / 3Quality Assurance: No data received by Saucelabs from beta labs - 10https://bugzilla.wikimedia.org/68083#c3 (10Chris McMahon) 5NEW>3RESO/FIX We'll address specific issues as they occure [17:15:27] greg-g: Jenkins might have been slow around then, dunno. [17:41:11] chrismcmahon: what's your bet on the hiring meeting happening? [17:41:11] 3Wikimedia / 3Quality Assurance: mediawiki ruby api doesn't encode userrights token properly - 10https://bugzilla.wikimedia.org/69305#c9 (10Jon) 5NEW>3RESO/WOR This seems to be working now. [17:55:26] greg-g: I never go to that any more, dunno [17:55:39] * greg-g nods [17:55:48] greg-g: I should take that off my calendar, I haven't attended in maybe a year [18:01:03] chrismcmahon: :) [18:03:48] (03PS1) 10Addshore: Refactor CodeSniffer Standard checks [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/154296 [19:26:46] (03CR) 10Hashar: [C: 04-1] "Need a hack to load phpcs on Wikimedia Jenkins slave. It is available under /srv/deployment/integration/phpcs and that is not in the inclu" (031 comment) [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/154276 (owner: 10Addshore) [19:28:06] (03CR) 10Hashar: [C: 032] Add phpunit job for codesniffer [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/154257 (owner: 10Addshore) [19:28:36] (03Merged) 10jenkins-bot: Add phpunit job for codesniffer [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/154257 (owner: 10Addshore) [19:29:57] (03CR) 10Hashar: "INFO:jenkins_jobs.builder:Creating jenkins job mw-tools-codesniffer-phpunit" [integration/jenkins-job-builder-config] - 10https://gerrit.wikimedia.org/r/154257 (owner: 10Addshore) [19:30:32] hashar: any other ideas on that rsync issue from erikb? [19:31:49] (03CR) 10Hashar: Add triggers for codesniffer phpunit (031 comment) [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/154258 (owner: 10Addshore) [19:32:10] greg-g: nop :-/ [19:32:17] greg-g: the jenkins job is happy [19:32:21] will have to debug scap itself [19:32:23] yeah [19:32:25] :( [19:32:27] the paths confuse me a ton [19:32:52] I have never felt at ease with the mess of /a/common /usr/local/apache-local /usr/local/apache-common /srv/deployment/... and so on [19:33:44] twentyafterfour_: if you're bored of phabricator things and want something to look at for your afternoon: https://bugzilla.wikimedia.org/show_bug.cgi?id=69590 :) It's kind of an off the deep end thing, but, it might be doable. [19:33:54] (03PS4) 10Hashar: Add triggers for codesniffer phpunit [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/154258 (owner: 10Addshore) [19:34:10] + I am out of whisky [19:34:15] which sucks on a holiday day [19:34:25] (03CR) 10Hashar: [C: 032] Add triggers for codesniffer phpunit [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/154258 (owner: 10Addshore) [19:34:33] (03Merged) 10jenkins-bot: Add triggers for codesniffer phpunit [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/154258 (owner: 10Addshore) [19:35:33] (03CR) 10Hashar: "Addshore already made the Jenkins Job Builder and Zuul changes which I have deployed \O/" [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/154276 (owner: 10Addshore) [19:35:42] (03CR) 10Hashar: "recheck" [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/154276 (owner: 10Addshore) [19:35:44] (03CR) 10jenkins-bot: [V: 04-1] Add tests for existing custom sniffs [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/154276 (owner: 10Addshore) [19:36:23] greg-g: digging in scap [19:36:25] which me luck [19:36:38] if you dont see me coming back please call my wife and life insurance policy :-] [19:36:41] hashar: be careful, it's dangerous in there :) [19:39:28] seems appropriate [19:53:31] :) [19:54:53] pfff [19:55:03] scap.py is over engineered [19:56:50] greg-g: beta sync is apparently broken since July 30th [19:56:57] ? [19:56:58] at least on one of the php app server [19:57:03] which? [19:57:19] apache01 [19:58:03] though it sync something apparently :] [19:58:14] ahh [19:58:19] cat dsh/group/mediawiki-installation [19:58:21] why isn't that listed in eg https://bugzilla.wikimedia.org/show_bug.cgi?id=69590#c6 [19:58:22] does not have the apache [19:58:23] hehe [19:58:24] yeah [19:59:01] its obsolete [19:59:02] bah [20:02:24] so broken since ~ August 13th 21:15 UTC [20:02:55] (03PS2) 10Cmcmahon: create browser test builds for Echo extension [integration/jenkins-job-builder-config] (cloudbees) - 10https://gerrit.wikimedia.org/r/154070 [20:07:36] ahh [20:07:38] progress [20:12:39] FUCK LAB [20:12:40] really [20:12:49] and lack of monitoring [20:13:51] :( [20:14:24] hashSpeleology: all those graphite graphs and never the health check you need? [20:24:27] honestly [20:24:32] I dont know why I spend time on that issue [20:24:37] the root cause is puppet [20:24:38] :-D [20:24:40] https://www.mediawiki.org/wiki/Continuous_integration says "In addition, an external Jenkins service at CloudBees regularly runs WMF browser tests across multiple browsers at Sauce Labs", but https://wmf.ci.cloudbees.com/ is a 404. Is wmf.ci.cloudbees.com dead [20:24:47] greg-g: puppet is broken :-D [20:24:48] hashSpeleology: thanks for your struggle [20:24:54] greg-g: some path got changed recently [20:25:06] and puppet does not reflect the change on the beta instance because ... it is broken :] [20:25:10] spagewmf: yeah, cloudbees is dead, long live wmf jenkins [20:25:24] I'll update [20:25:26] ty [20:25:34] hashSpeleology: broken or disabled? [20:25:39] (or both) :) [20:25:39] broken [20:26:05] hashSpeleology: any hints of where/when? [20:26:05] at least [20:26:07] we are not a bank :-D [20:26:30] is the aug 13th date/time the same for this? [20:27:27] greg-g: when you asked me wednesday what is needed for beta : MONITORING [20:28:51] yep [20:29:04] greg-g: I will find out a fix for sure :] [20:29:34] hashSpeleology: or complain to the ops list with a pointer in the right direction [20:29:45] (or do that anyways) [20:29:50] seriously. [20:30:37] I am pretty sure it is some mediawiki / ori change applied recently [20:30:38] will see [20:30:46] I will find! [20:31:03] a nice (not jerkish) note to ops post diagnoses/fix would be appreciated [20:33:37] so I found the root cause [20:34:16] https://gerrit.wikimedia.org/r/#/c/153807/ || mediawiki: create common-local directory merged Aug 13 22:28 [20:34:36] :( [20:34:48] can you comment on the patch that it broke the beta cluster [20:38:04] of course [20:38:17] I love spamming information [20:38:45] funnily the commit messages has: "" and add a table-flip emoji, because that's how I feel when I reflect on the app server symlink clusterfuck. Rage. "" [20:38:51] totally agree [20:39:22] yep [20:39:24] :( [20:45:54] reverting [20:45:57] since we are good at reverting [20:46:18] I give some talk to folks about how wmf infra works [20:46:23] they are always surprised by two facts: [20:46:32] 1) how small our eng department is [20:46:35] 2) how fast we revert [20:47:10] :) [20:48:10] I gave a quick pres of our caching system to a bunch of france high traffic folks [20:48:33] a couple came to me and told me: well we are disappointed, we are running the exact same thing we were expecting something else [20:48:50] heh [20:48:55] to which I replied: the only difference is that we have set that up a few years before you. But yeah that is standard practices [20:48:59] they were quiet happy [20:49:02] quite [20:49:19] :) [20:49:35] thanks a ton for your work here on a friday night, hashSpeleology [20:51:07] (03CR) 10Spage: "I'm confused, why is this in a cloudbees branch if we're not using CloudBees these days?" [integration/jenkins-job-builder-config] (cloudbees) - 10https://gerrit.wikimedia.org/r/154070 (owner: 10Cmcmahon) [20:51:51] greg-g: oh it is not like it is holiday and I am out of whisky :-] [20:51:55] I love friday evening [20:52:28] :) [20:53:17] root@deployment-mediawiki02:~# puppet agent -tv [20:53:17] Notice: Skipping run of Puppet configuration client; administratively disabled (Reason: 'reason not specified'); [20:53:27] ALL YOUR BASE ARE BELONG TO US. RESISTANCE IS FUTILE [20:54:23] !log deployment-prep puppet is proceeding on mediawiki01 [20:55:54] I'm starting to think next year I should say "no deploys the week after wikimania" [20:58:13] chrismcmahon, can you add my account on saucelabs as a subaccount of your account (same thing you did to marxarelli's account)? [20:58:30] greg-g: it is ok [20:58:43] greg-g: I am sure I will have fall in the same pit ori fall into [20:59:07] jgonera: what is it? [20:59:08] greg-g: beta still has some slight difference with production which are not easy to spot unless you spend a couple hours debugging an issue. [20:59:56] chrismcmahon, I'd like to be able to run browser tests through saucelabs to debug them if they fail there [21:00:10] * hashSpeleology flexes [21:01:22] hashSpeleology: thanks a ton [21:02:06] jgonera: OK. You can do that without being a sub-account of the WMF account. The main reason to be a sub-account is for access to the Sauce records of the Jenkins builds. If you're not working with WMF Jenkins, you don't need it. [21:02:25] chrismcmahon, I see, OK, thanks [21:05:51] greg-g: it is all good [21:06:19] * greg-g hands hashSpeleology a tumbler of whiskey [21:07:02] * hashSpeleology get drun [21:07:03] k [21:07:06] pff [21:07:22] greg-g: is it worth a postmortem or can I just go with a quick mail to qa list ? [21:07:50] quick mail to qa and ops [21:29:25] (03CR) 10Jeroen De Dauw: "> This would force us to submit per hand" [integration/zuul-config] - 10https://gerrit.wikimedia.org/r/154058 (owner: 10Jeroen De Dauw) [21:31:04] greg-g: mails sent :-] [21:31:13] greg-g: sorry for all the spam. [21:31:48] hashSpeleology: I love spam like that :) [21:32:33] greg-g: seeing a database error on beta labs during one of the mobile browser tests [21:32:49] who do i bug about that? [21:33:22] marxarelli: sean pringle I suppose [21:35:38] greg-g: rad. i'll bug him [21:42:48] marxarelli: beta cluster is broken [21:42:53] err the database is [21:42:57] fix is https://gerrit.wikimedia.org/r/#/c/154231/ [21:43:11] hopefulyl [21:43:24] that is due to CentralNotice having a wrong updater entry (updater that we dont use on production) [21:44:04] hashSpeleology: got it! i'll wait it out then. thanks [21:52:24] marxarelli: I should make the Jenkins jobs that update beta whine here and by email on qa-alerts [22:04:26] 17:52 <+hashSpele> marxarelli: I should make the Jenkins jobs that update beta whine here and by email on qa-alerts [22:04:29] yes please [22:16:56] (03PS3) 10Spage: create browser test builds for Echo extension [integration/jenkins-job-builder-config] (cloudbees) - 10https://gerrit.wikimedia.org/r/154070 (https://bugzilla.wikimedia.org/69130) (owner: 10Cmcmahon) [22:18:28] 3Wikimedia / 3Quality Assurance: Create Jenkins job for Echo browser tests - 10https://bugzilla.wikimedia.org/69130#c3 (10spage) 5NEW>3PATC I think Chris wrote gerrit 154070 for this. [22:25:43] marxarelli: the beta cluster databases are back in business :] [22:26:01] marxarelli: beta cluster is updated using Jenkins jobs. There is a view defined in Jenkins that list them all https://integration.wikimedia.org/ci/view/Beta/ [22:26:18] or you can try https://integration.wikimedia.org/dashboard/ [22:26:28] but it is manually maintained and I dont think anyone look at it [22:32:20] hashSpeleology: fantastic! i appreciate it [22:33:32] greg-g: re having Jenkins whine is now bug https://bugzilla.wikimedia.org/show_bug.cgi?id=69628 [22:33:35] too late to give it a try [22:33:49] marxarelli: yeah that is a bit messy sorry :-( [22:34:39] marxarelli: https://wikitech.wikimedia.org/wiki/Nova_Resource:Deployment-prep/How_code_is_updated might be worth your time :] [22:35:21] hashSpeleology: yes, any chance to learn more about our infrastructure is worth the time. thanks [22:35:46] marxarelli: you might want to poke S on 6th floor [22:36:00] I believe he uses beta on a daily basis [22:36:03] if not hourly [22:37:04] hashSpeleology: bd808|BUFFER gave twentyafterfour_ and i a good rundown a while back. i'll have to re-review the notes when i get a chance [22:38:09] marxarelli: what's the issue? I can help poke at it if that'll help [22:40:07] twentyafterfour_: check the backlog, not because i want to be a dick but because i won't be able to explain it well enough :) [22:40:24] marxarelli: reading [22:40:28] *scroll log* [22:40:42] * marxarelli needs to get his irc terminology straight [22:41:40] * twentyafterfour_ also thought it was called a backlog... :-D [22:43:22] marxarelli: yeah bd808 is awesome at explaining stuff :] [22:43:31] + he is in the same timezone as you guys [22:43:55] you can still ask on the qa list. There is fair number of people there that knows about prod / beta [22:45:14] gotta sleep, good week-end! [22:48:59] * bd808|MOBILE scanned back scroll [22:49:49] It should be possible to add a local hack patch on deployment-salt to work around the prod path change [22:50:21] Or just change the expected dirs on beta if needed. [22:51:17] The problem is likely that a manifest in the beta package is duplicating a path that has finally been added to prod [22:51:49] I did some ugly things to set up scap in beta [22:53:43] I have added several fire fighting patches for beta as _joe_ and ori have been cleaning up the prod app server classes. [23:00:28] * bd808|MOBILE sinks back into the haze of vacation and GMT+1 timezone [23:16:21] (03PS1) 10Dduvall: Edit method and exception upon edit failures [ruby/api] - 10https://gerrit.wikimedia.org/r/154360