[03:28:08] Hi. [03:28:27] Anyone from the office around? [04:13:22] Steven_Zhang: the physical office or? [04:16:41] Steven_Zhang: for engineering, press 1. for legal, press 2. for community engagement... [14:01:48] #startmeeting CI Weekly meeting triage [14:01:52] Meeting started Tue Jun 2 14:01:48 2015 UTC and is due to finish in 60 minutes. The chair is hashar. Information about MeetBot at http://wiki.debian.org/MeetBot. [14:01:52] Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. [14:01:52] The meeting name has been set to 'ci_weekly_meeting_triage' [14:02:02] addshore: jzerebecki zeljkof Krinkle yeah I forgot to send the announcement :-( [14:02:07] o/ [14:02:11] :p [14:02:15] \o [14:02:15] #link https://www.mediawiki.org/wiki/Continuous_integration/Meetings/2015-06-02 Agenda [14:02:31] #link https://www.mediawiki.org/wiki/Continuous_integration/Meetings/2015-05-19/Minutes Past meeting minutes [14:02:37] which I have pasted a minute ago :-((( [14:02:48] #topic Actions restrospective [14:02:58] there was a single action: Antoine to poke ops so that jzerebecki has +2 on zuul machine https://gerrit.wikimedia.org/r/#/c/210692/ [14:03:04] which is solved :-} [14:03:21] thx :) [14:03:24] jzerebecki: I guess you get access to gallium to deploy zuul changes now right? [14:03:37] there is a fabric file at the root of integration/config which should do the job [14:03:44] poke anyone if in trouble! [14:03:57] yes, didn't yet do a deploy, but checked that sudo works [14:04:01] great! [14:04:32] * hashar forgets about the task [14:04:45] #topic Release engineering attended last meeting [14:04:57] so just before the hackathon we had a team offsite [14:05:05] almost all of #releng was present [14:05:12] and we did the meeting together, I was commenting on the pace [14:05:23] s/pace/what was happening/ [14:05:24] in short [14:05:36] our team is going to have more triage meeting. We did one for the beta cluster yesterday [14:05:45] all I had to say :} [14:05:54] zeljkof: do you have a weekly triage planned for browser-tests ? [14:06:06] browsertests triage starts after this meeting [14:06:12] awesome [14:06:25] just Dan and me for the first time [14:06:37] but we will announce it and open it next week [14:06:37] that is a good start [14:06:54] lets move to composer-merge-plugin madness [14:07:14] #topic composer-merge-plugin autoload merging [14:07:28] addshore: jzerebecki that follow up a composer-merge-plugin issue https://github.com/wikimedia/composer-merge-plugin/issues/18 [14:07:47] that is now merged and working [14:07:50] #link https://github.com/wikimedia/composer-merge-plugin/issues/18 github issue: Add merge `autoload` and resolve paths [14:07:59] #link https://github.com/wikimedia/composer-merge-plugin/pull/29 github pull request (merged) [14:08:24] i tested it with wikibase and another extension in CI [14:08:32] ohhh [14:08:55] I can't remember what it was blocking [14:09:04] but I guess you guys can make progress again now right? [14:09:36] yup now i need to adapt wikibase to not load the extension when the autoloader is loaded [14:10:16] do you need anything done on CI front? [14:10:57] nope. the next big thing will be to automate the vendor build, so that we can still deploy to beta when running composer in CI on master [14:11:56] maybe we should just run composer on beta ? [14:12:17] we had the discussion during the hackathon for the oid services on beta cluster https://phabricator.wikimedia.org/T100099 [14:12:36] that would work but means having production deployment different from beta deployment [14:12:52] or maybe that is another task [14:13:42] yeah that is the part that puzzles me [14:14:18] if one update a dependency in the source master branch [14:14:28] we would need another patch to update /vendor/ repo as [14:14:29] well [14:14:48] that ticket implies doing beta and production differently was agreed on, though it doesn't say that explicitly [14:15:07] yup [14:15:21] and subbu from the Parsoid team told me they want to use the deploy git repo on beta [14:15:44] I guess i need to sort out the notes I took during the meetings [14:15:46] yea when you'd build vendor automatically you could wait on the build being pushed to the vendor repo and then continue deploying [14:16:11] and summarizing the discussion we had together on the last day of the hackathon (on the paper board ) [14:16:31] lets fill that as an action [14:16:56] #action Antoine to write down the composer / vendor strategy for CI / Beta / production. With jzerebecki as a reviewer. [14:17:12] part of our discussion is relected in my comment there https://phabricator.wikimedia.org/T88211#1317971 [14:17:20] s/relected/reflected/ [14:17:49] #link https://phabricator.wikimedia.org/T88211#1317971 Jan commenting about circular dependencies between source and deploy. [14:17:52] ah great [14:18:05] there is a lot of different tasks :/ [14:18:34] I will bring up the subject during the releng team meeting in a couple hours [14:19:25] nothing else to add [14:20:15] #topic CI infra triage [14:20:32] lets go through the 15 infra bugs untrained [14:20:39] s/untrained/untriagged/ [14:20:42] #link https://phabricator.wikimedia.org/project/board/401/ [14:20:57] #link https://phabricator.wikimedia.org/T100903 Run pywikibot test suite regularly on beta cluster as part of MediaWiki/Wikimedia CI [14:21:20] the idea is to have a job running pywikibot test suite against beta [14:21:33] need some work on their side [14:23:04] #link https://phabricator.wikimedia.org/T100518 Reenable ssh MAC/KEX hardening on beta cluster and integration labs project [14:23:13] that is merely a remember item [14:23:15] moving to backlog [14:23:32] probably need to ask Jenkins to be upgraded upstream [14:23:37] o/ [14:24:39] Krinkle: hello! Doing the untriaged column of https://phabricator.wikimedia.org/project/board/401/ [14:25:03] #link https://phabricator.wikimedia.org/T100517 Jenkins jar should ship with a more recent jsch java lib version to support hardened algorithm [14:25:09] flagged it as upstream [14:25:17] and moved it to externally blocked [14:25:42] #action Antoine fill a bug to upstream Jenkins to get jsch bundled lib updated ( https://phabricator.wikimedia.org/T100517 ) [14:25:55] #link https://phabricator.wikimedia.org/T99982 Upgrade PHPUnit to 4.0+ [14:26:09] jzerebecki: addshore legoktm someone asking to switch to PHPUnit 4.0 :-D [14:26:24] yes! [14:26:26] #info we use a stalled version of PHPUnit: 3.7.x . Provided via a git repo [14:26:42] I have refused to upgrade it because the job ran against REL1_19 [14:26:48] which did not support PHPUnit 4.x [14:26:52] maybe we can bump it now [14:26:59] but I would rather migrate everything to composer [14:27:07] |Yes! [14:27:28] I guess we will need the composer-merge-plugin to merge in mediawiki/core and extension composer.json files [14:27:42] and end up with some version of PHPUnit folks want [14:28:03] I have no idea what is the progress toward moving mw/core to composer though [14:28:14] nor can I find a task [14:28:15] o/ [14:28:20] good morning legoktm ! [14:28:29] * hashar send coffee and donuts over the wire [14:28:33] you are just in time! [14:28:52] so, for phpunit the plan is to never upgrade it and instead migrate to composer, so we can update it in composer.json instead [14:28:59] +1 [14:29:01] do we have any task / person working on switching mediawiki phpunit jobs to use composer? [14:29:07] me :) [14:29:23] https://phabricator.wikimedia.org/T90303?workflow=create is the task [14:29:24] we first need to solve the how deploy to beta task [14:29:38] #agreed Keep the git phpunit repo to 3.7 and never upgrade. Switch to composer instead https://phabricator.wikimedia.org/T90303 [14:30:18] I commenting on the task requesting to switch to PHPUnit 4.0+ https://phabricator.wikimedia.org/T99982 [14:30:31] jzerebecki: is there a bug filed for that? I remember talking about it [14:30:57] i still need to file a bug about automatically building vendor [14:31:37] which is a possible solution for deploying to beta in a CI with composer on master world [14:31:40] what do you think about using the QA mailing list to exchange about it ? [14:31:50] or maybe there is a reference task? (sorry I am confused) [14:32:00] maybe one of the task can be flagged as epic [14:32:19] i need to file a task for it and get the dependencies right [14:33:03] #action jzerebecki to fill a task about automatically building vendor [14:33:04] :-} [14:33:16] :) [14:33:59] I miss bugzilla treeview of blocking bugs [14:34:54] legoktm: jzerebecki should we sync up on the qa list ? [14:34:58] or maybe bring it to wikitech-l [14:35:48] moved https://phabricator.wikimedia.org/T88211 "Unable to update libraries in MediaWiki core and vendor due to version mismatch and circular dependencies" to 'next' [14:36:14] anything else to add ? [14:37:00] no [14:37:03] #link https://phabricator.wikimedia.org/T99552 Let Jenkins-mwext-sync clean up own open unmergable patch sets [14:37:12] lowest prio, moving to backlog [14:37:28] #link https://phabricator.wikimedia.org/T99413 Fix "PHP Warning: Module 'apc' already loaded" on zend slaves [14:37:31] Krinkle: ^^^ :D [14:37:41] and I thought apc was disabled [14:37:51] because it caused some havoc / madness with the qunit jobs [14:37:54] hashar: Can you please look at VE submodule failures? I think git plugin upgrade caused problems. [14:38:17] It's hurting a lot and frequently. [14:38:19] Krinkle: after the meeting yeah and fill a task! [14:38:46] None of the jobs are updating hte submodule. They checkout the commit but don't sync the submodule. So the commit of the last working job is stuck in the submodule. [14:40:24] apc.so is also enabled in a disable-html_errors.ini PHP conf file. [14:40:27] updated task [14:40:45] moved it to next since it is annoying [14:40:49] should not be too hard to fix up [14:41:08] #link https://phabricator.wikimedia.org/T98976 Jenkins does not shutdown due to a deadlock in the IRC plugin [14:41:20] hashar: you were faster again [14:41:29] faster? [14:41:44] I nearly said the same thing in the apc bug [14:41:53] #action Antoine to fill a bug upstream for https://phabricator.wikimedia.org/T98976 [14:42:52] oh [14:44:45] #link https://phabricator.wikimedia.org/T90177 Create end-to-end automated test for Wikipedia native app(s) [14:45:10] devs want some more framework to test their apps [14:45:32] I am going to move it to #releng team [14:45:44] we will need some work on the infra, but overall seems the bug is much larger [14:46:35] #link https://phabricator.wikimedia.org/T98885 Unattended upgrade seems to only run daily instead of hourly [14:46:44] on my radar [14:47:27] hashar: https://phabricator.wikimedia.org/T101105 [14:47:47] Krinkle: thx. will look at it after this triage [14:48:06] #link https://phabricator.wikimedia.org/T98294 Bump python-gear package to 0.5.6 [14:48:09] that is for zuul [14:48:18] specially the zuul server on gallium [14:48:28] need to repackage / update apt.wm.o [14:48:50] moved to backlog [14:49:50] Krinkle: hey I refreshed the board and your task appear :-] [14:50:14] moved it to work in progress [14:50:25] #link https://phabricator.wikimedia.org/T96919 Run QUnit tests via SauceLabs [14:50:49] Krinkle: should that qunit/saucelabs move to backlog ? [14:51:05] not sure how SauceLabs will be able to hit the wiki setup on our slaves though :/ [14:51:05] Yeah [14:51:12] hashar: that's trivial. [14:51:16] It has built-in tunnel support in Grunt. [14:51:23] We're already using that. [14:51:26] \O/ [14:51:34] should it be assigned to you ? [14:51:38] No. [14:51:43] ok [14:51:54] #link https://phabricator.wikimedia.org/T96432 Run qunit tests in IE8 (and possibly other Grade A browsers) [14:52:00] which is blocked by sauce lab task [14:52:06] I might end up working on it based on needs from individual teams and resources available in RelEng, but I don't plan to. [14:52:28] so should we make it normal priority and stalled ? [14:53:00] it's not stalled. It's just not a priority I guess? It's like all other tasks in the backlog [14:53:07] ok [14:53:20] made it normal [14:53:35] #link https://phabricator.wikimedia.org/T96390 Allow ref-updated listener to filter out tag deletions [14:53:42] * Krinkle is curious to hear about progress of vm isolation, and (before isolation) progress on using git caches and smaller instanes. [14:54:10] Krinkle: will post about it this week [14:54:25] will bring it up with the releng team in an hour or so [14:54:33] in short: not much progress :-D [14:54:45] What is being worked on instead? [14:55:51] hold on [14:55:59] I would like to finish triaging the two other tasks [14:56:06] #link https://phabricator.wikimedia.org/T96034 mwext-Wikidata-testextension-zend not queuing properly [14:57:44] The mwext-*-testextension jobs are throttled by a plugin so there is only one instance of them per node. That is to prevent the slaves disk space to be filled with multiple copy of mediawiki/core [14:57:44] There might be a bug in that plugin :( [14:58:14] Since when? [14:58:21] (throttled textextension) [14:58:27] the throttling ? been around for ages [14:58:39] Where is that config? [14:58:51] we added that when addshore created jobs for all extensions [14:58:57] so it is somewhere in the jjb config files [14:59:05] *reads up* [14:59:20] - throttle-one-per-node [14:59:21] I see it [14:59:27] https://github.com/wikimedia/integration-config/blob/831e6302e4f02b3dbf7355aaa2c762756282439f/jjb/mediawiki-extensions.yaml#L176-L183 [14:59:27] OK [14:59:31] jjb/mediawiki-extensions.yaml:182: - throttle-one-per-node [14:59:50] So what does that mean. One instance of extensions Foo per node. [15:00:03] yup [15:00:04] Or one instance globally of this template. [15:00:08] on instance of that specific job per node [15:00:17] *one [15:00:21] hashar: Is this about runtime or workspace? [15:00:26] so we don't end up with dupe workspaces [15:00:33] because if it's about runtime, that says _nothing_ about having multiple copies or not [15:00:43] so if you have 4 builds triggered, they will be run on 4 different nodes [15:00:48] if you have only 3 nodes available [15:00:49] 3 run [15:00:49] That will still end up running Foo on slave1 and tomororw on slave2 [15:01:02] 1 is kept in Jenkins build queue waiting for an available executor [15:01:20] Right [15:01:28] So no multiple copies of one extension on the same slave. [15:01:34] no workspace@2 etc. [15:01:41] but still all extensions on all slaves. [15:01:42] OK. cool [15:01:44] yeah that is to prevent the workspace@2 [15:01:47] but that's not related to the bug I think. [15:01:50] which fill disk because of mw/core [15:02:00] hashar: well, only becuse we don't clear workspaces.. [15:02:07] :) [15:02:09] yeah [15:02:15] so in the end [15:02:22] * zeljkof-meeting has to go to another meeting [15:02:25] that plugin might explain why the job is not being triggered [15:02:34] zeljkof-meeting: sure! will join late though :-D [15:02:39] zeljkof-meeting: thanks to have attended! [15:02:49] I see this bug from time to time, and for sure there were other slaves available that were not executing this job. [15:03:00] maybe the throttler does not work properly [15:03:01] It's probably just another bug in Jenkins/Zuul scheduling illogical thing. [15:03:06] or the configuration we crafted is wrong [15:03:12] or a bug yeah [15:03:19] eventually, we will get rid of that plugin [15:03:28] and the last bug is [15:03:33] #link https://phabricator.wikimedia.org/T94212 Accommodate flaky tests flapping [15:03:53] which I am sending straight to backlog [15:03:56] and lowering priority for [15:04:26] #info triaged the whole "Untriaged" column of CI infra! [15:04:36] jzerebecki: addshore: Krinkle: anything else to add ? [15:04:39] #topic the end [15:04:47] no thx [15:05:37] gotta investigate that submodule trouble for VE [15:05:40] hashar: did you end the meeting? can we start ours here? [15:05:45] #endmeeting [15:05:45] Meeting ended Tue Jun 2 15:05:45 2015 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) [15:05:45] Minutes: https://tools.wmflabs.org/meetbot/wikimedia-office/2015/wikimedia-office.2015-06-02-14.01.html [15:05:45] Minutes (text): https://tools.wmflabs.org/meetbot/wikimedia-office/2015/wikimedia-office.2015-06-02-14.01.txt [15:05:45] Minutes (wiki): https://tools.wmflabs.org/meetbot/wikimedia-office/2015/wikimedia-office.2015-06-02-14.01.wiki [15:05:46] Log: https://tools.wmflabs.org/meetbot/wikimedia-office/2015/wikimedia-office.2015-06-02-14.01.log.html [15:05:48] zeljkof-meeting: yeah go ahead [15:08:58] hashar: thanks! [15:09:00] #startmeeting Browser test meeting triage [15:09:01] Meeting started Tue Jun 2 15:09:00 2015 UTC and is due to finish in 60 minutes. The chair is marxarelli. Information about MeetBot at http://wiki.debian.org/MeetBot. [15:09:01] Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. [15:09:01] The meeting name has been set to 'browser_test_meeting_triage' [15:09:41] #topic Sorting triage column of https://phabricator.wikimedia.org/project/view/1078/ [15:09:52] o/ [15:11:29] #topic Assessing Doing column [15:11:57] tasks outside of a week timebox will be moved back to TODO [15:14:14] #info moved T96283 back to TODO [15:15:58] #info T99653 and T99652 should stay [15:16:09] #topic moving on to Waiting for [15:21:18] #info moved T89353 to TODO [15:28:06] #info moved T94162 to TODO (will try to schedule pairing session with ryasmeen) [15:33:26] #info moved epic T94150 to release engineering board to task long term progress [15:37:31] #info pinging T92154, leaving in Waiting for [15:41:28] sorry can not attend after all. Gotta fix some magic jjb/git issue :( [16:00:24] #endmeeting [16:00:25] Meeting ended Tue Jun 2 16:00:24 2015 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) [16:00:25] Minutes: https://tools.wmflabs.org/meetbot/wikimedia-office/2015/wikimedia-office.2015-06-02-15.09.html [16:00:25] Minutes (text): https://tools.wmflabs.org/meetbot/wikimedia-office/2015/wikimedia-office.2015-06-02-15.09.txt [16:00:25] Minutes (wiki): https://tools.wmflabs.org/meetbot/wikimedia-office/2015/wikimedia-office.2015-06-02-15.09.wiki [16:00:25] Log: https://tools.wmflabs.org/meetbot/wikimedia-office/2015/wikimedia-office.2015-06-02-15.09.log.html