[00:21:32] 3Phabricator, Wikimedia-Git-or-Gerrit: Task/Job 7092771 (PhabricatorRepositoryCommitHeraldWorker) has over 1600 failures (Importing revision) - https://phabricator.wikimedia.org/T87549#993974 (10Se4598) 3NEW [00:26:07] 3Phabricator, Wikimedia-Git-or-Gerrit: Task/Job 7092771 (PhabricatorRepositoryCommitHeraldWorker) has over 1600 failures (Importing revision) - https://phabricator.wikimedia.org/T87549#993981 (10Se4598) [03:10:47] 3Phabricator, Phabricator.org: Phabricator login button uses a log out icon - https://phabricator.wikimedia.org/T87552#994037 (10Isarra) 3NEW [03:50:55] 3Phabricator, Wikimedia-Git-or-Gerrit: Task/Job 7092771 (PhabricatorRepositoryCommitHeraldWorker) has over 1600 failures (Importing revision) - https://phabricator.wikimedia.org/T87549#994063 (10Aklapper) Wondering if that's similar to {T87282}. Probably not. [06:11:15] 3Phabricator: Do not show "MediaWiki Userpage" on Phabricator profile if value is "Unknown" - https://phabricator.wikimedia.org/T903#994124 (10mmodell) I don't think this would be very difficult though I'm not sure it's really beneficial either. [06:38:39] PROBLEM - Puppet failure on deployment-rsync01 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [07:03:40] RECOVERY - Puppet failure on deployment-rsync01 is OK: OK: Less than 1.00% above the threshold [0.0] [07:10:36] !log Un-cherry-pick I0b8408 and Iff8598 that were merged, so rebase can succeed, so https://gerrit.wikimedia.org/r/#/c/173336/13 can be cherry-picked [07:27:35] !log Cherry-pick https://gerrit.wikimedia.org/r/#/c/173336/14 [11:45:17] 3Phabricator: Some imported bugzilla tasks have no component - https://phabricator.wikimedia.org/T87560#994262 (10Nemo_bis) 3NEW [12:21:01] 3Phabricator: Some imported bugzilla tasks have no component - https://phabricator.wikimedia.org/T87560#994291 (10valhallasw) 5Open>3Invalid a:3valhallasw The component Wikimedia-SSL-related was renamed to HTTPS: https://phabricator.wikimedia.org/tag/https/ [12:37:35] PROBLEM - SSH on deployment-lucid-salt is CRITICAL: Connection refused [15:06:42] 3pywikibot-core, Continuous-Integration: Whitelist people with +2 rights - https://phabricator.wikimedia.org/T87413#994488 (10XZise) @Mpaa if you want to make it more automatic, you can query the members via SSH: `ssh -p 29418 gerrit.wikimedia.org gerrit ls-members pywikibot`. [15:46:51] 3Phabricator, Wikimedia-Git-or-Gerrit: Task/Job 7092771 (PhabricatorRepositoryCommitHeraldWorker) has over 1600 failures (Importing revision) - https://phabricator.wikimedia.org/T87549#994535 (10Chad) >>! In T87549#994063, @Aklapper wrote: > Wondering if that's similar to {T87282}. Probably not. It is. [17:15:11] 3Release-Engineering, MediaWiki-Developer-Summit-2015, Continuous-Integration: 2015 MediaWiki Developer Summit - State of continuous integration (CI), what we did in 2014 - https://phabricator.wikimedia.org/T86750#994590 (10hashar) Etherpad: https://etherpad.wikimedia.org/p/MWDS2015-CI [17:15:18] 3Release-Engineering, MediaWiki-Developer-Summit-2015, Continuous-Integration: 2015 MediaWiki Developer Summit - State of continuous integration (CI), what we want to do in 2015 - https://phabricator.wikimedia.org/T86752#994591 (10hashar) Etherpad https://etherpad.wikimedia.org/p/MWDS2015-CI [18:02:25] 3Beta-Cluster, Release-Engineering, operations: Intermittent DNS failures in beta labs regularly trigger a bunch of puppet failures - https://phabricator.wikimedia.org/T87480#994642 (10Reedy) [18:10:43] 3Beta-Cluster, Release-Engineering, operations: Intermittent DNS failures in beta labs regularly trigger a bunch of puppet failures - https://phabricator.wikimedia.org/T87480#994664 (10yuvipanda) [18:57:08] 3Continuous-Integration: mediawiki-phpunit-hhvm fails sometimes with IOError - https://phabricator.wikimedia.org/T87591#994738 (10Umherirrender) 3NEW [18:57:48] 3Release-Engineering, MediaWiki-Developer-Summit-2015, Continuous-Integration: 2015 MediaWiki Developer Summit - State of continuous integration (CI), what we did in 2014 - https://phabricator.wikimedia.org/T86750#994753 (10hashar) [18:58:37] 3Release-Engineering, MediaWiki-Developer-Summit-2015, Continuous-Integration: 2015 MediaWiki Developer Summit - State of continuous integration (CI), what we want to do in 2015 - https://phabricator.wikimedia.org/T86752#994754 (10hashar) [19:01:30] 3Project-Creators, Release-Engineering: Add in Phabricator quarterly milestones for RelEng - https://phabricator.wikimedia.org/T75729#994768 (10Aklapper) a:5Aklapper>3None So if #releng-2015-q2 and #releng-2015-q3 is wanted, please go ahead and create them (deadline icon, green color), or close this task as... [19:03:00] Hey, beta-code-update-eqiad has been failing for a couple of days based on MathSearch repo fetch it looks like? Anyone following this up? [19:03:40] I don't think any of us have had much time for regular work in the past few days... [19:03:53] I can look into it [19:04:18] Can you link me to the failure log? [19:04:34] Sure. [19:04:42] https://integration.wikimedia.org/ci/view/Beta/job/beta-code-update-eqiad/41959/console [19:04:59] 18:53:35 Submodule 'modules/min' (https://gerrit.wikimedia.org/r/p/mediawiki/extensions/MathSearch.git) registered for path 'modules/min' [19:04:59] 18:53:35 No submodule mapping found in .gitmodules for path 'third_party/MathJax' [19:04:59] 18:53:35 Failed to recurse into submodule path 'modules/min' [19:07:01] hmm [19:07:58] I'll bug Hashar about it ;) [19:08:08] Thanks. :-) [19:34:23] 3Engineering-Community, Code-Review: How to prioritize code review of patches submitted by volunteers - https://phabricator.wikimedia.org/T78768#994852 (10Aklapper) Afraid of sidetracking but: "If you don't feel responsible for project XY, why would you review its code contributions?" Especially thinking in ter... [19:34:58] 3Engineering-Community, Code-Review: How to prioritize code review of patches submitted by volunteers - https://phabricator.wikimedia.org/T78768#994853 (10Aklapper) CCing @damons as he brought up "giving feedback faster" earlier today at MW Dev Summit and might have some thoughts to share. (Feel free to unsubscr... [20:05:44] 3Beta-Cluster: deployment-mx is its own puppetmaster - https://phabricator.wikimedia.org/T86575#994945 (10Jgreen) We need to test mx functionality, as part of bounce handler testing, so we need role::mail::mx which conflicts with labs standard instance config . The local change in question is: class standard {... [20:32:04] 3Beta-Cluster: deployment-mx is its own puppetmaster - https://phabricator.wikimedia.org/T86575#994961 (10Jgreen) background: https://lists.wikimedia.org/pipermail/labs-l/2014-October/002977.html [21:06:05] !log I just merged a scap change that probably will break the beta-recomile-math-textvc-eqiad job -- https://gerrit.wikimedia.org/r/#/c/186808/ [21:06:10] Logged the message, Master [21:06:15] The MathSearch submodule is broken [21:06:36] bad hash or ? [21:06:57] bad submoduloe [21:07:00] twentyafterfour@deployment-bastion:/srv/mediawiki-staging/php-master/extensions/MathSearch/modules/min$ git submodule init [21:07:02] No submodule mapping found in .gitmodules for path 'third_party/MathJax' [21:07:26] I can't find where the submodule is even defined, so not sure why it thinks there IS a submodule at third_party/MathJax [21:07:38] since it's not in the .git/config or .gitmodules for that repo [21:08:11] I'm amazed that we have at least 3 levels of submodules [21:08:15] 3Continuous-Integration, ContentTranslation-Deployments: Enable Debian CI tests on all Apertium packages - https://phabricator.wikimedia.org/T87607#995020 (10KartikMistry) 3NEW [21:08:17] <^d> Check paren't .git/ diretory [21:08:19] <^d> *directory [21:08:22] <^d> submodules fucking suck [21:08:30] no shit... [21:08:31] <^d> Probably in .git/modules/* [21:08:46] * twentyafterfour hates submodules with newfound passion [21:09:32] the error is from here -- https://github.com/physikerwelt/min/tree/master/third_party [21:09:49] which is the submodule we import, it has a recursive dep [21:09:55] that is not setup correctly [21:10:40] and it is broken in the upstream -- https://github.com/DPRL/min/tree/master/third_party [21:10:49] 3Phabricator: Delete specific user account in Phabricator - https://phabricator.wikimedia.org/T87608#995028 (10Aklapper) 3NEW [21:11:49] * bd808 wonders why we have an extension pulling a submodule from github [21:12:11] * twentyafterfour was wondering the same [21:12:27] who should I talk to about that? any ideas? [21:12:37] Math guy [21:12:50] physkerwelt [21:13:12] 3Beta-Cluster: deployment-mx is its own puppetmaster - https://phabricator.wikimedia.org/T86575#995035 (10scfc) I think the underlying problem (not include role::mail::sender) is fixable by hiera; this would solve a similar situation with `tools-mail` where we need a different exim package for LDAP lookups as well. [21:13:12] https://phabricator.wikimedia.org/p/Physikerwelt/ [21:13:13] math guy lol [21:13:34] he's a prof in Germany; nice guy [21:14:09] https://phabricator.wikimedia.org/tag/mediawiki-extensions-mathsearch/ [21:14:17] Should we temporarily revert? [21:14:30] Get back to green status and then work out what went wrong? [21:15:20] James_F: probably, I don't see physkerwelt on IRC [21:15:43] should I report an issue against his github repo? [21:16:04] the DPRL one? yeah [21:16:28] https://gerrit.wikimedia.org/r/#/c/185974/ is the one that brought it into MathSearch AFAICT. [21:16:52] 3Phabricator: Delete specific user account in Phabricator - https://phabricator.wikimedia.org/T87608#995048 (10chasemp) 5Open>3Resolved a:3chasemp ``` IMPORTANT: OBJECTS WILL BE PERMANENTLY DESTROYED! There is no way to undo this operation or ever retrieve this data. These 1 object(s) will be completely... [21:17:08] 3Phabricator: Create "Lyon hackathon 2015 ideas" project on phabricator - https://phabricator.wikimedia.org/T87610#995051 (10Jdlrobson) 3NEW [21:17:16] https://gerrit.wikimedia.org/r/#/c/186829/ is the revert. [21:17:16] I really wish we could get him more help, all those self-reviews :/ [21:17:21] greg-g: Yeah. :-( [21:18:19] btw, if you want to chat CI past/present/future, come to conf room #2 [21:18:52] greg-g: Would be nice if it didn't clash with other presentations. Maybe I should have pushed for it to be plenary. [21:19:29] yeah, it's hard to pack it all in [21:19:37] stupid "un"conferences :) [21:20:02] * James_F would readily give up his talk to go to the CI future talk, but… :-) [21:20:24] https://github.com/DPRL/min/issues/1 [21:20:56] Thanks twentyafterfour. [21:21:17] thanks to bd808, he tracked it down better than I could [21:22:37] merge failed on the revert [21:24:13] 3Triagers, Phabricator, operations, Project-Creators: Broaden the group of users that can create projects in Phabricator - https://phabricator.wikimedia.org/T706#995080 (10Jdlrobson) Please can I be added? I'm still trying to work out a good mental model that is compatible with how the mobile team works that wil... [21:34:01] 3Phabricator, operations: Add @emailbot to #wmf-nda - https://phabricator.wikimedia.org/T87611#995099 (10chasemp) [21:34:54] 3Phabricator, operations: Add @emailbot to #wmf-nda - https://phabricator.wikimedia.org/T87611#995103 (10chasemp) @csteipp, do you have any objection? I can't think of any other way to solve this problem that isn't worse. [21:41:27] 3Beta-Cluster, operations: Renumber apache user/group to uid=48 - https://phabricator.wikimedia.org/T78076#995130 (10yuvipanda) I'll let @faidon elaborate, but I think we're going to re-number in prod and also try to explicitly set uid/gid for all system users declared in puppet. [22:01:57] twentyafterfour: the tracking task for YuviPanda's work: https://phabricator.wikimedia.org/T87220 [22:03:48] greg-g: twentyafterfour yeah, tell me if I’ve missed anything? [22:04:37] 3Beta-Cluster, operations: Set up an alert for unmerged changes in deployment-prep - https://phabricator.wikimedia.org/T87616#995192 (10yuvipanda) 3NEW a:3yuvipanda [22:04:38] thanks! [22:04:48] subscribed and flagged [22:05:09] pink flag for fighting cancer [22:05:19] greg-g: twentyafterfour maybe we should set up a small meeting sometime to agree on some things, divide up tasks? [22:05:34] and most importantly set up a ‘contract’ of sorts between releng and ops? [22:06:00] oh man, you're trying to make it all official... [22:06:04] alright fine :) [22:06:15] YuviPanda: when do you leave town? [22:06:23] greg-g: friday morning sadly [22:06:34] I really wish ya'll (ops) hadn't holed up on Tues/Fri [22:06:45] So what's it going to take to get people to take beta seriously? it doesn't sound like we're taking it seriously as long as it's running as a labs project [22:07:00] phabricator isn't a labs project [22:07:07] greg-g: tuesday was offsite, and friday I was running around interrupting everyone! Have like 7 more tasks from different teams now. [22:07:36] greg-g: but yes, could have been better. [22:07:37] oho well [22:07:47] I'm leaving wednesday :( [22:07:58] so, re seriousness/labs: Its a good point. This is actually what I was trying to get to when I suggested Beta being migrated to the new DC, but that is pre-mature I think [22:08:20] I'd like to stop calling it beta as well [22:08:26] YuviPanda: realistically, how hard, not counting hardware, would it be to setup another openstack cluster/"labs" just for beta? [22:08:35] we need a staging environment, and we need to call it staging [22:08:41] for a beta-like setup.* [22:08:45] :) [22:09:03] <^d> I don't think labs is the problem. It's not as though we /need/ dedicated hardware for it [22:09:14] <^d> I think it's just the current hodge-podge of a setup. [22:09:23] greg-g: so dallas labs isn’t being fired up now. so we could technically fire it up, restrict it to just deployment-prep, in about a month of andrew-bogot’s time [22:09:28] it can be almost entirely the same as the current beta setup but simply moving it out of labs and calling it staging will make a big difference in people's perceptions [22:09:28] greg-g: but not sure what that solves [22:09:38] I kinda agree with twentyafterfour [22:09:49] and ^d [22:09:50] <^d> I think the naming is important. [22:09:52] it solves people's perceptions, if you agree with him :) [22:09:54] <^d> And reducing the delta to prod. [22:10:05] ^d: my ‘quarterly’ goal is fixing that (delta to prod) https://phabricator.wikimedia.org/T87220 [22:10:07] yes [22:10:22] <^d> SUBSCRIBED [22:10:25] :D [22:10:29] <^d> Seeing as this is sorta my job now ;-) [22:10:37] * ^d comes in and licks ALL the cookies [22:10:41] ok, I'm going to act like a manager and say: twentyafterfour YuviPanda ^d you should all find each other at 2:30 (break-time) and talk about this, including what you want from yuvi/ops for the rest of the quarter :) [22:10:53] ^d: I think the blocking tasks of that are complete-ish, from an ops perspective. let me know if it’s missing things [22:11:18] sounds good [22:11:40] 3Continuous-Integration: Jenkins build timeout - https://phabricator.wikimedia.org/T87617#995218 (10Deskana) 3NEW [22:12:02] <^d> We'll have to meet outside for reasons I hope are obvious. [22:12:25] * ^d googles for the nearest convenience store too, pack's almost empty [22:12:46] this thing that Hashar is describing right now, needs to die in a fire [22:13:21] 3Continuous-Integration: Jenkins build timeout in Android Apps tests too short - https://phabricator.wikimedia.org/T87617#995225 (10greg) [22:13:29] twentyafterfour: what’s he describing? I’m not fully sure [22:13:38] multiversion [22:13:41] oh, that. [22:13:48] as a staging environment [22:13:59] <^d> multiversion <3 [22:14:00] 3Continuous-Integration: Jenkins build timeout in Android Apps tests too short - https://phabricator.wikimedia.org/T87617#995218 (10greg) (I think I changed the title accordingly: is that what you mean Dan?) [22:14:03] lol [22:14:19] * ^d remembers the awful days when every wiki was stuck using the same code, except test.wp which ran straight from NFS [22:14:22] <^d> THOSE WERE THE DAYS [22:15:27] 3Continuous-Integration: Jenkins build timeout in Android Apps tests too short - https://phabricator.wikimedia.org/T87617#995230 (10Deskana) >>! In T87617#995225, @greg wrote: > (I think I changed the title accordingly: is that what you mean Dan?) I don't know exactly what I mean, except that this test probably... [22:18:56] ^d that sounds like a good thing, minus NSF [22:18:59] NFS [22:21:18] <^d> Basically it's not, because people write bad code sometimes (more often than not :)). So we like being able to deploy to some subset of the full traffic load :) [22:21:50] <^d> enwiki is like ~50% of the traffic, so it makes for nice buckets. [22:26:12] twentyafterfour: ^d YuviPanda I have to run and take a call, you still should talk :) [22:26:44] ^d: I understand that it's nice to deploy to a subset of traffic load, I just don't know if this is the best way to do that [22:26:58] <^d> Hysterical raisons :p [22:27:17] what if one server ran the newest code and a load balancer sent a % of traffic there [22:27:29] we already do that as well, actually :D [22:27:50] well, as of two weeks ago at least [22:27:59] but it doesn’t run newer code or anything, so ignore me [22:28:05] lol [22:29:28] that’s the X-Wikimedia-Debug or something [22:31:11] YuviPanda: https://github.com/wikimedia/FirefoxWikimediaDebug [22:31:25] twentyafterfour: ^ [22:32:27] That setup is still "on the train" but pins the requests to a specific cluster host [22:32:58] It would be pretty snazzy to spring board off of that into "pre-deploy" verisons [23:17:58] 3Continuous-Integration: Jenkins: Jobs should not be affected by .git/index.lock of previous run - https://phabricator.wikimedia.org/T49638#995375 (10Krinkle) Happened again just now: https://integration.wikimedia.org/ci/job/mediawiki-phpunit-hhvm/1672/console ``` 23:09:07 Traceback (most recent call last): 23:... [23:18:13] 3Continuous-Integration: Jenkins: Jobs should not be affected by .git/index.lock of previous run - https://phabricator.wikimedia.org/T49638#995376 (10Krinkle) 5Resolved>3Open [23:19:17] 3Continuous-Integration: Jenkins: Jobs should not be affected by .git/index.lock of previous run - https://phabricator.wikimedia.org/T49638#524026 (10Krinkle) 5Open>3Resolved [23:19:24] * bd808 grumbles at zuul-cloner [23:19:30] 3Continuous-Integration: Jenkins: Jobs should not be affected by .git/index.lock of previous run - https://phabricator.wikimedia.org/T49638#524026 (10Krinkle) Sorry, wrong task. See T86734. [23:19:32] IOError: Lock at '/mnt/jenkins-workspace/workspace/mediawiki-phpunit-hhvm/src/.git/HEAD.lock' could not be obtained [23:19:51] 3Continuous-Integration: Zuul-cloner failing to acquire lock sometimes ("IOError: Lock for file .git/config did already exist, lock is illegal") - https://phabricator.wikimedia.org/T86734#975131 (10Krinkle) A slightly different lock error happened just now: https://integration.wikimedia.org/ci/job/mediawiki-phpu... [23:20:33] 3Continuous-Integration: Zuul-cloner failing to acquire .git/config lock sometimes - https://phabricator.wikimedia.org/T86730#995393 (10Krinkle) [23:20:48] 3Continuous-Integration: Zuul-cloner failing to acquire .git/config lock sometimes - https://phabricator.wikimedia.org/T86730#975087 (10Krinkle) 5Open>3Resolved a:3Krinkle A slightly different lock error happened just now: https://integration.wikimedia.org/ci/job/mediawiki-phpunit-hhvm/1672/console ``` 23... [23:21:27] 3Continuous-Integration: Jenkins build timeout in Android Apps tests too short - https://phabricator.wikimedia.org/T87617#995397 (10Deskana) Sometimes, if you keep asking Jenkins to recheck, it just works: https://gerrit.wikimedia.org/r/#/c/186862/ [23:22:33] !log rm integration-slave1006:/mnt/jenkins-workspace/workspace/mediawiki-phpunit-hhvm/src/.git/HEAD.lock (file was timestamped Jan 22 23:55) [23:22:35] Logged the message, Master [23:23:58] 3Phabricator: request for deletion: 'shell' project - https://phabricator.wikimedia.org/T87623#995406 (10Dzahn) p:5Triage>3Low [23:25:37] 3Phabricator, operations: Add @emailbot to #wmf-nda - https://phabricator.wikimedia.org/T87611#995412 (10RobH) a:3csteipp I've assigned this to Chris for his commentary. Chris: Please provide feedback and then feel free to unassign yourself as owner (or assign to me since I'll be working on this as it gets re... [23:26:18] 3Beta-Cluster: BounceHandler extension surprisingly missing from beta wiki - https://phabricator.wikimedia.org/T87624#995414 (1001tonythomas) 3NEW [23:27:34] 3Continuous-Integration: Zuul-cloner failing to acquire .git/config lock sometimes - https://phabricator.wikimedia.org/T86730#995421 (10Krinkle) 5Resolved>3Open [23:27:46] bd808: fixing.. [23:28:11] bd808: ah, thx [23:28:19] *nod* [23:28:45] I don't know why slave1006 is the one that does that most of the time [23:28:55] maybe just the job hashing [23:43:14] 3Release-Engineering, MediaWiki-Developer-Summit-2015, Quality-Assurance: Advanced Topics in Browser Test Automation - https://phabricator.wikimedia.org/T86070#995469 (10BGerstle-WMF) [23:46:30] PROBLEM - Free space - all mounts on deployment-cache-upload02 is CRITICAL: CRITICAL: deployment-prep.deployment-cache-upload02.diskspace._srv_vdb.byte_percentfree.value (<100.00%)