[00:04:08] I'm looking at that task now [00:04:32] is it just a train blocker? not emitting errors in production right now? [00:05:34] TimStarling: yeah, we rolled back. not currently in production. [00:11:03] it's partly a PHP7/HHVM conflict I think [00:13:40] Oh good. [00:14:27] my reason for saying that is just that $empty = ''; $obj->$empty = ''; doesn't work on HHVM, it raises "Hit fatal : Cannot access empty property" [00:14:39] when I test on deploy1001 [00:15:02] it does work on PHP7 [00:15:14] if you unserialize such an object on HHVM, that works [00:22:42] https://phabricator.wikimedia.org/P8832 <- fatals on php7.0 but not on 7.2, somewhat to my surprise. (if anything i'd have expected that to go the other direction.) [00:24:30] there were some bugs filed about this in the context of JSON unserialization, I think the conclusion was that we just weren't going to use 7.0 [00:40:18] brennen: dyk about 3v4l.org? https://3v4l.org/ad4g0 [00:40:54] RoanKattouw: i hadn't! nice. [00:43:58] (huh. it appears i bookmarked that in 2016 and immediately forgot about it.) [00:53:30] seems pretty unlikely T229366 is getting any patches yet this (USA-relative) evening, but just for the record i'm signing off now and will pick the train back up in the (colorado-relative) morning. [00:53:30] T229366: serialize(): "" returned as member variable from __sleep() but does not exist - https://phabricator.wikimedia.org/T229366 [01:06:11] I'm still on the case, I can give updates if anyone else is looking at it [01:06:57] but I assume nothing will happen today [01:07:54] I'm around, naturally. [01:44:32] fully isolated and commented on the task [02:20:28] fix submitted, please review [03:26:25] it took 19 minutes for jenkins to merge that change [03:29:41] wmf-quibble-core-vendor-mysql-hhvm-docker https://integration.wikimedia.org/ci/job/wmf-quibble-core-vendor-mysql-hhvm-docker/20515/console : SUCCESS in 18m 10s [03:29:41] wmf-quibble-core-vendor-mysql-php72-docker https://integration.wikimedia.org/ci/job/wmf-quibble-core-vendor-mysql-php72-docker/3800/console : SUCCESS in 19m 36s [03:30:36] yeah, I'm looking at the log of the 19 minute one [03:32:18] there's a lot of different things together in that job, would it make sense to break it up more? [03:32:23] the majority of the time is spent in browser tests and the second phpunit run with databases [03:32:40] potentially, yes [03:33:42] IIRC it takes about 2-3 minutes for setup [03:35:13] setup time could be optimised [03:37:39] I've been playing with btrfs lately, at RationalWiki [03:37:47] I see just about all of the docker setup time is from rsync [03:37:55] and I also see that docker has a btrfs storage backend module [03:38:47] that's castor, it basically rsyncs the ~/.cache/ directory from a central host so most dependencies aren't redownloaded again from whichever package manager [03:39:34] well, maybe castor can use btrfs [03:39:47] btrfs send can replace a lot of uses of rsync [03:40:44] we could potentially use it as a backend for scap [03:44:21] castor is our own thing? [03:50:15] I'm on integration-castor03 looking at its config [03:50:44] when docker starts, is the cache directory empty? [03:52:22] yes, its a custom thing that hashar invented [03:52:53] castor populates $WORKSPACE/cache which is then mounted into various docker images that run [03:54:56] but that slow rsync copies everything, it is not incremental? [03:57:00] at the end of gate-and-submit jobs, it rsyncs the cache/ back to castor to populate new stuff, but all other jobs just pull from castor [03:57:05] so yeah, it copies everything [04:02:57] the size is 72GB [04:08:25] that's the whole of /srv/cache, I see for this job it only copies mediawiki-core/master/wmf-quibble-core-vendor-mysql-php72-docker [04:09:14] which is 368MB, not so ridiculous [04:14:10] yeah, it splits it by cache name [04:14:23] er, job name [04:15:12] I did an rsync on integration-slave-docker-1040 of what I think is the approximately what that job did, and it only took 12 seconds [04:15:45] time rsync --stats -a rsync://integration-castor03.integration.eqiad.wmflabs:/caches/mediawiki-core/master/wmf-quibble-core-vendor-mysql-php72-docker timtest [04:17:04] how many other rsyncs were running at that time? when a MW core patch runs, there are 3-4 jobs that will immediately try and rsync from castor to their respective instances [04:17:16] fun [04:17:38] see https://phabricator.wikimedia.org/T188375 [06:03:49] commented there [06:17:49] thanks [06:17:57] removing --compress is a pretty easy change I think [06:20:33] another thing that I think would help by a factor of two or so is to occasionally wipe and regenerate the cache [06:20:51] the _cacache directories in particular seem to grow forever [06:20:59] 140M mediawiki-quibble-vendor-mysql-php72-docker/npm/_cacache [06:20:59] 31M mediawiki-quibble-vendor-mysql-php73-docker/npm/_cacache [06:21:30] if 7.3 needs 31MB and 7.2 needs 140MB, that means there is at least 110MB of stale files, right? [06:22:42] in mediawiki-core/master, the various _cacache directories consume 1.9GB of 4.3GB [06:23:01] hmm [06:23:17] probably because npm packages get upgraded much faster/often than composer, etc. [06:33:04] I just deployed the --compress change to all quibble jobs [06:34:32] maybe the npm local cache can be replaced by a remote cache [06:35:36] googling to figure out what the state of the art there is [06:36:33] maybe https://github.com/local-npm/local-npm ? [06:36:38] 23:35:16 Syncing... [06:36:38] 23:35:17 rsync: failed to set times on "/cache/.": Operation not permitted (1) [06:36:38] 23:36:21 rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1668) [generator=3.1.2] [06:36:56] that is quite slow [06:36:59] (for the wmf-quibble-core-vendor-mysql-php72-docker job I just triggered) [06:37:26] there were probably 11 concurrent rsyncs at that time [06:38:07] yeah, difficult to get reproducible benchmark figures under those conditions [06:38:31] at least it is not slower than before [06:39:00] it was 86 seconds, now 64 [06:41:36] local-npm looks pretty cool, have you seen it before? [06:42:34] nope, I'm reading through it now [06:43:52] https://github.com/local-npm/local-npm/issues/181 :| [06:44:50] that's what you get for using open source [06:48:37] trying to set it up locally now... [06:52:59] the idea would be to have a centralized npm caching proxy, and start jobs with an empty local cache, so packages would be downloaded individually on demand instead of speculatively [06:59:44] I think it's working [07:00:23] yep, it is [07:00:35] sounds good, dinner time here though [07:00:45] midnight here :) [10:12:40] <_joe_> sigh the techcom meeting was today and not tomorrow? [10:13:05] <_joe_> I completely missed the change in date sorry :( [10:16:36] <_joe_> duesen_: why has the meeting been moved to wednesday morning? [10:16:46] I realised at 35 past the hour [10:17:20] <_joe_> this makes me feel slightly less guilty [10:17:21] <_joe_> :P [10:17:43] Daniel, Dan and Kate were the only ones there on time [10:17:49] <_joe_> I have to remember the evening before :/ [10:18:04] <_joe_> yeah and I had to do the board grooming, which I usually do in the afternoon before the meeting [10:18:11] <_joe_> so that I get all the relevant updates [10:18:21] <_joe_> anyways, sorry :/ [10:19:41] _joe_: the reason is that we want to have the meeting before the irc discussion, which is on wednesday afternoon [10:19:50] (not today, because we didn't have anything scheduled) [10:19:52] <_joe_> heh yeah it makes sense [10:20:03] we decided that like a month ago or so [10:20:15] <_joe_> but last time I remember being at the meeting on thursday morning, and yeah it checks out [10:20:19] ...when we established the europe-friendly discussion slot [10:20:21] <_joe_> I was out for 3 weeks basically [10:20:37] <_joe_> and I was back last week [10:20:43] <_joe_> so 1 month ago checks out :/ [10:20:58] <_joe_> duesen_: sorry again [10:21:23] _joe_: no big deal. beware that we may change the time again - this slot hasn't really worked out [10:21:34] we'll talk about that next week [10:21:44] <_joe_> ok, but then I'll be around and remember :D [10:21:58] <_joe_> I admit I didn't go back and read the old meetings minutes [10:22:17] just check your calendar ;) [10:22:45] but i'll admit, the 7am slot is usually scroleld out of view for me [11:42:24] <_joe_> duesen_: yeah that's what tricked me, all my other meetings are after 5 pm :P [13:30:31] _joe_: yeah, I hope we can find a single slot/day to have our meeting. I don't really mind anymore what time, but it's hard to keep a sane schedule when one week I'm supposed to working at 22:00-23:00 (never mind getting to sleep), and another week day up working at 06:00-07:00. Those two aren't compatible. [14:28:38] * Krinkle looks with concerning eyes at https://grafana.wikimedia.org/d/000000402/resourceloader-alerts?panelId=16&fullscreen&orgId=1&from=now-6h&to=now [15:20:19] clarakosi: this looks promising https://github.com/visionmedia/superagent/blob/master/test/node/agency.js [15:20:51] (start from line 76) [15:22:12] duesen_: That might've just did it for me :D [15:25:11] :D [15:58:56] duesen_: Hi :), is there anything you think should be done in this patch: https://gerrit.wikimedia.org/r/c/mediawiki/extensions/CategoryTree/+/526234 apart from the TODO as written in the commit msg? [18:19:04] hknust: On reflection, it's simpler to just bump the generic tox-docker job to 20 minutes. [18:19:46] James_F: Wasn't sure that was an option [18:20:19] hknust: Yeah, too much complexity in CI. :-) Give me 10. [18:20:41] James_F: great. thx [18:39:05] woohoo. that looks good. thx again [18:44:34] Happy to help.