[00:02:41] lol [05:44:37] halfak: legoktm hey [05:44:54] I don't think we need to switch to debs before it [05:45:04] the security review should be for all our own python code [05:45:07] and the extension [05:45:18] and the dependencis [05:45:20] *ies [05:45:31] doesn't matter if it's debs or from pip [05:45:53] halfak: legoktm so I'm pretty sure that we want him to review the python code [05:45:56] for security issues [05:46:03] I'm also kindof sure it'll be straightforward [05:46:07] we're a readonly gig [13:06:01] halfak: hey [14:04:41] o/ yuvipanda [14:04:48] hello [14:04:56] (am in offsite, so partial attention) [14:05:01] is the security review today? [14:05:05] No worries. Figured. Yes. [14:05:13] Well. they want an overview today [14:05:16] ah. yeah, I left you messages here yesterday [14:05:17] Not sure about the actual review. [14:05:21] ah right [14:05:23] yeah [14:05:32] did you see my messages? [14:05:37] Yeah. Just now. [14:05:39] Makes sense. [14:06:23] one thing I'd still want us to implement is 'backpressure' - have the web server fast-fail requests if celery queue is too big. not sure how exactly to do that though [14:06:39] (not security related, but definitely will make life easier) [14:07:15] yuvipanda, I've been thinking about that. It seems like we should have a max size on our celery queue. [14:08:03] yeah [14:08:06] definitely [14:12:16] Ok. I'll see if I can get a plan together before I talk to those guys. [14:13:05] halfak: yeah. but worst case that'll just bring us down, not the wiki [14:13:45] Indeed. Graceful failure or something like that. [14:15:21] yeah [14:15:56] halfak: unrelated but we'll also have to decide on paging. I guess opsen will get paged and so will you [14:17:05] Makes sense. [14:17:17] I'm a fan of making "product owners" in charge of their services. [14:17:30] We suck at resources right now, but that problem should be solved sooner than later. [14:17:55] +1 [14:18:04] halfak: I guess it'll be you and the new FTE [14:18:49] At this rate, I'm feeling pretty good about pager duty. ORES hasn't been down or had any issues in a long time :) [14:19:03] Thanks to YuviAdvice [14:19:23] "Build it like this and you will sleep" [14:20:19] halfak: yup! [14:20:23] yay! [18:05:35] hello halfak [18:05:43] o/ ToAruShiroiNeko_ [18:07:12] can you take a look at the tr labelling campaign? [18:07:18] It should be done for real this time [18:07:27] I do have one thing I cannot label as its a deleted revision [18:07:41] oh I forgot to file that [18:24:32] * halfak has a look. [21:31:35] halfak: awight deployment update: I talked to Alex and we think we know where ores will live in cluster (the SCB subcluster), and also figured out a varnish / endpoint plan [21:31:46] bblack wants it to be en.wikipedia.org/ [21:31:52] and we've to infer the wiki name from domain [21:31:56] but that is fairly easily doable [21:32:05] so shouldn't be a problem [21:32:33] rad [21:33:37] awight: halfak [21:38:14] err [21:38:23] legoktm: should I just merge awight's mw patch? [21:38:37] yuvipanda: I think he already did yesterday? Which one? [21:38:39] in my same vein of 'I shall +2 whatever you ask me to because I trust you guys and this is not in production' [21:38:41] ooooh [21:38:43] ok :D [21:38:45] yay [21:39:23] it's not quite ready to go--there are two missing pieces, * maintenance script to purge the cache, and * hourly job to fetch the latest model version. [21:40:11] yuvipanda: I merged "Initial commit" and -1'd another one [21:40:19] ah [21:40:21] nice [21:40:23] ok [21:40:26] I'll leave the mediawiki to the mediawiki experts [21:40:34] I'm talking to alex about how to manage deps and what not today [21:40:43] he'll probably help with the final deployment [22:11:04] o/ Sorry meetings and then dog needs. [22:11:08] Reading scrollback [22:12:36] All great news. [22:13:44] So. The path is going to be annoying, but nothing we can't handle with some config voodoo. Hard part is thinking about the right place to put the voodoo so we don't hate ourselves later. [22:14:37] I might talk about revscoring at the next metrics meeting. [22:14:45] we could possibly do it in varnish level [22:14:51] We've got about ~10 uses in the wild. [22:14:53] but no we can't [22:15:01] since we have to translate wiki domain to dbname [22:16:21] I've thought of about 10 ways to do this that I don't like so far. [22:16:31] heh [22:16:39] well [22:16:46] I suppose the first hings to hit the prod ores [22:16:49] would be mediawiki [22:16:51] hmm [22:16:57] yeah, we need to solve that there too anyway [22:17:02] I don't have a clear answer [22:18:10] I think that the best way is to have something re-write URL on the web node before it gets to Flask. [22:18:21] Either that or we drop the context from the URL pattern. [22:21:44] halfak: yeah [22:21:49] halfak: am doing resource estimation with alex now [22:21:52] I don't want to maintain a mapping of domain and url prefixes in the web server config. [22:22:06] * halfak thinks about some technical detail too much. [22:22:22] We'll work it out some way. [22:22:24] yeah [22:22:35] so we have 16 'virtual' cores for workers now [22:22:39] and they're at about 30% util on avg [22:22:43] and I guess precached is running [22:22:45] Yes. [22:22:49] Yes it is. [22:22:59] so maybe 4-8 physical cores (which are a *lot* faster) might be enough [22:23:14] ores..precache_request [22:23:37] ^ web host [22:24:08] http://graphite.wmflabs.org/render/?width=586&height=308&_salt=1444861438.96&target=ores.ores-web-01.precache_request.count [22:24:09] nice [22:24:27] oooo [22:24:49] Weird. What happened at 6PM? [22:25:13] I think we got a bunch of long running requests [22:25:19] see as they completed there was a big spike [22:25:24] weird [22:29:46] YuviPanda, I went looking for way to limit celery queue size. I couldn't find an obvious config parameter. [22:30:09] halfak: hmm, that sucks. we should both file a bug in our phab and maybe a bug in celery? [22:30:10] Do you have a limited size queue for Quarry? [22:30:22] I don't but quarry has a much smaller queue size in general [22:30:23] Just want to be sure I'm not missing anything. [22:30:26] it hardly goes over 10 [22:30:31] Gotcha. [22:30:42] Might go way over 10 during the workshop. [22:30:46] How's 50ish? [22:31:17] oh yeah it can handle those no problem [22:31:21] just saying on average it's not very high [22:31:59] Gotcha. [22:33:32] YuviPanda, could we do it in nginx or uwsgi? [22:33:42] the limiting? unfortunately no [22:33:51] Hmm... Now that I think about it, it would be better in celery anyway. [22:33:53] we can just check the length of the redis key [22:34:04] Yeah... that'd be kinda hacky though, right? [22:34:15] well we should do it in the web layer since it's celery that'll be overloaded and we want more things to not reach it [22:34:17] halfak: yeah [22:34:23] halfak: I guess we can 1. do it like that and 2. file a bug [22:34:25] in celery [22:35:27] How is it that this hasn't come up before? [22:35:38] How is there a queue that you can't set a max size on? [22:36:02] good question. I don't know [22:36:24] halfak: I'd suggest filing a phab task, and I'll ask around in the ops team too [22:36:35] halfak: btw, we're planning on beginning prod deploy end of month ;) [22:36:37] :D [22:36:38] Yeah. I've got the phab task in. [22:36:45] Cool! [22:36:54] Just gotta get backpressure and central logging in. [22:37:02] logging config, I guess [22:37:05] You guys need anything from me re. debs? [22:37:13] Yeah. Logging config. [22:37:13] no, we're good mostly. [22:37:16] I just haven't had the time [22:37:18] next week [22:37:21] I gotta go soon [22:37:25] lobby and food and drink [22:37:31] Cool! Have fun :) [22:37:39] legoktm: can you open / is there a bug for ores deploy with the extension deployment checklist? [22:37:42] Thanks for the updates! [22:37:54] yw! [22:38:47] YuviPanda: no bug afaik, I can do that in a bit [22:38:55] kkk thanks :D [22:39:01] halfak: 'tis all coming together, etc, I guess :) [22:39:19] Indeed :) [22:39:32] halfak: crazy idea I've been forming today is https://etherpad.wikimedia.org/p/quarry-for-pwb-scripts [22:39:55] oooo! [22:47:58] * halfak adds notes [22:48:06] YuviPanda, I think that's a great idea.