[15:22:10] DELETE ALL THE THINGS! [15:25:40] hell yes, githubspam [15:39:10] :D [15:40:38] harej, looks like http://wikiconferenceusa.org/ is down [15:40:44] yup [15:40:50] Do you know who should be pinged? [15:41:51] i pinged the appropriate person [15:41:57] COol. [15:42:02] Can you remind me of the venue location? [15:42:05] wikimedia dc has been having issues with ukrainian DoSers [15:42:17] lame DoSers [15:42:28] nothing compared to mighty russian dos [15:42:30] Probably didn't get their proposal accepted. [15:42:39] the conference is at the national archives, 700 pennsylvania venue nw [15:42:41] archives metro station [15:42:52] Thanks! [15:42:54] uhhh between 7th and 9th streets NW on constitution avenue [15:43:10] * halfak is giving in and getting an AirBNB [15:43:58] air bnb is very good [15:52:23] {{done}} [15:52:44] I think I'm all set for DC then. harej, how's biking in the city? [15:52:50] downtown is completely flat [15:52:58] DC is a bike friendly city, at least by American standards [15:53:17] major streets tend to have bike lanes and we have a bikesharing system [15:53:29] Cool. Looking forward to it then. :) [18:54:55] o/ yuvipanda [18:55:08] What do we need to do in order to set of centralized logging for ORES? [18:55:27] See https://phabricator.wikimedia.org/T108421 [19:18:00] halfak I responded [19:18:10] just making sure that ores can take config params for it sounds good enough [19:18:27] No problemo [19:18:31] Will do that asap. [19:18:40] Do we have a store for the logs or do you want to do log files for now? [19:20:05] halfak let's do stdout now [19:20:11] Boo [19:20:14] that gets collected by systems [19:20:17] systemd [19:20:19] kk [19:20:21] into jouroanctl [19:20:28] and it handled rotation for us [19:20:35] if we do it to files manually [19:20:40] we have to handle rotation [19:20:47] so lets avoid that [19:21:00] logging has some stuff for that [19:21:00] for prod we might want graylog / gelf [19:21:02] But yeah. [19:21:08] I thought you wanted logstash or something like that [19:21:13] yeah [19:21:21] logstash is just the forwarder [19:21:31] It seems like I should be looking at statsd too. [19:21:32] it can take a lot of protocols [19:21:37] gelf / graylog seem popular [19:21:39] yes [19:21:46] I would really like to be able to talk about out cache hit/miss rate and the number of requests served. [19:21:47] I'll say statsd is more important [19:21:54] OK. I'll pick that up first./ [19:21:57] ok [19:22:00] thank you [19:22:04] What do I need to know about statd in order to use it? [19:22:15] Is there an URL or something that logs get sent to? [19:22:16] metric types basically [19:22:20] and when to use what [19:22:22] oh yes [19:22:24] send it to [19:22:33] labmon1001.eqiad.wmnet [19:22:39] on standard statsd port [19:22:45] and it will show up [19:22:52] on graphite.wmflabs.org [19:22:58] Cool. [19:23:01] Prefix "ores" [19:23:04] or something like that? [19:23:21] This think look reasonable: https://pypi.python.org/pypi/statsd [19:23:21] ? [19:23:26] yup [19:23:33] that's the one on Debian as well [19:23:37] so lets use that [19:35:09] OK. [21:49:44] yuvipanda, check it out: https://www.mediawiki.org/wiki/User:Halfak_%28WMF%29/mediawiki-utilities/Index [21:50:31] niiiice halfak [21:50:34] * yuvipanda jelly [21:52:13] Lots of stuff to flesh out still, but you can get the gist from glancing at that page. [21:52:21] Jelly? Come join me in the revolution. [21:52:22] yup definitely [21:52:33] But seriously, I work on each of these utilities as I need 'em. [21:52:34] I seem to have too many things on my plate unfortunately [21:52:48] So, mwapi got a bunch of work for ORES. [21:53:07] mwxml, mwreverts, mwdiffs and mwpersistence got a lot of work for a content quality analysis I am working on. [21:53:14] nice [21:53:35] Oh same deal with mwsessions. It got some attention due to a recent analysis. [21:53:39] :) [21:59:34] * halfak gets back to statsd [22:11:14] yuvipanda, I want to record response timings, but I'd like to bucket them by the number of rev_ids requested. [22:11:24] Does this make sense with statsd [22:11:25] ? [22:13:42] halfak: yeah I think that's ok [22:13:48] you can just put that in the key [22:13:52] number of rev_ids [22:13:57] Gotcha. [22:14:01] I suspect they'll bucket neatly along 1, 10, 50 maybe [22:14:21] Yeah. We currently support requests of more than 50 revisions at a time, but I think we should stop that. [22:14:24] 50 revs tops [22:14:25] too bad [22:15:23] indeed [22:15:31] halfak: if they need way more than that they can install the models and stuff locally too [22:15:37] yeah [22:15:39] exactly. [22:16:05] Or do their own sequencing of the request. [22:16:13] halfak: so I guess the 'max_revids' can be a config param? [22:16:19] that we can raise and lower as needed? [22:16:21] Yeah. I think that makes sense. [22:17:06] https://github.com/wiki-ai/ores/issues/95 [22:19:37] What would you call statsd/graphite? Usage logger? Performance logger? [22:19:41] yuvipanda, ^ [22:24:29] halfak: metrics collector / reporter is the usual term [22:34:46] Thanks [23:00:56] yuvipanda, can you have both an increment and a timing at the same key? [23:01:08] halfak: nope [23:01:16] one key can be of only one type [23:01:30] OK. So, maybe keys should be suffixed with ".count" or ".timing"? [23:01:38] they are by default [23:01:42] Oh... [23:01:43] let me give you an example [23:01:44] Hmm.. [23:02:15] according to the python statsd docs, client.incr("bar") will increment "bar" [23:02:25] what if I client.timing("bar", 500)? [23:02:46] I don't know [23:02:50] I think statsd might get confused? [23:02:55] you can try it and see what happens :D [23:02:59] http://graphite.wmflabs.org/ [23:03:00] so for counters [23:03:15] err or gauges? you get 99th percentila, sum, upper, lowe,r etc [23:03:16] Is there harm in just sending some crap to graphite for test purposes? [23:03:19] nope [23:03:22] halfak: just mark them as such [23:03:24] something like [23:03:27] ores.testing. [23:03:32] Gotcha. [23:03:33] and we can remove them later [23:03:40] halfak: so we usually prefix metrics with [23:03:46] $projectname.$hostname [23:03:59] so that there are no conflicts from metrics coming in from multiple machines [23:04:00] so: ores.wmflabs.testing.bar [23:04:03] yeah [23:04:05] err [23:04:07] no [23:04:10] ores.ores-web-01.testing.bar [23:04:16] Oh... hmm.. [23:04:21] That's going to be weird [23:04:36] I need to sneakily access hostname in a function [23:04:36] so the usual way that's done is you pass a 'prefix' to the statsd constructor [23:04:44] and you get the prefix from somewhere, maybe a commandline param [23:04:49] or just config [23:05:07] and that's set appropriately by something else, usually puppet. although since we don't do puppet for our config I'm not sure [23:05:11] Yeah. but we deploy the same config on every machine [23:05:27] but you can just access it in wherever you're instantiating the statsd object I guess [23:05:27] I could have a magic word in the config [23:05:34] yeah not sure [23:05:42] seems more complicated than just calling socket.gethostname [23:05:46] in one place [23:06:26] Seems like config is the right place since others might have different norms [23:06:48] But I don't want to give this special treatment [23:06:56] hmmm [23:07:14] you'll need the 'ores' part to be configurable as well [23:07:17] since in prod [23:07:19] it's servers.hostname [23:07:19] Yeah. [23:07:22] That's no problem [23:07:34] I just don't like having a magic word in the config. [23:07:39] indeed [23:08:26] simplest way to do is maybe just say that {{hostname}} will be replaced with the hostname and just do that with string.format(hostname=socket.getdomain) [23:08:35] Yeah [23:08:37] and if others don't want hostname they don't specify {{hostname}} in the prefix fonfig [23:08:39] *config [23:08:40] That's what I was thinking too. [23:08:51] and if they want other magic words [23:08:53] it's easy to add [23:08:57] Fair. [23:08:58] and this doesn't make all fields take magic words [23:09:01] and only the one field [23:09:03] which is good [23:09:03] OK. Ima do that [23:09:06] ok [23:09:08] thanks [23:09:16] Thanks for helping me work this out :) [23:13:30] So, one of the keys I am thinking about making is "ores.ores-web-01.scores_requested.enwiki.reverted.50" [23:13:33] Any red flags? [23:27:20] Looks like timing() contains incr() [23:27:28] Which is cool because only need to call timing() then. [23:27:36] yuvipanda, ^ [23:27:41] See also my comment about the key [23:35:47] Gotta run. Will queue up yuviquestions for the morning [23:35:48] o/ [23:36:32] halfak: sorry been pulled into like 3 different conversations but if it's scores_requested.$wiki.$model.$count that's probably ok [23:36:46] Cool. Thanks [23:36:56] Next thing to think about is how to flag precached requests. [23:37:21] Really, I've got the flagging them figured out. I guess it's the recording that I'm not so sure about. [23:39:00] halfak: you probably also want a .any or .all so we can easily get a 'total count of all requests' or something. [23:39:11] Gotcha. [23:39:14] halfak: http://graphite.readthedocs.org/en/latest/functions.html is all the functions available for aggregations on the data [23:39:25] Thanks [23:39:34] * halfak actually runs away now [23:39:39] reading material for the AM [23:39:40] o/ [23:39:44] How do you take the elements of a struct and use them for a map?