[00:01:08] mforns, are you looking at disk space on eventlog1001? it's almost full. [00:01:24] jgage, wow, thanks for the heads up [00:01:34] icinga paged me :) [00:01:55] jgage, solved [00:01:59] thanks amigo [00:02:02] sorry for that [00:02:24] no problem, glad we caught it [00:18:44] madhuvishy: so adding tests might be the way to go to see where things go wrong [00:20:06] nuria: yeah. i doubt i'll be able to today, but Dan and I will probably pair on it when he's here thursday. (I'm away until wednesday). But i'm deep into this bug now - hope to figure it out in a bit [00:20:21] madhuvishy: sounds good [00:31:30] nuria: https://phabricator.wikimedia.org/diffusion/ANWM/browse/master/wikimetrics/models/validate_cohort.py;69b26d41d7b06899e83cda76c75a71f650b37d84$89 here i see that usernames are being inserted. but in the wiki_user table there are both user_names, and associated ids. do you know where/how this association happens? [00:32:38] madhuvishy: user_ids are pulled from the wiki project in question "enwiki, dewiki", in vagrant "wiki" [00:32:50] nuria: where in the code? [00:33:28] madhuvishy: ah , let me look [00:39:47] madhuvishy: seems to me that when records are 1st inserted they go in only with "name" [00:40:20] nuria: which kinda seems "wrong" when the user supplies ids [00:40:27] after the validation process enters the id for the valid ones (and probably non valid too for which it can find ids) [00:40:50] madhuvishy: ya, a "little" wrong if that is the on;ly code path [00:41:11] nuria: something is going wrong there [00:41:31] if the id is not valid [00:41:58] the behavior is incorrect [00:43:55] madhuvishy: ya, i see several things do not look right [00:44:32] the "validate_as_user_ids" field on form [00:44:53] is passed along and changes all behaviour: ./wikimetrics/models/validate_cohort.py: [00:45:01] nuria: right [00:45:08] nuria: and that's intact [00:45:27] nuria: it's in the database, and when you validate again, it looks right [00:45:52] https://www.irccloud.com/pastebin/Fjywp3wZ [00:46:00] this is where it's awry [00:46:05] 7 and 8 are user ids [00:46:51] but validation doesn't put them in mediawiki_userids - which makes sense - they don't really exist [00:46:59] but this is wrong too [00:49:19] madhuvishy: ya, they are being treated like user_names [00:50:04] nuria: this is part of the problem - and then somehow it marks all as invalid user ids - may be its checking the usernames? [00:50:08] not sure [00:50:50] madhuvishy: i think so, i think user ids are being validated as user names -looks like- and thus they are invalid, sometimes user_names and user_ids are teh same so in instances it might work [00:51:25] nuria: Validate again works totally fine if you upload a list of user names. [00:51:40] nuria: but yes, for user ids, this is plausible [00:52:12] madhuvishy: well, there you go, you might need to refactor the id workflow, but it might be wrong only when it comes from ui, you can write tests and see [00:52:22] if that makes sense [00:52:26] Ironholds: yt? [00:52:38] nuria: yeah. [00:53:06] Ironholds: for the apps session job (that BTW it runs on 6 mins now for a whole day) [00:53:33] Ironholds: you calculated percentiles ? [00:53:49] or rather gave lowbound , highbound for quantiles? [00:54:00] this one seems a lot harder to interpret for a pm [00:55:08] Ironholds: the phab ticket says "quantiles" [01:03:08] Ironholds: doing percentiles, let me know otherwise [01:03:58] Ironholds: maybe we just got mixed up with "quaRtiles" and quaNtiles" [04:59:17] i never know what to do when i get these alerts from Icinga: PROBLEM alert - graphite1001/Difference between raw and validated EventLogging overall message rates is CRITICAL [04:59:41] CRITICAL: 20.00% of data above the critical threshold [1800.0] [05:04:46] jgage: I guess it means that there are way too many invalid events coming in...