[05:36:17] Analytics-Cluster, operations, Interdatacenter-IPsec: Secure inter-datacenter web request log (Kafka) traffic - https://phabricator.wikimedia.org/T92602#1115779 (Gage) NEW [08:57:56] Analytics-Tech-community-metrics, ECT-March-2015: KPI pages in korma need horizontal margins - https://phabricator.wikimedia.org/T88670#1115975 (Qgil) Ok, this new layout with margins is more usable and looks a lot better. Thank you! There is still a chance that some tooltips are cut, and not even using... [13:27:54] Analytics-Tech-community-metrics, ECT-March-2015, Patch-For-Review: "Volume of open changesets" graph should show reviews pending every month - https://phabricator.wikimedia.org/T72278#1116526 (Qgil) p:Low>Normal [13:27:56] Analytics-Tech-community-metrics, ECT-March-2015, Patch-For-Review: "Age of unreviewed changesets by affiliation" shows negative number of changesets - https://phabricator.wikimedia.org/T72600#1116528 (Qgil) p:Low>Normal [13:32:25] Analytics-Tech-community-metrics, ECT-March-2015, Patch-For-Review: "Volume of open changesets" graph should show reviews pending every month - https://phabricator.wikimedia.org/T72278#1116543 (Qgil) a:Qgil>Dicortazar >>! In T72278#745488, @Acs wrote: > Quim, we plan to add the Verified -1 and -2... [13:40:41] Analytics-Tech-community-metrics, ECT-March-2015, Patch-For-Review: "Age of unreviewed changesets by affiliation" shows negative number of changesets - https://phabricator.wikimedia.org/T72600#1116558 (Qgil) a:Qgil>Dicortazar Ok, here the suspicion was that the values of previous months changed a... [13:41:12] Analytics-Tech-community-metrics, ECT-March-2015: Key performance indicator: Gerrit review queue - https://phabricator.wikimedia.org/T39463#1116562 (Qgil) p:Low>Normal [14:02:22] ottomata: standupppp [14:02:45] (CR) Milimetric: "Just some cleanup comments, looks good to merge if you fix the flake8 stuff. I confess I wasn't able to test it but I trust you that it w" (5 comments) [analytics/limn-mobile-data] - https://gerrit.wikimedia.org/r/192319 (https://phabricator.wikimedia.org/T89251) (owner: Mforns) [14:03:12] ohmygoodness standup [14:08:14] Analytics-EventLogging, Analytics-Kanban, operations: Eventlogging JS client should warn users when serialized event is more than "N" chars long and not sent the event [8 pts] - https://phabricator.wikimedia.org/T91918#1116583 (kevinator) Implementation notes from our tasking meeting: Client - adding m... [14:13:29] ottomata: I can't get that "we keep 1 PB of data in kafka buffers" thing from the article [14:15:56] they have a lot of kafka brokers [14:16:00] wait, linkedin? [14:16:08] the jay kreps lambda article? [14:42:07] qchris: can I ask you for some git/gerrit advice? [14:42:14] Sure. [14:42:18] it is very simple,and something you are good at. [14:42:20] so [14:42:25] i have a patch in for review [14:42:27] but i want to keep working on it [14:42:34] probaly will submit a new patch on top of it [14:42:53] but, i am not sure of the best way to still be able to edit the first patch if I get review comments [14:43:01] if i have a commit on top of it also in for review [14:43:29] how do you do it? [14:43:44] So you have a commit A1 that is parent of B1. [14:43:52] A1 is in gerrit and getitng reviewed, [14:44:05] while you privately hack on B1? [14:44:05] yes [14:44:16] for now, but i would like to put B1 up too [14:44:19] while A1 is still in review [14:45:03] Whenever you update A1 to A2, A3, A4, ... I'd also rebase B1 on the current A$N. [14:45:14] So you'd end up with A1 being parent of B2; [14:45:19] Grr. wrong. [14:45:20] Sorry. [14:45:23] So you'd end up with A1 being parent of B1; [14:45:31] A2 being parent of B2. [14:45:36] A3 being parent of B3. [14:45:38] And so on. [14:45:53] It's not super-nice, but I do not know of a simpler thing. [14:46:33] Do the rebasing/cherry-picking in your local checkout. Git is typically better at that than gerrit is. [14:46:33] locally how to you hack on a1? [14:46:36] check it out in its own branch, hack, and then submit [14:46:46] and then after that rebase B1? [14:47:38] I typically create a C1 (which is a child of B1). [14:47:45] Then I go "git rebase -i master" [14:48:05] and reorder A1 <- B1 <- C1 to A1 <- C1 <- B1 [14:48:20] (and squash C1 into A1). [14:48:41] That creates A2 (from A1 and C1), and B2 (just rebased B1). [14:49:01] This workflow assumes that A1 and B1 are not too intertwined. [14:49:21] Are A1 and B1 really intertwined for what you are trying to achieve!? [14:49:24] s/!// [14:49:26] not really [14:49:31] oh man sounds complicated though [14:49:38] reorder? [14:49:49] It only sounds complicated. [14:50:10] Reorder ... yes. Did you notice the "-i" switch in the rebase command? [14:50:15] That is for "interactive" rebase. [14:50:31] Git opens up an editor, with a list of commits that you are about to rebase. [14:50:38] ah no, ok, going to try it. [14:50:51] Swapping the line of B1 and C1 the saving, swaps the order of the two commits. [14:50:59] so, in your example, C1 is what you want to turn in to A2, yes? [14:51:09] A2 := A1 + C1. Yes. [14:51:11] ok [14:51:41] "git rebase -i master" will show you a file in your editor that looks like [14:52:00] pick $HASH_OF_A1 $COMMIT_MSG_OF_A1 [14:52:03] pick $HASH_OF_B1 $COMMIT_MSG_OF_B1 [14:52:09] pick $HASH_OF_C1 $COMMIT_MSG_OF_C1 [14:52:13] turn that into [14:52:21] pick $HASH_OF_A1 $COMMIT_MSG_OF_A1 [14:52:28] f $HASH_OF_C1 $COMMIT_MSG_OF_C1 [14:52:35] pick $HASH_OF_B1 $COMMIT_MSG_OF_B1 [14:52:38] and save. [14:53:06] in your case, are you using a working topic branch? [14:53:08] whne you rebase? [14:53:10] (So order of B1 and C1 got swapped. and "pick" of B1 got turned into "f" which means "fix-up") [14:53:26] yes, but that is not so important. [14:53:30] so that rebase -i master chooses an earlier commit? [14:53:35] i just did a test in a new git repo [14:53:39] with three commits [14:53:42] If you're not on a topic branch, you just end up with a detached HEAD: [14:53:45] rebase -i master [14:53:46] says [14:53:47] noop [14:53:51] noop [14:53:51] # Rebase 623654a..623654a onto 623654a [14:54:00] I master at the parent of A1? [14:54:07] (It should be) [14:54:08] master is at heaed [14:54:10] head [14:54:12] Ah! [14:54:12] so, C2 [14:54:14] sorry [14:54:15] C1 [14:54:26] so that's why you'd need another branch [14:54:28] then try "git rebase -i HEAD~3" [14:54:30] k [14:55:10] (But typically I'd recommend to not work on "master" branch. Either work detached, or use a topic branch.) [14:55:20] (Topic branches are less error-prone) [14:55:36] aye [14:55:49] ok, that worked. [14:55:50] pffft. magnet and a steady hand. [14:55:55] how do I squash A1 and C2? [14:56:16] The change from "pick" to "f" in the rebase does the trick. [14:56:28] oh [14:56:33] "fix-up" is a "squash" without asking you about the commit message. [14:56:55] It's a short-cut that really usefull in those situations. [14:57:16] So one cannot pick the wrong Change-Id when squashing them. [14:57:19] hm Cannot 'fixup' without a previous commit [14:57:24] maybe i did something weird [14:57:27] trying again [14:57:28] from scrat [14:57:45] Yup. See scrollback. [14:57:49] A1 is "pick" [14:57:53] C1 is "f" [14:58:27] Let me boot the google machine... [14:59:10] yes i think my problem is in my trial i didn't have an A1, it was just the inital commit [14:59:15] trying again with an A1 [15:00:05] that worked qchris [15:00:21] ha, this is pretty annoying [15:00:25] hm, ok, at least I know how now [15:00:27] thank you! [15:05:20] yw :-) [15:17:03] hey nuria, EL q for you if you got a sec [15:17:45] or for ori too, but maybe he is not awake yet? [15:21:20] ottomata: what's up? [15:21:41] oh, its kinda a El architecture q, about the codebase. [15:21:54] kinda a: do you think I should do this, or this, question [15:24:19] sorry, just caught up with your git/gerrit question above - I know how to do that reordering too, there's also another way that I do it [15:24:51] you can ask, I know the EL code fairly well [15:26:35] oh ok [15:26:36] ok [15:26:41] ok [15:27:24] so, both forwarder and processor have a similar problem, that is, even though they take input and output URIs [15:27:33] they are specific to zeromq [15:28:00] i'm wondering if they could take advantage of the handlers and just use get_reader and get_writer [15:28:07] and then potentially be able to work with any input or output [15:28:21] exampel [15:28:25] processor does [15:28:26] pub = pub_socket(args.output) [15:28:29] wheras, it could do [15:28:36] output_stream = get_reader(args.output) [15:28:41] sorry [15:28:44] get_writer(...) [15:29:36] hm, my opinion is it'd be better to keep these things single-purpose [15:29:39] i'm not sure about the unicode stuff thoguh [15:29:50] so instead of that, just make a kafka_writer [15:29:57] whatever_writer [15:30:03] you mean a eventlogging-kafka-processor? [15:30:14] i already have kafka_reader and kafka_writer [15:30:17] that works fine [15:30:22] that's why the kafka URIs work [15:31:21] so you want to change the zeromq_writer to be more generic? [15:31:27] no [15:31:34] i want the bin/ scripts to work with any reader or writer [15:31:44] oh, sorry, bin/scripts, duh [15:32:26] eventlogging has this fanciness to infer reader iterator streams and writer streams from URIs [15:32:30] its really awesome [15:32:32] and works for kafka [15:32:42] but, the bin scripts are special cased and don't use the get_reader or get_writer methods (mostly) [15:32:50] -consumer does [15:32:52] and it works great with kafka [15:33:20] I think that idea makes sense, yea [15:33:53] so, i'm looking at processor now, too see how hard it would be to process events directly from kafka, rather than using 0mq [15:34:03] the reader is actually generic [15:34:07] oh HM [15:34:08] actually [15:34:10] lemme try that... [15:34:44] ottomata: sorry i missed your ping. did you get your EL question answer? [15:35:15] ah, no it isn't [15:35:18] sorry, htat was my own code [15:35:21] ha, no [15:35:21] so [15:35:27] i'm looking at eventlogging-processor [15:35:40] and wondering if I can make it use generic get_reader and get_writier from input and output args [15:35:44] rather than doing the specfic [15:35:51] (sub_socket(args.input, identity=args.sid) [15:35:52] and [15:35:57] https://github.com/wikimedia/mediawiki-extensions-EventLogging/blob/master/server/bin/eventlogging-processor#L53 [15:35:58] pub = pub_socket(args.output) [15:36:06] yeah, exactly [15:36:10] for that [15:36:11] why not do [15:36:19] output_stream = get_writer(args.output) [15:36:20] get_writer(args.output) [15:37:11] ha, the --sid flag can be encoded in the url if you are using 0mq get_reader [15:37:12] where's get_writer defined again? [15:37:23] tcp://*:8600?socked_id=abc [15:37:25] in utils i think [15:37:48] no, factory.py [15:37:48] https://github.com/wikimedia/mediawiki-extensions-EventLogging/blob/master/server/eventlogging/factory.py#L66 [15:38:07] ottomata: the problem is that the processor assumes zeromq streams and you want to make it more generic, correct? [15:38:22] yes [15:38:26] ottomata: did I understood that well? [15:38:29] same for forwarder too [15:38:31] sorta [15:38:47] the main issue is the iter_unicode stuff [15:38:55] i think those methods assume a file handle? [15:39:02] not just a generic iterator [15:40:15] file or zmq stream? [15:40:16] not sure what [15:40:21] if hasattr(stream, 'recv_unicode'): [15:40:21] return iter(stream.recv_unicode, None) [15:40:21] is for [15:40:27] i assume recv_unicode is a zmq method? [15:40:38] ottomata: lemme look [15:40:58] ja it is [15:41:04] ok, so yeah, the various writers have that decorator that registers them in factory.py, which then dispatches the appropriate writer depending on the URI scheme. I think get_writer should work fine [15:41:31] milimetric: it does well, because consumer uses it [15:42:00] and it makes sense to make processor more generic, too [15:42:03] hmm, yeah, actually i thikn forwarded might be very easy to maek generic [15:42:14] processor looks harder, because of the unicode stuff, [15:42:17] hm, i guess [15:42:31] those methods in stream.py already take condintional action based on the type of stream [15:42:32] I don't see where the consumer does it, or maybe it's not merged? [15:42:32] https://github.com/wikimedia/mediawiki-extensions-EventLogging/blob/master/server/bin/eventlogging-consumer [15:42:36] if hasattr(stream, 'recv_unicode' [15:42:37] etc. [15:42:42] i could add another conditional for kafka [15:42:51] consumer calls .drive() [15:43:18] https://github.com/wikimedia/mediawiki-extensions-EventLogging/blob/master/server/eventlogging/factory.py#L82 [15:44:15] seems that the current stream abstraction doesn't work with kafka [15:44:19] oooh [15:44:21] :) [15:44:42] ay ay [15:45:11] it could! [15:45:15] Adding conditionals there doesn't make teh forwrader /consumer more generic though, right? [15:45:27] it does, because then you could call iter_unicode(kafka_stream) [15:45:27] you just move one level below the "if" [15:45:39] i'd still have to make it use get_reader and get_wrtier [15:45:46] but i could pass those to iter_unicode [15:46:07] for raw_event in iter_unicode(get_reader(args.input)): [15:47:20] hm, yeah [15:47:23] actually, that is what I should do. [15:47:28] make the stream abstraction work with kafka [15:47:32] then everything else will fall in place [15:47:43] if this were OO i'd say [15:47:59] streams shoudl be a factory that returns a stream handler based on input args [15:48:41] stream handler will be of type zeomq or kafka depending on args passed to forwarder (if those two are substantially different) [15:49:35] now... this all being python modules i imagine you can add the needed logic to manage kafaka streams to streams (seems that it belongs there) [15:49:42] yeah, as is it's functional without pattern matching. But with get_writer and get_reader, your problem only starts here: https://github.com/wikimedia/mediawiki-extensions-EventLogging/blob/master/server/bin/eventlogging-processor#L66 [15:50:45] (and btw, apply_safe in factory is kinda like pattern matching) [15:51:10] Probably what we want to change is the fact that the forwarder/processor assumes there is only one possible stream [15:51:57] hmm, milimetric, yeah [15:52:01] that is someting I don't understand [15:52:06] why all the special unicode handling? [15:52:50] so "make stream abstraction work with kafka" sounds good and also "make forwarder work with stream abstraction rather than call out methods directly" [15:54:09] I think it's just being careful to keep everything unicode because python 2 sucks at handling that kind of thing unless you're explicit [15:54:46] do you mean just calling pub.send_unicode or other stuff? [15:55:02] other stuff too [15:55:10] the stream abstraction does iter_unicode pretty much everywhere [15:55:20] which means that it either calls zmq's recv_unicode [15:55:21] or [15:55:26] ottomata: because sys.setdefaultencoding('utf-8') [15:55:31] wraps a file object with a utf conversion [15:55:35] ottomata: which is not the default in python [15:56:05] milimetric: https://github.com/wikimedia/mediawiki-extensions-EventLogging/blob/master/server/eventlogging/streams.py#L77 [15:56:17] i wonder, if i change iter_file to be a more generic utf8 wrapper [15:56:22] to work with files, or iterators of any kind [15:56:41] like, instead of implicitly using io.open, just wrap anotehr generator, hm. [15:56:42] Hmmm [15:57:53] nuria: to do that, would I just make sure each message does unicode(message, "utf-8") [15:58:01] or [15:58:02] decode('utf-8', 'ignore') [15:58:03] ? [15:58:07] (i'm looking at SO :) ) [15:59:20] ok, but note that setting sys.setdefaultencoding('utf-8') on forwarder is affecting all code gets executed from that moment on on that process [15:59:28] it seems to me like this line means iter_unicode only handles files and things with recv_unicode (zeromq): https://github.com/wikimedia/mediawiki-extensions-EventLogging/blob/master/server/eventlogging/streams.py#L90 [15:59:37] yeah, milimetric, i think this might work: [15:59:49] https://gist.github.com/ottomata/4b62022150d52300a7fa [15:59:58] maybe [15:59:59] ? [16:00:23] instead of ignore, why not 'replace' like he uses? [16:00:54] but that sounds ok... [16:01:20] oh is that a thing? [16:01:33] aye yeha [16:05:17] ooo that works! cool, i can now use stream() on a generic iterator! [16:06:41] ottomata: yes, I was referring to the Jay Kreps article before. 1 Petabyte!! We should totally think that through when we start work this next quarter. [16:07:14] also - here's my worse but maybe simpler way to do what you asked about gerrit: [16:08:16] git checkout -b A [16:08:16] work work [16:08:16] git commit [16:08:16] git review [16:08:16] git checkout -b B [16:08:16] work work [16:08:17] git commit [16:08:17] git review [16:08:17] -- this is when you want to address changes to A, right? [16:08:27] yes [16:08:45] git checkout A [16:08:45] work work [16:08:45] git commit --amend [16:08:46] git checkout B [16:08:46] git rebase A [16:08:46] git review [16:09:05] that will submit A2 and B2 both [16:09:18] because you brought A2 into your B topic branch [16:09:22] by rebasing [16:09:24] ah so just multpled branches, aye, but. hmm, gerrit doesn't mind if your topic branch B depends on an un merged commit in topic branch A [16:09:24] ? [16:09:41] no, it doesn't care [16:09:58] I suck at synchronizing my local topic branch names with the gerrit ones, but I don't think anyone cares about any of that [16:10:16] well, I take that back, I don't think people like me care, I'm sure Christian cares :) [16:11:10] anyway, I think Christian's way is more correct, but this might be easier to reason about / remember. At some point I'll start doing it Christian's way. Also, I'd say use squash instead of fix-up [16:11:29] until you're more comfy with the whole thing anyway [16:12:20] hm, k [16:12:21] thanks [16:12:38] ok, milimetric, here's an annoying thing about using uris all the time [16:12:50] see my coment here? [16:12:50] https://gerrit.wikimedia.org/r/#/c/196073/3/server/bin/eventlogging-kafka-forwarder [16:13:03] basically, in places where raw input is required [16:13:12] i need to make sure that the URI has raw=True on it [16:13:54] whereas, because these scripts don't use get_reader() for example, but call things like sub_socket directly, there is no json conversion. [16:14:18] i guess, hm, get_reader and get_writer as assuming that you are working with json only. [16:14:23] whereas raw input might not be json [16:14:33] e.g. from varnishncsa it is not, there are extra fields [16:14:42] that's why processor exists, to convert it to json. [16:14:43] hm [16:16:26] hm, i could make the methods in factory.py take and pass through a raw argument, instead of relying on it being in the uri? [16:16:48] ottomata: that is why there are two processors though, watch out [16:17:32] yes, but still, in both cases those processors input is raw [16:17:45] ottomata: one for client side one for server side, the process to convert both to valid json is different [16:17:48] ottomata: ah yes [16:18:46] so i mean, it isn't that big of a deal, in the bin scripts themselves [16:18:50] to make sure the input url has raw=True on it [16:18:54] but, it is a little messy [16:18:58] thoughts? [16:19:09] sorry, what's "raw" mean? [16:19:14] raw means [16:19:15] ok [16:19:18] if raw=False [16:19:20] which is the default [16:19:30] the stream returned will automatically convert each message to a json object [16:19:35] rather than just give you the string [16:20:02] processor needs the raw string, because, it assumes that the input will not be a json objet [16:20:27] the whole point of processor is to take a raw stream of strings and parse out the json field, and then convert that to a json object sream [16:20:50] so, in order for processor to work, it has to iterate over a raw stream, [16:20:54] if i don't set raw=True [16:21:09] the iterators in streams.py will call json.loads() on the string [16:21:33] https://github.com/wikimedia/mediawiki-extensions-EventLogging/blob/master/server/eventlogging/streams.py#L103 [16:23:24] ottomata: that seems a good thing to me, distinguising between raw events and json-fy events [16:24:54] yes, it is good [16:24:56] ottomata: yeah, it seems like a fine thing to do in the forwarder, because it wouldn't ever be different for that forwarder [16:25:07] right, i would also do it in processor [16:25:28] maybe just add a short explanation above [16:25:32] if i wanted multiplexer to work with kafka, i'd have to do it there too [17:08:00] Analytics-EventLogging, operations: Delete vanadium:/srv/eventlogging - https://phabricator.wikimedia.org/T75084#1116914 (RobH) a:Ottomata @ottomata: is this task something you would handle? I'd think so, since the parent is also assigned to you. If not, please feel free to remove yourself. [17:34:18] Analytics-EventLogging, operations: Delete vanadium:/srv/eventlogging - https://phabricator.wikimedia.org/T75084#1117037 (Ottomata) Ja can do. [17:35:39] Analytics-EventLogging, operations: Delete vanadium:/srv/eventlogging - https://phabricator.wikimedia.org/T75084#1117041 (Ottomata) Open>Resolved [17:55:03] mforns: did you tested the EL code with the multimedia events? [17:55:16] nuria, yes [17:55:51] mforns: souns great [17:55:52] you mean in beta? [17:56:08] I tested in vagrant [17:56:13] mforns: no you cannot test it in beta cause there we only control chnages to server, not client [17:56:14] not in beta? [17:56:19] ok [18:35:45] milimetric, I tried to find how to mock module (not instance) methods with MagicMock and I didn't find anything, do you have an idea if that is possible? [18:36:25] mocking the module method seems to have no effect [18:36:38] mforns: I might be naive, but I thought it was just module.method = Mock(...) [18:36:39] and the code that uses utcnow is no class [18:36:50] this has no effect [18:37:28] huh, weird, lemme mess with it for a second. It doesn't matter anyway, I'm not concerned about that test not passing :) [18:37:29] milimetric, it also happens if you try to mock a module method by hand, it seems python does not let you assign to modules [18:37:42] if I can't figure something out in the next five - ten minutes, then you can just ignore my comment [18:37:48] I could change reportupdater.py to be a class [18:37:58] nah, that's fine, no need [18:38:10] but it would be lots of lines of code (tests) [18:38:23] ok [18:39:31] mforns: batcave? [18:39:34] sure [18:57:34] milimetric, https://phabricator.wikimedia.org/T92596 is about VE only, right? [18:57:46] it doesn't include any of my wikieditor instrumentation [19:03:46] Krenair: no, it doesn't , the issue is present on client sie data [19:03:50] *client side [19:05:24] nuria, do you know why that bug says rev_id rather than rev_user? [19:05:37] Krenair: lemme see [19:06:48] I wonder what "James seems to think this is because mw.user (which is used to get .isAnon()) was changed to be async at some point. " meant [19:07:37] you can get logged in/logged out in the middle of edits, due to other tabs, etc. I wonder if that's related [19:07:59] Krenair: it means that the user (user.id, user.class) in most VE sessions (client side data) [19:08:15] Krenair: is listed as being anonymous when it is really not [19:08:53] Krenair: James_F thought it might have something to do with the fact that mw.user it loaded async and -in some instances, according to james- [19:08:59] not present when VE is loaded [19:09:28] Krenair: "you can get logged in/logged out in the middle of edits, due to other tabs, etc"-> for the majority of our data? no, i do not think so [19:10:06] Krenair: I think there is some instrumentation failure there, otherwise we will have a small portion of data be wrong, not the majority of it [19:10:23] you think that case is a minority and a much larger part of the data is just wrong? [19:10:27] Krenair: wrong , when it comes to identify anonymous users, that is. not wrong overall [19:10:45] Krenair: no, let me explain. [19:11:20] Krenair: The "majority" of the data incorrectly identifies logged in users as anonymous. That is teh only incorrection. [19:11:49] ok [19:11:50] Krenair: That is separated for "userX" went through steps1 -> 2->3 of funnel [19:11:55] that's what I thought you were saying before :) [19:12:33] Krenair: What we know is that we cannnot do any analysys of user types with the data we have. [19:13:30] Krenair: ah, ok, "wrong data" is kind of too broad. "dataset does not label correctly anonymous users" is more precise [19:13:54] right [19:13:59] wrong user class data is what I meant [19:14:43] sorry I missed your ping, Krenair [19:15:08] one clarification [19:15:16] it wasn't that long ago, a lot of people take much longer to respond [19:15:31] :) [19:15:46] k, so rev_id may be correct, I can't say 100% certainty, but it looks ok [19:16:01] the user.id is definitely not matching the revision table's rev_user column though [19:16:15] (if merging by event_page.revid = revision.rev_id) [19:16:41] so my point was that most likely it's just event_user.id that's wrong, but it could very well also be event_page.revid, not sure) [19:17:00] and yes, this only applies to VE, but James told me to ping you about it first, even though Roan was the one who wrote it [19:17:53] Krenair: let me know if I can clear up anything else [19:27:16] mforns, milimetric, I think I'm trying to do the same thing (mock utcnow, not used in a class) [19:27:24] did you figure out something that works? [19:27:38] fhocutt: yea [19:27:45] Marcel's about to push another patch [19:28:00] and that will have the example, will that work? [19:28:04] I think he's pushing soon-ish [19:28:21] I'll take a look, can one of you link me when it's out? [19:28:27] thanks! [19:28:28] yes fhocutt, in 10 minutes [19:28:48] I'm doing a last testing round [19:29:09] sure, I'll paste the link [19:40:10] milimetric, so mediawiki.user is loaded at the same time as ve.init.mw.trackSubscriber.js [19:41:50] (PS13) Mforns: Add reportupdater: a more reliable scheduler [analytics/limn-mobile-data] - https://gerrit.wikimedia.org/r/192319 (https://phabricator.wikimedia.org/T89251) [19:41:50] Krenair: at the same time meaning teh same request? [19:42:00] (CR) jenkins-bot: [V: -1] Add reportupdater: a more reliable scheduler [analytics/limn-mobile-data] - https://gerrit.wikimedia.org/r/192319 (https://phabricator.wikimedia.org/T89251) (owner: Mforns) [19:42:26] nuria, mediawiki.user is a dependency of the module which contains the instrumentation code [19:42:32] but I don't think that's the issue [19:42:44] milimetric, I pushed the change for fhocutt to see, but I still will fix flake8, so let me ping you [19:42:52] mw.user.isAnon() effectively returns mw.config.get( 'wgUserName' ) === null [19:43:14] thanks, mforns. Which test was the one with the mock? [19:43:20] Krenair: ok, that sounds good, sounds deps are right. [19:43:33] that comes from MW core: [19:43:34] 'wgUserName' => $user->isAnon() ? null : $user->getName(), [19:44:29] fhocutt, https://gerrit.wikimedia.org/r/#/c/192319/13/reportupdater/test/reportupdater_test.py line 42 [19:44:50] great! thanks very much. [19:45:39] the catch is... the utcnow wrapper function has to be implemented in the same module that will call it [19:46:09] Krenair: how is VE executing the PHP code though, if you use VE only you are hitting only cached pages.. or not? [19:46:10] I implemented it inside https://gerrit.wikimedia.org/r/#/c/192319/13/reportupdater/reportupdater.py line 168 [19:46:16] fhocutt, ^ [19:46:49] oh, interesting [19:46:49] which is the module that calls it [19:46:51] ok. [19:46:57] nuria, VE doesn't execute that code [19:47:04] it comes from when you initially load the page from the server [19:47:19] I'll make that change to the code I'm testing, then [19:47:25] Krenair: right, i know, but if you are anon using VE, are you [19:47:28] thanks again. [19:47:32] Krenair: still loading from server? [19:47:39] fhocutt, if you need help, ping me :] [19:47:49] you are loading VE JS modules from the server, and some data about the page from the server [19:48:03] mforns, will do. I'm trying to figure out mocks in general, I haven't worked with them before [19:48:17] ok [19:49:19] Krenair: So the VE JS Module is not cached at all? [19:49:56] nuria, what? [19:50:01] it is cached [19:50:13] it's not relevant at all to the determination of whether the user is logged in or not [19:50:32] Krenair: ah , ok, i missread, i think, "some" data comes from server means taht if you use VE you are not hitting varnish, is that right? [19:50:34] as I said, that information comes from the initial page load, from MediaWiki core itself [19:52:56] Krenair: then those values will be correct if js is loaded after all info from the main document is available, correct? [19:53:47] nuria, should be [19:54:22] It would be interesting to log the contents of wgUserName and compare it to the value that comes out in rev_user_text [19:54:28] So the javascript "needs to wait" for the main document to be loaded to have all vars available, mmmmm [19:54:53] Krenair: so, if those values are read only once (upon ve js loading) there is your problem right? [19:55:20] Krenair: if js is read and executed before all values on main document are available those values will be null for the rest of the session [19:55:40] nuria, you know how VE actually loads in-page, right? you have to load the page (which sets wgUserName), then open VE, which checks wgUserName while logging events [19:56:01] You're not making much sense [19:56:37] Krenair: sorry, i do not know how VE is loaded, but if it's loaded always after main document then those values should be OK [19:56:59] Krenair: given that they are not maybe something is not working as it should, correct? [19:58:12] We should be thinking about this as: What does wgUserName say they're logged in as while logging events, vs. who is their edit actually attributed to in the end? [19:59:06] Krenair: so wgUserName is coming on as null [20:00:56] while their edit is getting saved as a logged in user, yes [20:03:10] right, Krenair, I think the fact that ve code depends on the mw.user module rules out the problem that James thought was happening here [20:03:53] if ResourceLoader is to be trusted anyway, which I feel is an easy assumption to make [20:04:30] to me, the fact that the percentage of anons stays relatively constant with time also rules out that possibility. [20:05:51] Krenair: maybe talk to Roan about it. I'm going to go work on my pretty dashboards and when I'm done sometime next week I'll come help again. But until this problem is resolved, we can't analyze users by type or pages by size. [20:10:34] milimetric, nuria: I don't think either of you understand what is happening here. [20:10:57] Krenair: How is js loaded? cause mmm... that is not that hard [20:11:13] :) ok, let me explain real quick, just to make sure [20:11:17] Krenair: or what exactly? [20:11:20] mediawiki.user is definitely loaded [20:11:27] nuria, Krenair: hold on please [20:11:34] I'll run through what I think is happening [20:11:37] if it was not, you would never be able to get user.class=IP set [20:11:38] and you can tell me where I'm wrong [20:11:52] mediawiki.user does not send data to the client about whether it is logged in or not [20:11:57] Krenair: one second please [20:12:09] it's isAnon function effectively returns mw.config.get( 'wgUserName' ) === null [20:12:13] you said I didn't understand, allow me to explain what I understand, please [20:12:56] That variable (wgUserName) is set up when you initially load the page for viewing, and is set by MediaWiki core as a JS global [20:13:00] page loads (wgUserName is a global variable set by the server, set to the actual user name if the user is logged in) [20:13:13] Krenair: I asked nicely, please shut up now [20:13:15] k, thx [20:14:00] after page load, the is initialized [20:14:20] as part of this initialization, it requests access to the mw.user module, which defines mw.user.isAnon() [20:14:44] the initialization doesn't happen until that module is loaded, and this is ensured by the way ResourceLoader works [20:15:22] therefore, when mw.user is used from VE code, it's certain to be loaded correctly, and return the correct value as far as wgUserName is concerned [20:15:35] Krenair: now I am done, thanks for listening, now you can explain what I'm mistaking [20:15:54] that part sounds fine [20:16:20] nuria: this is getting good! [20:16:21] https://gerrit.wikimedia.org/r/#/c/196073/ [20:16:21] https://gerrit.wikimedia.org/r/#/c/196637/ [20:16:24] ok, that's all I understand, and as far as I know that's all that influences what the event is emitting [20:16:39] is that not true? [20:16:50] That sounds true to me [20:17:32] ottomata: all right! let's make sure to test this throughly before deploying. [20:17:56] Krenair: ok, then we're on the same page, the bad data is still a mystery :) [20:18:05] ottomata: will CR in detail after i finish a CR I have for hoovercards [20:18:33] indeed. [20:18:59] since I rolled this change into the existing forwarder, the existing config files will ahve to change [20:19:04] they now need to specific uris instead of just ports [20:19:16] ottomata: I would love for changes to be backwards compatible though [20:19:27] ottomata: i think that is a requirement for any change to a service [20:19:30] they are, its just the one format [20:19:34] what do you mean? [20:19:40] we can change the config file pretty easily [20:19:52] everything else should work exactly the same [20:20:24] milimetric, https://phabricator.wikimedia.org/diffusion/EVED/browse/master/modules/ve-mw/init/ve.init.mw.Target.js;9dd64e3f8462e9d3c764263a9021ed0b295accc0$609-629 seems relevant [20:21:05] ottomata: but we shouldn't have to change puppet + service at the same time for things to work [20:21:10] ottomata: config is on puppet [20:21:24] though I wonder what happens if the user's login status changes before they attempt to save [20:21:45] yeha>>>hm [20:21:54] you are saying i should make it figure out what to do if it has just port.... [20:21:56] ok, not a bad idea [20:21:57] i can do that [20:22:50] hmm [20:23:15] ottomata: cause runs of puppet are decoupled for services deploys (not a bad thing) but config is in a different system than service code [20:23:22] nuria: i dunno, it is actually pretty inelegant to do that with argparse [20:23:24] yes, that is good. [20:23:27] Krenair: indeed that could cause different users throughout a session, good find [20:23:36] but, we can change config withtout having eventlogging-forwarders restarted, no? [20:23:43] I suppose then, the only change in users should come after a saveFailure event? [20:23:46] or, we can do it manually first, deploy, restart forwarderes [20:23:49] then ake sure puppet is cool [20:24:14] ottomata: mmm, in theory yes, in reality processes die and upstart restatrs them [20:24:18] milimetric, a user's login status can change in the background and we can't do much about that [20:24:26] *re-starts them [20:24:28] why woudl they die? [20:24:31] until they save, at which point their old token is now invalid so we have to re-request [20:25:02] ottomata: because a runtime exception will kill python process [20:25:06] 1. disable puppet. [20:25:06] 2. deploy new code [20:25:06] 3. edit config files [20:25:06] 4. restart forwarders [20:25:06] 5. enable puppet with new changes [20:25:18] why would there be a runtime exception? [20:25:25] (PS14) Mforns: Add reportupdater: a more reliable scheduler [analytics/limn-mobile-data] - https://gerrit.wikimedia.org/r/192319 (https://phabricator.wikimedia.org/T89251) [20:25:47] ottomata: ya, i get it , what i was trying to avoid when i said "backwards" compatible is that we should not have to "stop" puppet, [20:26:13] ha, naw, stopping puppet is how you do migrations like this, i do it all zee time [20:26:30] although, sure. [20:26:48] ottomata: well there are runtime exceptions happening not all the time but frequent enough that if you look at processes of EL in vanadium [20:27:05] ottomata: they have different uptimes, when they were all started at the same time [20:27:13] haha [20:27:17] hookaayyy [20:27:25] ottomata: which is fine, for a tier-2 service that is [20:27:26] nuria: does EL run in beta? [20:27:36] ottomata: yes, see: [20:28:21] ottomata: https://wikitech.wikimedia.org/wiki/EventLogging/Testing/BetaLabs [20:28:39] ottomata:we fixed it last quarter so it runs clean on beta [20:28:52] ottomata: you need sudo in that box to see logs/db etc [20:29:12] oh cool, i could deploy a varnishkafka instance in betalabs to test the kafka processor then? [20:29:41] (PS15) Mforns: Add reportupdater: a more reliable scheduler [analytics/limn-mobile-data] - https://gerrit.wikimedia.org/r/192319 (https://phabricator.wikimedia.org/T89251) [20:29:41] ottomata: right, right, yes, that is what i mean when i said "test on beta" [20:30:01] (CR) Mforns: Add reportupdater: a more reliable scheduler (5 comments) [analytics/limn-mobile-data] - https://gerrit.wikimedia.org/r/192319 (https://phabricator.wikimedia.org/T89251) (owner: Mforns) [20:30:28] coool [20:30:39] milimetric, now it is ready: https://gerrit.wikimedia.org/r/#/c/192319/ [20:30:59] mforns: ok, awesome, I will merge at the end of the day today [20:31:11] milimetric, no rush from my side [20:31:19] thanks! [20:31:31] ori:yt? [20:31:36] ori: can you merge? https://gerrit.wikimedia.org/r/#/c/191231/ [20:32:10] Krenair: yeah, totally, the user can change, and that's fine. The edge cases don't bother me here, it's more the fact that we're seeing 78% of events from anonymous users, and everything we know about our site tells us that this is very likely to be wrong [20:32:32] one idea was that maybe most people start out logged out and they only log in to save [20:32:48] but then we would have less than 58% anonymous on the saveSuccess event, which we don't [20:32:57] nuria: what is the beta labs project name? [20:33:22] deployment-prep? [20:33:23] milimetric, what about just saveSuccess events? [20:33:23] that it? [20:33:25] i think i'm in that [20:33:30] according to the enwiki.revision.rev_user=0 definition of "anonymous" we have much fewer than that many anonymous editors [20:33:32] yes! [20:33:39] ottomata: https://gerrit.wikimedia.org/r/#/c/191231/ [20:33:41] how closely do those match up to their resulting rev_user? [20:33:42] ottomata: SORRY [20:33:45] Krenair: yeah, that's what I was saying here: 16:32:47 but then we would have less than 58% anonymous on the saveSuccess event, which we don't [20:33:50] ottomata: http://en.m.wikipedia.beta.wmflabs.org/wiki/Main_Page [20:33:51] ah right [20:35:32] Krenair: I didn't have time to look at that matchup in depth, I'm really sorry about that, I have to focus on finishing up the Q3 deliverables [20:36:02] but I will return to the analysis once that's done (maybe end of next week?) Until then, hopefully you guys can talk about this and look it over a bit [20:36:11] maybe there's some simple explanation [20:36:20] yeah I'm going to talk to Roan about it [20:50:32] thanks, mforns, that works for me [20:50:49] ok fhocutt good [21:15:08] (CR) Milimetric: [C: 2] Add reportupdater: a more reliable scheduler [analytics/limn-mobile-data] - https://gerrit.wikimedia.org/r/192319 (https://phabricator.wikimedia.org/T89251) (owner: Mforns) [21:59:06] nuria, as we spoke in tasking, I'm looking at https://phabricator.wikimedia.org/T90742 [21:59:30] I see there is already a changeset in gerrit for that task [22:00:10] nuria, I'd like to know what is the current status? [22:08:04] y'all have a nice weekend, bye [23:12:07] have a nice weekend y'all [23:45:30] Analytics-Dashiki, Analytics-Kanban, Patch-For-Review: Pageviews not loading in Vital Signs - https://phabricator.wikimedia.org/T90742#1118389 (Nuria) Change should not be needed according to @ottomata, but I think it might, that is what it needs to be checked out, whether -via puppet- we need to set u...