[01:26:53] (PS1) Bmansurov: Update ui-daily datasource and graph. Follow up on 9a8dd13d219c1d7ff5de69a6e479a5ff85c725e3 [analytics/limn-mobile-data] - https://gerrit.wikimedia.org/r/174615 [01:30:07] milimetric: thanks for the documentation email, very helpful [08:05:46] yuvipanda, Hey [08:06:08] hey! [08:06:12] yuvipanda, I was not able to wake up early in the morning :\ [08:06:30] yuvipanda, I will implement the pagination after reaching home tonight :) [08:06:48] that's ok :) [15:04:06] ottomata: Sneeky meeting! :-D [15:04:30] AHHH [15:11:37] Analytics / Wikistats: Cross-link stats.wikimedia.org and ee-dashboard.wmflabs.org - https://bugzilla.wikimedia.org/65994#c2 (Andre Klapper) Nemo: Are you still working on this, as this ticket is assigned to you? [15:26:39] brrrrrb gotta move car [15:32:23] Analytics-Features: Testing - https://phabricator.wikimedia.org/T1371 (kevinator) NEW p:Triage [15:33:23] Analytics-Features: Testing phab with a - https://phabricator.wikimedia.org/T1371#24032 (kevinator) p:Triage>Normal [15:39:13] heh [15:39:15] phawkes is here [15:52:05] Evil phawkes killing wmbugs. [15:52:11] * qchris_away shokes fist at phawkes! [15:52:17] s/shokes/shakes/ [15:55:51] qchris, around? [15:55:57] yup. [15:56:00] Wassup? [15:56:13] qchris, i'm still trying to figure out what causes a massive discrepancy in 515-03 [15:56:24] and not exactly sure how to even debug that [15:56:41] i can easily produce a list of all pagehits that we think are "page hits" [15:56:45] from hadoop [15:57:09] since it should match perfectly to the log files, can we figure out what causes the diff? [15:57:42] Mhmm... I guess so. [15:57:56] One would have to pass the urls through the pig udf. [15:58:19] thing is, dan doesn't want to fully switch to my analytics until he is certain that the limn graphs are reasonably comparable [15:58:58] well ... analytics graphs are known to not be good. Just as the webstatscollector pageview definition is. [15:59:22] "analytics graphs are known to not be good" should be "analytics zero graphs are known to not be good" [16:00:55] I could have a look, but I won't find time this week. EventLogging is suffering pretty badly, and I need to do some plugin mord for gerrit to make the bugzilla->phab switch smoother. [16:01:10] s/mord/work/ [16:02:46] yurikR: Is next week good enough for you? [16:03:38] sure [16:03:52] sec, on another line [16:04:00] k [16:13:13] qchris, i just need to figure out why just one, single carrier (515-05) is having a 10x difference, compared with everything else (where my calculations are about 10% lower than yours, which seems to be ok) [16:14:47] Before you wrote 515-03, now 515-05. Which one would you be interested in? [16:16:09] define discrepancy, here? [16:16:18] (sorry to butt in) [16:18:24] qchris, 515-05 sunph [16:18:29] k [16:19:28] Ironholds, pretty definite :) my numbers - around 35K daily as of last week, limn - 4.5K [16:19:40] yurikR: You're counts are higher than the analytics dashboards? [16:19:45] (or lower) [16:19:49] qchris, ^^ [16:20:02] Sorry. Right. [16:20:14] shapes are similar [16:20:15] I think I have a hunch. Let me double-check. [16:20:19] oh oh [16:20:21] :) [16:20:38] qchris, do you have an account on zero wiki? [16:20:42] yes. [16:23:27] https://git.wikimedia.org/blob/analytics%2Fkraken.git/658a43dd27595e5b6a5dffe14fb4e5c3720d9026/kraken-generic%2Fsrc%2Fmain%2Fjava%2Forg%2Fwikimedia%2Fanalytics%2Fkraken%2Fpageview%2FCidrFilter.java#L39 [16:23:38] yurik: I guess ^ is the culprit. [16:23:51] kraken does weird (and wrong) X-Forwarded-For things. [16:24:03] wha? lovelyl ) [16:24:17] but how come it only affects one carrier? [16:24:26] And I guess for the 515-05 lines, I guess it is ending up believing that the real client IP is 10.0.0.0/8. [16:24:28] And discards it. [16:25:00] oh, it could be due to the XFF containing 10.0... ips on the OTHER side [16:25:15] I did not test this theory in full, but it would match the first few lines of the zero logs. [16:25:20] right. [16:25:33] lol, thx! [16:25:42] this is good enough i think to report to dan [16:25:52] Of the first 1000 lines of traffic, this issue explains 770. So that is not too bad. [16:26:01] 1 sec [16:28:17] Ok. Then I'll not dig further. [17:54:54] ottomata, so starting earlier this week, I can't connect to jdbc:hive2://analytics1027.eqiad.wmnet:10000/ [17:56:02] (PS1) QChris: Reformat datasources/ui-daily.json to allow for nicer diffs [analytics/limn-mobile-data] - https://gerrit.wikimedia.org/r/174740 [17:56:04] (PS1) QChris: Update ui-daily datasource columns for recent SQL change [analytics/limn-mobile-data] - https://gerrit.wikimedia.org/r/174741 [17:56:11] .SQLException: Could not establish connection to jdbc:hive2://analytics1027.eqiad.wmnet:10000/wmf_raw: Internal error processing OpenSession [17:58:17] (CR) QChris: "This change is completely untested." (1 comment) [analytics/limn-mobile-data] - https://gerrit.wikimedia.org/r/174741 (owner: QChris) [18:09:53] (CR) QChris: "It seems this change did not update the corresponding" [analytics/limn-mobile-data] - https://gerrit.wikimedia.org/r/174055 (owner: Bmansurov) [18:10:50] Ironholds: will be with you shortly... [18:10:57] ta! [18:11:34] (CR) Bmansurov: "Here is the follow up patch I made: https://gerrit.wikimedia.org/r/#/c/174615/" [analytics/limn-mobile-data] - https://gerrit.wikimedia.org/r/174741 (owner: QChris) [18:16:32] (CR) QChris: "Hey :-D Great. I'll abandon then." [analytics/limn-mobile-data] - https://gerrit.wikimedia.org/r/174741 (owner: QChris) [18:17:17] (Abandoned) QChris: Update ui-daily datasource columns for recent SQL change [analytics/limn-mobile-data] - https://gerrit.wikimedia.org/r/174741 (owner: QChris) [18:17:38] qchris: hey, if my patch looks good, can you approve it? [18:18:49] (Abandoned) QChris: Reformat datasources/ui-daily.json to allow for nicer diffs [analytics/limn-mobile-data] - https://gerrit.wikimedia.org/r/174740 (owner: QChris) [18:21:08] bmansurov: I'd for sure CR-2 a single 700 character json line, if I saw one. So I guess you don't want me to code review ;-) [18:21:17] Apparently, others care less, so I better not look. [18:21:25] * yuvipanda pats qchris [18:21:32] Hahahaha :-) [18:21:53] qchris: yes that was weird, I'm not sure why it has to be a single line json [18:22:18] * yuvipanda bets it is limn related [18:22:20] Up to my knowledge, it need not be a single line. [18:22:36] yuvipanda: yes limn [18:22:44] In geowiki (also limn), we don'th use such long json lines IIRC. [18:22:52] qchris: oh, then I can surely make it multiline [18:23:10] I had two changes to do that: [18:23:31] https://gerrit.wikimedia.org/r/#/c/174740 [18:23:36] https://gerrit.wikimedia.org/r/#/c/174741 [18:23:51] If they are useful for you, just restore them and grab them. [18:23:57] hookay [18:23:59] ironholds [18:24:06] gimme some info again on how to reproduce this? [18:24:09] you had a gist... [18:24:25] qchris: as you point out in your comment, they will override it again I think [18:25:00] bmansurov / qchris / yuvipanda: I can shed some light [18:25:16] the scripts that generate the datasources and graph json files indeed don't bother to format them properly [18:25:28] this can be easily fixed with indent=4 or something like that passed to the json output [18:25:38] totally, 'angon [18:26:08] limn does not care about whitespace of course, and you can even write these in YAML if you want. The files won't be overwritten unless someone runs the automatic generator [18:26:34] milimetric: would it be a good idea to update the generator to always use indent=4? [18:26:44] bmansurov: yeah, def. [18:27:04] milimetric: ok i'll submit a patch [18:27:05] bmansurov: https://github.com/wikimedia/analytics-limn-mobile-data/blob/master/generate-graph.py#L40 [18:27:13] and also: https://github.com/wikimedia/analytics-limn-mobile-data/blob/master/generate-graph.py#L183 [18:27:20] thanks [18:27:21] ottomata, https://github.com/Ironholds/WMUtils/blob/master/R/hive_query.R [18:28:48] ehhhh [18:28:55] ? [18:29:00] how do I import RJDBC? [18:29:09] Error: unexpected symbol in "import RJDBC" [18:29:13] i jsut wnat to test the connection [18:29:34] library(RJDBC) [18:29:42] or install.packages("RJDBC"); library(RJDBC) [18:30:00] heh, installing packages not in debs? be quiet :) [18:32:17] (PS1) Bmansurov: Otput pretty datasource and graph JSON files [analytics/limn-mobile-data] - https://gerrit.wikimedia.org/r/174757 [18:33:52] i'm gettting closer [18:34:01] ironholds, help me with a really easy connection problem reproduction [18:34:03] (CR) Milimetric: Otput pretty datasource and graph JSON files (1 comment) [analytics/limn-mobile-data] - https://gerrit.wikimedia.org/r/174757 (owner: Bmansurov) [18:34:42] Ironholds: http://www.codeshare.io/67aYo [18:35:02] (PS2) Bmansurov: Output pretty datasource and graph JSON files [analytics/limn-mobile-data] - https://gerrit.wikimedia.org/r/174757 [18:35:32] (CR) Milimetric: [C: 2] Output pretty datasource and graph JSON files [analytics/limn-mobile-data] - https://gerrit.wikimedia.org/r/174757 (owner: Bmansurov) [18:35:39] (Merged) jenkins-bot: Output pretty datasource and graph JSON files [analytics/limn-mobile-data] - https://gerrit.wikimedia.org/r/174757 (owner: Bmansurov) [18:36:31] why is the third line bright orange, ottomata? I can't read it ;p [18:36:41] i had it seleected [18:36:45] stupid codeshare [18:36:47] :p [18:37:21] taht's ;looking better [18:37:25] ok cool, i got what you got now [18:37:32] cool! [18:37:39] okay, that runs for me and produces the error I've pasted [18:38:06] oh ho! [18:38:06] 2014-11-20 18:37:56,663 ERROR thrift.ProcessFunction (ProcessFunction.java:process(41)) - Internal error processing OpenSession [18:38:07] java.lang.OutOfMemoryError: GC overhead limit exceeded [18:38:09] on hive server! [18:38:16] well well well [18:38:21] we may have to move hive server to a beefier machine! [18:38:36] hmmn! [18:38:41] (PS2) Bmansurov: Update ui-daily datasource and graph. [analytics/limn-mobile-data] - https://gerrit.wikimedia.org/r/174615 [18:38:47] that's the hive, not the hadoop? [18:38:54] because I noticed it happens whatever I do to the hadoop heapsize [18:39:08] that is the hive server [18:39:11] not your client [18:39:18] alternately, does hive2 have any weirdness around the heapsize there? We don't use it, afaik, so that may have defaults or something. [18:39:20] ottomata: 1:1 now? [18:39:20] that error is in the hive server logs on analytics1027 itself [18:39:24] oooooh [18:39:26] yay! [18:39:28] someone else's problem! [18:40:37] tnegrin: 5 minutes? [18:40:43] kk [18:40:57] I will be here [18:41:48] hello milimetric, would you like some monitoring for the wikimetrics project? puppet failures and out of disk space errors for now, more can be added later. [18:41:57] * yuvipanda is adding monitoring to prod-like things on labs [18:42:02] hm, Ironholds, i will restart hive server for you, and work on this [18:42:11] but ellery is running a hive query atm [18:42:21] ottomata, yay! Ta [18:42:27] and he's always running a hive query, but let's wait, yeah. [18:43:07] yuvipanda: sure, that'd be swell. We already have some custom monitoring around there, but more info helps [18:43:14] ok, tnegrin [18:43:20] milimetric: right. can you tell me who all should be notified and their email addresses? [18:43:58] yuvipanda: just me. If you could send me the change that enables that, others can submit changes for themselves as they see fit [18:44:04] i'll communicate once I have your change [18:44:06] joining hangout...i think [18:44:11] me: dandreescu@wikimedia.org [18:44:14] milimetric: sure! let me add it [18:44:36] milimetric: what's the name of the labs project? analytics? [18:45:11] analytics yes [18:45:16] cool [18:45:19] but that contains a bunch of stuff [18:45:24] is this adding it to the project level? [18:45:39] yeah [18:46:00] I'll be adding more fine grained controls later. [18:46:31] ok, Ironholds, try now [18:48:10] * Ironholds drumrolls [18:48:43] IT WORKS [18:48:45] yay! [18:54:27] milimetric: added! [18:54:40] milimetric: note that shinken is still beta quality, etc, etc. [18:54:54] but we've found important issues via it on deployment-prep all of last month [18:55:16] yep, thanks much [18:55:22] beta > nothing [18:55:35] :D [19:05:11] milimetric: starting to show up now. email notifications are busted atm (am fixing). http://shinken.wmflabs.org/problems (username/pw: guest/guest) [19:06:53] ottomata, okay, it works! MOSTLY [19:07:29] https://gist.github.com/Ironholds/c437cb58236f9cfbfd90 [19:07:33] that's what happens if I ask for 100k rows [19:07:38] <=10k, it works fine [19:08:06] do the server logs say anything interesting? [19:26:43] (PS1) Mforns: Add pages edited metric [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/174773 (https://bugzilla.wikimedia.org/73072) [19:26:53] (CR) jenkins-bot: [V: -1] Add pages edited metric [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/174773 (https://bugzilla.wikimedia.org/73072) (owner: Mforns) [19:27:20] (CR) Mforns: [C: -1] "Tests missing (in progress)." [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/174773 (https://bugzilla.wikimedia.org/73072) (owner: Mforns) [19:27:21] yeahhh, Ironholds [19:27:22] again [19:27:23] 2014-11-20 19:19:00,094 WARN thrift.ThriftCLIService (ThriftCLIService.java:FetchResults(510)) - Error getting result set metadata: [19:27:23] java.lang.RuntimeException: java.lang.OutOfMemoryError: GC overhead limit exceeded [19:28:01] looks like hive server has to keep your result in memory in ordr to give it to you via jdbc :/ [19:30:45] gah! [19:30:57] and it's not on a dedicated machine, or is but it's a weedy one? [19:31:09] because shit, 100k rows is nothing :/ [19:33:30] it is not on a dedicated machinehmmmmm [19:33:36] i can probably increase heap size for hive server [19:33:37] lemme see [19:37:10] ottomata, danke! [19:37:21] if we can get it to tolerate 8m rows, we'll never need more than that. [19:37:31] or at least, if we ask it for more than that, R will probably be sad. [19:38:59] (CR) Mforns: Add pages edited metric (6 comments) [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/174773 (https://bugzilla.wikimedia.org/73072) (owner: Mforns) [19:39:54] milimetric, I pushed the new metric for review (although still in progress), so if you want to see it, I'll be happy [19:50:10] ottomata, hrm. And it looks like it's not clearing rows, I think. [19:51:10] Ironholds: oh? [19:51:19] i am trying to increase heapsize of server, but it is not working like it is supposed to! [19:51:30] i have to run pretty soon but i am going to scramble to make this better asap [19:51:35] i think hope i can do it before i gotta run [19:52:46] ottomata, so, 20k works. Then 100k doesn't work. then 20ks doesn't work again [19:52:53] I think it's just my code failing to flush the connection on failure [19:52:57] so I've patched that and will retry [19:53:34] yep, looks like it [19:53:45] * Ironholds tests [19:53:51] (intentionally running broken code, woot [19:56:36] (CR) Nuria: "Running the metric through the UI I get:" [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/174773 (https://bugzilla.wikimedia.org/73072) (owner: Mforns) [19:57:54] nuria__, you know, at some point I got that, too [19:58:08] and then it stopped happening and I forgot [19:58:20] i was going to investigate where does it come from....makes no sense [19:58:26] now I'm thinking maybe a restart is needed [19:58:32] why? [19:58:50] besides i had just restarted [19:59:03] maybe mysqlalchemy caches the data model? [19:59:14] I don't know [19:59:21] *sqlalchemy [19:59:37] oh! [19:59:58] it caches the meta info yes, but the error is on the python object .... [20:00:55] mmm [20:01:08] Ironholds: so heap is ok as is? [20:01:16] mforns: but i guess you are right [20:01:25] have you tried restarting wikimetrics-queue? [20:01:36] ottomata: well, it's flushing old rows, now. [20:01:44] it's still not good with big dataset sizes :( [20:01:50] well, I haven't tested that actually. Wait one. [20:01:55] mforns: i restarted - must have not done it before- and it worked [20:02:18] mforns: no.. wait [20:02:24] aha [20:03:14] mforns: no, double checked that it DID work [20:03:19] ok [20:04:17] nuria__, we could add automatic restart on wikimetrics-queue when files get edited, because today only wikimetrics-web is restarted [20:04:36] edited or changed [20:04:58] mforns: ya, right [20:05:19] mforns: does celery have a setting for that , lemme see [20:06:10] ottomata, okay, it's clearing rows properly now, but still problems with large datasets [20:08:01] (CR) Mforns: "Wikimetrics-queue needs to be restarted, sorry for not remembering that." [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/174773 (https://bugzilla.wikimedia.org/73072) (owner: Mforns) [20:09:27] mforns, Ironholds quick question. Is it possible to ingest staging.countries and turn it into a dimension of the latest cube or is that beyond what Pentaho can do? [20:09:33] ok [20:09:35] mforns: there is an option , just like flask's one, only for developmemnt: http://celery.readthedocs.org/en/latest/userguide/workers.html#autoreloading [20:09:55] DarTar, I guess you could just create a new table that INNER JOINs the two and create that on top of the existing cube? [20:10:21] DarTar, I think it can be done [20:10:44] yes, I'm with Ironholds [20:10:47] (CR) Jdlrobson: [C: 1] "Unverified. I can't generate these graphs myself manually :/" [analytics/limn-mobile-data] - https://gerrit.wikimedia.org/r/174615 (owner: Bmansurov) [20:10:54] this is just metadata on existing data so hopefully a full re-ingestion is needed, but yes, this can be done too [20:11:00] nuria__, thanks! I'll do it [20:11:01] not needed, that is [20:11:14] mforns: no, i will do it, do not worry, [20:11:27] mforns: i need to wait for EL CR so it's fine [20:11:32] nuria__, I suppose there isn't any reason to not do that right? maybe not cancelling running reports? [20:11:38] Ironholds: what’s the status of the latest data, are we working on it because of the HTTPS issue or is it staying like that? [20:11:42] oh, ok1 [20:11:43] mforns: it doesn't [20:11:54] mforns: kills workers 1 by 1 [20:12:00] mforns: and it is only for dev [20:12:03] not prod [20:12:11] yeah ok [20:13:40] Analytics / Wikimetrics: Wikimetrics queue should be restarted in development when files change - https://bugzilla.wikimedia.org/73671 (nuria) NEW p:Unprio s:normal a:None Wikimetrics queue should be restarted in development when files change. This is the same behaviour that we currently s... [20:13:47] mforns: I have assigned it to myself: https://bugzilla.wikimedia.org/show_bug.cgi?id=73671 [20:13:53] Analytics / Wikimetrics: Wikimetrics queue should be restarted in development when files change - https://bugzilla.wikimedia.org/73671 (nuria) a:nuria [20:13:55] DarTar, it's probably gonna stay like that, although I am thinking of doing some experimentation around IP ranges [20:13:58] nuria__, fine [20:14:15] Ironholds: got it [20:16:50] (CR) Milimetric: [C: 2] Update ui-daily datasource and graph. [analytics/limn-mobile-data] - https://gerrit.wikimedia.org/r/174615 (owner: Bmansurov) [20:16:56] (Merged) jenkins-bot: Update ui-daily datasource and graph. [analytics/limn-mobile-data] - https://gerrit.wikimedia.org/r/174615 (owner: Bmansurov) [20:21:07] (CR) Nuria: "Rightt.. let me fix that today, should be a small fix if things are documented in celery as they should be." [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/174773 (https://bugzilla.wikimedia.org/73072) (owner: Mforns) [20:21:21] nuria__: I am trying to speed up merging of o-ri's multi-insert-fix-up, as it should make the EventLogging issue disappear for now. I see you tested it and everything. [20:21:22] Do you think you can merge it today or tomorrow, or would it take longer (then I'd dial back the volume of some schemas)? [20:21:50] qchris: If ori CR-s it [20:22:15] I thought it is his code ... so you should CR it? [20:22:17] qchris: we should be able to merge it today, besides the unit tests i have tested a bunch on beta-labs [20:22:20] * qchris rechecks. [20:22:30] qchris: making up exceptions and seeing re-starts as we should [20:23:05] qchris: this is the change: https://gerrit.wikimedia.org/r/#/c/174485/ [20:23:06] Glad to hear that we might be able to merge it today \o/ [20:23:20] So you'd prefer o-ri to code-review? [20:23:38] qchris: if you want to CR [20:23:51] You read the code ... so you should CR :-) [20:23:52] qchris: that will be awesome and I am sure ori is ok with it [20:24:09] qchris: but i wrote a bunch of it so it'll be a self merge [20:25:04] Gerrit thinks you only wrote the tests and added comment. But gerrit screws up diffs sometimes, so it might again do the same thing here :-) [20:25:30] Ok. I'll have a look then. [20:26:58] qchris: no, that's right, ori wrote the the fix for the grouping, i just corrected a small issue and wrote the tests for it [20:27:20] qchris: so gerrit is telling you the truth [20:28:04] qchris: if it looks good to you given the testing i feel pretty good, batching is done every 2 secs [20:28:25] qchris: we can increment that interval if it is too small [20:37:57] Ironholds: i am close but have failed [20:38:01] my heapsize setting is not taking [20:38:10] and there is a rabbit hole of shell scripts and env vars that I have failed to solve [20:38:48] hmmm [20:53:11] nuria__: Do you have sufficient permission to deploy EventLogging (I lack access to tin)? [20:53:29] qchris: no i do not think so [20:53:40] k. [20:53:41] qchris: I just have access to vanadium [20:54:11] qchris: i need to leave for a dr appointment and will be back in 2 hours, [20:54:20] Sure. [20:54:32] qchris: i cna try to sync up with ori when i am back and we can try to deploy if you want [20:54:38] I'll try to get it deployed with help from others in the meantime. [20:54:50] qchris: ok [20:55:10] ori: I figure EventLogging deployments need to go through git-deploy on tin. I still lack access there. [20:55:21] Could you please help us out once again? [20:55:40] qchris: sure, but how come you don't have access? [20:55:52] Never needed to deploy anything :-) [20:56:00] * qchris is just an analytics guy. [20:56:05] qchris: jajaja [20:56:16] qchris: in our hearts you shall always be the gerrit guy? :) [20:56:17] I guess with EventLogging, I should apply for access if you don't oppose. [20:56:26] * qchris hugs yuvipanda [20:56:45] * yuvipanda hugs qchris [20:56:45] qchris: of course I don't oppose [20:56:50] qchris, ori: I will be out of reach for couple hours but iam leaving irc on, plis plis let me know how things go ok [20:57:03] nuria__: ok. I'll keep you updated. [20:57:04] nuria__: no way, we'll keep you in suspense [20:57:10] ;) [20:57:10] qchris, ori: as if i missed testing anything in beta labs i'd like to know [20:57:13] yeah, total information blackout on nuria__ today [20:57:15] ori is sooooo evil :-D [20:57:19] ciaoooo [20:57:20] Ouch yuvipanda tooo! [20:57:41] so many things we won't keep you updated about... [20:57:44] LIKE WHAT I AM EATING RIGHT NOW [20:57:45] byeeee [20:59:11] ori: Thanks! [20:59:26] qchris: looks good [21:00:36] Awesome! Thanks again. [21:00:48] tnegrin, DarTar: can one of you file an access request for qchris for tin so he can do EL deploys? [21:00:49] I'll babysit it for a few hours. [21:01:12] I'll file an RT ticket right now. [21:01:40] qchris: you have root on vanadium, right? [21:01:55] Yup. I could deploy in a rude way. [21:02:03] right, cool [21:02:34] But I'd prefer to go through the usual tin git-deploy thing, unless something needs an immediate fix. [21:03:03] It does all the tagging and everything. That makes debugging so much easier. [21:05:51] milimetric: did you just get spammed with a bunch of nonsensical emails? [21:05:53] no, right? [21:06:15] Yay! More monitoring spam :-P [21:06:33] qchris: :) just trying to figure out *why* it doesn't actually spam [21:06:46] i had falsely assumed that if you're on call for a host, then all notifications wrt to that go to you [21:06:47] apparently not [21:06:54] :-) [21:10:02] yuvipanda: nope [21:10:26] milimetric: cool. hopefully you will in a few days [21:16:11] ori, qchris: I was AFK, is it done already? [21:16:27] DarTar: Yup. [21:16:30] Thanks. [21:17:00] cool [21:39:18] Ironholds, mforns: I created staging.pentahoviews_countries, left JOINING pentahoviews and including all relevant fields (country name, continent, global north) [21:39:33] ready to be ingested whenever you want [21:39:38] ok [21:39:53] do we want to keep country name or should I drop it? [21:40:22] knowing how embarrassingly bad we are at mapping country codes I’d tend to leave it [21:40:27] Ironholds: thoughts? [21:41:18] This supposedly will increase the size of the DB by countries * 7 * 2 ... [21:41:28] no, wait [21:41:41] they are dependent, so only by #countries [21:41:50] how are we bad at mapping country codes? [21:41:53] can this be a problem? [21:43:12] Ironholds: as measured by our combined response time on that list of 20 countries [21:43:47] I’m fine dropping it if you guys don’t see a good use for that field [21:43:59] I dunno about you, but I was exhausted by Monday morning [21:44:07] I probably wouldn't take that as a standardised example [21:44:16] ha ha, fair [21:44:19] building it in doesn't hurt the dataset but it will require changes to the format pentaho digests [21:44:22] so, bear that in mind [21:44:36] I think it will be ok [21:44:43] well, I imagine we’ll need to include continent, GS/GN data regardless [21:44:46] don't see any problem with tah [21:44:51] *that [21:44:57] whether it comes from the raw data or a post-hoc join in SQL [21:45:07] for future updates, that is [21:46:06] ok, let’s scrap country name for now and just leave the two other fields [21:46:30] Ironholds, do you want me to import the data or you want to do it? [21:46:36] {{done}} [21:46:39] hah [21:46:44] DarTar, I guess [21:46:46] I don't have it ;p [21:47:03] staging.pageviews_countries [21:47:36] but it’s really not something I should be working on. I have WikiGrok and app instrumentation waiting for me [21:47:39] bah. no way I can run a cron job every 32 days, is there. [21:48:32] Ironholds: can you help with this? [21:48:49] my brain is dead and I have to build the UUID thing for apps [21:48:59] I'm not seeing where the work is, outside throwing it into pentaho [21:49:09] where I don't have a labs account with a valid SSH key at the mo ;p [21:49:32] ok, np, I'll do it [21:49:34] ok, mforns that sounds like quite a lot of work on our end [21:49:38] thx man [21:49:48] np [21:49:59] I'll ping you when ready [21:50:07] is kevinator aware that we’re stealing some of your time? [21:50:28] DarTar, meeting ;p [21:50:33] (well, in 10m) [21:50:40] yes, I'm syncing in the daily meeting [21:50:42] mforns: thanks, I created a card on Trello to track changes to Pentaho https://trello.com/c/bGBdTGSC/557-pentaho-data-changes [21:50:57] (feel free to comment there, if you have access to Trello) [21:51:19] I can see the card, but not comment on it [21:51:26] Ironholds: do we need to check in today? If not I could use some time to get this WikiGrok stuff done [21:51:38] I don't know [21:51:48] mforns: are you logged in/do you have an account on Trello? [21:51:56] you can use your Google App creds [21:52:08] ok [21:52:09] and you’ll be automagically created as WMF member [21:52:20] I did not see that I was logged with my personal account [21:52:45] oh the table is public, so you can see it even if you’re logged out [21:52:56] as an org member you can comment on it [21:53:09] as a board members you can actually modify cards (happy to invite you) [21:53:15] >member [21:54:10] Ironholds, mforns if we ingest a new snapshot of data from pentahoviews_countries should we also remove the isHTTPS field, given that it’s buggy? [21:54:26] I don't know why, I login with my wikimedia account and it redirects me to my personal account.. [21:54:32] ok, I'll fix that later [21:54:36] well, if nobody's using it, sure [21:54:44] just remember that every change you make, I have to hardcode later. [21:55:02] Ironholds: I thought we concluded that it’s not accurately representing what it does, i.e. it tracks zero requests [21:55:04] DarTar: should Leila join you at the mobile schema thing? [21:55:29] DarTar, I'm not objecting to that [21:55:33] tnegrin: she can if she needs some distraction :) [21:55:36] it looks like she’s going to be working with EL/schemas one way or another [21:55:45] I'm just pointing out that if I try and throw data into a table with different fieldnames, and a different field count, it explodes. [21:55:49] tnegrin: alright I’ll loop her in [21:55:51] so, every change you make... [21:56:06] thanks [21:56:08] oh Ironholds I was going to modify pentahoviews_countries only [21:56:14] not the original table [21:56:17] so your code is safe [21:56:31] the process couls stay the same for now, i.e. [21:56:43] - dump the data into pentahoviews [21:56:45] then feel free [21:56:50] because, as said: nobody's using it ;p [21:56:51] - post-process it [21:56:54] - ingest it [21:56:55] (the SSL field) [21:57:01] cool [21:57:29] jesus I'm looking forward to friday. [21:58:07] hey Ironholds [21:58:25] hey boss. What's up? [21:58:40] uh, shoot — private [22:00:05] DarTar, do you want me to drop the is_http column? [22:00:35] milimetric: you might get spam from shinken now. that's ok, I hope. just testing email to make sure it works [22:00:51] oh it works yuvipanda :) [22:01:22] i'm just ignoring it and will delete but it may not be that useful to me if it's for the whole analytics project [22:01:23] milimetric: yay [22:03:15] mforns: nope, we can’t drop it, my bad, we would need to recompute the data not to break it down along that dimension [22:03:21] let’s leave it then [22:03:30] and just bump the version number in Pentaho [22:03:41] ufa, I just dropped it... [22:03:45] I messed up [22:03:51] no worries :) [22:03:54] I can regenerate it [22:03:57] hang on [22:03:57] sorry... [22:04:01] np [22:04:59] mforns: done [22:05:12] oh.. ok, I was starting to sweat [22:05:15] oh wait [22:05:25] you removed it from the original dataset? [22:05:41] aka pentahoviews? [22:05:45] no, from pentahoviews_country [22:06:07] ok, then we’re good [22:06:14] ok [22:06:24] for whatever reason it’s not there in my refreshed table [22:06:25] hang on [22:06:57] there are 2 tables now [22:07:12] pentahoviews_countries (without is_https) [22:07:22] and pentaho_county (with is_https) [22:07:34] *pentahoviews_country [22:08:04] yes, sorry just use _countries [22:08:07] I think you are looking at the old table [22:08:10] _country was a typo [22:08:29] I dropped _country [22:08:33] ok, fine [22:08:36] :) [22:08:49] so I'll import that and tell you when done [22:09:02] oh one last thing [22:09:05] sorry for the mess [22:09:08] yep [22:09:16] goddammit [22:09:25] we need a centralised tasking system that runs on days. [22:09:35] apparently “Invalid” gets truncated to “Inv” when ingested [22:09:58] as a value for the country field [22:10:22] milimetric: can you do me a favor and tell me exactly how many emails you got? [22:10:43] 14 [22:10:51] funny, my lucky number [22:10:59] Ironholds: expand? [22:11:26] hmm [22:11:27] https://wikitech.wikimedia.org/w/api.php?action=query&list=novainstances&niproject=analytics&niregion=eqiad&format=json [22:11:30] lists 18 hosts [22:11:39] so theoretically you should've gotten 18 emails [22:11:41] but you got only 14 [22:12:46] DarTar, ok I wait [22:13:13] mforns: does that require a schema change? [22:13:31] it’s stored as a BLOB in SQL [22:13:36] DarTar, the new fields? [22:13:41] I initially thought it was truncated there [22:13:45] oh, the new fields should be there [22:14:18] appended continent and global_north to _countries [22:14:35] yes yes [22:14:43] I did not get your question then [22:14:57] it's about the Invalid truncation? [22:15:12] mforns: ah, I was saying that when the data is imported into Pentaho “Invalid” as a country value is truncated as “Inv" [22:15:19] oh [22:15:28] it’s a minor thing [22:15:49] iif it’s complicated to fix it it’s nbd [22:16:46] ok, I'll see what I can do [22:22:04] mforns: thanks [22:22:14] no problem! [22:31:16] YuviPanda|zzzz: I got one more since then [22:45:09] nuria__: Just a quick heads up that the EventLogging thing is looking good so far. Numbers jumped up. [22:55:23] DarTar, the data is in Pentaho! [22:55:44] mforns: awesome, checking [23:24:13] Analytics / EventLogging: if logEvent fails due to not matching Schema requirements, it should return false (or an error) instead of true - https://bugzilla.wikimedia.org/73678 (Ryan Kaldari) NEW p:Unprio s:normal a:None Right now, if you pass data to logEvent that doesn't match the schema,...