[03:43:02] (PS1) Helder.wiki: Remove extra comma [analytics/quarry/web] - https://gerrit.wikimedia.org/r/154008 [08:29:14] Analytics / Wikimetrics: Separate concurrent report queue and regular 'default' queue - https://bugzilla.wikimedia.org/69523 (nuria) NEW p:Unprio s:normal a:None Separate concurrent report queue and regular 'default' queue. It is possible that a bunch of scheduled recurrent runs could bloc... [08:37:56] Analytics / Wikimetrics: Story: Community has documentation on chosen dashboard architecture and alternatives - https://bugzilla.wikimedia.org/67125#c1 (nuria) This bug is taken care of by this document, is it not? https://www.mediawiki.org/wiki/Analytics/Editor_Engagement_Vital_Signs/Dashboard [09:51:57] hola springle [09:52:39] nuria: hi [09:52:41] (PS4) Nuria: Fix slow Rolling Active Editor metric [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/149482 (https://bugzilla.wikimedia.org/68596) (owner: Milimetric) [09:53:05] one question about waht we talked yesterday regarding [09:53:13] teh table to consult replication lag [09:53:42] The replication lag will be reported per host, correct? or per project? [09:53:55] per shard [09:54:05] s1 s2, .. s7 etc [09:54:16] ok, and other question [09:54:27] the table will be in ops db ? [09:54:53] no, nobody gets SEELCT access to ops :) probably in information_schema_p [09:55:37] ok, and information_schema is a db taht exists in every shard correct? [09:56:03] no, that schema exists only on labsdbs hosts, one per host [09:56:19] because the new mariadb db labsdb instances combine shards [09:56:48] let me log into one of the hosts to make sure i understand [09:57:14] nuria: i havn't sat down and built this yet. subject to change :) [09:57:46] springle : jaja ya i know it's that i would liek to understand how it works, it does not help [09:58:00] that i have like .. ahem HUGE gaps on how is labs put together [09:58:02] np [09:59:45] ok, but i get it information_schema_p is a db with open access in every labs host, correct? [09:59:58] correct [10:00:15] very well, thank you! [10:00:35] I will add this info to the bug i filed. [10:02:11] Analytics / Wikimetrics: replication lag may affect recurrent reports - https://bugzilla.wikimedia.org/68507#c2 (nuria) Per conversation with springle this table will be in db information_schema_p which is present of every host and has open access. Information on lag will be reported per shard (s1, s... [10:03:11] Analytics / Tech community metrics: Remove severity related graphs from bugzilla_response_time.html - https://bugzilla.wikimedia.org/69179#c2 (Santiago Dueñas) Pull request merged. [10:25:31] qchris, regarding the zero bug [10:26:21] nuria: yes, what's up? [10:26:23] i cannot get a 200 request via command line , i get a 301 [10:26:36] that translates to a cache miss [10:26:41] like: [10:26:55] curl http://fr.m.wikipedia.org/ -H 'Request URL: http://fr.m.wikipedia.org/favicon.ico' [10:27:25] That's a strange curl line. [10:27:34] Use telnet. It's way simpler there. [10:27:51] You can talk the protocol directly there, and [10:28:04] have direct full control over how the request is made. [10:28:30] If you want to, I can demo in a hangout how to do it. [10:28:50] this is the result of the curl though http://pastebin.com/FT2iWbgt [10:29:15] it's not a 404 [10:32:26] Well ... no one said it has to be a 404. ... and actually ... a 404 for http://fr.m.wikipedia.org/ would be bad :-) [10:32:46] http://fr.m.wikipedia.org/ is a 301 to http://fr.m.wikipedia.org/wiki/Wikip%C3%A9dia:Accueil_principal [10:32:48] for me. [10:32:52] That looks perfectly fine. [10:39:32] qchris: that request is GET http://fr.m.wikipedia.org/http://fr.m.wikipedia.org/favicon.ico but a 301 is what i would expect sure [10:40:01] i do not understand how this with telnet is any different, isn't it a different means to build the http requests? [10:40:07] *request [10:41:57] nuria: the url after the GET has a slash between org and http. so it is "org/http". Not "orghttp" [10:42:30] Also, the curl command you gave above results in a "GET / HTTP/1.1" for me. [10:42:41] "Request URL: ..." gets sent as header. [10:42:58] Maybe our versions of curl differ. [10:43:14] But regardless ... the slash between org and http is in the way. [10:43:29] this: curl http://fr.m.wikipedia.org -H 'Request URL: http://fr.m.wikipedia.org/favicon.ico' -D headers -v [10:44:03] returns the same (301) [10:44:19] "results in a "GET / HTTP/1.1" for me." -> right [10:44:25] is that not expected? [10:44:42] nuria you claimed above "qchris: that request is GET http://fr.m.wikipedia.org/http://fr.m.wikipedia.org/favicon.ico [...]" [10:45:17] That claim does not hold true for me. And by your dpaste [10:45:31] and also the comment you just made, it does not hold true for you either. [10:45:34] ah sorry, you are correct [10:45:41] Around the "GET /" ... [10:45:52] I cannot see the relation to the zero bug. [11:15:27] sorry qchris , getting back to this, please take a look at this and let me know if this is what you are thinking: http://pastebin.com/R9mvZyda [11:16:55] nuria: yes. That looks good. [11:17:03] See, now the Content-Type matches [11:17:10] And you get a 200 http status code. [11:20:20] ok, will add this info to the bug and that concludes our work on it right? [11:21:31] I hope not. [11:21:36] :-) [11:21:43] what else remains to be done? [11:21:48] With the telnet command, you now have [11:22:04] a way to produce the correct Content-Type and HTTP status code, but [11:22:24] as we discussed yesterday, it remains to be shown that they really result in [11:22:32] the reported logged URLs, and also [11:22:46] one needs to argue whether or not those requests are sane or not. [11:22:53] And whether or not they are an issue. [11:24:04] What additional work is needed to show that the request is logged as reported by zero team? it follows from varnish documentation [11:24:28] Analytics / General/Unknown: zero.log contains duplicate host in logs - https://bugzilla.wikimedia.org/69371#c15 (nuria) You can reproduce one of this requests by using telnet as follows. Please note "host" and "url below [prompt]$ telnet 91.198.174.204 80 Trying 91.198.174.204... Connected to mobile-... [11:26:49] Also, whether they are and issue depends on who consumes the logs I guess. Who does? [11:26:50] Well ... you before argued other things about our logging which all have been refuted :-) [11:27:07] So if I were you, I'd want to make sure. [11:27:24] But if you thing, checking is again not necessary ... it's your bug, so do as you wish. [11:27:30] s/thing/think/ [11:28:15] At least researchers, devs, and wikipedia zero team consume them. [11:28:31] But we do not have an exhaustive list of who consumes which data. [11:28:32] Sane or not is really not an issue, logs are full of spamy request like "wikipedia/sexy.vdo" , which makes sense as nothing filters requests before they get to varnish so we get there 'spamy' requests and this ones which Seemed malformed to me but I guess they are not. [11:29:24] For the benefit of reserachers I will point to this bug from the page on the logging format on wikitech, does that sound good? [11:29:36] *researchers and devs [11:30:20] You decide. You asked my opinion. I gave my opinion. [11:32:03] Oh ... I did not respond to that last question. Pointing to the bug does not make sense to me, because there are quite some false claims and blames there. [11:32:11] But it's a wiki. Be bold. [11:41:11] ok, documented there, i did point to the bug as I think anyone working with the files might be interested on went on analyzing the issue, falsehoods and all : https://wikitech.wikimedia.org/wiki/Cache_log_format#Issues [11:43:37] I really do not see anything else we need to do on the bug, the varnish docs are sef explanatory on how the request is parsed and those are already linked from the bug [11:47:27] Analytics / Tech community metrics: Remove severity related graphs from bugzilla_response_time.html - https://bugzilla.wikimedia.org/69179#c3 (Quim Gil) PATC>RESO/FIX Thank you! The changes are visible at http://korma.wmflabs.org/browser/bugzilla_response_time.html (you might need to refresh the... [12:14:03] (PS3) Yuvipanda: Let users star and unstar queries [analytics/quarry/web] - https://gerrit.wikimedia.org/r/153963 [12:14:08] (CR) jenkins-bot: [V: -1] Let users star and unstar queries [analytics/quarry/web] - https://gerrit.wikimedia.org/r/153963 (owner: Yuvipanda) [12:14:50] (PS4) Yuvipanda: Let users star and unstar queries [analytics/quarry/web] - https://gerrit.wikimedia.org/r/153963 [12:15:03] (CR) Yuvipanda: [C: 2] Let users star and unstar queries [analytics/quarry/web] - https://gerrit.wikimedia.org/r/153963 (owner: Yuvipanda) [12:15:08] (Merged) jenkins-bot: Let users star and unstar queries [analytics/quarry/web] - https://gerrit.wikimedia.org/r/153963 (owner: Yuvipanda) [12:18:11] (PS1) Yuvipanda: Use gender neutral pronouns [analytics/quarry/web] - https://gerrit.wikimedia.org/r/154040 [12:18:21] (CR) Yuvipanda: [C: 2] Use gender neutral pronouns [analytics/quarry/web] - https://gerrit.wikimedia.org/r/154040 (owner: Yuvipanda) [12:18:26] (Merged) jenkins-bot: Use gender neutral pronouns [analytics/quarry/web] - https://gerrit.wikimedia.org/r/154040 (owner: Yuvipanda) [12:32:16] (CR) Nuria: "I did tested this with a couple cohorts with 400 and 60 users each and gains were significant." [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/149482 (https://bugzilla.wikimedia.org/68596) (owner: Milimetric) [12:36:28] (CR) Nuria: [C: 2] Fix slow Rolling Active Editor metric [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/149482 (https://bugzilla.wikimedia.org/68596) (owner: Milimetric) [12:36:35] (Merged) jenkins-bot: Fix slow Rolling Active Editor metric [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/149482 (https://bugzilla.wikimedia.org/68596) (owner: Milimetric) [12:36:49] * YuviPanda waves at nuria [12:37:04] holaaaa YuviPanda [12:37:08] heya nuria :) [12:37:26] Analytics / Wikimetrics: Rolling Active Editor is slow - https://bugzilla.wikimedia.org/68596 (nuria) PATC>RESO/FIX [12:37:44] YuviPanda: Your tool is on fire eh? [12:38:04] nuria: not many users since Wikimania, as seen in http://quarry.wmflabs.org/query/runs/all [12:38:22] but I've been hacking away at features anyway, like http://quarry.wmflabs.org/Yuvipanda :) [12:43:38] (PS1) Yuvipanda: Fix 502 on login for new users [analytics/quarry/web] - https://gerrit.wikimedia.org/r/154043 [12:45:15] (CR) Yuvipanda: [C: 2] Fix 502 on login for new users [analytics/quarry/web] - https://gerrit.wikimedia.org/r/154043 (owner: Yuvipanda) [12:45:20] (Merged) jenkins-bot: Fix 502 on login for new users [analytics/quarry/web] - https://gerrit.wikimedia.org/r/154043 (owner: Yuvipanda) [12:50:49] (PS1) Yuvipanda: Fix div width issue in well underneath code area [analytics/quarry/web] - https://gerrit.wikimedia.org/r/154046 [12:50:51] (PS1) Yuvipanda: Reduce default height of SQL query text area [analytics/quarry/web] - https://gerrit.wikimedia.org/r/154047 [12:51:08] (CR) Yuvipanda: [C: 2] Fix div width issue in well underneath code area [analytics/quarry/web] - https://gerrit.wikimedia.org/r/154046 (owner: Yuvipanda) [12:51:14] (Merged) jenkins-bot: Fix div width issue in well underneath code area [analytics/quarry/web] - https://gerrit.wikimedia.org/r/154046 (owner: Yuvipanda) [12:51:15] (CR) Yuvipanda: [C: 2] Reduce default height of SQL query text area [analytics/quarry/web] - https://gerrit.wikimedia.org/r/154047 (owner: Yuvipanda) [12:51:20] (Merged) jenkins-bot: Reduce default height of SQL query text area [analytics/quarry/web] - https://gerrit.wikimedia.org/r/154047 (owner: Yuvipanda) [12:51:56] (CR) QChris: Fix slow Rolling Active Editor metric (2 comments) [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/149482 (https://bugzilla.wikimedia.org/68596) (owner: Milimetric) [12:52:47] (CR) QChris: "Oh well ... seems it was merged while I was writing this :-D" [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/149482 (https://bugzilla.wikimedia.org/68596) (owner: Milimetric) [14:23:14] Analytics / Quarry: Allow comments on queries - https://bugzilla.wikimedia.org/69543 (Helder) NEW p:Unprio s:normal a:None I wanted to mention on http://quarry.wmflabs.org/query/219 that it is related to bug 19288 and bug 58196, but there was not way to add a comment. [14:23:56] Analytics / Wikimetrics: Wikimetrics can't run a lot of recurrent reports at the same time - https://bugzilla.wikimedia.org/68840 (nuria) PATC>RESO/FIX [14:24:43] Analytics / Quarry: Show the author of a query in its page - https://bugzilla.wikimedia.org/69544 (Helder) NEW p:Unprio s:normal a:None When accessing a page like this http://quarry.wmflabs.org/query/219 I would like to see the username of the author (possibly with a link to Meta-wiki) [14:32:14] [travis-ci] wikimedia/mediawiki-extensions-EventLogging#234 (wmf/1.24wmf17 - 387049c : Reedy): The build passed. [14:32:14] [travis-ci] Change view : https://github.com/wikimedia/mediawiki-extensions-EventLogging/commit/387049c6bf0c [14:32:14] [travis-ci] Build details : http://travis-ci.org/wikimedia/mediawiki-extensions-EventLogging/builds/32544461 [14:49:19] hey ottomata -- can I access the hadoop jobtracker? [14:49:22] UI that is [14:53:05] yes! [14:53:23] that was is easier than most, since you def have shell access on the master node [14:54:19] heh -- I want the UI to see what the cluster is doing [14:54:54] this is the same conversation we have every few months! [14:55:07] here [14:55:07] https://wikitech.wikimedia.org/wiki/Analytics/Cluster/Access#ssh_tunnel.28s.29 [14:55:08] just updated that [14:55:16] yeah, tnegrin, there is still no good way [14:55:20] the sshuttle thing on that page works [14:55:24] kk [14:55:34] I'll see if I can get it to work -- thanks! [14:56:13] refresh, just updated again [14:56:34] ssh -N bast1001.wikimedia.org -L 8088:analytics1010.eqiad.wmnet:8088 [14:56:35] then [14:56:39] http://localhost:8088/cluster [14:56:52] ottomata, hi, have some time today? [14:57:47] sure yup [14:57:49] now is good [14:57:54] oh i see your python packages thing [14:57:55] i think that's fine [14:58:59] ottomata, there's more than that :) [14:59:20] k ja [14:59:21] so! [14:59:22] let's talk [14:59:30] gimme da skinny! [14:59:38] hey ottomata -- qq [14:59:43] before you dive in with yurikR [14:59:44] issues: 1) what's the best place to store sms logs ( i already uploaded them to 1003 /a/zerosms [14:59:55] tnegrin@stat1002:~$ /usr/lib/hadoop/bin/hadoop job -kill job_1406229821917_23633 [14:59:55] Error: JAVA_HOME is not set and could not be found [15:00:25] hm, not sure about that, but that's a weird way to kill a job! [15:00:27] i think you want [15:00:38] yarn application -kill appplication_IDxxxxxxx [15:01:35] ok, yurikR, the main diff between stat1002 and stat1003, is that stat1002 is for private data [15:01:44] if you think this data is private and sensitive, you rpobably want to do this on stat1002 [15:01:50] it is [15:02:02] but stat1002 doesn't have external access [15:02:12] meaning it can't pull data from amazon cloud [15:02:18] ah [15:02:19] hm [15:02:41] does stat1003 not have a public-datasets folder? [15:03:00] because: private and sensitive, sure, but given that you need an SSH key and bastion access to get into either of those [15:03:13] I think if some evildoers get into stat3 we're sort of screwed anyway. [15:03:22] yurikR: https://wikitech.wikimedia.org/wiki/Http_proxy [15:03:38] Ironholds: true, access is just more restricted to stat1002 [15:03:43] and stat1002 does not have a public IP [15:03:55] yeah, which are both strengths. [15:04:07] but nobody really uses stat2 because they don't know it has a MySQL client now. [15:04:15] and that means ErikZ and I get to use it without caring about other people [15:04:22] I LIKE writing code that doesn't care about other people :D [15:04:54] (more seriously: proxying is a PITA. Stat1003 would be a lot easier. I don't really get that there's a security problem there but I'm not that technical, so.) [15:06:09] ottomata, boto has a proxy configuration -- http://boto.readthedocs.org/en/latest/boto_config_tut.html#boto [15:06:52] what values would i use for port, user, password? [15:06:52] ja, yurikR, from stat1002: [15:06:52] curl --proxy http://webproxy.eqiad.wmnet:8080 http://www.google.com [15:06:59] no user or pass [15:07:01] 8080 for port [15:07:09] going to add that to the wikitech page [15:07:12] took me a sec to remember that [15:07:16] ok, so that can be solved, good. Thx! [15:08:15] what about location - which dir structure to use -- my scripts are in http://git.wikimedia.org/summary/?r=analytics/zero-sms.git [15:08:51] i would like the script to be cloned somewhere and auto-pulled via cron, plus cron should auto-run the script on the data [15:09:00] edited: [15:09:00] https://wikitech.wikimedia.org/wiki/Http_proxy [15:09:05] what's the best way to organize it [15:09:16] Ironholds: have you checked this out: http://localhost:8088/proxy/application_1406229821917_23653/ [15:09:21] argh -- sorry [15:09:23] that won't work [15:09:23] yurikR, if you want to just manage this yourself, you can clone it into your homedir [15:09:27] and you can save the data somewhere in /a [15:09:34] (/a isn't that organized...:/) [15:10:04] what about the scripts? [15:10:06] Ironholds: you can see what your hadoop jobs are doing [15:10:13] the scripts could be in your home directory [15:10:16] /a/zero-sms would be fine for data i suppse [15:10:24] or, if you want ops to manage it, we can puppetize this [15:10:41] pupetize the script cloning? i would rather have that i think [15:10:48] i think you would too :) [15:10:51] but, that and the cron job, etc. [15:10:57] i think its fine to manage this yourself (especially for now) [15:11:13] if you ever want it quote 'productionized', we'll have to puppetize, etc. [15:11:19] but for now i think you can keep it under your wing [15:11:24] so you can add a cron under your user too [15:11:38] ok, so i basically clone the scripts into my own home dir, and set up two cron jobs - to clone and to run the scripts [15:11:48] do you need a cron to clone/pull? [15:11:56] would like to [15:11:58] you can just do that whenver you need to 'deploy' an update, right? [15:12:01] ok, then sure [15:12:02] that's fine [15:12:10] can i easily move files from stat1003 to 1002? [15:12:14] (one time) [15:12:18] yes [15:12:24] /a has an rsync module on all stat servers [15:12:32] on stat1003: [15:12:48] rsync path/to/bla stat1002.eqiad.wmnet::a/path/to/bla [15:12:53] (note the double :: there) [15:14:58] rsync /a/zerosms stat1002.eqiad.wmnet::a/a/zero-sms -- access denied [15:15:15] does it use my own creds? [15:16:28] Analytics / Refinery: Story: AnalyticsEng has vetted PageView Definition - https://bugzilla.wikimedia.org/69546 (Kevin Leduc) NEW p:Unprio s:enhanc a:None vet the pageview definition [15:16:51] ottomata, ^^ [15:17:24] hey ottomata -- can we talk about the hadoop queues when you have a sec [15:17:27] Analytics / Refinery: Story: AnalyticsEng has vetted PageView Definition - https://bugzilla.wikimedia.org/69546 (Kevin Leduc) p:Unprio>Normal [15:17:27] Analytics / Refinery: Story: AnalyticsEng has hourly PageView counts - https://bugzilla.wikimedia.org/67804 (Kevin Leduc) [15:18:12] yurikR: no need for the /a/ part [15:18:15] ::a tarets /as [15:18:20] ::a targets /a [15:18:32] so, ::a/zero-sms should be fine [15:18:33] or actually [15:18:34] even just [15:18:36] ::a/ [15:18:49] tnegrin: sure [15:18:51] i got time now [15:18:56] (kinda waiting on a review...) [15:19:18] ok -- it looks like dario's query is running in a higher priority q than the camus stuff? [15:19:27] ottomata, still getting access denied [15:19:42] yurik@stat1003:~$ rsync /a/zerosms stat1002.eqiad.wmnet::a [15:19:42] @ERROR: access denied to a from stat1003.wikimedia.org (208.80.154.82) [15:19:58] DarTar’s queries on a fast track [15:20:07] well of course [15:20:30] someone's cutting the line ;) [15:20:51] non capisco [15:21:21] simmer down gents -- we don't want the loading jobs to starve [15:21:57] hmmm [15:22:30] HM, weird, yurikR, me too, investigating... [15:22:38] (tnegrin, with you shortly...) [15:22:47] np -- no hurry [15:23:32] OH, werid, yurikR, /a module is missing the stat1003 rule, um, hm, /srv is an alias [15:23:47] ottomata, i chmod zero-sms to 700, is that ok? [15:23:48] yurikR: try ::srv/ [15:23:52] ja that's fine [15:24:01] try ::srv/ instead of ::a/ [15:24:41] yurik@stat1003:~$ rsync /a/zerosms stat1002.eqiad.wmnet::srv/zero-sms [15:24:41] skipping directory zerosms [15:25:20] -r [15:25:39] oof, this whole /srv /a confusion needs cleaned up. [15:25:48] yei, i think its doing ... something ... [15:25:53] i'm worried now :) [15:25:55] cool, if you add -v it will tell you what [15:26:07] but, it looks like its working [15:26:09] too late, its done its dirty did [15:26:10] i see /a/zerosms [15:26:13] on stat1002 [15:26:47] cool, learnt something new. Now my script, when it generates some dashboards, how can it copy them to gp (the labs analytics) [15:27:25] that i am not sure... [15:27:40] i don't want it to go via git or any other open repo because they are semi-protected: host//graph [15:27:42] i suppose you could set up an rsync module on gp as well [15:28:09] that is write only [15:28:21] and only allows stat1002, but i'm not sure [15:28:43] ok, will explore that [15:28:54] tnegrin: i don't see any of dartars jobs... [15:29:12] ah, this one? [15:29:13] http://localhost:8088/cluster/app/application_1406229821917_23651 [15:29:26] ottomata, does stat1002/a get backed up anywhere? [15:29:36] application_1406229821917_23653 [15:29:38] ah, i see [15:29:45] yurikR: no, not unless specifically requested [15:29:51] there aren't any other jobs now [15:30:01] do we back up logs? [15:30:27] I would have expected DarTar's job in the adhoc queue [15:30:36] hmmmm [15:31:11] https://github.com/wikimedia/operations-puppet/blob/production/templates/hadoop/fair-scheduler.xml.erb [15:31:28] so default and adhoc are equivalent, except default accepts more running apps [15:31:32] looks like [15:31:44] (i haven't looked much at this before) [15:33:55] hm, honestly i'm not really sure of why we have adhoc and default (dsc configured these originally) [15:34:28] tnegrin, maybe we only need two queues? default and 'standard'? [15:34:34] maybe standard shoudl be renamed 'admin' [15:34:34] ? [15:34:44] and just be given a higher weight? [15:34:59] probably [15:35:15] the camus stuff needs to run first and not get starved [15:35:50] but three queues is probably ok [15:36:29] yeah, and camus is running in default queue too.... [15:36:43] ok, i'm adding this to my todo list, i will discuss with q chris and get his thoughts [15:37:44] super -- thanks [15:47:12] Analytics / Refinery: Story: AnalyticsEng has vetted PageView Definition - https://bugzilla.wikimedia.org/69546#c1 (Kevin Leduc) NEW>RESO/INV this story is no longer needed. [15:56:49] milimetric: both hangouts are empty ... where do we continue? [15:57:07] i'm in the google one [15:57:28] ok. [15:57:28] qchris_meeting: ^ [15:57:31] sry :) [15:57:32] joining. [15:57:53] thanks. [15:59:33] ottomata, https://gerrit.wikimedia.org/r/#/c/153862/ - need to try graphing in labs [15:59:45] could you +2 pls :) [16:00:27] cool, yurikR, uhm, i'm not sure i'm one with authority to +2 on that repo [16:00:32] i mena, i probably have permissions [16:00:44] who normally approves stuff on that repo? [16:00:56] no idea - this is the repo for beta labs according to iwki [16:00:59] wiki [16:01:11] for production, they use core [16:01:13] with branches [16:02:12] hashar, ? ^ [16:02:51] yurikR: what is the question? [16:03:00] hashar, https://gerrit.wikimedia.org/r/#/c/153862/ [16:03:10] please repeat / ask [16:03:17] I am not going to parse a bunch of things, it is late :-D [16:03:28] could you +2 - this is for labs only, right? seems like i could have self-merged it too [16:03:46] i'm looking at that repo [16:04:11] although i'm not sure i should have added the dir [16:05:42] yurikR: just use sync-with-gerrit.py [16:05:46] at the root of the repo [16:05:49] it takes care of everything [16:05:54] or should [16:07:16] hashar, that script assume i'm on mac [16:07:54] could you add the graph ext pls? [16:16:16] hashar, still around? [17:21:04] oh! qchris_meeting, I forgot to mention [17:21:06] the LEAD implementation of that HIVE query is awesome [17:21:13] I had no idea HIVE had lead and partition and all that great stuff [17:21:20] I could swear when I looked at it last year, it didn't... or the version we were using was old [17:32:55] milimetric: maybe didn't have LEAD back then. Don't know. [17:33:01] Upgrades for the win :-D [17:33:35] that's awesome though, good riddance to that monstrosity I wrote (I'm now having deja-vu like you showed me this before) [17:36:19] hey milimetric [17:36:42] milimetric: have a question if you have a few mins :) [17:37:11] shoot YuviPanda [17:37:22] milimetric: how are you storing results of queries in wikimetrics? [17:37:45] as json in REDIS for 30 days [17:37:57] then for reports that people make public, forever as that same JSON dumped to files [17:38:16] ah [17:38:21] so what happens if the JSON is too big? [17:38:46] well, wikimetrics is somewhat controlled, and we were talking about adding some limits [17:38:58] ah, right [17:39:01] but no restrictions exist now [17:39:06] hmm, for Quarry I'm running into the problem already [17:39:20] since everything is public, query results can be huge and I keep them forever [17:39:39] well, just make a function of size -> retention period [17:39:47] and warn the user [17:40:18] milimetric: no, my problem is more like I'm crashing people's browsers ;) [17:40:34] heh, by trying to render the json output? [17:40:38] yeah [17:40:51] me and phuedx came up with a solution where each result is stored as an sqlite database :) [17:40:52] are you storing it in a file on the backend? [17:40:58] yeah [17:41:00] JSON file [17:41:14] right - well, this is where CSV would help al ot [17:41:21] well, TSV 'cause commas are evil, but yea [17:41:22] right [17:41:29] you can stream those [17:41:31] then you could head / tail from the web interface [17:41:35] yeah [17:41:52] but you can do the same with JSON if you want to keep the format that way [17:42:09] streaming JSON is a pain [17:42:13] indeed [17:42:33] well... hm... YAML? [17:42:42] milimetric: was considering msgpack as well [17:42:45] then you kind of get the best of both words [17:42:48] but sqlite also lets me stream [17:42:49] *worlds [17:42:57] and also paginate [17:43:08] yeah, but sqllite is gonna pad it quite a bit with extra space [17:43:16] true, but I'm putting them on NFS :D [17:43:20] so have all the space in the world [17:43:24] in that case you definitely have to implement a lower retention policy for big stuff [17:43:40] well, NFS has like 40T of free space I think [17:43:50] but you don't need a separate database for each output, just use the same db file, no? [17:43:54] make a separate table for each output [17:44:03] why not have a different db for each output? [17:44:18] table management is easier than file management [17:44:29] or maybe not [17:44:41] yeah, I'd rather manage hierarchical files than flat-list-of-tables [17:44:44] 'cause with the file granularity you get mtime and stuff like that for free [17:44:51] ^ that [17:44:57] makes sense [17:47:35] YuviPanda: the only potential downside would be if you wanted to compare results for some reason [17:54:48] Analytics / Wikimetrics: Wikimetrics backup has no monitoring - https://bugzilla.wikimedia.org/69397 (Sam Reed (reedy)) s:normal>enhanc [17:55:03] Analytics / General/Unknown: Create a table in labs with replication lag data - https://bugzilla.wikimedia.org/69463 (Sam Reed (reedy)) s:normal>enhanc [17:55:04] Analytics / Quarry: Show the execution time in the table of queries - https://bugzilla.wikimedia.org/69264 (Sam Reed (reedy)) s:normal>enhanc [17:55:09] qchris: the send_nsca script I assume is being ready by the yarn user, right? [17:55:14] if you already know for sure, let me know [17:55:16] :) [17:55:19] read* [17:55:26] Analytics / Quarry: Show the author of a query in its page - https://bugzilla.wikimedia.org/69544 (Sam Reed (reedy)) s:normal>enhanc [17:55:36] Analytics / Quarry: Allow comments on queries - https://bugzilla.wikimedia.org/69543 (Sam Reed (reedy)) s:normal>enhanc [17:55:36] Analytics / Quarry: Make the table sortable - https://bugzilla.wikimedia.org/69265 (Sam Reed (reedy)) s:normal>enhanc [17:55:40] I have no clue. But that's easy to find out. [17:55:45] Let me do that [17:55:54] Analytics / Quarry: Add a list/table of popular queries - https://bugzilla.wikimedia.org/69266 (Sam Reed (reedy)) s:normal>enhanc [17:58:35] argh. the test cluster is in safe mode ... :-/ [17:59:41] probably your namenode is in standby? [17:59:42] naw [17:59:43] hm [17:59:45] no you don't have HA [17:59:46] dunno [17:59:49] restart namenode? [17:59:54] *wasn't me!* [18:00:22] The namenode runs out of disk space once in a while, because [18:00:31] oozie logs get pretty big, pretty quickly. [18:00:52] So I just have to clean them up and force safe mode off. [18:02:02] aye [18:03:17] ottomata: whoami in the shell script reports "yarn" [18:03:33] So you were right. [18:03:37] perfect [18:43:30] qchris: per our conversation you and milimetric agreed this change can be merged, correct? https://gerrit.wikimedia.org/r/#/c/153388/ [18:45:49] (PS1) Ottomata: Add option to set icinga reported hostname to send_ok_to_icinga.sh script [analytics/refinery] - https://gerrit.wikimedia.org/r/154109 [18:46:45] (PS2) Ottomata: Add option to set icinga reported hostname to send_ok_to_icinga.sh script [analytics/refinery] - https://gerrit.wikimedia.org/r/154109 [18:47:18] (CR) Ottomata: [C: 2 V: 2] Add option to set icinga reported hostname to send_ok_to_icinga.sh script [analytics/refinery] - https://gerrit.wikimedia.org/r/154109 (owner: Ottomata) [18:47:26] nuria: Yes. [18:48:02] ok, ottomata: would you be so kind as to merge this change, it is reday to go: https://gerrit.wikimedia.org/r/#/c/153388/ [18:48:06] *ready [18:48:32] somebody want to +1 it first ? [18:53:59] ah , sorry i will [18:54:39] ottomata: done [19:38:51] milimetric: if someone has a one off data request, should I ask them to email the analytics public mailing list about it? [19:38:57] see the bottom of thsi bug [19:38:57] https://bugzilla.wikimedia.org/show_bug.cgi?id=69277 [19:43:47] (Abandoned) Ottomata: Add shell actions for triggering an icinga passive check via send_nsca [analytics/refinery] - https://gerrit.wikimedia.org/r/151957 (owner: Ottomata) [19:53:39] (PS1) Yuvipanda: Use 'published' instead of stars for self queries [analytics/quarry/web] - https://gerrit.wikimedia.org/r/154122 [19:53:47] (CR) jenkins-bot: [V: -1] Use 'published' instead of stars for self queries [analytics/quarry/web] - https://gerrit.wikimedia.org/r/154122 (owner: Yuvipanda) [19:58:33] (PS2) Yuvipanda: Use 'published' instead of stars for self queries [analytics/quarry/web] - https://gerrit.wikimedia.org/r/154122 [19:59:39] (CR) Yuvipanda: [C: 2] Use 'published' instead of stars for self queries [analytics/quarry/web] - https://gerrit.wikimedia.org/r/154122 (owner: Yuvipanda) [19:59:44] (Merged) jenkins-bot: Use 'published' instead of stars for self queries [analytics/quarry/web] - https://gerrit.wikimedia.org/r/154122 (owner: Yuvipanda) [20:26:51] ottomata: I've run into another package issue with R, any chance you can get to https://rt.wikimedia.org/Ticket/Display.html?id=8057 this week? [20:28:38] leila, not likely this week, but next! [20:29:08] ooki. thanks for letting me know. me goes digging some other way out of the current block. :D [20:33:29] Analytics / Wikimetrics: Story:d WikimetricsUser runs 'Recurring old active editors' report - https://bugzilla.wikimedia.org/69569 (Kevin Leduc) NEW p:Unprio s:enhanc a:None https://meta.wikimedia.org/wiki/Research:Rolling_monthly_active_editor#Recurring_old_active_editors [20:37:26] Analytics / Wikimetrics: Story:d WikimetricsUser runs 'Recurring old active editors' report - https://bugzilla.wikimedia.org/69569#c1 (Kevin Leduc) p:Unprio>High Collaborative tasking of feature will be in etherpad: http://etherpad.wikimedia.org/p/analytics-69569 [20:38:07] qchris: WOOO [20:38:07] https://icinga.wikimedia.org/cgi-bin/icinga/status.cgi?search_string=hive_partition [20:38:16] I like the url \o/ [20:38:21] Let me look at it. [20:38:55] ottomata: Yay \o/ [20:39:02] 1 green already! [21:05:31] laters all! [21:11:17] Ouch ... ottomata you missed the first partition go green :-/ [21:43:54] Ironholds: where do you run Hive stuff these days? [21:44:12] analytics1026 won't let me in and 1011 has some weird logstash error [21:44:57] or leila might know this too ^ [21:45:16] stat1002 milimetric [21:45:40] hive is running there? :) [21:45:45] yup [21:45:50] just type hive and it will show up [21:46:15] woa... [21:46:16] :) [21:47:39] milimetric, stat2 [21:47:43] yep [21:47:48] I begged people into installing a client there [21:47:53] and also installing a mysql client [21:47:58] and now I have my own personal sandbox :D [21:48:04] lovely, this is great [21:48:05] YuviPanda|brb, https://gerrit.wikimedia.org/r/#/c/154200/