[00:00:21] Yes. New feature request: Collapsible query threads for sequential runs under the same title. [00:00:42] Hmm... This is *not* essential for the hackathon though. [00:01:34] This list looks like a history page. I think it will work well for people. [00:03:33] halfak: yeah, except if you click on any link it'll just take you to the latest one [00:03:43] halfak: I could make it take them to the exact revision... [00:04:03] I think that collapsing them with the option of seeing revisions is a good idea. [00:04:07] right [00:04:16] e.g. the "Page" is the most recent "Revision". [00:04:19] right [00:04:32] and also paginate. it currently taps out at 25 and then does nothign [00:04:34] *nothing [00:04:49] * halfak didn't even notice. [00:04:50] lol [00:04:52] halfak: I should also put a 'history' thing on each page. backend is setup for that, just need to expose [00:04:59] halfak: hehe :) [00:06:15] Gotta run. I'll be on tomorrow if you want to stress test. [00:06:16] o/ [00:06:24] halfak: will do. cya! [12:37:41] Hey YuviPanda [12:37:46] hi halfak [12:37:49] Interested in stress testing? [12:38:33] halfak: yeah, in about 5min? I'm testing one final thing [12:39:49] kk [12:44:23] halfak: alright, good to go now [12:44:50] OK. So I'm thinking that I'd like to start ~ 10 queries. I'll need different user accounts, right? [12:44:59] halfak: no, you can use the same account [12:45:04] halfak: there's no per user limit [12:45:05] atm [12:45:05] OK [12:47:32] halfak: tell me when you've started? [12:47:36] OK. Here we go. [12:47:38] I'm seeing almost zero load now :) [12:47:39] whee [12:48:28] "queued" == running? [12:48:45] halfak: no, just... queued. not running [12:48:55] Oh. Well none of them are running. [12:49:01] hmm [12:49:06] All queued [12:49:25] halfak: ah, right, moment. [12:50:00] halfak: should've started running now [12:50:04] puppet oversight on my part, fixing [12:50:19] halfak: 'olumn 'rev_id' in on clause is ambiguous' [12:50:21] *C [12:51:25] "Lost connection to MySQL server during query" [12:52:13] halfak: is that happening to every query? [12:52:40] halfak: looks like it [12:52:41] hmm [12:52:49] Yup [12:53:13] These queries are designed to be a pain in the ass to run. [12:53:32] They'd finish on the analytics-store, but it might take a couple of days and a huge amount of temp table space. [12:53:32] halfak: are 'normal' queries working out alright? [12:54:13] yeah [12:54:17] http://quarry.wmflabs.org/query/29 [12:54:56] * halfak wishes the URL was quarry.wmflabs.org/halfak/Hyphenated-query-name-like-github [12:55:30] halfak: heh, we could makeitso, but then would need a mandatory title [12:55:55] Would it be too much trouble to use the placeholder title in the meantime? [12:56:45] Still have a couple of the queries running. [12:56:54] Can you confirm that the queries that "failed" actually died? [12:57:15] halfak: not really, but I'd want to use the stackoverflow method of /user//title, so it'll still work if title changes [12:57:48] halfak: I just tweaked the code, restarted. try now? [12:58:14] Restart the queries? [12:59:01] halfak: should have [13:00:24] halfak: hmm, I see lost connections. [13:01:17] halfak: it's possible the server kills the query if it is too intense. springle said he's got something like that runnig [13:01:24] But would it kill the connection? [13:01:57] halfak: I think so. it killed the connection when you touched information_schema without a where [13:02:09] halfak: link to query? I can run it from tools to see if it's killed [13:03:16] Here's one that lost connection: http://quarry.wmflabs.org/query/26 [13:03:35] This is the other variant of big query: http://quarry.wmflabs.org/query/23 [13:03:49] halfak: yup [13:03:50] halfak: err [13:03:51] no [13:04:09] halfak: hmm, it's not being killed on toollabs [13:20:05] halfak: hmm, this is weird :| [13:20:13] halfak: since it runs with the same user account when run from toollabs [13:20:15] runs forever, but runs [13:21:31] So only the connection is getting killed? [13:22:01] halfak: it is, but not by quarry [13:23:08] halfak: can you try a query that should take a long time to complete, but doesn't send back as much data? [13:25:02] Sure! [13:28:17] http://quarry.wmflabs.org/query/25 [13:28:24] It should return 28 rows. [13:28:33] WOops. need to fix a thing. [13:28:45] Done [13:30:32] halfak: not killed yet, I think... [13:32:47] halfak: how long do you expect this to run, btw? [13:33:31] 72 hours? [13:33:37] maybe 24 [13:33:53] halfak: ah, so will get killed after 10m [13:34:02] in theory... [13:34:20] halfak: yeah [13:34:20] 4 more min [13:34:24] halfak: ok [13:38:53] halfak: got killed by the killer [13:39:22] Still says "running" in the Query Runs list [13:39:40] halfak: it has a 5s delay there. still saying 'running? [13:42:36] Still says "running" [13:42:36] halfak: It says 'killed' in the query page, but the runs page is inconsistent [13:42:49] Yup. I see that too. [13:43:43] halfak: I'll fix that tomorrow. also, I'm going to temp. decrease the timeout to 1m [13:44:33] That query hit caused a lot of load? [13:45:32] halfak: not that I could see, no. [13:45:59] Why reduce the timeout? [13:46:26] halfak: to debug [13:46:40] halfak: the 'lost connection' thing, at least. [13:46:54] halfak: http://ganglia.wikimedia.org/latest/?r=hour&cs=&ce=&s=by+name&c=MySQL%2520eqiad&tab=m&vn=&hide-hf=false has labsdb db loads [13:47:46] gotcha [13:48:12] halfak: am running http://quarry.wmflabs.org/query/30 now, see if it has lost connections [13:48:23] halfak: the lost connections might also have been a transient network failure, but hopefully not [13:48:43] halfak: btw, no appreciable load increase anywhere so far, though. [13:49:59] halfak: hmm, I don't see the connection lost issue anymore [13:49:59] cool [13:50:03] Weird. [13:50:32] halfak: I'm increasing timeout back to 10 now [13:50:53] Do you think it was the lack of "where" causing it before? [13:51:13] halfak: might be, springle can confirm later when he's up [13:51:24] * halfak is trying one without where [13:52:04] halfak: I'm restarting the deamon, might be a minute before it starts running [13:52:31] halfak: back up now. hit run again? [13:52:45] done [13:54:46] halfak: hmm, so far no connections lost [13:57:47] halfak: found an unrelated bug that causes things to crash if mysql returns a decimal. fixing, but the large query seems to be still running [14:02:53] halfak: it got killed after 10minutes [14:03:54] halfak: run 5 of those? :) [14:04:12] What's the "0.0020s" after "Query Killed!"? [14:04:28] halfak: it's my time module going bonkers, shold report 600s [14:04:37] I need to fix that as well, unsure why it's reporting a ridiculous time [14:05:38] halfak: I saw a lost connection again :( [14:05:40] Lost connection http://quarry.wmflabs.org/query/23 [14:05:43] two [14:05:52] I wonder if it is for running the same query multiple times. [14:06:07] that shouldn't be a problem [14:06:50] halfak: no lost connection at http://quarry.wmflabs.org/query/38 [14:06:55] halfak: where Ijust added a limit [14:07:16] halfak: it's possible this is springle's work killing these queries when they're sometimes 'obviously' bad [14:08:26] halfak: http://quarry.wmflabs.org/query/39 was killed instantly with the lost connection, for example [14:08:49] hmm, I am submitting it again, and now *not* instantly killed [14:08:49] wat [14:08:57] Was it killed or was the connection dropped and the query allowed to run? [14:10:13] halfak: it was properly killed, but I see a lot of 'sleep's in there now [14:11:19] kk [14:11:58] * halfak should not use the word "ridiculous" when reporting bugs. [14:12:01] sorry about that [14:12:07] halfak: it is ridiculous [14:12:28] halfak: hey, I'm going to step away for 30m to get a shave, I'll be back. sorry to leave in the middle, but do hit it with queries? [14:12:34] still better than "janky" [14:12:40] No worries. [14:12:52] * halfak goes to play with ferrets [14:12:56] :> [15:46:54] halfak: I might've fixed the server has gone away thing [15:46:59] * YuviPanda crosses fingers [15:47:11] woot! [15:48:41] halfak: can you start a bunch more of 'em? [15:48:55] Sure [15:50:40] halfak: so far nothing! [15:52:31] A few will return the whole page table and a couple will run for forever but only return a couple of rows. [15:52:40] *revision table [15:53:38] halfak: bah, I was wrong :( it's still losing connection, but at least this one is just losing it *during* a query, and not before (which was also happening before) [15:56:30] halfak: ok, so only one lost connection (the other 'execution was interrupted' was just me [15:57:12] * halfak notices Southparkfan [15:59:05] halfak: heh, yeah, testing the sanitization :) [16:03:03] halfak: ok, I'm going to copy paste a 'select * from page' about 10 times and run them, can you do something like that as well? [16:04:28] * YuviPanda waits [16:05:24] hmm, 2 lost ocnnections [16:05:28] 3 [16:09:36] halfak: yeah, mostly because I've not implemented it yet [16:09:43] halfak: I'll probably get rid of that button until I implement it [16:10:02] Which button? [16:11:28] halfak: 'download CSV' [16:11:33] halfak: oh, southparkfun reported it [16:11:34] halfak: not you [16:11:35] niiiice [16:19:40] halfak: link me to your query that runs for a long time but doesn't return a lot of data? [16:22:48] * YuviPanda goes to et [16:35:26] YuviPanda, http://quarry.wmflabs.org/query/59 [17:07:49] halfak: hmm, bah. sigh. this is just very... intermittent and what not [17:08:33] I wonder if we can get springle to help us debug at the Hackathon [17:08:51] halfak: yeah, definitely [17:08:56] halfak: 'regular' queries seem fine [17:09:54] halfak: I ran a 'select * from page limit 10' and all 10 succeeded [17:10:01] halfak: do you have a query that'll be 'useful' but finishes in a few minutes? [17:10:36] I can work one out quick. [17:11:17] halfak: \o/ [17:11:44] SELECT command denied to user 'u2029'@'10.68.17.213' for table 'logging' [17:11:54] n/m [17:12:00] I was referencing "enwiki" as the DB name [17:12:05] Should be "enwiki_p" [17:12:28] What's the row limit you set? [17:13:01] http://quarry.wmflabs.org/query/59 [17:13:13] halfak: I didn't set one [17:13:16] kk [17:13:26] This one should return about 4 or 5k [17:13:45] It's one we use in wikimetrics [17:16:57] halfak: http://quarry.wmflabs.org/query/50 nice [17:17:02] halfak: locks up the browser a bit, tho :) [17:17:20] halfak: I think the servers fine on this, and the client crashes because browsers are too meh. [17:18:18] Probably need to 'head' the result-set [17:21:58] halfak: yeah [17:23:19] halfak: ok, so the queries *all* were fine [17:23:31] halfak: none of them dropped out, so yay [17:27:33] :) [17:28:53] halfak: I need to get graphite setup for better monitoring, but that's the *other* opsy project I took time out to do this. oh well :) [17:31:07] I just saw my resultset for Newly registered users. I can officially say that this works the way I'd expect. [17:31:15] (short of "Download CSV" [17:31:20] halfak: right. wheee [17:31:29] Also, that should be "Download TSV" [17:31:36] halfak: ah, I got that from Ironholds [17:31:37] why [17:32:01] If there's anything that the TAB character is for, it's acting as a delimiter for table data. [17:32:17] hmm [17:32:24] TSV files are the same size as CSV files, but they are much easier for humans to read. [17:32:29] hmm [17:32:31] that's true [17:32:32] MySQL natively reads and writes TSVs [17:32:38] That's about it. [17:32:51] the native format for quarry is JSON, and I'll have to write a convertor. [17:33:00] JSON would be OK too. [17:33:14] JSON is the standard format for young devs these days. [17:33:27] Will Excel understand JSON rows? [17:33:30] halfak: I'm unsure how exactly to implement HEAD, though. I'll have to do server side pagination, and that's going to be a bit painful [17:33:58] One option (which would reduce browser issues) is to HEAD on the client-side [17:34:05] halfak: true [17:34:11] So the client still downloads it all, but it will stop displaying after a few rows. [17:34:26] halfak: I could also generate CSV clientside to start with [17:34:26] You could do an infinite scroll, but that sounds like a lot of code to manage at this point. [17:34:32] +1 [17:34:34] halfak: ah, I have a WIP patc that does datatables [17:34:48] halfak: which actually was better performance wise, since it can do deferred DOM rendering [17:35:15] halfak: www.datatables.net/examples/index for datatables [17:38:12] oooh [17:38:14] shiney [17:38:28] halfak: yeaaah [17:38:48] halfak: had some weird issue where the data won't show up until I open up chrome inspector, which is a weirdasfuck bug [17:46:01] halfak: so, the query killer is fairly good too. even if I kill the worker by doing a 'celeryd stop', it kills running queries with kill query [17:46:02] so yay [17:46:55] Nice! [17:47:11] I think that zombies are our greatest threat. [17:47:23] ^ You can quote me on that. [17:47:29] Because it sounds awesome. [17:48:08] halfak: :D [17:48:30] Just finished 0.1.0 of mwevents. [17:48:31] https://github.com/halfak/MediaWiki-events [17:48:35] * halfak stretches. [17:49:12] It can listen to the API right now. Generates the full list of events described here: https://meta.wikimedia.org/wiki/Research:Ideas/MediaWiki_events:_a_generalized_public_event_datasource [17:49:27] * YuviPanda clicks [17:50:00] halfak: nice! does it poll the API [17:50:40] It does. There's a nice timeout on it. It will rccontinue as fast as it can until it starts getting fewer than the expected number of results -- then it backs off. [17:50:51] I want to write a "source" for RCStream too. [17:50:55] Also the databse. [17:51:06] Still needs query-ers for historical events. [17:51:42] halfak: yeah, historical is going to be problematic/fun [17:51:49] That's going to be an interested bit of work. Will build in all of the stuff at https://meta.wikimedia.org/wiki/Research:Wiki_archaeology [17:52:08] Hopefully, through this process, I can render that page irrelevant. [17:53:01] Ok. Time to go have a sunday. See you in like 48 hours, dude. [17:53:03] o/ [17:53:05] YuviPanda, ^ [17:53:31] halfak: \o/ yeah [17:53:41] halfak: I'll go around fixing things till then [18:05:55] wm-bot4: do you do any tricks? [18:06:15] Emufarmers: are you a new dog? [18:06:32] I was hoping wm-bot4 would fill that role [18:06:48] Emufarmers: oh, I thought you wanted it to teach you