[07:24:59] <jynus>	 https://phabricator.wikimedia.org/T102949#1386542
[07:37:46] <jynus>	 I would like to do a battery learning cycle on db1047, does anyone know the researchers doing queries there?
[08:10:07] <jynus>	 profiling db1018
[12:45:26] <jynus>	 regular maintenance on es1001
[19:04:14] <jynus>	 regular maintenance on es1002
[20:35:24] <jynus>	 I will have fun with T103417 tomorrow, as I know the background, but feel free to step in if you feel like it
[20:36:09] <jynus>	 there has been issues with pc1003, probably related to a mass expiration of items 1 month after maintenance
[20:36:29] <jynus>	 it may happen to pc1001 and pc1002 too, but only once
[20:36:48] <jynus>	 probably something to report to mediawiki
[20:38:14] <jynus>	 es1002 is depooled. nothing wrong with it, but too late to commit on tin and check for potential mistakes
[20:38:33] <jynus>	 will do tomorrow, the thoughput is low
[20:39:36] <jynus>	 one last thing- P_S was very helpful to debug pc1003 issue. I will let you there that.
[20:49:02] <jynus>	 I have also disabled temporarelly lag alert on 1047- see corresponding ticket
[20:50:58] <jynus>	 I will let the slow log 1/50 rate on db1008 running on the screen, the idea is having 2G of 1 day of queries
[23:18:44] <springle>	 you really mean db1008? the frack box...
[23:30:19] <bobwest>	 Hi!
[23:30:22] <bobwest>	 I'm trying to get the enwiki revision table into HDFS. Sqoop seems to be too slow for this, so I came up with the following hacky way of first mysqldumping the table, then doing some simple processing on the plain text, and finally copying it to HDFS:
[23:30:37] <bobwest>	 mysqldump -u research -p -h analytics-store.eqiad.wmnet \ --no-create-db --no-create-info                         \ --single-transaction --quick                            \ --max_allowed_packet=512000000                          \ enwiki revision                                         \ | sed 's/),(/\n/g'
[23:30:45] <bobwest>	 OOPS, sorry, here again:
[23:30:54] <bobwest>	 mysqldump -u research -p -h analytics-store.eqiad.wmnet \
[23:31:00] <bobwest>	 --no-create-db --no-create-info                         \
[23:31:08] <bobwest>	 --single-transaction --quick                            \
[23:31:15] <bobwest>	 --max_allowed_packet=512000000                          \
[23:31:21] <bobwest>	 enwiki revision                                         \
[23:31:27] <bobwest>	 | sed 's/),(/\n/g'
[23:31:34] <bobwest>	 This has worked before, but now that I wanted to redo it with the freshest revision table (we're about to run real user experiments with Leila and Ellery, and we'd like up-to-date data for this), I got a timeout error:
[23:31:42] <bobwest>	 west1@stat1002:~/wikimedia/trunk/missing_articles/src/main/bash$ bash dump_revision_db.sh mysqldump: Error 2013: Lost connection to MySQL server during query when dumping table `revision` at row: 105679747
[23:31:58] <bobwest>	 It seems that increasing the net_write_timeout and net_read_timeout parameters on the mysql server is the way to go, but I (obviously) don't have privileges to do so.
[23:32:12] <bobwest>	 Does anyone have an idea how i could manage to dump the table into a text file efficiently?
[23:39:10] <springle>	 bobwest: is the disconnection issue reproducable?
[23:39:56] <springle>	 the box is under some load at present with other research queries; it should not drop connections, regardless, but stranger things have happened...
[23:41:09] <bobwest>	 it happened several times in a row, and i tried over the course of about half a day
[23:44:25] <springle>	 bobwest: trying that ^ locally on the server...
[23:45:09] <bobwest>	 thanks @springle!
[23:45:30] <bobwest>	 just to confirm: are you saying you're trying it yourself, or should i also try something?
[23:45:48] <springle>	 sorry. i am trying it :)
[23:45:59] <bobwest>	 got it, thz
[23:46:01] <bobwest>	 thx
[23:46:25] <springle>	 is your max_allowed_packet setting important for something else in the chain to HDFS?
[23:46:38] <springle>	 or can it be default (16M)?
[23:46:55] <bobwest>	 that was just one thing i tried to fix it, but the error happens with and w/o it
[23:46:59] <springle>	 ok
[23:47:02] <bobwest>	 btw, it might take a while (around 30 min to an hour) till it fails
[23:55:06] <springle>	 someone has opened 80+ concurrent connections to analytics-store reading chunks of enwiki revision
[23:55:14] <springle>	 no wonder the box is struggling
[23:56:10] <springle>	 from analytics10xx.eqiad.wmnet clients
[23:56:52] <springle>	 all hadoop workers