[00:02:40] gwicke: fyi https://bugzilla.wikimedia.org/show_bug.cgi?id=58300 [00:03:36] dan-nl, k [00:03:50] just slip it in if you end up tweaking the config ;) [00:21:53] (03CR) 10Legoktm: "Is there a reason the same grrrit-wm account can't be used?" [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/87663 (owner: 10Nemo bis) [00:22:32] (03CR) 10Legoktm: "Oh ignore me, should have read the bug first." [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/87663 (owner: 10Nemo bis) [03:56:21] Is it deliberate that some of the Beta Labs non-Wikipedia wikis redirect to Wikipedia? [03:56:33] E.g. http://de.wikivoyage.beta.wmflabs.org redirects to http://de.wikipedia.beta.wmflabs.org/wiki/Wikipedia:Hauptseite [04:04:14] superm401: i don't think so, shouldnt be [04:04:22] Okay, will file. [04:04:32] it should do what apache-config repo does in prod.. hmm [04:05:04] but maybe that's the problem to emulate with .beta. .. all those redirects [04:05:05] I actually think it might not be 100% of the time. [04:05:09] yea [04:05:17] I think I saw it not happen in the browser. [04:05:20] But it's consistent in curl. [04:05:42] superm401: it did to me what you said [04:05:46] and that doesnt seem right [04:05:57] Yeah, I don't see how it's useful. [04:06:07] I can understand the fact that not all the prod versions have Beta versions. [04:07:23] I can see a config error at http://en.wikipedia.beta.wmflabs.org/w/api.php?action=sitematrix&format=jsonfm [04:07:35] It says sitename: Wikipedia which seems related. [04:10:45] https://bugzilla.wikimedia.org/show_bug.cgi?id=58308 [04:10:56] superm401: indeed, seems you're close [04:11:01] that sitename [04:13:32] mutante, do you know where I can run mwscript on Beta? [04:14:20] superm401: sorry no, not really which instance it is [04:16:42] manifests/misc/deployment.pp: source => "puppet:///files/misc/scripts/mwscript"; [04:17:27] superm401: on whatever has class misc::deployment::common_scripts applied on it [04:23:10] superm401: I think that deployment-bastion is roughly the equivalent of tin [04:23:41] bd808, thanks, I'll try that in a second. [04:30:16] Thanks, bd808, that works. [04:30:40] yw [05:41:16] Not very helpful: [05:41:17] View 'enwiki_p.namespaces' references invalid table(s) or column(s) or function(s) or definer/invoker of view lack rights to use them [05:41:58] that's not the table you expect [05:42:03] its horrible out of date and wrong [05:42:10] if you want namespaces, use the API [05:43:42] Why is the namespaces table (/view) even visible then? And what is this view if it's not an enumeration of the namespace names? [05:44:21] ancient mediawiki relic that is somewhat showing up on labs [05:44:29] and why doesn't it turn up on my schema diagram? [05:44:36] because it's so old? [05:44:38] yes [05:44:41] its no longer used [05:45:09] I will permit you to remove it then, and have it trouble me nomore. [05:45:27] And I'll just 'know' that the namespace I'm after is 5. [05:46:02] :) [05:52:38] Seriously? : [05:52:40] ERROR 1345 (HY000): EXPLAIN/SHOW can not be issued; lacking privileges for underlying table [05:53:07] there's a bug for that [05:53:14] https://bugzilla.wikimedia.org/show_bug.cgi?id=48875 [05:54:30] !explain is https://bugzilla.wikimedia.org/show_bug.cgi?id=48875 [05:54:30] Key was added [05:54:57] !namespaces The namespaces table is not up to date and is a historical table that is no longer used. If you want namespace info, use the API [05:55:04] !namespaces is The namespaces table is not up to date and is a historical table that is no longer used. If you want namespace info, use the API [05:55:05] Key was added [05:58:54] okay, so I can only get through the views [05:58:56] I recall somewhere said that there were different views for hitting different indexes - did I recall wrong? [06:02:52] no, that's correct [06:02:54] lemme find the docs [06:03:28] so it's right - isn't mysql smart enough to figure out what indexes to use? [06:03:34] Josh_Parris: https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Help#Tables_for_revision_or_logging_queries_involving_user_names_and_IDs [06:03:39] the indexes are on different tables [06:04:00] er, views I guess [06:04:01] dunno [06:04:41] okay, I've got enough to keep me out of your hair for at least five minutes, let's see what I can set on fire [10:44:55] !log deployment-prep deleted deployment-search01 and deployment-searchidx01 , Beta cluster has been migrated to ElasticSearch over the summer. [10:45:00] Logged the message, Master [11:30:44] hey hashar thanks for picking up that parasoid bug. it looks like https://gerrit.wikimedia.org/r/#/c/100769/ had an issue deploying the fix to beta though. were you able to resolve it? https://integration.wikimedia.org/ci/job/beta-mediawiki-config-update/1648/console [11:31:52] hashar: i'm not seeing that parasoid timeout atm, so that's a good sign [11:34:58] one thing i am curious about is how to know if the /tmp dir is filling up or not. bd808|BUFFER was able to rm unnecessary files and i want to see if gwtoolset is creating them or not. i can get to deployment-bastion, but i believe that is different than what bd808|BUFFER was looking at https://bugzilla.wikimedia.org/show_bug.cgi?id=58299 [12:22:10] dan-nl-afk: re sorry [12:22:27] error: unable to unlink old 'wikiversions.dat' (Permission denied) [12:22:28] grrrrr [12:27:29] filled https://bugzilla.wikimedia.org/show_bug.cgi?id=58325 [12:29:22] !log deployment-prep test [12:29:23] Logged the message, Master [12:30:12] !log deployment-prep Migrated old SAL entries to subpage [[Nova_Resource:Deployment-prep/SAL/Archive1]] [12:36:14] !log deployment-prep test [12:36:20] labs-morebots: ping [12:36:21] I am a logbot running on tools-exec-02. [12:36:21] Messages are logged to wikitech.wikimedia.org/wiki/Server_Admin_Log. [12:36:21] To log a message, type !log . [12:36:28] !log deployment-prep please [12:36:51] !log deployment-prep Migrated old SAL entries to subpage [[Nova_Resource:Deployment-prep/SAL/Archive1]] [12:36:52] Logged the message, Master [12:54:23] hashar, how can i tell if /tmp is filling up on deployment-jobrunner08.pmtpa.wmflab? and then how can i remove unnecessary files from it? [12:54:33] is this what i need to look at for available space? http://ganglia.wmflabs.org/latest/graph_all_periods.php?c=deployment-prep&h=deployment-jobrunner08&v=40.407&m=disk_free&r=hour&z=default&jr=&js=&st=1386766411&vl=GB&ti=Disk%20Space%20Available&z=large [12:54:44] so the job got fixed and mediawiki code got updated [12:55:06] dan-nl: I have no clue which partition is monitored by ganglia, most probably /dev/vdb [12:55:26] k, neither do i … is there a way for me to monitor it from ssh ? [12:55:40] ah https://bugzilla.wikimedia.org/show_bug.cgi?id=58299 [12:55:49] we should put the /tmp somewhere else [12:56:02] that's the bug i'm referring to [12:56:06] and also find out what is leaking tmp files [12:56:16] want to narrow down if gwtoolset is filling up the tmp folder or something else [12:56:39] but i don't know how to monitor that dir from sshing into deployment-bastion [12:57:01] well bd808 pasted a list of files that were in /tmp [12:57:10] most of them appareled yesterday and start with URL0 [12:57:16] then 5 alpha characters [12:57:34] er URL<6 alpha numerics> [12:57:53] do you know how i can get to that dir from ssh? right now i'm at dan-nl@deployment-bastion:/tmp$ but i'm not sure if that's the correct dir or not [12:58:31] ah no [12:58:35] /tmp is local to each instance [12:58:47] the instance running most jobs is deployment-jobrunner08.pmtpa.wmflabs [12:59:09] k, is there a way for me to see what's in that instance's /tmp dir? [13:00:03] yup ssh to it and ls /tmp ? :-D [13:00:29] k, will try that now :) [13:00:46] get your ssh client to use ProxyCommand https://wikitech.wikimedia.org/wiki/Help:Access#Using_ProxyCommand_ssh_option [13:01:13] that makes ssh to configure request from your host to *.pmtpa.wmflabs as being proxified via bastion.wmflabs.org [13:01:15] hugeeee time saver [13:01:17] k, so now i'm at dan-nl@deployment-jobrunner08:/tmp$ [13:01:36] there are four files right now [13:01:39] that should be all then, yes [13:01:52] made them readeable [13:02:06] ahh 12/15M doh [13:02:27] they are TIFF image data [13:02:33] so might have been imported via gwtoolset [13:02:44] no tiffs as far as i know only jpegs [13:02:59] root@deployment-jobrunner08:/tmp# file URL* [13:02:59] URL4du6bo: TIFF image data, little-endian [13:03:00] URL7FUmuR: TIFF image data, little-endian [13:03:05] and they were only recently created … haven't been testing yet [13:03:10] might be something else like ProofReadpage [13:03:21] yeah all the other ones got deleted by bd808 yesterday [13:03:41] coll, at least i can now monitor the dir while i test and see if gwtoolset is creating unnecessary tmp files [13:03:51] thanks! [13:04:18] yeah you can also use mediawki debug log groups [13:04:57] ja, will take a look at that a bit later … first need to make sure our tests work. we haven't had any issues locally or on our labs instance [13:06:03] we were able to finally figure out the issue based on the job log, but it wasn't clear to me that the tmp dir was getting filled up … didn't know how to monitor it [13:06:09] but now i do so that's good [13:07:08] also, no longer seeing that parasoid timeout in the run jobs log. thanks for that patch [13:11:00] cli.log-20131211.gz:commonswiki-2984321a: 12.0852 36.0M MimeMagic::doGuessMimeType: analyzing head and tail of /tmp/URL624CgF for magic numbers. [13:11:01] cli.log-20131211.gz:commonswiki-2984321a: 12.0881 36.0M MimeMagic::doGuessMimeType: getimagesize detected /tmp/URL624CgF as image/jpeg [13:11:01] ahh [13:12:13] does that shed some light on something? [13:12:27] yeah that is the files that are being leaked on jobrunner08 [13:13:00] something is not deleted them properly [13:13:01] i don't see that file in /tmp atm [13:13:09] yeah got deleted [13:13:24] k, so i know that gwtoolset is using that method [13:13:32] commonswiki-55a91c6c: 36.9431 45.0M FSFile::getProps: /tmp/URL4du6bo loaded, 15014788 bytes, image/tiff. [13:13:32] :D [13:13:50] the beginning "commonswiki-55a91c6c:" is some random ID [13:14:23] so i can grep for it in the cli.log file: zgrep commonswiki-55a91c6c /data/project/logs/archive/cli.log-20131211.gz [13:14:55] investigating [13:15:04] hmm … is doGuessMimeType creating this tmp files? [13:15:27] I think that is the Tiff paging system that is leaking them [13:15:40] when a tiff is a multiple document, we have to extract the page requested [13:15:43] and serve it to the user [13:15:47] apparently that is left on disk [13:15:59] will look at mw code ;-] [13:17:01] cool [13:28:32] as an fyi, i just ran a small batch job and it didn't leave any files behind in /tmp. the batch also ran successfully [13:36:24] ah http://commons.wikimedia.beta.wmflabs.org/wiki/File:Rhododendron_lutescens_Franch.-E_-_Royal_Botanic_Garden_Living_Collection_-_19913373.jpeg [13:36:37] that has the category Category:GWToolset_Batch_Upload [13:36:59] the question now is what is creating the /tmp/URL*** files :-D [13:39:01] k, so i kicked off another batch run of ~3000 items [13:39:20] i see the URL tmp files getting generated [13:39:30] but they're also being cleared [13:40:47] i know that uploadbyurl uses the tmp dir [13:40:56] but gwtoolset also uses mimemagic [13:40:56] includes/upload/UploadBase.php: $file = $stash->stashFile( $this->mTempPath, $this->getSourceType() ); [13:40:57] ahhh [13:41:10] where getSourceType() yields 'URL' whenever it is called by UploadFromUrl [13:41:40] cool, so that's what is generating those tmp files most likely [13:41:48] atm it seems to be cleaning up properly [13:42:26] so then the ? is what would make it not clean-up the tmp file? [13:42:47] i imagine if the process gets stopped in the middle for some reason it would leave the tmp file behind [13:43:06] does that mean that a cleanup script should run regularly on the tmp folder [13:43:24] yeah [13:43:32] but I want to make sure the code is actually cleaning the tmp files [13:43:34] it probably is [13:43:48] class UploadStashException extends MWException {}; [13:43:48] class UploadStashNotAvailableException extends UploadStashException {}; [13:43:50] class UploadStashFileNotFoundException extends UploadStashException {}; [13:43:51] class UploadStashBadPathException extends UploadStashException {}; [13:43:52] class UploadStashFileException extends UploadStashException {}; [13:43:53] class UploadStashZeroLengthFileException extends UploadStashException {}; [13:43:54] .... [13:46:15] I give up [13:46:30] off the three Upload* classes I have seen, each of them process temp files differently [13:47:44] so yeah maybe we should use a cronjob against /tmp [13:54:28] hmm … i can narrow down uploadbyurl to this method [13:54:42] reallyfetchfile [13:55:05] but i don't know if that's helpful or not [13:56:12] i figure something may go wrong and the upload classes don't get a chance to cleanup the /tmp dir [13:56:58] and having a cleanup script would take care of making sure that dir is clean … maybe run with a timestamp of - 24 hours [13:57:31] does such a script already exist? [14:00:34] i could create a /maintenance/cleanupTmp.php script that might work ... [14:00:59] would that make sense? don't see anything else in maintenance/ that addresses this yet [14:02:17] hashar, it looks like that 3000+ batch finished and the /tmp dir is clean again, except for those tmp files we saw earlier [14:02:32] yeah no clue what might cause the files to leak :( [14:02:40] sorry haven't read above [14:02:45] np [14:03:17] we could get a cronjob that runs once a day on our box and clear /tmp [14:03:21] might already be the case [14:05:44] k, i'll see if i can get it written today or tomorrow [14:08:59] or would a simple shell script suffice for that cronjob? [14:19:19] not sure [14:19:27] better have to ask folks in #wikimedia-operations [14:19:35] in prod we might clean /tmp already [14:22:52] dan-nl: apparently in prod we clear tif from /tmp [14:25:48] k, will mention it to greg-g later today … or maybe he'll pick it up from here. possibly add a cronjob shell script that clears those URLxxx files from /tmp after a certain amount of time [14:29:51] dan-nl: some files are cleaned via a cron which is in the puppet class applicationserver::cron [14:30:01] example change I made on it a minute ago https://gerrit.wikimedia.org/r/100784 [14:31:36] hmm … so adding a cleanupurl would make sense ... [14:33:39] probably [14:35:06] hashar: would you be okay +2'ing this https://gerrit.wikimedia.org/r/#/c/100787/ ? [14:35:20] looking [14:35:29] adding aditional [14:35:35] hehe that is redundant isn't it ? :-] [14:36:11] yes :) [14:37:40] atm it looks like the delay is not being applied … hoping that getting rid of the + on the timestamp will help and also looking for more info on the exception output [14:38:21] throw new MWException( print_r( $userSubmittedContent ) ); [14:38:25] that sounds nasty [14:39:12] ha! [14:39:27] those params were sanitizer earlier [14:39:53] but if you're not okay with it i can wait for bd808|BUFFER [14:40:26] dan-nl: posted review on https://gerrit.wikimedia.org/r/#/c/100787/ [14:40:50] I can't remember whether we show the output of MWException [14:41:18] we most probably show the message though but I have no idea whether the message is properly escaped when displayed [14:54:30] hey hashar, hopeful that will take care of your concern about the param sanitisation ... [15:03:58] hashar: does this take care of your concerns https://gerrit.wikimedia.org/r/#/c/100787/ [20:04:39] Hm.. are there any known issues with Apache leaking memory? http://ganglia.wmflabs.org/latest/?r=week&cs=&ce=&c=cvn&h=cvn-apache2 [20:05:04] I moved all non-apache processes to a separate instance for cvn bots. This one is now pure apache and it is still steadily building up memory [21:24:27] !log wikimania-support Upated scholarship-alpha to 3f38602 [21:24:28] Logged the message, Master [21:36:31] is it possible to get backup of my crontab? I just blanked it by mistake [21:37:31] Coren|Travel: ^ [21:44:59] Betacommand: if there's no other way left try: grep local- /var/log/syslog .It's something :) [21:46:52] hedonil: we dont have a backup of them somewhere? [21:47:04] Betacommand: no clue [21:48:47] Betacommand, https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Help#Backups says: ‘ there is a lot of redundancy, but no backups of labs projects beyond the filesystem's time travel feature for short-term disaster recovery.‘ and ‘Time travel is not currently enabled’ so I guess there is no crontab backup [21:50:12] hashar, when you have a moment. the job queue on the beta cluster … it's using redis to run the jobs, correct? [21:50:41] dan-nl: I think so [21:50:57] dan-nl: you can confirm by looking at the conf in operations/mediawiki-config.git [21:51:10] k, will look now ... [21:51:18] or ask some other folks :-D [21:51:27] jobqueue-labs.php: 'class' => 'JobQueueRedis', [21:51:27] jobqueue-labs.php: 'redisServer' => '10.4.0.83', # deployment-redisdb [21:51:30] dan-nl: ^^^^ [21:51:45] probably has been setup by YuviPanda / AaronSchulz [21:51:47] ja, that's what i saw earlier ... [21:51:57] I have exactly ZERO clue how to access the redis instance [21:52:21] I use the maintenance script showJobs.php , it has an option to dump every single jobs iirc [21:52:26] k, for some reason the code i've got in one of the jobs is not returning true to the test delayedJobsEnabled() [21:52:48] maybe i need to re-write it ... [21:53:01] do you happen to know anything about that method? [21:53:22] maybe i'll just set-up redid locally and see if i can sort it out ... [21:54:35] dan-nl: I have no clue sorry :( [21:54:47] dan-nl: but Aaron or some other people in #wikimedia-dev should be able to help you [21:54:47] k, no problem … otherwise the jobs are running well … put in a 10169 batch job [21:54:58] k i'll double-check there [21:54:58] the job queue has been used by a ton of people from the wmf features teams [21:55:23] ahh good to know you managed to get files being uploaded in beta :-] [21:55:36] anything you spot there will be something you don't have to suffer from on production [21:56:10] dan-nl: have you scheduled the production deployment already ? [21:56:54] not that i know of …. greg-g mentioned that he was thinking of thursday [21:57:00] tomorrow [21:57:03] doh [21:57:22] but i don't know what the status of that is … greg-g ? [21:57:26] but then, museum will only start using it in febuary, so that leave sometime to polish up the config in prod [21:57:35] yes i think so [21:57:44] have you got a chance to use wfDebugLog yet ? [21:57:48] not sure what david has planned [21:57:57] if not we can poke it tomorrow if you want [21:58:13] no, not yet … i've been focusing on fine tuning the jobs [21:58:35] i want to make sure they will delay if they reach a certain threshold [21:59:39] i'd appreciate the help with wfDebugLog tomorrow [22:00:34] dan-nl: we might have a way to delay them in the job queue itself [22:00:35] not sure [22:00:51] if not, it might be a useful features for other jobs [22:01:01] like making sure we don't send more than 1000 mails per minutes or something [22:01:06] to avoid killing our mail servers [22:01:15] something worth talking about on wikitech-l I guess [22:01:25] their is with a jobReleaseTimestamp param … which is the one i've been trying to add … but for some reason that param is not yet being added to the job as i expected [22:01:36] i think it may have to do with the test i put in ... [22:02:10] http://git.wikimedia.org/blob/mediawiki%2Fextensions%2FGWToolset.git/172ba5fb6308b989674e0c0a88edec06fbbcf3cc/includes%2FJobs%2FUploadMetadataJob.php#L101 [22:02:39] nasty :-] [22:02:44] aaron and i had discussed this approach, but it doesn't seem to be working [22:02:47] I have no idea how it works honestly :( [22:02:56] no worries, i'll ask in -dev [22:03:11] maybe delayedJobsEnabled never got used anywhere yet :( [22:04:01] andrewbogott: could you give user "gage" full access? [22:04:04] that method set checkDelay which apparently is only used for the Redis job queue [22:04:08] andrewbogott: it's been created now [22:04:18] dan-nl: so you would have to set a Redis server if testing locally :/ [22:04:38] yes, exactly … would need set it up, etc [22:05:24] i have another idea about how to test for whether or not delayedJobsEnabled is true, but it's less desireable [22:07:29] hashar, do you want me to review https://gerrit.wikimedia.org/r/#/c/100573/1? [22:08:06] It affects a script I'm testing on Beta, though it might not be a blocker. [22:08:22] superm401: I was talking about it with Reedy in some channel but can't find it anymore grr [22:08:34] but yeah feel free to review / merge [22:08:38] poke reedy about it as well [22:08:44] I think I have quit #wikimedia-dev by mistake [22:09:13] follow up there [22:09:23] dan-nl: have a good night, poke me tomorrow about debug log group :-] [22:09:39] thanks hashar you as well [22:13:45] Change on 12mediawiki a page Wikimedia Labs/Tool Labs/List of Toolserver Tools was modified, changed by Tfinc link https://www.mediawiki.org/w/index.php?diff=840661 edit summary: making table sortable /* Active Tools on the Toolserver */ [22:34:39] mutante: Gage has shell access in labs now. I'm preparing a patch for admins.pp [22:34:58] andrewbogott: i added him to "ops" on formey ..LDAP group already [22:35:13] andrewbogott: ooh, i was about to do that, but sure:) nice [22:35:23] I have a script I want to try :) [22:35:27] cool [22:35:36] that checks the latest UID i bet [22:38:49] It pulls their current UID from labs/ldap to stay in sync [22:42:21] nice! [22:43:07] andrewbogott: what shell user name are you giving him [22:43:27] mutante: theoretically he chose one when he created his labs account... [22:44:16] andrewbogott: ok [23:13:43] flow team … fyi, beta cluster just threw this error [23:13:44] 2013-12-11 23:08:59 deployment-apache33 enwiki: [33c32b9a] /w/index.php?title=Talk:Flow_QA Exception from line 188 of /data/project/apache/common-local/php-master/includes/db/LBFactory_Multi.php: LBFactory_Multi::newExternalLB: Unknown cluster "extension1" [23:55:55] All my API requests on beta labs are returning 503 errors