[00:03:26] Betacommand: Kinda. [00:04:16] Coren: can you poke someone to fix https://bugzilla.wikimedia.org/show_bug.cgi?id=41345 ? the best thing would be to add a column to the db [00:08:11] I can poke around, but I don't get to set dev priorities. :-) [00:09:00] Coren: its a simple fix, they just dont want to make the change to the interwiki links table [00:09:39] that is holding back quite a bit [00:10:02] Define "they"? If there's a tested patch to merge in that implements things, it makes things easier. Adding a column is not a trivial thing, but it's done at regular interval when there is a good use case. I had one added to the ipblock table not that long ago. [00:10:51] But yeah, adding a column is something that is done gingerly. [00:10:58] Coren: what did you ahve added? [00:13:48] Coren: what are your thoughts of getting bugzilla database replicated to labs? [00:15:41] Betacommand: ipblocks.ipb_parent_block_id [00:15:57] for autoblocks? [00:16:23] Betacommand: That should be doable, except for security. You really want to talk to Andre for that though. [00:16:48] Betacommand: Yeah, so that unblocking removes the autoblocks too. That got in 1.20 IIRC [02:49:16] hey, wikitech seems down [02:49:52] web server wikitech.wikimedia.org not responding [02:50:52] The connection was reset [02:50:58] http://labsconsole.wikimedia.org/ [02:51:45] hi mutante. It's pingable, but the web server is unresponsive [02:52:21] can we rule out somebody works on it? [02:53:26] what host is it on, can I ssh to it from a bastion to investigate? Let me look for instructions on wikitech... D'oh :) [02:53:46] virt0 [02:54:23] apache runs [02:54:31] https://wikitech-static.wikimedia.org/ [02:54:38] that explains the "<+icinga-wm> PROBLEM - HTTP on virt0 is CRITICAL: CRITICAL..." [02:54:43] You won't be able to login to virt0 though ;) [02:55:59] https://wikitech-static.wikimedia.org/wiki/Wikitech-static works fine, is it the same or ... INCEPTION [02:56:41] anyway, back up now. [02:58:31] fixed [02:59:00] !log bastion had to restart apache on virt0, wikitech/labs puppetmaster was down due to apache [13:46:03] Coren: The Text::Diff perl module seems to be not availble on the labs web servers. Is that possible? [14:24:17] Coren, ping [14:43:40] krd: I'm not seeing it in the list of installed packages so yes. [14:43:46] Cyberpower678: What be up? [14:44:51] Coren, so I use the edit counter on labs with my computer just fine, but, when my bot tries to access it's API, it's getting nothing. Is labs blocking bot communication? [14:45:05] Coren: Can you change that please? On the login server it seems to be present, and I'd like to use it. [14:45:33] krd: Please open a bugzilla for it, and I'll have it installed in today's round of package installs. [14:46:46] Cyberpower678: No, but names from inside are different from names from outside. You have to use 'tools-webproxy' rather than 'tools.wmflabs.org' (Openstack limitation) [14:47:31] Ok [14:47:42] The script is also running from my computer but the UA identifies itself as a bot. [14:49:02] Oh. You might be getting caught by some of our desperate attempts at preventing crawlers. What is the exact UA you are using? [14:49:57] Coren, Peachy MediaWiki Bot API Version 2.0 (alpha 6) [14:51:04] Cyberpower678: Through http or https? [14:51:09] https [14:51:50] Coren, can you make a general exemption for any access with the UA containing "Peachy MediaWiki Bot API" [14:52:06] Because the version identifier will change as Peachy gets updated [14:52:10] I would, except that it shouldn't match in the first place. [14:53:39] * Coren looks in the logs. [14:54:05] Cyberpower678: I don't see that UA hitting the webservers at all. [14:54:18] Cyberpower678: PM me your IP? [14:54:27] Coren, it should be attempting to target https://tools.wmflabs.org/xtools/pcount/api.php [14:58:38] Cyberpower678: Please give me your IP so I can dig in the logs for it. [15:02:20] Coren, PMed you. [15:05:29] * Coren digsdigsdigs [15:05:55] Coren, it's returning Access denied [15:06:13] Coren, I'm sorry. I'm using http [15:06:38] * Coren checks in the *other* log then. :-) [15:06:44] :p [15:07:49] Cyberpower678: Actually, I see only 200s, and never that UA. I see a number of requests with /no/ UA though (but they get 200s) [15:08:03] Cyberpower678: Can you make an attempt now? [15:08:09] So I can see what happens to it? [15:08:38] Access denied [15:09:45] Yeah okay, so two things: first, you don't actually send a UA at all with the bot. Second, the tool is returning a 200; so whatever control is in place that prevents it from working is in the tool itself not the webserver. [15:10:10] - - [20/Dec/2013:15:08:31 +0000] "GET /xtools/pcount/api.php?name=Cyberpower678&lang=en&wiki=wikipedia&format=php HTTP/1.0" 200 239 "-" "-" [15:10:52] Coren, check it again. [15:11:05] There should be a new entry [15:11:12] There is: [15:11:28] - - [20/Dec/2013:15:09:51 +0000] "GET /xtools/pcount/api.php?name=10.4.1.89&lang=en&wiki=wikipedia&format=php HTTP/1.1" 200 778 "-" "Peachy MediaWiki Bot API Version 2.0 (alpha 6)" [15:11:45] So you got a 200, with 778 bytes of contents. [15:12:30] 200 means your tool got the request, and returned ok with data. [15:12:45] I'm getting data now. [15:13:49] Coren, there's a bug in the API. [15:14:08] Now that I'm getting data, I should be able to take it from here. [15:55:19] hashar, would you be okay with +2'ing a test for the gwtoolset jobs? i'm wondering if the issue is that the 1st job has a delay on it https://gerrit.wikimedia.org/r/#/c/102955/ [16:04:16] Coren: http://tools.wmflabs.org/krdbot/cgi-bin/Denkmalgrep.pl -> Internal Server Error [16:04:31] dan-nl: lets try out :) [16:05:13] cool, thanks! takes about 3 mins to get on beta … right? [16:05:23] a bit more [16:05:28] the job updater runs every six minutes [16:07:05] dan-nl: I have updated it manually [16:07:20] sweet … will kick off a new job then [16:18:09] hashar thanks again for that. i think i've narrowed down the issue. it appears that the job runner isn't picking up delayed jobs. i'll add my findings to that bug and hopefully that will help arron sort it out … [16:20:00] dan-nl: ah glad it helped [16:20:15] I guess nextJobDB doesn't list delay jobs so [16:20:24] would have to ask Aaron :( [16:20:52] yes, but at least this little helps narrow down the issue … [16:24:56] Coren, ping again [16:25:51] hashar: is that puppet script jobs-loop.sh.erb on a cronjob run every so many minutes or some other automated run? [16:26:07] Cyberpower678: Yes? [16:26:18] What causes "a:1:{s:5:"query";a:1:{s:5:"count";a:9:{s:8:"username";s:9:"10.4.1.89";s:5:"is_ip";s:4:"true";s:11:"user_exists";s:4:"true";s:7:"user_id";i:0;s:8:"opted_in";s:5:"false";s:6:"counts";a:3:{s:7:"deleted";i:0;s:4:"live";i:0;s:5:"total";i:0;}s:6:"groups";a:0:{}s:9:"firstedit";N;s:15:"averagepagedits";s:4:"0.00";}}}" [16:26:27] First 3 characters? [16:26:32] i'm wondering now if it's only run when one of the jobs it takes care of is kicked off … [16:27:01] dan-nl: it is a while loop iirc [16:27:14] Coren: any progress on the new packages? ( I would need >apt-get install php5-intl< ) ;) [16:27:21] yes, but is it continuously running? what kicks it off? [16:27:31] dan-nl: in operations/puppet there is an init script 'mw-job-runner' for it modules/mediawiki/files/jobrunner/mw-job-runner.init [16:27:43] hedonil: Ima do a round of installs in a few hours; I'll include yours. [16:27:51] dan-nl: which ensure the shell script is always there [16:27:58] Coren: fine. thx. [16:28:49] dan-nl: then the shell script basically while(true) and runs nextJobDB.php to find a wiki needing jobs run, then run the jobs for it [16:29:49] Cyberpower678: ... I don't understand your question. [16:30:06] The API is supposed to return a serialized string. [16:30:09] hashar, and this script, modules/mediawiki/files/jobrunner/mw-job-runner.init, is run by a cronjob every x minutes? [16:30:30] dan-nl: na it is a service maintained by upstart [16:30:43] dan-nl: upstart is ubuntu system that replaces init scripts [16:30:46] Something is spitting 3 obscure characters before that. [16:31:03] dan-nl: basically it makes sure the service is always running. In this case the service is the shell with the while loop [16:31:20] Cyberpower678: That's not a serialized string, it's a serialized array. [16:31:26] dan-nl: if the shell is ever killed / aborted, Ubuntu upstart will notice the service is no more running and restart it (aka relaunch the shell loop) [16:31:41] ah, i see, okay, thanks. hopefully what i add to that bug will help aaron track it down further [16:31:45] Coren, same difference. [16:32:07] I see nothing wrong with it. What three characters are you talking about? [16:32:23] Coren,  [16:32:25] dan-nl: on beta if you tail -f /data/project/logs/cli.log , you would be watching at the $wgDebugLogFile output of command line script [16:32:42] dan-nl: that spamming logs for run of nextJobDB.php, that is the shell script running on the jobrunner08 instance. [16:32:54] cool [16:33:05] dan-nl: and from time to time, you will see a runJobs.php output [16:33:19] dan-nl: ideally our maintenance script should wfDebug() an entry with the parameters passed to them [16:33:24] Cyberpower678: it's a normal serialized php array. you can unserialize that with php unserialize() [16:34:01] hedonil, do you not see the "" preceeding it? [16:34:13] thanks [16:34:19] * hedonil looking [16:34:20] It's causing unserialization to fail. [16:34:56] Cyberpower678: if you test your request with https://en.wikipedia.org/wiki/Special:ApiSandbox#action=query&prop=info&format=php&inprop=watchers|url&pageids=5248757|1923942|5042951 [16:35:18] Cyberpower678: it's the same [16:35:29] hedonil, huh? [16:35:34] I'm not using Wikipeda [16:36:05] Cyberpower678: but you can check possible resultsets with that [16:36:57] hedonil, I don't follow. I want to try and fix the edit counter API so it doesn't spit these wierd characters out when returning a serialized array. [16:37:24] But I don't know what in the API is causing this bug. [16:37:39] Cyberpower678: I don't understand. You're saying your script is outputting some things before it outputs the serialized array? [16:37:47] yes. [16:37:52] Do you not see it? [16:38:17] Can you manually type the first 10 chars you see of the string I pasted to you? [16:38:36] Cyberpower678: ... okay. And how am I/we supposed to guess what those are? It's obviously a bug in the script. [16:39:07] That isn't the output of a serialize(), so your script is outputting something before that. Find where it does, and don't do it. :-) [16:39:19] Coren, I know. I'm asking what could cause that. [16:39:25] Cause I can't find it. [16:39:28] Cyberpower678: Your script. :-) [16:39:36] :| [16:39:46] Oh, ffs. Lemme guess. [16:39:52] I think I know. [16:40:09] Not because you have me enough information, but from your history. What is the script file exactly? [16:40:55] Cyberpower678: do you mean these 'obscure' first characters? "a:1: [16:41:10] hedonil: Hang on. I think I've guessed right. [16:41:17] Cyberpower678: What is the filename? [16:41:30] hedonil, you can't see the characters I guees. [16:41:40] Cyberpower678: ahh [16:41:53] Coren, I didn't originally write the API. [16:42:02] But here are the files that run it. [16:42:42] https://github.com/cyberpower678/xtools/blob/master/public_html/pcount/counter.php [16:42:52] https://github.com/cyberpower678/xtools/blob/master/public_html/pcount/api.php [16:43:09] https://github.com/cyberpower678/xtools/blob/master/API.php [16:43:28] The second one is the actual API to access on the web. [16:44:20] hashar: you can now see the delayed job with mwscript showJobs.php --wiki=commonswiki --group [16:44:29] hedonil, try unserializing the contents returned by tools.wmflabs.org/xtools/pcount/api.php?name=10.4.1.89&lang=en&wiki=wikipedia&format=php [16:44:39] You'll see PHP fail. [16:44:42] bd808: added a gerrit patch last night that now shows that in the output [16:44:50] * hedonil tries [16:44:53] You might also see the characters. [16:45:10] Cyberpower678: I want the *file* not URIs [16:45:28] * bd808 just gave +2 to the change; Aaron was smart enough to think to write it [16:45:45] Coren, how am I supposed to do that? Just download them from GitHub. [16:45:45] Cyberpower678: Anyways; I know I guessed right. You used your crappy editor again, and it inserted a BOM before the Coren, no I didn't [16:46:24] Cyberpower678: Okay, what exact script on *tool labs* is giving you that output? [16:46:28] This was written before my crappy editor. I only edited the file paths when I migrated this. [16:46:29] hashar,bd808 yes, that was smart, the question now is why doesn't the runner pick it up the delayed job [16:46:31] Cyberpower678: for my api I've added header('Content-type: application/json'); works fine [16:47:15] What does BOM even mean [16:47:37] And why can't my IDE see it. [16:47:40] Cyberpower678: I've explained it to you several times. Unicode Byte Order Mark. U+FEFF [16:48:02] I'm using my PHP IDE. [16:48:05] https://en.wikipedia.org/wiki/Byte_order_mark [16:48:06] Cyberpower678: Again: Okay, what exact script on *tool labs* is giving you that output? [16:48:48] /data/project/xtools/public_html/pcount/api.php [16:48:59] http://lmgtfy.com/?q=bom [16:49:43] dan-nl, ah so it's the Bureau of Meteorology. :p [16:49:46] Cyberpower678: Thank you. And what is the URI you are using to test? [16:49:51] :) [16:50:04] http://tools.wmflabs.org/xtools/pcount/api.php?name=10.4.1.89&lang=en&wiki=wikipedia&format=php [16:51:24] dan-nl: hence the bug report :-] [16:51:30] dan-nl: at least you have a workaround now! [16:51:55] Cyberpower678:  in master/API.php  private function outputText($text) you have a line echo $this->formatHTML($text); [16:52:26] Coren, The script file is encoded in Windows-1582 [16:52:40] hashar: yes, but i want to use the delayed jobs so that gwtoolset doesn't flood the queue or overload a glam's server, so i want to sort this out before we really start using the tool [16:52:48] hedonil, okay [16:55:24] od -c: 0000000 357 273 277 a : 1 : { s : 5 : " q u e [16:55:24] That's in octal, and corresponds to EF BB BF. Unsurprisingly, that corresponds to the UTF-8 for U+FEFF -> BOM [16:56:20] It's not from me this time. So which file has it and is responsible for it. [16:56:35] * hashar today I learned Coren is still using octal. [16:57:06] hashar: Heh. od -c does. [16:57:34] ahhh [16:57:37] Coren, so what file is the culprit. Remember I didn't write any of this. [16:57:49] I only changed the file path of them. [16:58:11] grep for it ? [16:58:26] `perl -CD -pe 'tr/\x{feff}//d' file.bom > file.nobom` [16:58:32] or just replace them all : find . -type f -regex '.*html$' -exec sed -i 's/\xEF\xBB\xBF//' '{}' \; # evil [16:59:05] hashar, what's that do? [16:59:12] potentially, haven't tested though [16:59:23] that would look for html files and replace BOM bytes in them [16:59:32] potentially you could grep for it if grep supports hexadecimal [16:59:43] Coren, ^ [17:01:20] hashar do you happen to know how to clear the jobs listed with mwscript showJobs.php --wiki=commonswiki --group [17:01:45] dan-nl: Not offhand. [17:01:57] Cyberpower678: this my api output routine http://pastebin.com/Gb6M9nrf [17:02:03] Cyberpower678: Hang on, I'm doing a find in /data/project/xtools [17:02:39] Cyberpower678: try to add this header to your private function outputText($text) right before echo $text; [17:03:11] dan-nl: ahhhh aaron pasted a command for that . Something like JobQueue::singleton()->purge(); [17:03:12] hedonil, I fail to see what that will accomplish [17:03:38] Cyberpower678: your current response header is Content-type:text/html [17:03:46] k, how can i run a php command from the cli on deployment-jobrunner08 [17:04:11] JobQueueAggregator::singleton()->purge() [17:04:40] ah [17:04:45] so to execute that php code [17:04:49] get on deployment-bastion [17:04:52] i just don't know how to run it from the cli [17:04:59] yes [17:05:02] run the eval.php evil script (that gives you a PHP interactive session in mediawiki context) [17:05:04] done using: [17:05:08] omg [17:05:10] mwscript eval.php --wiki=commonswiki [17:05:13] k, i'll try that [17:05:19] that is the leet hacking utility [17:05:23] no code completion [17:05:27] no command return [17:05:34] ffs, Cyberpower678, how big is your public_html? [17:05:40] k, thanks! [17:05:42] once you get the prompt you can: var_dump( JobQueueAggregator::singleton()->purge() ); [17:05:52] Umm, pretty huge [17:05:58] dan-nl: try print $wgDBname; [17:07:13] Coren, you're not scanning the whole project are you? [17:07:17] dan-nl: got something ? [17:07:23] Coren, there are about 1,064 [17:07:24] got a phone call ... [17:07:29] files in there [17:07:30] dan-nl: about to head out [17:07:37] Cyberpower678: Well, I'm not going to play hide and seek in that mess. The idea is, your script outputs a BOM before the serialized data. It shouldn't. It is 100% certain that it does because you are including a file you edited and in which your crap editor added a BOM at the very beginning. Find it, fix it with a good editor, and that will solve your issue. [17:07:37] for vacations hehe [17:07:41] np, have a good holiday and thanks for your help! [17:08:10] Cyberpower678: this is what is beeing returned by your api : http://pastebin.com/3mLxgxVd [17:08:15] dan-nl: you can follow up with US people via #wikimedia-dev . Thank you for the glam utility :-D see you next year! [17:08:36] Cyberpower678: it contains html stuff which should not be there [17:08:48] Coren, can you give me something to help me find it? [17:09:05] hedonil, that is not what my script is getting. It's just getting the BOM character. [17:09:29] Cyberpower678: fgrep $(perl -e 'print "\xEF\xBB\xBF";') [17:09:35] That will match UTF-8 BOM. [17:10:02] Coren, how can I direct it to a specific directory? [17:10:45] Also, ***CHANGE YOUR EFFING EDITOR***. Any editor that inserts an UTF-8 BOM at the beginning of text documents is broken, by definition (BOM is only meaningful in UCS-2). It's at least the fifth time that you broke one of your own scripts that way and I have to pull you out of it. [17:10:53] Cyberpower678: man grep [17:11:39] !log wikimania-support Updated scholarships-alpha to 7b5709e [17:11:41] Logged the message, Master [17:11:53] Coren, what file encoding should it use? [17:12:51] Cyberpower678: Plain good, UTF-8. No BOM. [17:13:21] Coren, these files were originally written in Windows-1586 [17:13:42] I mean 1252 [17:14:33] Regardless; it's all ASCII so the mapping is 1:1. The important thing is that your editor must. not. insert. BOMs. [17:14:41] If it does, it's broken. [17:14:54] In fact, that it even *allows* doing so is broken. [17:18:29] Cyberpower678: If you are on Windows (random assumption by me) you might try http://notepad-plus-plus.org/ [17:19:03] That's the editor Coren is calling crappy. [17:19:20] oh. well that's not helpful then [17:19:34] Yea [17:20:02] http://stackoverflow.com/questions/8432584/how-to-make-notepad-to-save-text-in-utf-8-without-bom [17:20:04] From within Notepad++, choose the "Encoding" menu, then "Encode in UTF-8 without BOM". [17:20:22] * bd808 and Coren search alike [17:20:55] Great minds are alike in trying not to think when there are relevant SO threads [17:21:11] And yes, Cyberpower678, any editor that even /allows/ inserting a BOM in UTF-8 is crappy; or at least it's got a coder that is unable to read specifications. BOM is nonsensical in UTF-8 and should never, never be used. Ever. There is no possible good reason for it. [17:21:20] Ah mine was set to UCS-2 for some reason. [17:21:29] ...! [17:21:37] Do you even know what UCS-2 *is*? [17:21:41] Coren, no [17:21:44] 16-bit encoding. [17:21:48] I never changed it. [17:22:09] Now, interestingly enough, BOMs /do/ make sense at the beginning of a file encoded with UCS-2. [17:22:40] Except that UCS-2 itself is arcane and not in general use *anywhere*. [17:23:03] It was the state of the art in 1995! [17:23:09] (As files; it's often used internally. Think wchar_t strings in windows) [17:24:31] And I don't believe " I never changed it.". It's not possible that a text editor would default to UCS-2. [17:25:07] Coren, then I accidentally changed without realizing. :p [17:25:47] So, in closing: 1. find the files you've edited and broken by inserting BOMs in it 2. fix them 3. profit!! [17:31:25] Coren, according to Notepad++ all of the essential files are encoded in UTF-8 without a BOM [17:31:52] But when it runs, Notepad++ says the output in encoded in UTF-8 with BOM. [17:31:55] Cyberpower678: Clearly then it comes from a non-essential file. That BOM isn't appearing out of nowhere. [17:31:56] I don't get it. [17:32:23] I looked at all the require_once codes and checked every file. [17:38:38] Cyberpower678: Dude, seriously, you don't have enough experience to take over maintenance of scripts written by other and which you do not understand. You'll need more basic skill at diagnostics and debugging. Start smaller; write your own tools with more limited scope. [17:39:13] I do understand the scripts. [17:39:20] Cyberpower678: Get used to debugging your own code before you tackle the (much more complicated) task of debugging someone else's. [17:39:57] I've been essentially debugging X!'s tools since TParis took them over. [17:40:10] As well as SoxBot, SnotBot, and NoomBot [17:40:44] I am very capable of understanding the code. This issues I'm having is this damn encoding of the files with this stupid BOM. [17:41:07] Cyberpower678: I'm sorry if this is going to sound harsh, but no you haven't. You've had everyone else on this channel do so, and sometimes being dismissive even to the people who /do/ spend time and effort helping you. [17:42:18] Coren, I suppose porting SnotBot to PHP wasn't done by me. [17:42:33] I suppose fixing errors on the edit counter wasn't done by me. [17:42:46] I suppose Cyberbot II wasn't done by me. [17:42:54] * Coren gives up. [17:43:48] I'll admit when I'm over my head. I can read and maintain code. I can't seem to get the BOM issue straightened out. [17:44:05] And I'm not stopping until this damn API works, right. [17:44:39] I'm telling you you've clearly bitten more than you can chew. You can dismiss what I'm telling you if you wish to do so; it's you're time and effort that you are spending. But I have a busy schedule, and there is only so much that I can do it your stead. [17:45:21] About 5% of what needing fixing in the tool was done here, and it wasn't directly code related, but rather getting used to labs related. [17:45:31] File permissions, file encoding. [17:45:49] Learning to use jsub [17:45:50] Learn to use tools like curl and hexdump to inspect output of scripts in detail; learn basic unix file commands to manipulate your code. Those skills you need. [17:46:08] * Coren goes back to working on the migration to eqiad. [17:47:10] Coren, I learn as I go. [17:47:47] Cyberpower678: a last guess: try to encode before echo $text = utf8_encode ( $text ); [17:48:07] Coren: Do you have some time to look into the "internal server error" occuring e.g. at http://tools.wmflabs.org/krdbot/cgi-bin/Denkmalgrep.pl [17:48:26] Coren: I have a query on tools-db that's been stuck in query end state for 3 hours [17:48:54] MrZ-man: I have the same. [17:49:05] Query killed but still stuck. [17:49:31] MrZ-man: It's probably waiting for some other query to release some locks. What do you want to bet it's catscan2? [17:49:44] Oh wait, 'query end state' on tools-db? [17:49:51] Damn. Space issues. Me go fix. [17:51:08] Coren, I have been incredibly stressed for the past few months. RL was interfering with my ability to maintain code. Now that I'm on a 3 week break, I can catch up with things. I'm planning an overhaul of X!'s tools soon. [17:51:33] hedonil, I'll try that. [17:51:46] bd808, wtf? [17:51:59] Cyberpower678: I will test then ;) [17:52:06] bd808, any reason for doing that? [17:52:18] Eesh. krd, MrZ-man, it'll take me a little bit of time; all those pending writes is hard on the db and the box is trashing like mad. [17:52:20] Cyberpower678: Doing what? [17:52:31] "Received a CTCP CLIENTINFO from bd808" [17:52:44] You looked up my client info [17:53:04] Gah. Sorry. My irc client does weird shit sometimes when I click a username [17:53:18] Coren: You cann kill all of my queries if that helps. [17:53:22] Coren, oh btw, I emptied a bunch of logs from Cyberbot [17:53:45] krd: They'll unclog all at once once I manage to clean up some binlogs. [18:02:01] hedonil, no good. [18:02:44] Cyberpower678: another last guess : $text = substr($text, 3); [18:03:03] Cyberpower678: will remove the 3 evil bytes [18:03:28] hedonil: That's not a solution to the API outputting the BOM. :-) [18:03:59] Also, the three bytes are /one/ character (U+FEFF) and probably need substr($text, 1) [18:04:01] :-) [18:04:10] Coren: why not? it's outputting text via echo [18:04:31] Coren: Cyberpower678: http://stackoverflow.com/questions/4057742/how-to-remove-efbbbf-in-php-string [18:05:45] hedonil, that ended up chopping the a off. There are no other echos [18:05:50] that I can see. [18:06:12] Coren: PHP's substr is dumb and works on bytes, not characters [18:06:17] So logically it would have to be present before the first instance of the which would be in api.php [18:06:42] But according to N++ that's formatted in UTF-8 no BOM [18:07:01] Cyberpower678: afais the output happens in /blob/master/API.php private function outputText($text) [18:07:27] hedonil, that's where it's all printed. [18:08:00] But with PHP you can output to the browser if stuff exists outside the tags [18:08:41] The BOM must exist somewhere outside of the tags, or there's an echo somewhere I'm not seeing. [18:09:50] Cyberpower678: /data/project/xtools/database.inc [18:12:03] The one file I didn't check. [18:12:05] WOW [18:12:10] OMG [18:12:26] * Cyberpower678 facedesks through the desk. [18:12:45] * Cyberpower678 has a headache. :( [18:13:33] krd: Looks like tools-db isn't going to be able to recover without a more brutal intervention. :-( [18:14:16] As long as you don't tell me that my db creation was the cause everything is good for me. ;) [18:14:22] anomie, I would kiss you if I could. [18:14:27] No thanks. [18:14:34] Not really though [18:14:53] anomie, thankyouthankyouthankyou. That was the bad file. [18:15:47] Coren, and I'll be sure to study up on file encoding more. [18:16:06] Cyberpower678: Your homework for today is to figure out why you couldn't find that yourself an hour and a half ago. ;) [18:16:40] Because I was blind and completely overlooked that it was being depended on. [18:16:48] krd: No, it's just that tools-db is a dinky little VM that was never intended to be used that long in the first place. [18:17:04] krd: eqiad does have actual hardware and disk space. [18:18:09] Yay it works. And it only cost Coren his hair. :p [18:19:14] hedonil: http://tools.wmflabs.org/tsreports/?wiki=enwiki&report=wantedpages [18:19:40] hedonil: not sure about the 'generated in 13 weeks', that might have something to do with the import I attempted (of existing cached reports on the toolserver) [18:20:06] Coren: Can you give us a rough number at which time the DB will be back? [18:20:32] krd: It's attempting to shutdown and restart now. No more than 10-15 minutes I should expect. [18:21:10] And it can be treated a production machine, i.e. DBs depended on by tools can be put on it? [18:21:10] * anomie requests a ping when tools-db is back up [18:21:26] Or will there be any migration that sholud be waited for? [18:23:14] Aaand it broke again. [18:23:25] because tools-db is down. [18:23:38] But at least there's no BOM [18:25:06] krd: Should be back up now. [18:25:24] Also MrZ-man ^^ [18:25:40] Coren, thanks for your help. [18:26:25] krd: Hm. The *reason* it broke is almost 50G of binlogs in less than 24h. How big /is/ that table you're creating? :-) [18:26:56] Well, really big, sadly. [18:27:16] Coren, am I on your ignore list? [18:27:33] Cyberpower678: No, but I *am* busy. [18:27:46] Coren, I justed wanted to say thanks. [18:27:56] I'll leave you alone now. [18:28:22] Allright. Do watch that editor of yours. [18:30:12] valhallasw: \o/ results! [18:31:11] valhallasw: but your links are broken http//en some ::::::: are missing ;) [18:31:58] Cyberpower678: still around? [18:32:17] Going to take a shower in a moment, but yes. [18:32:50] hedonil, ^ [18:32:51] Cyberpower678: if you could save this as api-test.php http://pastebin.com/hQfGm0UG [18:33:08] Cyberpower678: I could do some tests [18:33:59] hedonil: that's... interesting [18:34:22] Cyberpower678: and a second one api-test2.php which you edited with your editor -- jut add some comments [18:34:41] hedonil: oh, I know why. They are supposed to be //en.wikipedia.org, but they are //http://en.wikipedia.org [18:35:00] hedonil: the wiki table on TL does include http://, the TS one does not [18:35:08] valhallasw: :) easy to fix [18:35:23] Coren, why does the database include http, actually? [18:36:00] what am I doing with test2? [18:36:10] Coren: https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Help#Metadata_database -- there is no field 'aa.wikipedia.org', just 'http://aa.wikipedia.org' [18:36:29] Cyberpower678: just save in your public_html and tell me the url [18:36:45] same as api but with api-test.php [18:36:54] * hedonil chacks [18:36:58] *checks [18:39:33] hedonil, is it working? [18:39:43] on it [18:40:42] I'm taking a shower now. [18:42:18] Cyberpower678: no. do some further check. [18:43:43] Cyberpower678: yes, now it works. [18:44:26] Cyberpower678: had format set to JSON .... but now it works fine with output set to php [18:46:05] valhallasw: What? [18:51:28] Coren: Can you please check if the binlog is increasing heavily at the moment? [18:52:06] krd: Not at a visibly ridiculous rate; but I'm going to keep an eye on it. [18:52:41] I changed my table von InnoDB to MyISAM yesterday, but I'm not sure if that is sufficient. [18:53:14] I'm not very expierienced in DB programming, maybe there's something still missing to save the system. [18:54:29] Coren: meta_p.wiki.url includes the http://, and there is no meta_p.wiki.hostname field [18:55:02] Ah, hm. It doesn't on toolserver? [18:55:43] It's also called differently on the TS [18:56:10] let me check... [18:57:14] *grumble* Then that column is misnamed; it's not an URL if it doesn't include a scheme. [18:57:56] Coren: the columns on the toolserver is called 'domain' [18:58:04] Ah. [18:58:10] Also poorly named. [19:03:03] valhallasw: But yeah, you can safely remove the scheme from the url column or (more robustly) extract the hostname from the authority part. [19:04:17] The correct regex for this is [^:]+://[^@]*@?([^/]+) [19:04:29] This will work regardless of the scheme. [19:05:29] (You might also want [^:]+://[^@]*@?([^/:]+) if you don't care about a possible port) [19:40:03] Coren: urllib.urlparser should do thet job :-) [19:40:06] -r [19:40:26] valhallasw: Even better. :-) [20:19:38] hi, is tehre a timeout for long-running queries on labs? [20:19:44] mysql [20:26:05] nope [20:26:47] odd, I have a query failing after a few seconds, with no error message [20:27:08] how did you run the query? [20:27:10] using jsub to run a small script that executes one mysql query [20:27:36] mysql --defaults-file=$HOME/replica.my.cnf -h commonswiki.labsdb commonswiki_p -e 'select /* SLOW_OK */ cl_from, page_id, cl_type from categorylinks,page where cl_type!="page" and page_namespace=14 and page_title=cl_to;' --batch --silent > $HOME/commons_categories.txt [20:27:56] if I add a limit 100 it runs [20:28:12] in the TS it fails after 24mins [20:28:27] on labs it fails after 2 seconds [20:28:51] quite an improvement speedwise ;-) [20:28:54] dschwen: it works when I run it manually (or rather: it does not fail after 2 seconds) [20:31:31] dschwen: anything in the error file in ~? [20:31:34] ~/.err [20:31:51] nope [20:31:53] 0 bytes [20:32:03] hmmm [20:32:26] it just disappears from qstat [20:33:29] ok, I'm trying it manual now and indeed it keeps running [20:33:43] but this is not a satisfactory solution [20:33:48] strange [20:33:55] I can also run it manually on tools-exec-01 and -04 [20:34:25] have you tried submitting it with jsub? [20:34:43] yeah, that doesn't work [20:34:52] I think it's a memory issue [20:34:58] 1927626testqueryvalhallaswTask / Running2013-12-20 20:34:18CPU: 0.84s VMEM: 0M/250M (peak 258M) [20:35:04] oh [20:35:09] 256M is default [20:35:35] I didn't think that my process would consume the memory, I thought it would be the mysql server [20:35:37] ok [20:35:39] me neither [20:35:48] but thta would explain the auto-kill [20:35:56] well, an error message would still be nice ;-) [20:35:57] not sure why no mail is sent when that happens, though... [20:36:06] oh, hm [20:36:16] still doesn't work with 500M [20:36:17] :| [20:37:20] oh, my manual run gets terminated as well [20:37:23] yeah [20:38:21] ha [20:38:26] NOW I got the emails [20:38:43] If you have problems due to insufficient memory for large result sets, use the --quick option. This forces mysql to retrieve results from the server a row [20:38:44] about 10 per attempt to run it [20:38:46] at a time rather than retrieving the entire result set and buffering it in memory before displaying it. This is done by returning the result set using the [20:38:49] mysql_use_result() C API function in the client/server library rather than mysql_store_result(). [20:39:02] ok, I'll try that [20:39:50] I have no .htaccess and https://tools.wmflabs.org/attribution/index.htm is redirecting to https://tools.wmflabs.org/attribution/cgi.py [20:39:52] My CGI environment seems to be missing HTTP_HOST [20:39:53] Could doing a pip install --user have broken my CGI environment variables? [20:40:21] HTTP_HOST is scrubbed [20:40:36] oh, wait [20:40:44] no, that should be there [20:41:38] Josh_Parris: uh, yeah, because index.htm contains [20:41:49] * Josh_Parris smacks head [20:43:32] I was remembering index.htm had a "hello world" - so it's not .htaccess related. Good. [20:44:58] Josh_Parris: HTTP_HOST is there as far as I can see [20:45:12] for both apache and lighttpd web servers [20:45:20] however, they are both useless [20:45:32] (tools-webserver-01 / tools-webgrid-01:12345) [20:45:38] you probably want HTTP_X_FORWARDED_HOST [20:45:53] I need to poke around and find out why the Flask framework is blowing up then, 'coz it's claiming to be missing HTTP_HOST [21:00:41] What's wrong with /data/project/attribution/public_html/cgi.py such that it produces a 500 at http://tools.wmflabs.org/attribution/cgi.py ? [21:01:22] Josh_Parris: #!/usr/bin/env python [21:01:36] Josh_Parris: and it's not chmod +x [21:04:31] Okay, fixed, thanks valhallasw [21:37:27] Cyberpower678: hey, I saw your api is working right now \o/ congrats ! [21:37:51] It is. [21:38:15] Cyberpower678: what did the trick? [21:38:28] database.inc file had the BOM [21:38:47] Cyberpower678: lol [21:38:57] And it only took Coren's nerves getting frayed to find and fix it. :p [21:42:10] Cyberpower678: he is a good community guy - he will stand that ;) [21:43:28] Cyberpower678 probably just make Coren smoke an extra half a pack [21:43:51] Coren, you smoke? [21:44:11] I do. [21:47:10] I don't think 'he'll tolerate it, so let's keep doing it' is necessarily the social standard wanted ;p [21:47:38] What could cause my python app to run fine in the shell, but from the webserver it's just 500ing? [21:47:39] It's when I python-import a --user imported library [21:47:41] How come that fails when webserved? [21:51:59] Josh_Parris, file ownership and permissions likely. [21:52:39] Josh_Parris: Use the new scheme, you'll get a good log that'll help a lot: [21:52:42] !newweb [21:52:42] https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Help/NewWeb [22:03:08] hi. how is it possible that a job on the grid doesn't run anymore when it was started with the continous option? too little memory assigned using the mem option? [22:04:08] i think if it fails in a certain way it doesn't continue? e.g. if it fails too fast? [22:04:21] i.e. fail, start again, fail start again, too fast [22:04:38] hmm [22:04:50] you should check a log... [22:04:57] and qstat and friends [22:04:58] it worked for over 12 hours [22:05:21] qacct only writes "failed 100 : assumedly after job; exit_status 137 " [22:05:41] maxvmem is less then I assigned using the mem option [22:06:24] but "failing too fast" is a bit weird... when there are database problems for example, most scripts will fail fast... but I assume they will run normal, if the database is back [22:06:47] really? [22:07:14] i would think most continuous things would not fail fast, would just keep retrying themselves [22:08:51] at least the .err files for the failed jobs are empty [22:10:08] but how I'm able to find out, what was the reason for the job to terminate [22:11:10] was it the memory? was it too fast failing? [22:17:05] apper: wrap it in something that gives you more info? [22:17:27] e.g. strace or a bash script [22:19:29] (I think) I've configured my lighttpd server [22:19:30] I've started the webservice, but Apache seems to be in control still; https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Help/NewWeb didn't mention any method of warding off Apache [22:19:32] Does apache fall back to lighttpd if it 404s? Why is Apache still serving? [22:20:23] Josh_Parris: if lighttpd is not started, apache will handle requests [22:20:32] it takes about a minute for lighttpd to start, though [22:20:42] so the grid engine stops running a continous job but does not provide information why it is doing so? :( [22:21:38] apper: exit status 137 == SIGKILL, so something killed it for some reason, probably too much memory [22:22:48] MrZ-man: but if maxvmem is 20% below the level I provided using the mem option? [22:24:01] and three jobs failed at the same time today at 18:18 UTC [22:24:38] I thought I should not add continous jobs to cron, but this looks like I should do that [22:24:46] does it use tools-db for anything? [22:24:52] MrZ-man: yes [22:25:15] were there problems with tools-db? [22:25:33] tools-db ran out of binlog space around that time [22:25:43] or at least that's when it got fixed [22:25:54] ah, okay [22:26:04] but I don't have any errors in the err file [22:28:08] then it's really possible it was started too often [22:28:47] I didn't knew there were such constraints [22:33:09] did anyone tryed to set up twisted and redis-py pubsub app on labs? It's making me a headeach [22:38:35] jeremyb: where did you get the information about "too fast failing"? where could I find a documentation of possible killing reasons for continous jobs? I could only take them into account if I know them... [22:42:02] apper: Continuous jobs only stop if (a) it breaks the memory limit (SIGKILL); or (b) it exists with a return value of 0. [22:42:41] exits* [22:42:57] ahhh [22:42:59] okay [22:43:01] thanks [22:43:25] Josh_Parris: Apache handles everything by default, but the proxy will sent it of to a lighttpd if one is currently running. [22:44:55] Corent: thanks for the infos, that helped a lot... I just found out that I killed my program using exit status 0 when there are database problems... [22:44:59] -t [22:56:31] !log wikimania-support Updated scholarships-alpha to 8fa91e8 [22:56:34] Logged the message, Master [23:31:20] I'm trying to create a new instance in the logstash project and it seems to be having trouble mounting NFS. [23:31:30] Creating directory '/home/bd808'. [23:31:30] Unable to create and initialize directory '/home/bd808'. [23:32:05] The misbehaving instance is logstash-puppet.pmtpa.wmflabs [23:33:11] There is another instance in the project that is already using nfs for /home and /data/project so I'm fairly certain that the project level settings are correct [23:36:48] I can see in my terminal scrollback that I did remember to force a puppet run before I rebooted and that run applied Role::Labsnfs::Client. [23:39:32] bd808: before you rebooted what? [23:39:49] jeremyb: logstash-puppet.pmtpa.wmflabs [23:39:59] Coren: so if it exits in 50ms it just keeps starting again endlessly? [23:40:29] jeremyb: Which I can now apparently log into. [23:40:33] bd808: but that's the new or the old? [23:41:08] no apper huh [23:41:41] jeremyb: I switched that instance to nfs client. I was not allowing me to login and showing an error creating my homedir. But after whining (and waiting as a side effect) it works [23:42:17] So I guess I just needed to wait for the NFS server to see the new client or something similar [23:42:37] sounds wrong [23:42:46] check the logs and see if some timing matches up [23:50:47] jeremyb: Looking at logs reminds me that I rebooted a second time via the [[Special:NovaInstance]] interface between the failure and success. [23:51:51] So puppet -> sudo reboot -> ssh fail (can't create homedir) -> reboot -> ssh works (NFS mounted /home) [23:52:09] * bd808 shrugs [23:52:20] maybe it just needs enough puppet runs [23:52:31] which probably means there's a dependency missing [23:53:59] That could be. At a previous employer the puppet run process was "run it in a loop until it stops changing things" :( [23:54:27] hah [23:54:38] with a max number of iterations? [23:54:59] I think so but something stupid like 50 [23:55:07] sleep between iterations? [23:55:21] Nope. [23:55:55] a cluster of puppetmasters or very few puppets? [23:55:59] Our system was so jacked up that each run took at least 5 minutes too. External facts [23:56:16] No it was just crappy manifests [23:56:23] hrmmmm [23:56:35] was there a command for e.g. shoot yourself in foot? [23:57:13] There were recursive dependencies in modules and nasty custom parser hacks. [23:57:22] It was not optimal :) [23:57:39] recursive works?? [23:58:57] As I recall there were roles that created new facts that triggered other roles. So not recursive I guess but processing order dependent [23:59:57] Hence the "keep running it" as there is no guarantee of ordering by default