[00:20:16] !log wikimania-support Updated scholarship-alpha to be03ac2 [00:20:18] Logged the message, Master [01:43:16] bd808|BUFFER: Tool labs has one. [07:38:23] Coren, are you around? [11:09:51] !log deployment-prep upgrading varnish on deployment-cache-text1 [11:09:56] Logged the message, Master [11:12:06] !log deployment-prep upgrading varnish on deployment-cache-bits03 [11:12:10] Logged the message, Master [11:12:50] !log deployment-prep upgrading varnish on deployment-cache-mobile01 [11:12:54] Logged the message, Master [11:13:52] !log deployment-prep shut downing deployment-search01 and deployment-searchidx01. They were for Lucene search. We use Elastic search now. [11:13:56] Logged the message, Master [11:18:23] !log deployment-prep made sure puppet agent is enabled on varnish caches and reran it manually [11:18:27] Logged the message, Master [11:24:50] !log deployment-prep upgrading varnish on deployment-parsoidcache [11:24:55] Logged the message, Master [13:59:05] Earwig|away: I am now. (Sorry, timezone fun) [14:01:07] Coren: are there updates for the dewiki database issues? [14:02:16] I was unaware of remaining issues, and mysql reports that the slave is up to date and has no errors. Lemme poke Sean. [14:03:30] yay [14:11:57] * Coren compares slaves to master. [14:19:09] giftpflanze: I'm a little confused; I just compared the number of rows in revision between prod and the replicas and there is no difference. [14:22:16] What is the issue? [14:26:30] giftpflanze: I don't know what the issue is I should be looking for? I thought it was missing rows? [14:32:39] it is [14:33:47] giftpflanze: Well, I'm not seeing any rows missing. Do you have example(s)? [14:34:12] i'm particularly interested in externallinks, the last dump had 13872654 rows, labs has 2827590 [14:35:41] Ah, I was looking at revison. Lemme do the same comparison with externallinks. [14:36:27] toolserver also has 13895824 and changing [14:38:50] Aha! There's definitely a difference there. [14:39:00] yay [14:39:04] progress [14:39:36] (Initial reports, afaik, mostly talked about revision so that's what we were checking) [14:39:47] How odd. [14:40:01] i added a comment to the bug. seems noone noticed [14:40:19] Well, I certainly didn't. [14:40:24] Hm. [14:57:21] giftpflanze: Bah, that table doesn't have a primary key. That makes resync "fun". [14:58:07] haha, yeah, that's why i created one fro my distinct el_to copy :) [14:58:16] *for [15:19:24] !Cyberpower [15:19:24] This user is made out of awesome plasma. [15:19:40] :) [15:41:04] giftpflanze: I've dumped and restored externallinks; things should be okay now. [15:41:21] omg, that's fantastic! [15:42:09] Thankfully, it's a small table so the brute force method works. [15:42:36] oh, how many rows do big tables have? [15:43:01] mh, maybe forget rows [15:43:09] columns are also important [15:56:47] Yeah, but 13 million small rows is nothing compared to things like revision or archive on enwiki. :-) [16:08:12] Coren: I solved my smtp relay issue with a local install of postfix, which is actually better because I was able to configure it to deliver all mail to a single local spool instead of the actual To: address. Less chance of someone spamming real folks while testing. [16:08:41] bd808: Indeed. For special cases, a local install is often best. [16:15:57] Coren, https://en.wikipedia.org/wiki/User_talk:TParis#Edit_summary_tool isn't loading anymore. Why is that? [16:16:11] Coren, err wrong link. [16:16:22] Coren, http://tools.wmflabs.org/xtools/pcount/index.php?name=Betacommand&lang=en&wiki=wikipedia [16:17:17] Cyberpower678: I don't know; it clearly gets to your tool. Do you have some debugging you can turn on? [16:17:33] Not that I'm aware of right now. [16:17:40] Somewhat busy with other stuff. [16:18:02] Nothing on the tool has changed, but aparently according to another user is returning a 502 [16:18:12] Coren, ^ [16:19:47] Coren, Proxy Error [16:19:48] The proxy server received an invalid response from an upstream server. [16:19:48] The proxy server could not handle the request GET /xtools/pcount/index.php. [16:19:48] Reason: Error reading from remote server [16:19:58] That's certainly because your tool never answers the proxy; I see it's currently pretty much at capacity. [16:20:30] There are only so much outstanding connections your webservice will accept; for instance, so if it's stuck on some then likely others will time out. [16:20:43] You'll need some logging if you want to figure out what is going on. [16:21:32] Coren, I'm confused. It's been running fine all this time. Why does it randomly stop now? [16:22:04] It's not stopped, and it's not random. Webservices are allowed five outstanding requests; the others are queued. [16:22:25] If your requests take minutes to answer, then obviously you'll get timeouts. [16:23:23] (A good rule of thumb is that if your service can't answer within ~30s or so, any burst of activity will end you at capacity and start delaying) [16:23:26] How do I see that? How do I fix it. I'm kind of clueless here admittedly since I never had this issue before. [16:23:56] It's the exact same code that ran on toolserver and never had issues. [16:24:10] Cyberpower678: You'll need to add some logging; see what requests come in, and how long they take to respond. With that data, you'll be able to profile where the time is spend and consider optimizations or alternatives. [16:24:34] What SQL requests you are making and so on. [16:25:03] The performance profile on toolserver isn't the same as tool labs. A query that was fast on one might be slow on the other. [16:25:07] And vice-versa. [16:25:23] Especially if you do selects on 'revision' and not 'revision_userindex' [16:25:55] Coren, https://tools.wmflabs.org/xtools/ec/ doesn't load either. All that should do is display a form. [16:26:25] Cyberpower678: You elected to stuff all your tools in the same service; they share the same webserver and therefore the same request queue. [16:26:32] Coren, everything is pretty much optimized. [16:26:47] Cyberpower678: Clearly not, since you have queries that take several minutes to answer. [16:27:23] Coren, that shouldn't be happening. PHP should timeout. [16:27:31] Or you're trying to do too much interactively. Also, I'm pretty sure you do no sort of caching whatsoever; which means that someone can stuff your tool full of long requests just by leaning on F5 [16:27:46] Did you set a timeout? [16:27:51] And return a timeout message. [16:28:09] Coren, no. The webserver should already have one. [16:28:21] No, the webserver most certainly "should" not have one. [16:28:45] What would be a good value for you might not be good for someone else. If you want a timeout, pick one. [16:29:09] Cyberpower678: http://php.net/manual/en/function.set-time-limit.php [16:29:55] I'll add timeouts to the tools then. [16:30:11] I'm pretty sure it has a max_execution_time [16:30:32] Your webserver has that defined in the php.ini file doesn't it? [16:30:39] max_execution_time only counts actual execution, not time it's waiting for SQL results et. all. [16:30:49] et. al.* [16:31:13] Also, you might want to look into putting timeouts to your most expensive SQL queries. [16:31:23] Robustness is *hard*. :-) [16:32:18] Can you tell me what's hung up on xtools? [16:33:35] Coren, ^ [16:34:49] Cyberpower678: No. But you might find some useful information by turning on debug.log-request-handling = "enable" in your .lighttpd.conf -- I think it even output times by default. [16:35:12] So you'll see what requests you get exactly. [16:35:34] You might also want to add debug output to your scripts so that it tells you what takes long and what doesn't. [16:36:14] Coren, where's the file located? [16:36:43] It your tool's home; you probably don't have one yet if you've never changed your lighttpd settings [16:36:47] !newweb [16:36:47] https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Help/NewWeb [16:36:50] ^^ [16:40:16] Coren, did I do it right? [16:43:32] I can't see; you've apparently set your tool's home to exclude others (which is probably not a good idea, btw) [16:44:03] Coren, can you check if I set the .lighttpd.conf file up correctly, and reset xtools to run fresh again? [16:44:09] I did? [16:45:14] Coren, check now. [16:45:44] * Coren groans. [16:46:02] Did you use that unicode editor again? That file has a BOM at the beginning and no trailing newline. [16:46:18] Pro tip: Don't do that. [16:46:19] OOPS [16:46:37] :-) [16:52:43] CP678: Other than that, it should be okay. [16:55:10] Silke_WMDE, is there already a date for the office hour regarding TS / Labs? [16:55:25] Coren|Lunch, can you reset xtools and kill the existing connections to the tool? [16:55:55] ireas: No, sorry, not yet. Any wishes? [16:56:02] Cyberpower678, if you want to restart the webserver: you can do this by yourself (`webservice restart`) [16:56:06] Cyberpower678: You haven't fixed your .lighttpd.conf yet. Also you can do this yourself with just "webservice stop / webservice start" [16:56:18] Silke_WMDE, no problem, just wanted to be sure I didn’t miss something :) [16:56:20] * Coren|Lunch goes to lunch for realz. [16:56:52] Coren|Lunch, what do I need to fix. Remember that I now nothing about this. [16:56:55] ireas: I'll post it on toolserver-l and labs-l. I'll have to coordinate with colleague I didn't see. [16:57:03] Silke_WMDE, okay, thanks! [16:57:37] My battery is about to die. [16:58:26] LOL [16:58:29] Retarting webservice [17:01:21] Cyberpower678: Fix needed was "Did you use that unicode editor again? That file has a BOM at the beginning and no trailing newline." [17:01:59] I would guess the BOM would make nginx choke when parsing the file [17:03:39] s/nginx/lighttpd/ [17:38:46] Chris is currently running a fuzz test against wikimania-scholarship.wmflabs.org. I thought I'd mention it here just in case anyone notices problems with the instance proxy as a result. Poke me and I'll poke him to kill it if it causes problems. [17:39:52] He's only sending about one request per second so it shouldn't really bother anything. [20:08:28] sshing into instances and sudoing is very slow - is LDAP overloaded? [21:15:27] !log deployment-prep Debugging apache on deployment-apache33, may look hung [21:15:32] Logged the message, Master [21:24:45] is dns resolution super slow in labs? [21:25:11] i hope it's not my fault [21:35:10] giftpflanze: why your fault? are you conducting an internal DNS DDoS Amplification Attack :P [21:35:23] giftpflanze: ? [21:36:04] i have 1000 threads that access the net [21:36:08] manybubbles: There's an issue with the openstack DNS subsystem atm; it's acting a bit flaky. [21:36:40] manybubbles: Not sure why yet. [21:36:47] thanks [21:36:54] * Coren glares suspiciously at giftpflanze. [21:37:01] giftpflanze: 1000 threads sounds reasonable for a *real* good bot :) [21:37:22] it didn't cause problems before i guess? [21:38:29] giftpflanze: It looks like the issue started about two days ago. [21:38:39] then it can't be me [21:38:53] i started 3 hours ago [22:03:23] heh, i'm trying to be a good and add docs to my puppet classes that puppet docs finds, and also have a README.md. now if i open a README.md in vim and the string "_Resource" is in it it starts to do some weird highlighting that i don't want. i just have that string because our labs URLs to projects are all "Nova_Resource:Foo" :p hehe [22:03:41] i bet _Resource is something reserved [22:04:29] wants to link from puppet module README.md to the Labs project this has been and can be tested in [22:06:06] mutante, _ is an emphasis character (_italic_, __bold__), see http://daringfireball.net/projects/markdown/syntax#em [22:06:13] maybe this causes the strange formatting [22:06:47] hashar: (or other beta commons sysops) can I get added to the gwtoolset group there to help dan-nl test [22:07:11] My beta username is "BryanDavis" [22:07:36] aren't you on beta already ? [22:07:49] ireas: thanks [22:07:55] In the project yes, but I have no rights on the wikis [22:08:01] my account is dan-nl and it would be great to add DivadH, the project manager.on http://commons.wikimedia.beta.wmflabs.org to gwtoolset group [22:08:26] * bd808 wonders if he should know a trick that he doesn't know [22:08:37] ahh [22:08:49] you mean a mediawiki permission right, not the labs project on wikitech [22:08:51] gmmh [22:09:10] Error: 503, Service Unavailable at Fri, 06 Dec 2013 22:09:03 GMT [22:09:10] ahah [22:09:14] while login in :( [22:09:20] oh boy [22:09:40] so hmm [22:09:43] I am going to bed [22:10:11] You're always saying that [22:10:30] given my wife is staring at me …. :D [22:10:39] will reboot the text cache [22:11:13] !log deployment-prep upgrading packages on text cache / running puppet and rebooting it. [22:11:23] Logged the message, Master [22:13:08] no worries hashar we can pick this up on monday. if anything it might be good just to prove that the extension can be present without interfering with anything. enjoy your weekend with your wife :) [22:13:24] * bd808 looks at Special:ListUsers to find someone else to poke [22:13:28] she staring at me because I can't help her with some C++ bug [22:13:57] ah I am in [22:13:57] oh my … [22:14:03] Tell her that python is better :) [22:14:10] :D [22:14:51] * hashar looks how to grant bd808 bureaucrat powers on beta cluster :D [22:15:40] http://deployment.wikimedia.beta.wmflabs.org/wiki/User:Bd808 ? [22:15:49] hashar: BryanDavis [22:16:11] But maybe also Bd808? I make too many accounts [22:16:34] I know the password for BryanDavis :) [22:16:53] User account "Bd808" is not registered. [22:17:02] per wikitech User: [22:17:38] mutante: http://commons.wikimedia.beta.wmflabs.org/ [22:17:45] Apparently I have both [22:17:49] that Bd808 account should have super powers now [22:18:23] was just speaking for labs(console) users [22:18:29] can you believe we have something like 20 special pages in the 'User and rights' section? [22:19:01] yes [22:19:39] so i need to find upstream for the markdown syntax highlighting in vim :p [22:19:55] annoyed ##vim :) [22:20:35] mutante: Tim Pope [22:20:55] hashar: ty:) [22:21:10] mutante: the author/maintainer is usually listed at the beginning of the syntax file [22:21:26] hashar: Thanks! [22:21:30] mutante: which is in something like /usr/share/vim/vim7.4/syntax/markdown.syn [22:22:08] bd808: does it work? [22:22:09] hashar: cool, got it! [22:22:21] !log deployment-prep rebooting deployment-cache-text1 (aka text varnish) [22:22:31] Logged the message, Master [22:22:36] Not sure yet. Trying to change password on bd808 account and boom! [22:22:41] :D [22:22:52] ah hmm that reboot was unfortunate heeh [22:23:43] bd808: the evil thing is that since you have a shell access to the beta cluster, you can grant yourself rights from the console :] [22:24:01] the hardcore way: mwscript eval.php --wiki=labswiki [22:24:16] <^d> hashar: I think I killed beta. [22:24:17] <^d> :( [22:24:31] ^d: I restart the varnish cache a few minutes ago [22:24:37] <^d> Hmm, maybe that's it? [22:24:52] <^d> I was running a ton of job runners from deployment-prep to reindex commonswiki. [22:25:01] <^d> And then I suddenly couldn't connect in the browser anymore. [22:25:06] <^d> (Was afraid it was me) [22:25:28] I knew I should have went to bed [22:27:21] ^d: beta has its own job runnerhosts [22:27:33] bd808: varnish text is back [22:27:54] deployment-jobrunner08.pmtpa.wmflabs for most jobs [22:27:59] there is one for video scaling [22:28:16] <^d> Ah, site is back now for me. [22:28:21] the jobrunner instance spend most of its time pollng for new jobs [22:29:39] ^d: if you still know something about filebackend, dan-nl has a change pending in operations/mediawiki-config.git to add a new filebackend [22:29:42] for gwtoolset [22:30:15] Coren, what’s the current status of the TomCat lighttpd integration? Is there something I can do? [22:30:44] ireas: No, but I should have it ready for testing Monday-ish. [22:30:54] Coren, okay, thanks! :-) [22:31:10] <^d> hashar: I don't see it. [22:31:31] thanks hashar. ^d via irc and email aaron said he thinks this config will work. https://gerrit.wikimedia.org/r/#/c/98684/6/wmf-config/filebackend-labs.php [22:31:32] And btw it's tomcat /instead/ of lighttpd. :-) [22:31:44] Just done with the same general setup. [22:31:55] ah, okay [22:32:05] ^d: nm, aaron handled it :-] [22:32:07] but lighttpd works as a proxy? [22:32:13] it's already on the beta cluster. just waiting to be added to the gwtoolset group on that server so that i can test it out [22:32:14] dan-nl: so that should be deployed on beta already [22:32:15] <^d> Yeah I see it now. [22:32:22] yes it is there [22:33:09] once i get added to the gwtoolset group i'll be able to access the extension at http://commons.wikimedia.beta.wmflabs.org/wiki/Special:GWToolset [22:33:29] then i can try uploading a small metadata set and see if it works [22:34:20] dan-nl: what is your username on http://commons.wikimedia.beta.wmflabs.org ? [22:34:26] dan-nl [22:34:51] i hate our user permission system [22:34:54] <^d> manybubbles|away: Done reindexing, commonswiki now has 3 indexes :) [22:35:24] dan-nl: you should be in [22:35:37] would be great if you could also add DivadH [22:35:53] he's the project manager and will also test on monday or this weekend if he has access [22:36:09] done [22:36:16] can you confirm it is working ? [22:36:23] looking now [22:36:26] yep, i'm in [22:36:32] will test with a small data set [22:36:56] sorry it took so long :/ [22:37:13] I will be out for the whole week-end [22:37:53] but feel free to ping me on monday if something is wrong [22:41:15] np [22:41:27] it looks like it's working but i got this message now [22:41:27] Copy uploads are not available from this domain. [22:42:26] otherwise things seem to be working ... [22:44:02] might need some more configuration in the wmf-config files : / [22:44:07] so i think the filebackend is working fine, but the beta server isn't able to download external images? [22:45:13] k, i can look … the [22:45:42] you can try uploading a file witht he upload wizard http://commons.wikimedia.beta.wmflabs.org/wiki/Special:UploadWizard [22:47:10] http://commons.wikimedia.beta.wmflabs.org/wiki/File:Image.jpg works for me [22:47:59] k, looks like i'll have to troubleshoot it a bit further, not sure what that message means yet … will look through the code base [22:48:07] heading bed n [22:48:21] see you on monday :-] [22:49:32] have a good weekend and thanks for your help! [22:57:47] Anyone able to help me check an inquiry for labs before I run a request? [22:58:39] mediawiki 1.22 is now stable [22:58:58] release few minute ago.. in case you guys wanna upgrade your labs instances [22:59:35] ChrisGualtieri: what kind of inquiry do you mean [23:00:02] I want to run a scan on Catscan2 against the stub category [23:00:12] And that means all 2.2 million [23:01:13] I can't figure out how to do a proper regex in the database scanner from AWB and it is giving me really bad results, and I use up all the memory and crash out at 300,000+ [23:01:34] hmmm...i see [23:01:49] that would be something to be escalated to "2nd level":) [23:01:56] like Coren :) or the list [23:02:03] Would a lower run be acceptable? [23:02:18] i suppose so, yea, i just have no idea about how much lower [23:02:25] I have a few different targets, but I can't get the page lists I need because I do not have admin bit and I don't want to break anything [23:02:51] One would be the WP:USA importance and class run... it breaks when I query the category at 25,000 [23:03:05] Need the HighLimits plugin to even get the list [23:03:31] the best thing would probably be if you just summarize what you said and send it on the mailing list [23:03:41] that sounds like it could use input from several people [23:03:54] and it works better when they dont have to be online in the same timezone [23:04:16] by list i mean maybe even both, labs and wikitech-l crosspost [23:04:29] Well, I've already done a much more difficult task by hand. But I lack the skills to execute it - if anyone knows how to check for stub class in the database scanner I'd be set [23:04:46] labs-l and wikitech-l are replete with expertise that can help. [23:05:13] I'm just out of my element, a stranger in a strange land as they say [23:05:28] sorry, i dunno about the database scanner [23:05:38] but there will be others who do [23:06:21] Where would I need to go to submit this? [23:07:32] ChrisGualtieri: https://lists.wikimedia.org/mailman/listinfo/labs-l [23:07:42] https://lists.wikimedia.org/mailman/listinfo/wikitech-l [23:09:49] I think I just crashed my mozilla firefox, but looking [23:12:02] I so do not understand this list thing [23:26:59] ChrisGualtieri: if it's about a general list thing - this may help http://tools.wmflabs.org/paste/view/32492014 [23:31:13] i subscribed, since I'll probably be needing to in the future [23:31:53] ChrisGualtieri: yea, good idea. welcome ! [23:59:52] Coren: Hmm. I don't want to be ungrateful - but: it takes *ages* to load some resources [23:59:58] Coren: https://tools.wmflabs.org/wikiviewstats/loading_page.png