[02:00:41] You don't have permission to access /~addbot/status on this server. [02:00:42] D: [02:04:51] addshore: Is bug, or expected? [02:07:14] Coren: bug [02:07:30] not sure what though and I am in no place to dig deeper right here :P [02:07:32] Plz to give context? [02:07:46] Wait, this on labs? [02:07:50] yes [02:08:02] Then /addbot/status Note the conspicuous lack of ~ :-) [02:08:33] ~ is there ;p [02:08:46] That's my point. It shouldn't be. :_) [02:08:49] http://bots.wmflabs.org/~addbot/status 403 Forbid :O [02:08:55] Ah! Bots! [02:08:57] Coren: no, it should be :P [02:09:09] or should it :/ [02:09:18] On tools it shouldn't. [02:09:25] on bots it should ;p [02:09:36] this page has been refreshing ever min on my tablet and its just vanished :P [02:09:54] I would say it was gluster having issues with /data/project on apache on bots but then /~addshore works xD [02:10:33] Gimme a sec, I go see. [02:12:57] * Coren tries to figure out how petan did this. [02:15:33] addshore: id: addbot: No such user [02:15:43] So ~addbot not working is expected. [02:15:52] * addshore made the folder addbot [02:16:31] That's not enough, ~user syntax requires an actual account with a $HOME [02:16:42] but but, this has been working for days [02:16:45] o_O [02:16:47] :P [02:17:11] see /data/project/public_html/addbot/ [02:17:35] The configuration of bots-apache01 doesn't even /look/ there. [02:17:38] I agree thinking about it now it probably shouldnt work (not that I have looked at the apache config) [02:18:06] Did someone perhaps add it manually and it then got trampled by puppet? [02:18:20] mhhhm, I dont think so [02:19:29] Because from what I see, it was configured to use UserDir which... requires a user. [02:21:18] * addshore has just been looking at the apache logs [02:21:38] seems it isnt in the access log but rather the other_vhosts_access.log :P [02:22:20] The only not-user-specific override I can see is for cluebot [02:22:31] ... are you sure this ever worked? :-) [02:22:33] bots.wmflabs.org:80 xx.xx.xx.xx - - [07/Apr/2013:06:50:53 +0000] "GET /~addbot/status HTTP/1.1" 304 276 "http://bots.wmflabs.org/~addbot/status [02:22:55] yes :P I cant have gone that mad [02:23:56] This makes so sense to me unless someone recently changed the config, because as it is right now there's no way this could have ever worked. [02:24:16] * addshore has no idea [02:24:35] unless this is being served up by a secret apache server ;p [02:24:43] .. except its in the apache01 logs ;p [02:24:57] And besides, that's where the public IP lives. [02:24:58] ahh well, I will have a look tomorow when I wake up! :P Night! [02:25:02] * Coren waves. [08:10:47] hi [08:36:13] !log wikidata-dev wikidata-dev-9 changed LocalSettings.php "$wgResourceLoaderDebug = true;" to "$wgResourceLoaderDebug = false;" [08:36:16] Logged the message, Master [12:38:43] Good morning, Labs. [12:40:27] Morning [13:16:42] Coren I think that binary logs should live on different partition on -db [13:16:53] putting it to /var/log is really evil idea [13:18:08] petan: Hm. Why so? For performance, it's generally better to have 'em elsewhere than the actual DB, although that's probably not as important on a VM. [13:18:31] because it's unnecesarily fragmenting and slowing down the system partition [13:18:35] mhm [13:18:46] well, I don't really care but / is small and important [13:18:52] if it get filled up by binary logs... [13:19:11] on bots sql is having far higher load, and binary logs are getting huge sometimes [13:19:19] on tools mysql is not used very much atm [13:19:43] btw I believe that vda and vdb lives on same physical storage [13:19:59] so putting them to vda or vdb will be no change [13:20:00] Well, that box runs exactly /one/ daemon, so I don't think it's relevant. Size counts, but I was planning on keeping binlogs at 4G; but you have to remember that the local VM is a transient measure; I was planning on moving those to the real kickass DB anyways. :-) [13:20:24] DBs in VM suck anyways. :-) [13:20:30] probably [13:20:52] The DB hardware asher is stting up, on the other hand, kicks butt. [13:50:01] !log tools local-afcbot: unable to run mono applications: The assembly mscorlib.dll was not found or could not be loaded. [13:50:02] Logged the message, Master [13:50:08] Coren ^ [13:50:18] what is proper way to install a package to all exec nodes? [13:50:49] petan: 'ongrid aptitude install foo' [13:51:02] does it track the change? [13:51:10] so that new exec nodes will have this package as well? [13:51:14] petan: And note the addition to http://www.mediawiki.org/wiki/Wikimedia_Labs/Tool_Labs/Notepad [13:51:31] No, I have to sit down an write the puppet classes this week; it's #2 on my TODO [13:51:37] ok [13:51:58] #1 is the NFS server. :-) [13:53:10] btw - why not install mono-complete on exec nodes? one day you will have an application on tools which will need some package from it... it could be any package [13:53:27] even if my bot may not need all of these packages, some other application might [13:53:51] runtime is very limited... [13:54:31] Well, there is no overriding reason not to, I suppose, though doesn't mono-complete include lots of dev-only stuff that's not useful at runtime? [13:54:49] I don't think so... [13:55:01] well, true [13:55:10] it contains mono-devel [13:55:27] on other hand maybe if someone needed to compile their application, execution nodes could be better than loading tools-login with that :P [13:55:33] compiling is cpu expensive [13:55:48] Hm. I admit I thought about enabling qmake anyways. [13:56:57] I have no idea which package contains mscorlib.dll [13:57:08] petan: lemme check [13:57:33] http://stackoverflow.com/questions/10490155/unable-to-run-net-app-with-mono-mscorlib-dll-not-found-version-mismatch [13:57:43] I got it to work by installing mono-complete: - sudo apt-get install mono-complete [13:57:44] :D [13:58:02] libmono-corlib2.0-cil [13:59:04] Lots of suggest and depends; it's probably better to just install mono-complete after all. [13:59:17] that's what I think :)) [13:59:20] Not like we are that short of diskspace. [13:59:37] so I install? [13:59:45] or not [14:00:41] btw I found it really very hard to get answer for this: http://stackoverflow.com/questions/15860481/how-can-i-limit-the-memory-size-for-net-mono-process [14:00:55] which I asked just to find out how to run c# applications on SGE with low virt memory [14:01:01] because all I tried so far didn't work [14:01:11] all mono applications allocate several GB of virtual memory [14:01:17] even if they eat less than 20mb of resident [14:01:35] and if you limit virtual memory to lower value than 1gb they just crash with coredump [14:02:22] I don't really know what is so wrong on that question that it got downvoted maybe they are just lazy on stack overflow :D [14:05:25] Coren I am installing mono-complete then? or should I just get the package my bot needs? [14:05:49] Nah, just go ahead and install mono-complete with ongrid. [14:06:20] (which is just a for loop of ssh sudo) [14:07:43] !log tools petrb: ongrid apt-get install mono-complete [14:07:45] Logged the message, Master [14:08:08] 200mb [14:08:11] :o [14:08:39] Yeah, it's not small, which is why I didn't install it by default. :-) [14:08:49] Change on 12mediawiki a page Wikimedia Labs/Tool Labs/Notepad was modified, changed by Petrb link https://www.mediawiki.org/w/index.php?diff=670296 edit summary: [+14] [14:19:28] @labs-resolve exec [14:19:28] I don't know this instance - aren't you are looking for: I-00000604 (tools-exec-01), I-0000064c (tools-exec-02), [14:19:38] @labs-resolve tools- [14:19:38] I don't know this instance - aren't you are looking for: I-00000515 (webtools-odie), I-000005c9 (webtools-login), I-000005ca (webtools-apache-1), I-000005cb (webtools-rr), I-000005f9 (tools-login), I-000005fb (tools-puppet-test), I-00000600 (tools-webproxy), I-00000604 (tools-exec-01), [14:19:51] damn these webtools :OP [14:43:38] Coren what does ongrid do [14:43:52] I got a feeling like it's 4th server I am installing to... [14:43:59] It's a dumb for loop that does ssh. :-) [14:44:08] where huh [14:44:15] I thought there are 2 nodes [14:44:32] the install is running like for 4th time [14:44:35] no idea where [14:44:41] 2 nodes, the puppet master, the shadow, the webserver and the login host. [14:44:45] ah [14:44:49] webserver :o [14:44:58] puppet master? o.o [14:45:17] puppet master is needed for permission validation; it needs to be able to see the executables. [14:45:19] we need a similar script for exec [14:45:21] only [14:45:24] ok [14:45:28] Same with shadow. [14:46:17] Webserver is convenience; I want to make sure that people can debug their code on login knowing it will work as a web service, or that tools that have both a web component and job(s) won't get confused by different libraries/executables. [14:46:26] Especially since you can start jobs from a web tool. [14:46:34] k [14:47:33] But ongrid is just a hack; there will be a role class in puppet for "run environment" to deal with that set. [14:53:19] Change on 12mediawiki a page Wikimedia Labs/Tool Labs/Needed Toolserver features was modified, changed by 93.193.25.210 link https://www.mediawiki.org/w/index.php?diff=670313 edit summary: [+55] /* Labs wide (not only bots / tools), but available for all projects */ started cleaning up bottom [14:54:23] !toolsdocs [14:54:30] @search tools [14:54:30] Results (Found 2): morebots, labs-morebots, [14:54:35] @search docs [14:54:35] Results (Found 3): docs, demon, puppet, [14:54:45] @search wikitech [14:54:45] Results (Found 5): pxe, wikitech, mobile-cache, botsdocs, putty, [14:54:56] Coren where is documentation for tools? [14:55:06] @search console [14:55:06] Results (Found 42): labs, instancelist, instance-json, amend, sal, sudo, access, stucked, group, pathconflict, terminology, manage-projects, rights, docs, ssh, documentation, start, link, socks-proxy, console, resource, security, project-discuss, git, port-forwarding, instance, bot, pl, projects, accountreq, puppetmaster::self, addresses, initial-login, deployment-beta-docs-1, sudo-policies, forwarding, labsconsole, sudo-policy, puppetmasterself, labswiki, requests, single-node-mediawiki, [14:55:20] @regsearch console.*tools [14:55:20] No results were found, remember, the bot is searching through content of keys and their names [14:55:23] http://www.mediawiki.org/wiki/Wikimedia_Labs/Tool_Labs/Help [14:55:38] @search Tool_Labs [14:55:38] Results (Found 2): tl, tooldocs, [14:55:46] tooldocs. :-) [14:55:49] !alias toolsdocs tooldocs [14:56:00] !toolsdocs alias tooldocs [14:56:00] Created new alias for this key [14:56:10] !toolsdocs [14:56:10] http://www.mediawiki.org/wiki/Wikimedia_Labs/Tool_Labs/Help [14:56:15] here we go [14:58:01] Change on 12mediawiki a page Wikimedia Labs/Tool Labs/Needed Toolserver features was modified, changed by 93.193.25.210 link https://www.mediawiki.org/w/index.php?diff=670316 edit summary: [-45] /* Labs wide (not only bots / tools), but available for all projects */ [14:58:03] Coren when I use jsub where is output from app? [14:59:11] btw Coren it seems that every mono application will need to request about 1.2gb of virtual memory to run, even "hello world" [14:59:16] petan: By default, ~/jobname.out [14:59:20] aha [14:59:31] petan: Yeah, mono is teh suxx0rs unless you limit its memory. [14:59:51] I really don't know why and when I asked on stack overflow I just got downvoted that I am asking stupid questions [15:00:10] petan: http://www.mono-project.com/Release_Notes_Mono_2.8#Configuration [15:00:13] because virtual memory doesn't matter to anyone [15:00:24] petan: That's because coders nowadays are morons. [15:01:03] People write like Windows. "Who cares, I'm not sharing my ram with anyone and I can always buy more, and if I crash it's just too bad for me." [15:01:34] Coren I need an example of that, it's too vague documentation [15:01:40] it say I can use some options somewhere... [15:01:43] Change on 12mediawiki a page Wikimedia Labs/Tool Labs/Needed Toolserver features was modified, changed by 93.193.25.210 link https://www.mediawiki.org/w/index.php?diff=670322 edit summary: [-251] /* Labs wide (not only bots / tools), but available for all projects */ introduced more subsections for better overview in bottom part [15:02:11] I was playing with MONO_GC_PARAMS in past and it looked to me like mono ignored them [15:02:40] major-heap-size is the operative one, I think. [15:02:58] oh, thanks bot for making me notice-> I'm editing as an IP again... m( [15:03:34] petan: I've no love for Java, but at least /it/ does actual memory management. [15:03:57] mhm [15:04:15] I should have use c++ [15:04:29] petan: If I read the doc right, by default mono allocates half a gig of ram to its heap, which is insane. [15:05:03] I can rewrite it to c++ but what about others [15:05:09] I used to code for 4K of precious, precious ram. :-) [15:05:10] you can't force people to rewrite their apps [15:06:16] petan: No, we'll have to throw more VM at them; that's the positive side of virtualization, it's relatively easy to overcommit virtual memory without actually overcommitting /in/ the guest. [15:07:09] btw is it possible to see how much virtual is used on certain node [15:07:17] so that you can see that node is full or not [15:07:57] because qtop display only resident memory in node usage [15:08:20] so now when u open qtop you have no idea how much virtual memory is available for tasks on each node [15:08:48] you could have 10 tasks each using 2mb or ram but requesting 1gb of virtual ram [15:08:53] which would make the node full [15:08:56] Change on 12mediawiki a page Wikimedia Labs/Tool Labs/Needed Toolserver features was modified, changed by Silke WMDE link https://www.mediawiki.org/w/index.php?diff=670325 edit summary: [+16] /* Labs wide (not only bots / tools), but available for all projects */ [15:09:53] petan: qstat -F h_vmem [15:10:20] that is free? [15:10:21] or used [15:10:23] free [15:10:26] ok [15:10:43] it's limited per queue or per node? [15:11:56] Per node, but it shows twice since both nodes are available for both queues atm [15:14:45] no chance running the bot unless I give it 1gb of memory [15:14:58] even if it eats some 60mb or resident [15:15:09] o.o [15:15:33] petan: Right, but the thing is it /might/ actually use all that allocated memory and there is no way to prevent it. [15:15:57] but why [15:16:03] it needs so much virtual memory [15:16:13] that bot is so simple it will never use it [15:16:48] even funcking gnome3 which is in java is running my poor laptop out of ram [15:17:08] all these VM based languages are wasting memory... [15:24:00] petan: I know. :-( [15:24:20] petan: Coding since 1990 has been ignoring frugality. [15:24:37] "Only allocate the resources you will use" is not in fad. [15:31:47] Coren: [15:31:47] local-legobot@tools-login:/data/project/legobot$ chmod +x process_baseball.py [15:31:48] chmod: changing permissions of `process_baseball.py': Operation not permitted [15:31:58] pretty sure i own the file... [15:32:12] -rw-r--r-- 1 legoktm local-legobot 4208 Apr 8 15:07 process_baseball.py [15:32:23] ... and you don't. :-) [15:32:40] legoktm != local-legobot [15:32:57] gah [15:34:02] how do i change ownership then? [15:34:49] Quick workaround: ~marc/bin/take process_baseball.py [15:35:00] Not an official tool for now. [15:35:43] awesome thanks :) [15:37:09] Change on 12mediawiki a page Wikimedia Labs/Tool Labs/Needed Toolserver features was modified, changed by MPelletier (WMF) link https://www.mediawiki.org/w/index.php?diff=670372 edit summary: [+559] /* Bots project */ re [15:41:11] Change on 12mediawiki a page Wikimedia Labs/Tool Labs/Needed Toolserver features was modified, changed by Silke WMDE link https://www.mediawiki.org/w/index.php?diff=670379 edit summary: [+16] /* Bots project */ renamed section [15:45:40] Change on 12mediawiki a page Wikimedia Labs/Tool Labs/Needed Toolserver features was modified, changed by Silke WMDE link https://www.mediawiki.org/w/index.php?diff=670383 edit summary: [-254] deleted distinctions made in headline to merge them [15:46:14] Change on 12mediawiki a page Wikimedia Labs/Tool Labs/Needed Toolserver features was modified, changed by Silke WMDE link https://www.mediawiki.org/w/index.php?diff=670384 edit summary: [-125] /* Storage */ deleted redundancy [15:46:37] Change on 12mediawiki a page Wikimedia Labs/Tool Labs/Needed Toolserver features was modified, changed by Silke WMDE link https://www.mediawiki.org/w/index.php?diff=670386 edit summary: [+17] /* Filesystem */ renamed section [15:47:53] Change on 12mediawiki a page Wikimedia Labs/Tool Labs/Needed Toolserver features was modified, changed by Silke WMDE link https://www.mediawiki.org/w/index.php?diff=670389 edit summary: [-873] /* Database */ deleted redundancy [15:49:20] Coren: Do you already changed my proposal in [[bug:46460]] or should I try to improve the code? [15:49:48] Change on 12mediawiki a page Wikimedia Labs/Tool Labs/Needed Toolserver features was modified, changed by Silke WMDE link https://www.mediawiki.org/w/index.php?diff=670391 edit summary: [+813] /* Database */ added comments/requests from below [15:51:29] Jan_Luca: It's about #4 on my todo, but if you want a crack at it I'd welcome it. The biggest problem is the lack of input validation; there's nothing preventing one from creating a table named "mytool_gotcha; drop database poor_victim" say. [15:51:45] Change on 12mediawiki a page Wikimedia Labs/Tool Labs/Needed Toolserver features was modified, changed by Silke WMDE link https://www.mediawiki.org/w/index.php?diff=670392 edit summary: [-250] /* Projects */ deleted redundancy [15:52:33] Good ol' bobby drop tables. :-) [15:52:33] Coren: What's about using the "using"-part of "execute" [15:52:36] Change on 12mediawiki a page Wikimedia Labs/Tool Labs/Needed Toolserver features was modified, changed by Silke WMDE link https://www.mediawiki.org/w/index.php?diff=670394 edit summary: [+32] /* Job-system */ marked as done [15:53:19] This should quote the statement, shouldn't it? [15:54:05] Jan_Luca: Possiblty, I'm more adept with postgres than mysql; but sanitizing user input is the safe way to do this. :-) [15:54:05] Change on 12mediawiki a page Wikimedia Labs/Tool Labs/Needed Toolserver features was modified, changed by Silke WMDE link https://www.mediawiki.org/w/index.php?diff=670397 edit summary: [-232] /* Misc services */ deleted redundancy [15:54:36] Coren: http://dev.mysql.com/doc/refman/5.1/en/sql-syntax-prepared-statements.html [15:56:19] Jan_Luca: Yeah, that'd be /safe/. [15:56:32] Jan_Luca: I'd still prefer if you could limit the db names to alphanumeric though. [15:57:06] Change on 12mediawiki a page Wikimedia Labs/Tool Labs/Needed Toolserver features was modified, changed by Silke WMDE link https://www.mediawiki.org/w/index.php?diff=670402 edit summary: [+73] [15:57:36] Change on 12mediawiki a page Wikimedia Labs/Tool Labs/Needed Toolserver features was modified, changed by Silke WMDE link https://www.mediawiki.org/w/index.php?diff=670404 edit summary: [+9] /* Licenses: Info */ [15:57:48] Coren: OK, I try to do this :-) [15:59:15] Change on 12mediawiki a page Wikimedia Labs/Tool Labs/Needed Toolserver features was modified, changed by MPelletier (WMF) link https://www.mediawiki.org/w/index.php?diff=670406 edit summary: [+340] /* Projects */ Not applicable, really [15:59:49] Change on 12mediawiki a page Wikimedia Labs/Tool Labs/Needed Toolserver features was modified, changed by Silke WMDE link https://www.mediawiki.org/w/index.php?diff=670407 edit summary: [-24] /* Misc services */ obsolete section deleted [16:02:23] Change on 12mediawiki a page Wikimedia Labs/Tool Labs/Needed Toolserver features was modified, changed by Silke WMDE link https://www.mediawiki.org/w/index.php?diff=670411 edit summary: [+98] /* End-user-support */ [16:02:47] Change on 12mediawiki a page Wikimedia Labs/Tool Labs/Needed Toolserver features was modified, changed by Silke WMDE link https://www.mediawiki.org/w/index.php?diff=670412 edit summary: [-152] /* Permanent blockers for migration of some projects */ moved an item up [16:15:18] Change on 12mediawiki a page Wikimedia Labs/Tool Labs/Needed Toolserver features was modified, changed by MPelletier (WMF) link https://www.mediawiki.org/w/index.php?diff=670414 edit summary: [+333] /* Projects */ monitoring is up [16:18:14] Change on 12mediawiki a page Wikimedia Labs/Tool Labs/Needed Toolserver features was modified, changed by MPelletier (WMF) link https://www.mediawiki.org/w/index.php?diff=670415 edit summary: [+558] /* Projects */ Need more info for ine; {{doing}} for another [16:19:57] Change on 12mediawiki a page Wikimedia Labs/Tool Labs/Needed Toolserver features was modified, changed by MPelletier (WMF) link https://www.mediawiki.org/w/index.php?diff=670416 edit summary: [+366] /* Projects */ done, in part, available for the rest [16:35:09] Change on 12mediawiki a page Wikimedia Labs/Tool Labs/Needed Toolserver features was modified, changed by Silke WMDE link https://www.mediawiki.org/w/index.php?diff=670418 edit summary: [+25] rearranged projects, OSM, Render sections for better overview [16:38:03] Change on 12mediawiki a page Wikimedia Labs/Tool Labs/Needed Toolserver features was modified, changed by Silke WMDE link https://www.mediawiki.org/w/index.php?diff=670420 edit summary: [-27] [16:39:00] Change on 12mediawiki a page Wikimedia Labs/Tool Labs/Needed Toolserver features was modified, changed by Silke WMDE link https://www.mediawiki.org/w/index.php?diff=670421 edit summary: [+2] [17:13:31] Change on 12mediawiki a page Wikimedia Labs/Tool Labs/Needed Toolserver features was modified, changed by MPelletier (WMF) link https://www.mediawiki.org/w/index.php?diff=670426 edit summary: [+354] /* Bots and webservices project */ Not done, [17:25:47] andrewbogott_afk: I see you created and deleted a bunch of instances. is instance creation not working properly? [17:26:04] Ryan_Lane: Hey, a Ryan! :-) [17:26:07] yep [17:26:16] I was about to send out an email [17:26:20] I have the new image in place [17:27:26] Yeay! All problems fixed, then? [17:29:26] yes. there's one last issue [17:29:36] if the initial puppet run doesn't finish, the instance will never run puppet [17:29:43] but nagios should tell us that [17:29:50] and salt will be running [17:30:03] either way, I'm adding in the puppet cron to ensure it will [17:30:08] * Coren nods. [17:30:24] I'm configuring the new NFS servers with the shelf pairs. [17:31:23] awesome [17:31:51] I've already checked and the split works find, both server see both arrays without trouble. [17:32:19] And software raid magic makes it see the actual config right. [17:32:36] so, we don't really need quagga/BGP with this setup [17:32:46] we can just switch the IP on the interface [17:32:57] Right, since they have to be in the same row. [17:33:00] but we'll need to clear the arp table when doing so, if we want it to be relatively quick [17:33:07] * Coren nods. [17:34:02] I'll write shell scripts we can run on either to take over. [17:34:24] So we can test this fast enough. [17:34:47] cool [17:35:08] If we want to be really smart about it we can first try to mount the other server and abort if it works. [17:35:46] I think we want to be smart about it. [17:36:03] Having both run would be a Bad Idea(tm) [17:36:39] if we need to switch, we should ssh into the other system and do a poweroff [17:37:09] we should definitely not have the mount in fstab, so that when systems reboot they don't try to mount [17:37:32] the script should also have a warning prompt: "Did you make sure you powered off the other system?" [17:38:16] "Did you make sure you powered off the other system? If you didn't this command will lead to data loss." [17:38:50] we wont be able to move to the new storage server [17:38:55] we don't want cross datacenter nfs [17:39:08] you need to wait till labstore3/4 are ready ;) [17:39:59] that's a lot closer, but still not there yet [17:51:07] hi [17:53:14] Ryan_Lane: I was hoping you'd be ready by Thursday, but I still need the outage for the switchover. It's just sad. :-) [17:55:47] [bz] (ASSIGNED - created by: Jan Luca, priority: Normal - enhancement) [Bug 46460] Allow tools to create databases - https://bugzilla.wikimedia.org/show_bug.cgi?id=46460 [17:56:02] Coren: I have uploaded a new version [17:56:28] Yeah, I just got pinged for it. I'll take a peek at it later today. Working on NFS (yeay!) at the moment. [17:56:30] Now it tests if the dbname has the syntax _ [17:57:57] Coren: The only problem is that I cannot use the "using"-syntax because this does not work for "create database". And no problem, I only want to help you, you do not have to read it now ;-) [17:58:49] Jan_Luca: That's okay since the user input is sanitized now. [18:01:02] Coren: When you have another coding-problem (PHP, MySQL, ...), I could maybe help you. You should use your time for the more important things :-) [18:06:26] [bz] (NEW - created by: Chris McMahon, priority: High - normal) [Bug 46459] Search should work in beta UI - https://bugzilla.wikimedia.org/show_bug.cgi?id=46459 [18:08:57] Jan_Luca: I'm going to write a todo list of "convenience tools for maintainers" soon, most of which will be webbased. Sounds right up your alley. :-) [18:09:26] [bz] (NEW - created by: Chris McMahon, priority: Unprioritized - major) [Bug 47015] beta cluster: Special:SpecailPages internal error - https://bugzilla.wikimedia.org/show_bug.cgi?id=47015 [19:03:20] students: Wikimedia has been accepted into Google Summer of Code 2013 :) [19:17:25] sumanah: Like there was much doubt. :-) [19:17:35] (Good news anyways, for sure) [19:25:05] Change on 12mediawiki a page Wikimedia Labs/Tool Labs/Needed Toolserver features was modified, changed by MPelletier (WMF) link https://www.mediawiki.org/w/index.php?diff=670614 edit summary: [+506] /* Bug tracker */ res [19:26:22] hm. I wonder how many more volumes I have left to shrink [19:26:31] publicdata took all weekend [19:26:36] (it's 8TB) [19:26:51] Change on 12mediawiki a page Wikimedia Labs/Tool Labs/Needed Toolserver features was modified, changed by MPelletier (WMF) link https://www.mediawiki.org/w/index.php?diff=670615 edit summary: [+365] /* Logs/Stats */ re [19:29:02] Change on 12mediawiki a page Wikimedia Labs/Tool Labs/Needed Toolserver features was modified, changed by MPelletier (WMF) link https://www.mediawiki.org/w/index.php?diff=670616 edit summary: [+548] /* Backup */ re [19:30:35] Change on 12mediawiki a page Wikimedia Labs/Tool Labs/Needed Toolserver features was modified, changed by MPelletier (WMF) link https://www.mediawiki.org/w/index.php?diff=670617 edit summary: [+325] /* access to instances: http/sftp */ Done, actually [19:33:38] Change on 12mediawiki a page Wikimedia Labs/Tool Labs/Needed Toolserver features was modified, changed by MPelletier (WMF) link https://www.mediawiki.org/w/index.php?diff=670620 edit summary: [+508] /* Projects */ details [19:41:52] Change on 12mediawiki a page Wikimedia Labs/Tool Labs/Needed Toolserver features was modified, changed by Tim.landscheidt link https://www.mediawiki.org/w/index.php?diff=670623 edit summary: [+596] /* Bug tracker */ [19:57:00] !log deployment-prep deployment-searchidx01 updating local puppet repository 81f5a93..7d036cb [19:57:03] Logged the message, Master [19:57:11] xyzram: regarding your ukraine wiki change ( https://gerrit.wikimedia.org/r/#/c/58133/1 ) [19:57:22] xyzram: beta does not have that wiki :D [19:58:34] xyzram: so you have to try it out in prod I guess :/ [20:00:20] !log deployment-prep deployment-search01 updating local puppet repo c345581..7d036cb [20:00:22] Logged the message, Master [20:27:14] hashar: Ok, thanks. PY has just merged it and is deploying to production, so we'll find out shortly if it worked :-) [20:32:08] Krenair: so, I have the echo maintenance script running for instance builds now [20:32:16] but it's not actually triggering a notification [20:33:51] Ryan_Lane, no errors in console? [20:34:05] nope [20:34:07] runs fine [20:34:57] I'm getting Permission denied (publickey). from nova-precise2 [20:36:21] xyzram: while you are around, the searching box does not have enough memory for lucene-search-2 . So the Java process ends up being killed off [20:36:28] hm [20:36:31] xyzram: would tweaking the java -Xxm parameter be enough to solve that? [20:36:35] something is up with autofs on the system [20:36:36] fixing [20:36:51] ah [20:36:54] / is full again [20:36:55] damn it [20:37:14] That might explain why it can't insert but can successfully select/delete :) [20:37:22] Assuming that's indeed what the problem is [20:37:30] that's the test wiki [20:37:38] I'm talking about in production ;) [20:37:42] oh right. [20:38:11] [bz] (NEW - created by: Chris McMahon, priority: Highest - blocker) [Bug 47015] beta cluster: Special:SpecialPages internal error - https://bugzilla.wikimedia.org/show_bug.cgi?id=47015 [20:38:16] xyzram: I mean the /etc/init.d/lucene-search-2 script starts java with -Xmx20000m (20G) which is way too much for the instance. Maybe I should drop -Xmx when in beta :-] [20:38:18] I need to get puppet running on this instance [20:39:02] Are there any rows in echo_event where event_type='osm-instance-build-completed' ? [20:39:04] Krenair: ok. nova-precise2 will work again [20:39:10] lemme see [20:40:59] Krenair: yep [20:41:11] okay good, so the event is going in [20:43:20] oh [20:43:21] wait [20:43:23] there's only one [20:43:38] How many times did the script run? [20:43:42] once [20:44:00] for what project? [20:44:02] a:3:{s:12:"instanceName";s:12:"testing-echo";s:11:"projectName";s:7:"testing";s:11:"notifyAgent";b:1;} [20:44:13] that's not an instance I created ;) [20:44:26] 20130309000221 [20:44:30] that's the event timestamp [20:44:37] Someone ran the script in march? [20:44:39] hashar: Do you recall the error message ? [20:44:50] Krenair: yes. I tried testing it then [20:45:06] I ran it again today [20:45:08] xyzram: too much memory used, that triggers Linux out of memory killer that just kill -9 java :-] [20:45:10] And are there any entries in echo_notification where notification_event is the same as that event_id? [20:45:17] hashar: Yes, definitely reducing that to a more reasonable value should fix that. [20:45:22] well, I truncated that table [20:45:28] how much memory does the instance have ? [20:45:37] ... you truncated echo_notification...? [20:45:41] oh [20:45:42] sorry [20:45:46] no [20:45:49] xyzram: 4GB. [20:45:56] I truncated openstack_notification_event [20:46:01] one sec [20:46:05] xyzram: would running java without the -Xxm setting let it handle the mem usage for us [20:46:21] (I had read what you wrote incorrectly ;) ) [20:46:47] xyzram: or I can setup an instance with more memory + do some changes in puppet to let us adjust the memory usage easily. [20:47:08] Krenair: for that old notification, yes [20:47:09] xyzram: I will probably need to use a /etc/default/lucene-search-2 file that would get a new env variable :-D [20:47:13] xyzram: easy stuff [20:47:29] Well there should be entries in echo_event for the each time the script has run [20:47:44] hashar: Let's just try it with 2GB and see what happens ? [20:47:50] sure [20:47:50] should be an entry in echo_event for each time the script has run* [20:47:52] I'm curious too. [20:48:07] xyzram: I will hack the script and let the process run over night(day for you hehe) [20:48:08] Java should run GC when it cannot allocate more memory. [20:49:37] Ryan_Lane, I'm going to try this all again on the test instance [20:50:09] Krenair: cool [20:50:24] !log deployment-prep Changing lucene-search-2 memory usage from 20G to 2G by manually editing /etc/init.d/lucene-search-2 (see {{bug|46459}} ) [20:50:26] Logged the message, Master [20:50:40] sigh, someone has left it broken again [20:51:12] !log deployment-prep deployment-search01 : /usr/bin/java -Xmx2000m :-] [20:51:14] Logged the message, Master [20:51:39] PHP Parse error: syntax error, unexpected '=', expecting ')' in /srv/org/wikimedia/controller/wikis/w/resources/Resources.php on line 165 [20:52:08] [bz] (NEW - created by: Chris McMahon, priority: High - normal) [Bug 46459] lucene-search-2 uses too much memory on labs - https://bugzilla.wikimedia.org/show_bug.cgi?id=46459 [20:52:40] xyzram: I have assigned the bug to me. Will take care of it over the week and report back to you :-] Thank you!! [20:52:51] ok, trying this on w3 then [20:53:02] (/srv/org/wikimedia/controller/wikis/w3 instead of /srv/org/wikimedia/controller/wikis/w) [20:53:25] Krenair: w is broken? [20:53:27] hashar: Ok, no problem. Meanwhile I'll review the search code to see what it wants to keep in memory. [20:53:35] Ryan_Lane, the new instance build is so much better! I even get emailed now. [20:53:51] I'm going to strip out the email part ;) [20:53:55] krenair@nova-precise2:/srv/org/wikimedia/controller/wikis/w3/extensions/OpenStackManager (andrew-is-testing-feel-free-to-reset-this)$ [20:53:56] that's likely to break still [20:54:12] the echo notification will replace that [20:54:13] I guess I'll just get rid of those uncommited changes then, is that okay andrewbogott? [20:54:36] Krenair, sure [20:54:47] glad the new instance build process is liked, though :) [20:54:58] Ryan_Lane: 460 PB/s network, seriously? http://ganglia.wikimedia.org/latest/?c=Virtualization%20cluster%20pmtpa&m=cpu_report&r=hour&s=by%20name&hc=4&mc=2 [20:55:02] ;) [20:55:19] there's some data bug that occurs when instances are deleted [20:55:31] and ganglia doesn't discard obviously junk data [20:58:55] Ryan_Lane, I'm stuck trying to create a test instance on https://wikitech-test.wmflabs.org/w3/index.php?title=Special:NovaInstance&action=create&project=testing®ion=pmtpa [20:59:21] does something not happen? [20:59:40] "Waiting for wikitech-test.wmflabs.org..." [20:59:41] oh. looks like a timeout [21:00:08] now there's no instance types or image types [21:00:24] I restarted nova-api [21:00:37] I wonder what's timing out [21:00:57] api is timing out to something [21:01:06] ah [21:01:07] rabbit [21:03:44] Krenair: it's working now [21:10:53] Ryan_Lane, do you see a notification on those wikis about deletion of instance fgh? [21:12:40] I see one on the test wiki [21:12:55] I believe I've been seeing them in production as well, but they also may have stopped working [21:13:11] since I see in the recent changes andrew creating/deleting instances [21:13:49] in fact, I haven't seen a notification in 23 days [21:14:16] though my settings only show web notifications for mentions and reverts [21:14:31] and none of the OSM custom notifications show up in the preferences list [21:15:10] ah: Alex Monk built instance 'fgh' in project ... [21:15:16] so, it's working in the test instance [21:15:28] for others, at least. [21:15:30] Ugh... [21:15:36] I bet this is something caused by an Echo change [21:15:40] What version is running on wikitech? [21:16:02] https://gerrit.wikimedia.org/r/gitweb?p=mediawiki/extensions/Echo.git;h=945c1cb5a5bd66816467dd98af5eeb9c2dbb1fed [21:16:14] on mediawiki: https://www.mediawiki.org/wiki/MediaWiki_1.21 https://gerrit.wikimedia.org/r/gitweb?p=mediawiki/core.git;h=511d42958669eed33c92e292d7bb50a89626028f [21:16:27] 1.21wmf12 [21:17:04] I try to stay up to date with production [21:18:04] last time I tested this was with a an old version of echo with the old prefs system [21:18:39] ah [21:18:54] /w is up to date with the new system [21:19:21] I should upgrade /w2 and /w3 [21:19:52] Looks like the latest is incompatible with the version of mediawiki core in w3... and w3 isn't a git repo. that's probably my fault [21:20:07] heh [21:20:39] Krenair: It's ALL your fault! [21:20:59] Who's uncommitted changes are these in w? [21:21:10] probably mine [21:21:13] you can wipe them out [21:21:38] Krenair: well, it's technically my fault for not updating the echo code in OSM before I updated MW ;) [21:22:37] okay, core, echo, and osm all up to date on w [21:25:18] Coren: http://bots.wmflabs.org/~addbot/status ;p (the permissions on the folder had changed) works again now :) [21:27:05] addshore: Is the pretties. :-) [21:37:15] Coren: for initial labsdb access, let's limit it to the tools project [21:37:20] that'll make it easier to handle [21:37:26] we can add other project access in later [21:38:26] Ryan_Lane: Sounds like a plan; though I know there are people in analytics that are going to be disapointed. :-) [21:38:41] we can add that next [21:39:04] I'm more worried about getting tool access up than analytics [21:39:14] since we have a timeframe to meet for that [21:39:45] Also a good point. [21:40:05] and I'd not like to rush a solution [21:41:58] ok. food. [21:56:37] Ryan_Lane: https://gerrit.wikimedia.org/r/58226 [21:57:36] I had to modify existing i18n message keys. Did it for en but I think it needs to be done for all other occurrences as well? or will translatewiki handle that? [21:59:10] I believe TWN picks that up [21:59:15] Also turns out git-review wasn't working because the 'gerrit' branch had been set up for ssh://reedy@... So it must've been trying his account instead of mine lol [21:59:42] I just removed the reedy@ since labs usernames are the same as in gerrit [21:59:46] Yup: "If the extension is supported by translatewiki, please only change the English source message and/or key. If needed, translatewiki.net staff will take care of updating the translations, marking them as outdated, cleaning up the file or renaming keys where possible. This also applies when you're only changing things like HTML tags that you could change in other languages without... [21:59:47] ...speaking those languages. Most of these actions will take place in translatewiki.net and will reach Git with about one day of delay." [22:00:07] Also I had to use sudo to change my personal git config [22:00:10] error: opening /home/krenair/.gitconfig: Permission denied [22:00:15] okay, thanks RoanKattouw [22:09:43] Krenair: permission denied? that gluster volume must need help [22:09:59] oh [22:10:04] it's owned by root [22:10:18] fixed [22:10:21] I probably created it with sudo [22:10:36] * Ryan_Lane nods [22:11:05] Krenair: thanks so much for the help! [22:11:25] you're welcome. I probably should've kept up with the changes going on within echo [22:22:52] Krenair: :D [22:23:02] I got 4 emails and 2 notifications in production :) [22:23:12] hm. [22:23:21] about one event? [22:23:26] yep [22:23:37] ... what [22:27:12] lemme try that again [22:32:06] yep. same thing [22:32:09] that's weird [22:32:54] Well it didn't happen in testing... [22:33:19] yeah [22:36:05] I do see two events per instance creation in echo_event [22:38:43] ah [22:38:43] I see [22:38:49] two of the emails are to novaadmin [22:39:03] oh. that's gonna be nasty :D [22:39:13] I need to set that user's email to something that isn't me [22:39:27] but that still doesn't explain why two notifications are being sent [22:40:17] I'm also getting those e-mails [22:40:26] RoanKattouw: yes. that's expected [22:40:27] Probably because I'm a member of the project that you're creating instances in [22:40:36] yes. because you are an admin [22:40:39] Aha OK [22:40:45] I was about to say, what happens when someone touches bastion [22:40:47] and someone created an instance in your project [22:40:50] But if it's only admins, that's fine [22:41:00] it'll also send you one if someone deletes an instance [22:41:26] you can disable those notifications in the preferences, btw [22:42:24] the only thing I can think is that for some reason two entries are being added to the database on instance creation