[02:12:44] RECOVERY Free ram is now: OK on deployment-sql deployment-sql output: OK: 20% free memory [02:16:45] hmmm [02:19:44] ? [02:33:06] Ryan_Lane: Hi [02:33:14] howdy [02:33:31] Ryan_Lane: I understood from a fellow cvn-staffer that there has been some talks about moving CVN services to Labs. [02:33:37] yeah [02:33:51] they'd like a shared account, but I'm against shared accounts [02:34:05] I've been sent a convo [02:34:06] another solution is now going to be a web console for the bots [02:34:19] which I prefer over shared accounts [02:34:30] There's some missing intel in that convo, misunderstandings and some wrong assumptions [02:34:35] oh? [02:35:28] So we've got 20 bots. Written in C#/Visual studio (not our choice, was originally the case and was maintained through the years). Compiled with xbuild on Linux and running with 'mono'. [02:35:39] brb [02:35:44] PROBLEM Free ram is now: WARNING on deployment-sql deployment-sql output: Warning: 19% free memory [02:35:51] 3 developers that would need shell access (including me), and about a dozen staffers that only need access to the web control panel [02:36:07] so this could be a project-group like testswarm, not a shared account. [02:36:29] we have a php control panel already which we could use (requires apache to be running on the same server as the bots) [02:36:55] home-brew script, currently hosted on toolserver as well. [02:37:58] * Firebolt lurk [02:38:04] Ryan is brb [02:40:44] RECOVERY Free ram is now: OK on deployment-sql deployment-sql output: OK: 24% free memory [02:41:02] on phone [02:41:11] ok, i'll type a little more :) [02:41:21] although, not sure whether it should be a project-group like 'testswarm' is. Since this is probably something for Tool Labs, not Devwiki Labs, right ? I understood ClueNet is already on Labs, I didn't' know Tool Labs was already on ? [02:55:21] I can't login at beta.wmflabs, I requested a password reminder but I'm not getting anything. [02:55:26] (checked spambox) [02:55:51] back [02:55:56] lemme read backscroll [02:56:49] Krinkle: mail isn't working in labs yet [02:56:56] we need to set up a new relay for it [02:57:06] in case someone decides to spam through labs [02:57:21] we wouldn't want labs getting all of our sites blacklisted [02:57:25] ok [02:57:37] my views on how labs will work are shifting over time [02:58:35] I think if people working on this are technically proficient enough to forward ssh agents, then let's set this up as a project, and run it like all the others [02:59:03] it could run in the bots project, too [02:59:11] majority (if not all) toolserver accounts know how to use ssh and sftp. [02:59:24] with agent forwarding? [02:59:28] no, not that. [02:59:34] right [02:59:43] for this, people would still connect through bastion [02:59:44] WHy is that required ? I don't have that setup either [03:00:04] how are you connecting to instances behind bastion, then? [03:00:29] @search proxy [03:00:29] No results found! :| [03:00:34] @search proxy- [03:00:34] No results found! :| [03:00:38] @search -proxy [03:00:39] No results found! :| [03:00:40] heh [03:00:44] until a few weeks ago I could used ssh -something [03:00:50] is that no longer supported (from bastion) [03:00:59] using a password? [03:01:04] I guess [03:01:11] yeah, that was a problem, passwords were never supposed to work [03:01:18] !access [03:01:18] https://labsconsole.wikimedia.org/wiki/Access#Accessing_public_and_private_instances [03:01:30] https://labsconsole.wikimedia.org/wiki/Access#Using_agent_forwarding [03:01:34] or... [03:01:43] https://labsconsole.wikimedia.org/wiki/Access#Using_ProxyCommand_ssh_option [03:01:59] the latter works without forwarding, but requires people to edit their config files [03:02:26] we don't have enough public IP addresses right now, so people can't directly connect to all instances [03:03:22] ok. cvn also has a php API though, so we do need a public subdomain [03:03:34] used in gadgets. to get data from bots [03:03:46] yeah, that's fine [03:03:55] okay :) [03:04:15] if we put this in the bots project, it can use the apache server there too [03:04:19] and it has a public IP [03:04:24] bots.wmflabs.org [03:04:32] hm.. shared access ? [03:04:41] not totally [03:04:56] each set of bots can get its own instance [03:05:14] ideally in the future we'll have this project work with no root access [03:05:33] and have another one, for development of the bots that still allows root [03:07:15] ok [03:07:38] Ryan_Lane: I see Ive been granted access on deployment-web, trying to run changePassword for my account. [03:07:48] krinkle@deployment-web:/var/www/commons_wiki/w/maintenance$ php changePassword.php --user [03:07:52] DB connection error: Unknown database 'commons_wiki' (deployment-sql) [03:07:56] heh [03:08:06] I have no idea how they set stuff up there [03:08:11] k [03:10:21] I'd really like some kind of API driven service for bots [03:10:52] where we can control where bots run, and which version of the bot software they run [03:11:28] so, we could do something like "Manage bots", and have an interface where you can add a new bot, remove a bot, move a bot to another instance, change the version that is running, restart the bot, etc [03:11:52] then we could have a service that sits on all of the instances, that can do these actions [03:12:46] manage it via a queue. bot-manager service on instanceA can stop a bot, add a message to the queue that it finished, and instanceB can start it. <- move operation [03:13:16] it could be managed per-project, and via a botadmin role [03:13:34] then it could just be another interface on labsconsole [03:14:01] I've got a `$ ps -f -u ` based hack on the Toolserver as control panel [03:14:31] but it's with a 1 minute delay because toolserver has separate webserver/botserver. so I write command to SQL and cronjob pop()'s the rows [03:14:50] yeah, using mysql as a queue works too [03:15:09] I'm thinking of adding a centralized queue for all projects and instances [03:15:25] so that everyone has access to a queuing servie [03:15:28] *service [03:15:30] http://i.imgur.com/NSwF4.png [03:16:05] it's a one-weekend hack from march-2010 [03:16:09] neat [03:16:17] yeah, that's a really good start [03:16:28] we have a *lot* more control in labs [03:16:33] so, we can make a pretty fancy service [03:16:35] log is the std_out file [03:17:00] is this all php? [03:17:03] yep [03:17:06] * Ryan_Lane nods [03:17:08] php and an .ini file [03:17:25] I do basically everything in python. heh [03:17:40] it avoids needing to install a web server [03:17:55] python can run a wsgi server via eventlet [03:17:56] for example: http://i.imgur.com/7fIBf.png [03:18:15] or, just launch as a daemon, and wait on a queue [03:18:31] yeah, watching a file [03:18:38] does that write messages to irc? [03:18:47] the control panel ? no [03:18:55] the irc bots started from there do [03:19:08] what's the last image? [03:19:12] ini file? [03:19:17] yes, a fragment of it [03:19:19] cool [03:19:27] yeah, this is exactly what I wanted [03:19:29] well, close anyway [03:19:38] but more distributed [03:19:40] statusdetectpath is used to get the PID and uptime from `ps` [03:19:48] I'd like to have a bunch of "bots" instances [03:20:00] a 1-minute cronjob on the botserver dumps output from `ps` in a txt file [03:20:02] where bots can run on any of the instances, and can be moved between them [03:20:04] so that php can read it :( [03:20:06] heh [03:20:18] that's rough [03:20:21] Ah, wrong. No it's not `ps` [03:20:22] it [03:20:29] it's `stat` [03:20:32] `qstat [03:20:45] a daemon with a queue would handle this quicker [03:20:47] Bots are stated through cronsub/SGE so that they're started when they go off for whatever reason [03:21:01] a daemon could also do health checks [03:21:05] so they are started on one of the toolserver servers that have resources available at time of start [03:21:14] ah. I see [03:21:35] toolservers's SGE/cronsub/qsub distributes them automatically at "random" [03:21:40] ah [03:21:53] it's something the toolserver admins wrote to do so? [03:22:34] All I know is that when I want to do a very slow query (e.g. something like SpecialWantedFiles's query on en-wiki), I write it in a .sql file and queue with qsub [03:22:34] I'm wondering how it determines which one is best to run stuff on [03:22:43] then it's executed within 10-20 seconds on one of the servers [03:22:47] * Ryan_Lane nods [03:22:49] interesting [03:23:46] and from crontab I have for example: [03:23:52] hm. it would really be nice to have projects for each bot group, where there's a centralized project for central services [03:23:53] 0,5,10,15,20,25,30,35,40,45,50,55 * * * * cronsub -l -s clogger $HOME/bots/clogger-start.sh [03:24:22] cronsub will then start it, but not if a prices by the name 'clogger' is runing already [03:24:27] proces* [03:24:40] (yeah the ugly 0,5,10 is thanks to Solaris) [03:24:45] * Ryan_Lane nods [03:25:00] that's an interesting, yet kind of clunky way to launch services [03:25:44] well, we drive with the tools we've been given - add the stress of vandal fighting and limited time/resources devoted to it from both ts-root and ts-users like me, and you get this. [03:25:51] yep [03:25:59] hey, it works. I'm not knocking it [03:26:04] :) [03:26:06] we don't have anything like this at all yet [03:26:53] oh, and not to forget the bi-monthly updates when everything breaks and we have to go search through maintenance logs to find out what was changed and have to fix everything :P [03:27:00] :D [03:27:06] well, that's less of a problem here [03:27:20] like this week mono was updated to 2.x unannounced. all cvnbots broke [03:27:24] heh [03:27:28] I heard about that [03:27:35] so that's why you spotted mono.old in the .ini file screenshot [03:27:39] we use LTS ubuntu releases [03:27:52] so, you get what you get for about 2-4 years [03:28:20] and you can upgrade on your own time [03:28:35] this is also a reason I'd like to have multiple bot instances [03:28:37] Ryan_Lane: So before I close up at 3am here. I was wondering to what extend WMF is in contact with WMDE. I mean it would seem like a waste if Toolserver would be, say, doing upgrading hardware upgrades, right ? [03:29:03] well, we have a ways to go to replace toolserver [03:29:12] I'm not even actively trying to do so [03:29:23] I'm just trying to set up cooler things, so that people naturally migrate [03:29:23] right [03:29:45] part of this is the community setting up what they want, as well [03:30:23] biggest things missing right now is probably: live wmf-db replica's, user databases (hm.. already exist), and automatic backups of user data. [03:30:27] I'd really recommend the toolserver people, at some point, starting to move things to labs themselves [03:30:33] meaning the admins [03:30:54] yeah, user databases and backups are incoming [03:30:59] kind of [03:31:11] the database replication is planned, but is likely further out [03:31:26] yeah, although I'm not sure if admins can do that. Right now ts stuff is good but not organized well. Everybody has it's own account and can do stuff. It's hard to migrate. [03:31:39] yeah [03:31:54] labs isn't really far enough along yet, anyway [03:32:01] still too many SPOF in the architecture, IMO [03:32:07] btw, you wanted to know resource usage of cvn-bots ? How can I measure that ? I'm in shell now [03:32:19] I dunno [03:32:29] I wanted to know how much storage space it would need [03:32:40] the rest I don't care as much about [03:32:47] storage is harder to expand than memory/CPU [03:32:53] ok. [03:33:09] hm.. looking for a recursive folder size measure command in unix.. [03:33:19] we have a *lot* of compute nodes slated for this [03:33:24] 16 of the cisco servers [03:34:15] I do remember that 2-3 years ago CVN was unofficially not allowed on TS due to bots being too resource intensive. So they were moved to Alex Zariv (WMF)'s private server. Bots have improved though and most are on Toolserver again. [03:34:33] * Ryan_Lane nods [03:34:45] we should probably have a team that tries to improve bot performance [03:35:49] thankfully, if we architect things correctly, we an isolate bots that perform terribly [03:36:11] by giving them their own instance [03:36:32] Toolserver also offers an svn repo per group-project. CVN is using that too. Any recommendations for migrating that ? Into gerrit ? Or a local git/svn server on the instance ? [03:36:40] gerrit, yeah [03:36:59] we can probably do a repo per openstack project [03:37:34] gerrit has support for LDAP groups, so we can define gerrit permissions on the repo based on the openstack project (which is an LDAP group) [03:38:08] I need to add gerrit control to OpenStackManager [03:38:18] so I can automatically create a repo on project creation [03:38:25] One good argument of labs vs toolserver is their stubborn love of Solaris. [03:38:34] fuck Solaris [03:38:38] :) [03:38:47] we use what the production cluster uses [03:38:53] which, right now, is Ubuntu Lucid [03:38:54] Well, Solaris has cool things. But Oracle is Teh Evulz. [03:39:12] we'll be adding Precise in when it's released [03:39:19] likely shortly before its released [03:39:40] looks like we've got 650M on /home/projects/cvn at Toolserver [03:39:49] oh. that's small [03:39:50] My production servers are still on lucid. Rock solid. [03:39:53] not including mysql [03:39:58] ah [03:40:03] but that's fairly small [03:40:10] yeah. that's not bad at all [03:40:14] we can handle that right now [03:40:37] we really need hardware SQL servers [03:40:53] maybe next week I'll concentrate on adding virt1 as a compute node, and adding the SQL servers [03:41:32] hype ron is working on a database as a service application [03:42:05] resources used: http://pastie.org/3192848 [03:42:08] until then, we'll have to come up with some simpler ways of managing the databases [03:42:17] (that's half actually, since we run bots on two servers) [03:42:23] * Ryan_Lane nods [03:42:35] that's not bad at all [03:42:43] so that times 2 [03:42:54] yeah [03:43:10] not sure if this means much to you as it doesn't say the server specs (or does it) [03:43:19] So, Ryan, any particular reason why the labs aren't v6? [03:43:23] about 400MB memory, total [03:43:34] Coren: because we haven't enabled it yet [03:43:44] Tsk. :-) [03:43:48] also, because I didn't start off with IPv6, now I need to figure out how to enable it [03:44:09] apparently, they expect you to create ipv4 network and ipv6 network together [03:44:10] If you need help there, I'm all yours. v6 is my battle horse. :-) [03:44:33] I already added the v4 network, so now I need to modify the database directly, likely [03:44:58] I really need to reach out to my openstack connections for more direct help :) [03:45:24] Krinkle: I'm not terribly worried about memory or CPU usage [03:45:31] ok [03:45:33] Krinkle: since your instances won't directly affect others [03:45:43] you are limited to whatever instance type is used [03:45:51] yay for virtualization [03:45:54] My cloak unhelpfully hides the fact that even my IRC is v6. :-) [03:45:56] so what's the limit ? [03:45:58] :P [03:46:01] Coren: heh [03:46:11] Krinkle: well, some instance types can be 32GB of ram [03:46:19] not that I ever see that being used [03:46:28] how much does the physical server have ? [03:46:32] 48 [03:46:37] the ciscos have more, I think [03:46:42] maybe 64 [03:46:53] *cloink* [03:46:58] that's.. incredible [03:47:04] did I mention we are going to have 16 of those? :) [03:47:08] hehe [03:47:15] plus a few more for SQL [03:47:35] cloink? [03:48:33] the throat sound, not sure how to write that in English :P [03:48:49] like swallowing in cartoons when something amazing is pointed out [03:49:02] am I making any sense ? [03:49:06] yes [03:49:10] yeah, we're going to have a lot of resources at our disposal. heh [03:49:40] now we just need to write the tools to manage them properly ;) [03:50:03] * Firebolt wishes he could use a server like that for minecraft [03:54:20] k, thanks Ryan_Lane , gonna go now. we'll talk later upcoming week. [03:54:31] ok. sounds good [03:54:32] * Ryan_Lane waves [04:17:40] * jeremyb pokes the wind [05:05:44] PROBLEM Free ram is now: WARNING on deployment-sql deployment-sql output: Warning: 19% free memory [05:38:54] PROBLEM Free ram is now: WARNING on bots-2 bots-2 output: Warning: 19% free memory [05:45:44] RECOVERY Free ram is now: OK on deployment-sql deployment-sql output: OK: 20% free memory [06:38:44] PROBLEM Free ram is now: WARNING on deployment-sql deployment-sql output: Warning: 19% free memory [08:33:06] Anyone here able and knowhow to run a maintenance script on beta-labs ? [08:33:14] umm [08:33:18] don't think so [08:33:19] email hasn't been set up and I apparnetly' can't login [08:33:23] need a pass reset on my account [08:33:41] yeah. email is not yet working in labs [08:33:47] we need a relay specific for labs [08:33:55] so we don't get everything blocked if someone spams from labs [08:34:06] you told me yesterday ;-) [08:34:12] (yesterday for me anyway0 [08:34:43] changePassword.php should work though [08:34:49] can't get it to work [08:35:47] orly ? Just got in. [08:35:51] nvm :) [08:39:33] I think it's wrapped with het deploy stuff [08:39:40] there's likely a wrapper script you need to run [08:40:48] yep, but couldn't find anything like that [08:41:04] the drive structure isnt' documented afaik [08:41:08] i.e. what is stored where [08:41:54] web root looks non-standard, I was looking for /srv/ but that didn't exist. Did find separate mediawiki checkouts in a folder somewhere, seemed to match the active wikis [08:42:21] that's either old and unused or labs isn't using SiteConf / shared source yet [08:51:48] hm [08:51:55] well, it shouldn't be in /srv anyway [08:51:58] should be under /usr [08:52:34] live and prototype use /srv/ ? [08:54:56] production does not [08:55:01] prototype did [08:55:09] production uses a directory under /usr [08:55:11] ok [08:55:22] so where would mw checkout and maintenance be etc. ? [08:56:13] under that directory [08:56:41] k,will look at it tonight, no ssh key where I am now [08:56:45] thx [08:56:47] it's here: /usr/local/apache/common [08:57:38] Ah, I've seen that on wmf logs etc. yeah, I remember :) [09:37:41] Ryan_Lane: hi [09:37:47] howdy [09:38:10] is it possible to run non console application in a lab [09:38:58] what do you mean? [09:39:10] you mean a graphical one? [09:39:14] yep [09:39:19] like an IDE [09:39:28] why would you run an IDE on labs? [09:39:31] or a JAVA GUI config [09:39:35] rather than on your local system? [09:39:40] to debug [09:40:37] I was thinking of writing a config util for search - but it would be gui based [09:40:52] and there is little point in doing that if we can't use it [09:41:38] we wouldn't use a GUI one anyway [09:41:45] unless it was completely web based [09:41:55] in which case you don't need a GUI [09:41:58] I thought as much [09:42:32] it's possible to have a GUI if you can forward X11 somehow [09:42:47] that's probably hard through a bastion host, though it may work with proxycommand [09:42:48] I saw some docs on that [09:43:01] but it's going to be incredibly laggy [09:43:14] GUI apps across the WAN usually suck [09:43:41] ok [09:44:12] sell so far I don't need it [09:44:19] so so far I don't need it [09:44:44] * Ryan_Lane nods [09:45:18] it should be possible to remotely connect to a debugger using a local IDE [09:45:57] there are about 4 diffrent ways of working with eclipse on a remote systems (an they all suck) [09:46:14] 4 that I know of [09:46:30] for connecting to a debugger? [09:48:04] more than just a debugger [09:48:21] also access to the file system and start stop stuff [09:53:26] there is a remote file system explorer [10:58:52] RECOVERY Free ram is now: OK on bots-2 bots-2 output: OK: 20% free memory [11:06:57] ACKNOWLEDGEMENT Disk Space is now: CRITICAL on puppet-lucid puppet-lucid output: DISK CRITICAL - free space: / 34 MB (2% inode=35%): [11:11:52] PROBLEM Free ram is now: WARNING on bots-2 bots-2 output: Warning: 19% free memory [11:46:52] RECOVERY Free ram is now: OK on bots-2 bots-2 output: OK: 20% free memory [12:15:57] ACKNOWLEDGEMENT Free ram is now: WARNING on bots-2 bots-2 output: Warning: 19% free memory [12:17:55] Oh, this is going to be funny and annoying [12:18:20] Bots-2 is hovering at ~20% free memory [12:51:46] !log bots installed libhtml-entities-numbered-perl [12:51:49] Logged the message, Master [12:51:54] !log bots installed liburi-perl [12:51:55] Logged the message, Master [12:52:34] !log bots installed libxml-perl [12:52:35] Logged the message, Master [12:53:50] !log bots installed libcarp-always-perl [12:53:51] Logged the message, Master [13:04:14] Johnduhart, Can you please help open the sa-wikisource beta page for the community? [13:10:19] I am getting a headache on bots-3 [13:11:53] Coren, how heavy is your bot on bots-3? [13:22:09] Coren, forget it, swapping bots did not solve my problem [13:23:32] Raama: you need to open a wiki? [13:23:42] beta wiki lab [13:23:45] ok [13:23:55] for sa.wikisource.org [13:24:52] how long will it take? [13:33:02] done [13:33:30] Raama: ^ [13:34:05] Beetstra, Coren I created a way for users of lab to configure nagios [13:34:19] unfortunatelly ldap doesn't work yet [13:34:28] so if you wanted I can create an account for you [13:39:06] ya [13:39:13] sure petan... [13:39:59] Anyways thanks for the help :) [13:42:14] :o [13:43:03] Can someone explain: [13:43:17] If I log in on en.wikipedia, and then switch to meta, then I am not logged in on meta [13:43:30] if I log in on meta.wikimedia, and then switch to en.wikipedia .. I am logged in .. [13:43:43] (I don't see the 'log in on all projects check-box anymore) [13:45:28] it should have log you in on all projects [13:45:43] Beetstra: are you talking about prod or beta? [13:45:54] the real thing [13:46:09] I think it should work for all sites [13:46:16] Yes, that is also what I expected [13:46:37] But I logged my bot in on en.wikipedia, and it did not save on meta .. and I did not have a single clue why [13:46:53] cookies? [13:47:13] no, both fully enabled [13:47:22] And I tried it from my laptop .. and I have the same effect [13:47:23] no idea [13:47:48] log in as bot on en.wikipedia, switch to meta - logged out. Log in on meta as Beetstra, switch to en, logged in as Beetstra, not as COIBot [13:47:49] it doesn't work really well [13:47:52] no, indeed [13:56:56] Oh now it gets even funnier .. [13:57:02] 1) log in on en.wikipedia.org as COIBot [13:57:08] 2) switch to meta - you are logged out [13:57:12] 3) log in as Beetstra [13:57:19] 4) switch to en.wikipedia, you are Beetstra ... [13:57:26] 5) switch back to meta, log out [13:57:35] 6) switch to en.wikipedia, and you are COIBot .. [13:58:40] * Beetstra calls an exorcist [14:31:27] Beetstra: Gentle on ram, reasonable on bandwidth, and will occasionally bring the CPU to its knees crying for momma. [14:33:43] OK .. one of my bots is a beast what that regards, but it could run next to yours [14:34:50] But the problem was not the perl install, it is something with my bots that I don't see [14:40:12] Beetstra: should I enable memory check on bots-2 [14:40:18] I disabled to avoid spam [14:40:34] I think it is hovering close to 80% all the time [14:40:40] I know [14:40:43] that's why I did it [14:40:45] !nagios [14:40:45] http://nagios.wmflabs.org/nagios3 [14:40:53] there is current usage [14:41:01] it doesn't report to irc now [14:41:25] You could for bots-2 put the limit at 95 .. if it gets close to that I /really/ need to do something about it [14:41:34] 95 is critical [14:41:45] :D [14:41:47] now [14:41:58] and I can't change it only for one server [14:42:01] or maybe I can [14:42:04] dunno [14:42:12] yes, I can [14:42:53] I think that the memory footprint for LiWa3 is going to get lower [14:43:03] I tried less parser modules yesterday, and it start lagging [14:53:48] Now figure out what I am doing wrong, or what is going wrong in this funny bot [14:54:10] I don't believe it is my mistake [14:56:38] Beetstra: Global warming? Also your fault. :-) [14:56:50] heh [14:57:09] Well, actually that is what I am going to work on to prevent it .. in a way [14:57:13] But that is a different story [14:57:44] I have a mainbot, with a handful of modules running under it .. I log one module in, the cookie is set, and that should be taken over by the others .. [14:57:58] But one module says over and over that it is not logged in .. [14:58:37] As if it is loading a cached page from a time it was not logged in [14:58:58] How are you storing your cookies? [15:00:40] is a file in the dir of the bot [15:01:04] but, the login happens through the mainbot, and the modules should pick up the cookie .. one does, another does not [15:01:44] You're using an HTTP::Cookie jar? [15:02:44] yep [15:03:25] Did you create it with autosave=>1 ? [15:03:35] yes [15:03:42] I even tried deleting it and re-creating it [15:04:54] Hm. I've had race conditions about /when/ cookies are saved when doing that if you don't have explicit saves. Perhaps that's what's happening? One of your subprocesses tries to eat the jar before the cookie is (correcly) serialized and on-disk? [15:05:47] And to make it even more crazy, that module is not logged in on meta, it is on en.wikipedia .. [15:05:53] The other module is logged in on both wikis [15:07:09] Whee .. and now suddenly, without code change .. it is logged in on both [15:07:16] Some strange, strange caching .. [15:07:25] Yep, that sounds like a race condition allright. [15:08:08] Sigh [15:08:15] Ah well, it works [15:08:38] writing something like 20 debug lines in the code to see where exactly it fails .. can now all be turned off [15:10:03] I wouldn't count on it. If it's a race condition - and I'm fairly certain it is - the context switch the system calls for the prints cause is likely to be part of the sequencing and will affect whether the bug shows up or not. [15:11:08] I just turn them off, not really remove them. Can always turn them on again [15:11:25] Add to that the fact that it's not immediately clear from the doc that an HTTP::Cookie jar can be accessed concurently at all in the first place, and you're just begging for an heisenbug. :-) [15:12:27] Well, probably I should write simpler bots (and everybody should stop using external links on wikipedia so we don't have a spam-problem) [15:14:31] But my last code-upgrade (to make the bots really get all the external links that are added) made one of the bots too heavy for the original box it was on .. and even here it fills 80% of bots-2 memory .. [17:39:28] ugh [17:41:27] johnduhart: Is there an easy way (that you know of) to import the MediaWiki namespace to beta for all running wikis? [17:41:44] We're probably going to need a lot of gadget testing [17:44:02] hexmode, setup the same i18n caching? [17:44:06] though [17:44:16] just having the extensions enabled should do it [17:44:32] Reedy: I'm looking for gadget testing [17:44:38] https://yi.wikipedia.org/wiki/%D7%9E%D7%A2%D7%93%D7%99%D7%A2%D7%B0%D7%99%D7%A7%D7%99:Gadgets-definition [17:44:46] ah [17:44:56] that isn't on yiwiki.beta [17:49:15] hi hexmode [17:49:22] need anything asap? [17:49:32] because I need to go :o I hope everything is ok [17:50:21] petan: unless you know of a way to import that namespace right now, no, there is no emergency [17:50:29] which one [17:50:36] MW? [17:50:41] I am working on a tool for that [17:50:44] I'm thinking I'll have to script up a MW import for all wikis [17:50:51] that what I want to make [17:50:54] I would insert it to svn [17:51:01] which language you prefer? [17:52:16] petan: I don't really care as long as it works. I'm programming in php, after all. ;) [17:52:45] ok... I have no idea how to do this in php anyway need to go now [17:52:54] k [17:53:09] tell krinkle to log all changes he make :P [17:53:34] :) [18:17:55] there's a maintenance script which could export the whole ;edoawolo ns for a wiki [19:12:31] hexmode: regarding that mail on wikitech-l about testing gadgets, the envrionment should be already set up? [20:35:30] !log deployment-prep released unused IP address from project [20:35:33] Logged the message, Master [23:17:14] Ryan_Lane: can you insert me to open stack [23:17:23] btw which IP was unused [23:19:48] sure. [23:19:55] 215 [23:20:23] I thought it was release week ago [23:20:36] release IP was broken [23:20:39] ah [23:20:40] I fixed that too :D [23:20:42] :) [23:20:52] I also fixed reassociate address [23:20:55] I just returned from czech wiki meet up, heh funny thing [23:21:02] ok [23:21:10] so that you don't need to manually disassociate and reassociate [23:21:15] it will do that for you [23:21:18] that's cool [23:21:28] I also changed the new instance page [23:21:33] check it out, it's way better now :) [23:22:44] sure [23:23:13] btw Ryan_Lane I want to make nagios configurable but it parse the info from the nova [23:23:16] using smw [23:23:26] is there a way to let users check what they want to monitor [23:23:39] on instance config page so that nagios could get it from json? [23:24:49] like "nagios config" and there would be bunch of values to check [23:25:06] 01/16/2012 - 23:25:06 - Creating a home directory for petrb at /export/home/openstack/petrb [23:25:21] well, we could have a puppet variable, where users can enter values [23:25:34] then you can parse that SMW property [23:25:49] I would prefer to have check boxes [23:25:53] yeah, me too [23:26:05] 01/16/2012 - 23:26:05 - Updating keys for petrb [23:26:06] but that would be puppet classes [23:26:09] hmm [23:26:16] let's just bypass puppet [23:26:22] hm, ok btw regarding bots [23:26:25] I can add another interface for monitoring [23:26:31] so.... [23:26:32] I need to create class for application server for bots [23:26:38] nova-dev1 and nova-dev2 are for andrewbogott [23:26:46] so that we can eventually start playing with "non root" project [23:26:46] nova-dev-3 is hyperon's [23:26:59] nova-ldap1 is for nova-dev1 and nova-dev2 [23:27:05] nova-production1 is mine [23:27:08] ok [23:27:18] petan: yeah [23:27:29] petan: talk to Krinkle. his web interface is interesting [23:27:33] ok [23:27:38] I'd really like to go a little further than what they are doing [23:27:46] I'd like an API, with a backend service [23:27:48] anyway I don't want to keep all stuff web clickable, people will need to use shell [23:27:53] the backend service would run on all the bot nodes [23:28:02] for maintenance [23:28:09] the api would take requests, and would stick them into a queue [23:28:17] Ryan_Lane: you will cry when you look at the source though - I certainly have when I looked at it after 2 years. [23:28:22] you know it's cool to control stuff using browser, but you can't grep logs in that [23:28:33] then the backend service on the bots would handle the actions [23:28:36] petan: everything is possible [23:28:40] hi Krinkle [23:28:49] re deployment, did you reset pw? [23:28:51] I'd like to treat it like an openstack service [23:28:55] btw log everyting pls [23:29:16] ok [23:30:03] petan: no i remembered the password, at last [23:30:08] cool [23:30:37] I'm away-ish, btw. we are doing SOPA blackout stuff [23:30:42] Ryan_Lane: I think we should put it on proposal and discuss [23:30:47] petan: indeed [23:30:55] ok [23:31:03] I've been wanting to spend some time on all the proposals, but haven't gotten time just yet [23:31:09] Ryan_Lane: SOPA was a part of discussion on wiki meetup :D [23:31:26] Ryan_Lane: they were like: are they going to shut off our wiki too? :D [23:31:34] just enwiki [23:31:35] heh [23:31:36] I know [23:31:39] I told them [23:31:53] I've seen mail [23:31:54] from you [23:31:56] on wikitech [23:32:11] also the beta on test wiki [23:32:15] of blackout screen [23:33:09] btw Krinkle if you wanted to add to bots project let me know [23:33:19] there is a lot of stuff to work on [23:33:29] we have like 10 production bots [23:33:44] and we have really troubles to utilize vm's well [23:33:58] because it's always like one vm is idle and another has load 7 [23:34:01] or more [23:34:04] Yeah, CVN is quite big though. may be better to create a separate project (or what it is called ? Ryan_Lane knows from yesterdays convo) [23:34:22] oh ok, anyway if you needed help let me know [23:34:33] usefull on bots is that we already have nfs and apache server [23:34:35] since CVN has 20 IRC bots, a php-powered web control panel and a php-powered read xml/json API [23:34:37] with public ip [23:34:55] well, I'd prefer it all to be in bots project [23:34:59] ok, 20 irc bots in nothing compared to cluebot and coren bot :D [23:35:17] the way I have the bots service mapped out in my mind allows us to move bots at will between instances [23:35:29] Krinkle: even wm-bot is running on bots [23:35:34] it's irc bot [23:35:48] 20 irc bots, joining almost a 1000 channels (850+ on irc.wikimedia.org and some on freenode) [23:35:48] ) [23:36:02] Krinkle: no reason to make another project for them [23:36:11] we can put them all to own instances if you want [23:36:29] like I said. project / instance, whatever you call it [23:36:34] Krinkle: wanna add to project so that you can start on that? [23:36:34] group [23:37:17] can users be granted access to it ? I don't want to have them run on my personal account (ok for now but not when we switch for primary from toolserver to labs) so that my fellow staffers can access it as well. [23:37:37] right that's exactly what I have problem with too [23:37:40] there's sub-access within the bots projects. [23:37:42] my prod bots are running there too [23:37:53] but people who currently have access there are pretty trustworthy [23:38:01] in future it will run on non root vm's [23:38:08] other cvn members need access to them and at the same time ideally other bots-users not. [23:38:11] for now, all people have access to all vm's [23:38:18] who are project members [23:38:48] huh ? So I can access any project ? I thought I had to be granted access to each vm separately [23:38:52] no [23:39:04] all vm's in project can be accessed by member of project [23:39:12] I remember hexmode adding me to 'beta' (deployment-web) [23:39:14] you can't access vm's in other project [23:39:20] deployment is a project [23:39:20] oh 1project multiple vms [23:39:24] yeah [23:39:25] you have access to all vm's there [23:39:27] Krinkle: ^ [23:39:33] I am in that project too [23:39:33] yeah, that makes sense [23:39:41] not only -web [23:39:42] to all [23:39:44] some projects will be considered "global", which means no one but ops has root [23:39:57] btw Krinkle next time you change something on deployment, log it :) [23:40:02] bots, eventually, will be "global" [23:40:03] if one should not have access to another vm in the same project, it sounds ike it should be a separate project. but that's just my POV from what it sounds like [23:40:12] petan: I didn't change anything afiak. [23:40:15] really? [23:40:18] no [23:40:26] I thought like stuff from /mnt/www got lost [23:40:28] but ok [23:40:33] probably was someone else [23:40:35] I logged in and snooped around trying to find maintenance [23:40:43] but couldn't run changePassword.php [23:40:46] it's in /usr/local/apache/common [23:40:47] how is that setup ? [23:40:52] like on prod [23:40:55] it's similar [23:41:04] btw never use -web [23:41:06] yeah, I don't have access to prod (not yet atleast) [23:41:07] use -dbdump [23:41:08] !log deployment-prep to solve the trusted XFF problem, I installed tinycdb and created an 0 length file in the right place [23:41:09] Logged the message, Master [23:41:17] that's a maintenance vm [23:41:22] -web is only for apache [23:41:23] petan: documentation :) [23:41:26] I know [23:41:43] Krinkle: http://labs.wikimedia.beta.wmflabs.org/w/index.php?title=Help [23:41:47] that's beginning I made [23:41:48] :D [23:41:54] I need to expand it [23:42:04] petan: I was gonna try to get mirror MediaWiki: ns for all supported wikis unless you have already done it [23:42:12] especially the gadget stuff [23:42:22] hexmode: we talked about it few hours ago [23:42:33] I told you I work on that :) [23:43:54] [18:50:44] I'm thinking I'll have to script up a MW import for all wikis [23:43:55] [18:50:50] that what I want to make [23:43:57] [18:50:54] I would insert it to svn [23:44:00] [18:51:01] which language you prefer? [23:44:02] [18:52:15] petan: I don't really care as long as it works. I'm programming in php, after all. ;) [23:44:28] petan: oh, I know, but I want to [23:44:38] if you haven't already started [23:44:38] btw hexmode if you know how to do that, please do it :) [23:44:45] no I didn't start [23:44:52] k [23:44:53] but I would be happy to help you if you put it to svn [23:44:55] or something [23:44:56] * hexmode takes over [23:45:03] sure [23:45:05] ok [23:45:38] anyway I am drunked now, and I need to sleep because tommorow I have to wake up early :D [23:45:44] wiki meetup was really crazy [23:46:04] better not ssh to any vm :D [23:46:29] Krinkle: you want to add to project or not? [23:46:30] bots [23:46:39] actually I don't care so I added you [23:46:49] you can remove yourself if you need [23:47:09] 01/16/2012 - 23:47:09 - Creating a home directory for krinkle at /export/home/bots/krinkle [23:47:39] thx :) [23:47:44] Ryan_Lane: new instance window is cool :) [23:47:52] people will not be confused with classes [23:47:56] yeah [23:48:01] petan: Is there curl, mono, xbuild, php here ? [23:48:06] and svn ? [23:48:08] 01/16/2012 - 23:48:08 - Updating keys for krinkle [23:48:16] btw Ryan_Lane I wanted to help with extension but I needed to test it somewhere :> [23:48:26] petan: use nova-production1 [23:48:33] Krinkle: mono is there, php cli too, other you need to log and install [23:48:40] make a copy of the 1.18wmf1 directory [23:48:46] Krinkle: everything what is not logged will not be puppetized [23:48:51] ok [23:48:56] and change apache to have another wiki [23:49:02] ok [23:51:01] !log bots new vm for dozen of bots of Krinkle created :o [23:51:02] Logged the message, Master [23:51:20] Krinkle: I created bots-4 for you, because other vm's are already busy a bit [23:51:27] ok [23:51:29] bots-3 is half full so you can use it too [23:51:42] !nagios [23:51:42] http://nagios.wmflabs.org/nagios3 [23:51:58] btw if you wanted to have account on nagios let me know I try to use ldap but it doesn't work [23:52:49] you need to use a proxyagent [23:52:54] ah [23:52:56] see /etc/ldap.conf [23:53:01] or /etc/ldap/ldap.conf [23:53:01] I used apache authz_ldap [23:53:06] yeah, that's fine [23:53:12] but it doesn't work [23:53:12] but you need a proxyagent for lookups [23:53:21] anonymous searches aren't allowed [23:53:32] AuthName "Nagios restricted" [23:53:34] AuthLDAPURL "ldap://virt0.wikimedia.org:389/ou=people, o=wikimedia" [23:53:46] maybe it's wrong? [23:54:14] ok I will try to find it [23:54:58] not o=wikimedia [23:55:07] look at ldap.cond [23:55:09] *conf [23:55:12] I did [23:55:22] but no idea how to configure it for apache mod [23:55:28] dc=wikimedia,dc=org [23:55:35] ou=people,dc=wikimedia,dc=org [23:55:45] ok [23:55:54] also, you want to use CN as the search attribute [23:55:55] not uid [23:56:02] for consistency [23:56:09] since all other web services are using cn [23:56:45] (&(cn=)(objectclass=inetorgperson)) [23:56:53] that can be used as a search [23:57:11] hm... [23:57:16] it should use TLS, if possible, as well [23:57:26] ok I will look in that tommorow [23:57:30] * Ryan_Lane nods