[00:00:14] :) [00:00:34] * addshore cant wait for it all to cometogether :) [00:03:56] Ryan_Lane: you could always write me a faultless php framework for wiki api and wikidata also incorperating the db repls ;p [00:04:25] hahaha. good luck on that one ;) [00:05:34] addshore: Why isn't it working on tools? [00:06:13] Coren: my scripts run if I run them straight on an exec instance but not through qsub or jsub [00:06:28] addshore: What is the result when you try? [00:06:35] * addshore is trying to find it [00:07:11] 10:1 you're bumping your head against the allocation limit. What language is this written in? [00:07:17] libgcc_s.so.1 must be installed for pthread_cancel to work [00:07:28] php [00:07:31] Yep. Running against the ulimit. Lemme guess: php? [00:07:34] * Coren nods. [00:07:39] :D [00:07:48] php is teh evil. It never error checks its malloc()s [00:08:16] Best way: try once with -mem 2G, and when it's running check it's actual usage. [00:08:40] it uses about 16mb tops [00:08:55] addshore: Heh. PHP won't even *start* with just 16m [00:09:09] thats what it uses when its running though ;p [00:09:25] *correction 19 [00:09:45] addshore: More precisely it's using PHP overhead + 19M. :-) [00:09:53] Try it with -mem 2G [00:10:19] jsub -mem 2G 'php /data/project/addbot/wikidata/g.php --lang=de' ? [00:10:53] Yep. Then qstat -j jobnumber to look at its vmem and maxvmem [00:11:34] I'm pretty sure you're going to be surprised at how much a glutton php is. [00:11:42] (Though, admitedly, nowhere as bad as mono) [00:12:22] Coren: same error when running it [00:12:30] o_O [00:12:37] Mind if I try? [00:12:40] go for it [00:16:20] * Coren is confused [00:17:32] :D [00:18:16] Ah-ha! [00:18:31] I didn't even realize it! You're overquoting! :-) [00:18:57] You're trying to run a program named 'php /data/project/addbot/wikidata/g.php --lang=de'. Literally! :-) [00:19:13] Remove the quotes and you get it to run. It doesn't work, but for a real reason. :-) [00:19:45] PHP Warning: require(/home/addshore/.secure/.stathat.key): failed to open stream: Permission denied in /data/project/addbot/config/stathat.php on line 2 [00:20:06] hmm, thats because you dont have permission to that file :P [00:20:49] The problem isn't that /I/ don't have permission for the file, but that local-addbot doesn't. :-) [00:21:21] You /are/ running your bot through the tool account, aren't you? :-) [00:21:43] ahhhhh, is that my problem? :P [00:22:15] ... :-) [00:22:24] so, su to local-addbot? ;p [00:22:32] shorthand: become addbot [00:22:35] :D [00:23:04] >.< I need a password to su to addbot? [00:23:21] su always needs a password. You can /sudo/ to it though. [00:23:26] sudo -iu local-addbot [00:23:37] :) [00:23:51] Or, again, you know... 'become addbot' [00:23:57] Which does that. :-) [00:25:20] oh :P you didnt put it in quotes the first time ;p [00:26:40] addshore: That's because you seem to be overly fond of quotes. :-) [00:27:21] :( I cant chown my files to have addbot as the owner :< [00:30:02] I can also chown them for you if you want. [00:30:36] addshore: ^^ [00:38:46] What does it mean that a file has a T bit in the file permissions and only that bit set? [00:40:13] apmon: which project is this? [00:40:28] it usually means that a brick is hung for that gluster volume [00:41:09] maps [00:41:26] it is only a few files in /data/project/repo of maps [00:42:15] apmon: That the file has been broken by gluster. [00:42:15] apmon: Add to the long list of reasons to be happy we are getting rid of gluster. [00:42:37] can I just delete or overwrite those files? [00:42:58] apmon: one sec [00:48:19] Coren: moved on to another error now :P [00:48:31] addshore: Pray tell? [00:49:25] well, something isnt working quite right with my db but its probably my fault ^^ [00:49:54] Probably. :-P [00:50:07] apmon: try now? [00:51:03] cd: project: Transport endpoint is not connected [00:51:38] before I could at least see some files and write to the directory... ;-) [00:54:49] Ryan_Lane: I can access the directory from maps-tiles2, but not maps-tiles1 [00:55:08] but I see the same 0 byte T files on maps-tiles2 as I saw on maps-tiles1 [00:55:12] gimme a sec [00:55:13] ah ok [00:56:18] maps-tiles1 is back again [00:56:27] I guess those files were split-brained before the shrink [00:56:34] you'll likely need to remove them [00:56:59] OK, no problem [00:58:46] thanks for looking into it [00:59:28] yw [00:59:55] committing the shrink on bots-project [01:02:39] * Hazard-SJ says hello and seeks assistance [01:03:05] :O [01:03:08] * Krenair waves [01:05:57] hey Hazard-SJ [01:06:21] Coren: all volume shrunk [01:06:27] except for the stopped one [01:06:29] *ones [01:06:34] and I can't fucking delete those [01:06:41] so, we'll just consider it done [01:06:49] it doesn't hurt to have down'd peers [01:06:56] Ah. [01:07:14] Although, history has proven /that/ statement wrong at regular intervals. :-) [01:07:34] What's the problem, Hazard-SJ? [01:07:50] Do we now call labstore[34] ripe for reaping? :-) [01:08:08] yes [01:08:18] have at em [01:08:30] first, though... [01:08:43] let me kill all the gluster processes [01:08:51] and let's leave that over night [01:09:01] Ah. See if something goes boom. [01:10:39] Although, again from experience, we know that gluster-related things don't go boom, they go a sort of squishy-wet "splorch". [01:11:52] seems to be fine so far [01:12:01] oh. they go "boom". trust me [01:24:28] "Boom" implies some sort of failing solidity; gluster fails like a cardboard box full of overripe tomatoes; it doesn't explode, it just collapses under its own weight and leaves mush all over the place. :-) [01:26:35] :D [01:26:38] true [01:27:24] Krenair: Sorry for the delay. Would it be better if I ran by scripts in the bastion cron or from an instance? I'm not entirely familiar with some of these things :P [01:27:36] definitely not the bastion [01:27:44] other non-bastion instance [01:28:12] ah [01:28:47] Ryan_Lane, Krenair: And in that case, how would I put myself on another instance? I think I remember seeing something saying not to use bastion for such things, as they'd slow down things for everyone. [01:29:19] ssh instance-name [01:30:03] looks like paravoid disabled one of your crons on november 27th for dosing bastion :) [01:40:46] So shouldn't something like ssh bots-bnr3 work? [01:40:54] Krenair: Yes :P [01:41:10] If you're in the bots project, yes [01:41:30] I am, but I got an error [01:41:45] Oops [01:42:04] Wait, logged in to something else :P [01:42:07] error? [01:42:32] (you are in the bots project btw) [01:42:35] I was on a different server [01:42:42] Yes, I'm in the project [01:43:19] I got in [01:56:54] Am I supposed to use /data/project/bot_name or just use my home? [01:58:54] use /data/project storage [02:00:44] legoktm: That's where I'm slightly puzzled - there is only one directory in /data/project as far as I see :/ [02:01:31] errr [02:01:34] which host are you on? [02:01:47] and when was your account created? it might take up to an hour to show up [02:02:57] legoktm: I'm currently exploring on bots-bnr3, so I guess that's why? My account's fairly old by now :P [02:03:16] * legoktm looks [02:03:31] legoktm: Wait [02:03:41] http://dpaste.de/wiqen/raw/ [02:03:46] wfm [02:04:33] That's when I login on bastion itself in WinSCP [02:06:01] And no matter what I ssh I still have (username)@bastion1:~$ ... is that wrong? [02:07:23] idk [02:07:32] im not sure how WinSCP works [02:07:53] https://wikitech.wikimedia.org/wiki/Access#Accessing_public_and_private_instances [02:07:54] legoktm: The (username)@bastion1:~$ is from PuTTY [02:07:54] try that [02:08:17] and the dir only gives me asher's dir [02:08:19] * Hazard-SJ looks [02:08:45] winscp probably can't connect all the way through [02:08:51] there's a good tutorial for windows somewhere [02:08:54] one sec [02:09:31] https://wikitech.wikimedia.org/wiki/User:Wikinaut/Help:Access_to_instances_with_PuTTY_and_WinSCP [02:10:58] :| [02:11:10] https://wikitech.wikimedia.org/wiki/User:Wikinaut/Help:Access_to_instances_with_PuTTY_and_WinSCP#How_to_set_up_WiNSCP_for_tunneling_through_bastion.wmflabs.org_to_your_instance [02:21:30] I'm getting somewhere now [02:22:12] Thanks :D [02:26:35] yw [02:29:53] I can't create a folder, though :/ [02:29:56] Command 'mkdir "hazard-bot"' [02:29:57] failed with return code 1 and error message [02:29:59] mkdir: cannot create directory `hazard-bot': Permission denied. [02:33:45] Ryan_Lane: ^ [02:34:04] it should already exist... [02:34:08] where? in /data/project? [02:34:25] you'll likely need to talk to petan about that [02:34:27] he's asleep, though [02:35:24] yes, in bots-bnr4 [02:36:02] Ryan_Lane: I'll just email him and give up for the night then :) [03:45:25] Coren|Sleep: I can't create a screen as my tool [03:45:25] local-legobot@tools-login:~$ screen [03:45:26] Cannot open your terminal '/dev/pts/2' - please check. [03:46:22] legoktm: That's normal since your tool doesn't own the controlling terminal -- you do. You can start a screen /before/ you become your tool though. [03:46:44] legoktm: But, more importantly, you aren't supposed to run a tool in a detached screen. :-) [03:48:40] Ok :P [07:42:05] !log tools petrb: removed reboot information from motd [07:42:07] Logged the message, Master [08:18:57] [bz] (NEW - created by: Peter Bena, priority: Unprioritized - normal) [Bug 47115] emails must not be delivered to wmf sysadmins mail - https://bugzilla.wikimedia.org/show_bug.cgi?id=47115 [09:36:40] !ping [09:36:40] pong [13:52:11] [bz] (NEW - created by: Peter Bena, priority: High - major) [Bug 45768] console doesn't show proper errors - https://bugzilla.wikimedia.org/show_bug.cgi?id=45768 [14:03:55] ps faux [14:10:52] hi [14:10:54] Oren_Bochman, do you have a labs shell account already? [14:11:01] yes [14:11:07] been in active for a while [14:11:11] What's your on-wiki username? [14:11:16] oren [14:11:40] That's your wiki name or your shell name? (There is constant confusion between the two...) [14:11:47] Also, what's your project for, and what would you like it called? [14:12:08] it is a MediaWiki Moodle integration [14:12:19] and please call it moodle [14:12:46] Would you like shared storage between instances? [14:13:10] I think so [14:13:45] I'll probably have a dev inst production inst and a MW inst [14:14:08] ok. Should be all set. Lemme know if it misbehaves. [14:14:15] one more thing [14:14:42] can you add me to the Tools project [14:14:52] Sure. Just a second... [14:15:12] ok, done. [14:15:27] is there any info about migrating stuff to this project ? [14:15:35] like bots or a scripts [14:16:20] There's a bit of documentation here: http://www.mediawiki.org/wiki/Wikimedia_Labs/Tool_Labs/Help [14:16:45] You'll probably want a tool group created; at the moment Coren is the one to ask about that. [14:17:06] He's in EDT so should be online pretty soon. [14:17:10] well it is enough to get me started [14:17:58] I want to make a new bot to push and pull stuff beteween moodle and media wiki [14:18:20] but I need to set things up first [14:18:23] ttl [14:22:38] Would it be possible to get gnuplot for the tools project? [14:25:17] fwilson: Probably; best to ask Coren when he appears. [14:25:36] Alright, I will [14:27:22] automatic plotting of recent vandalism statistics :) [14:28:23] andrewbogott: I need an instance with LAMP [14:28:49] I'm not sure what I need to choose from the Puppet info [14:29:04] Oren_Bochman: First you'll want to set up a web security group [14:29:47] Oren_Bochman: Here's a guide to setting up a mediawiki instance: https://wikitech.wikimedia.org/wiki/Help:Single_Node_MediaWiki [14:30:02] For LAMP w/out mediawiki you'll want to follow that guide but stop short of doing the MW puppet config [14:30:26] ok [14:30:42] And instead you'll want to add… webserver::php5-mysql I think. [14:30:45] I'll probably be adding a Moodle pupet conf [14:30:45] Just that one option should do it. [14:31:02] Sure, if you make a Moodle conf then it can just include a webserver class. [14:31:07] I thought so too [14:31:27] but I forgot about adding a security group [14:31:41] Yeah, awkwardly you need to do that before you create the instance. [14:32:03] Fortunately instance creation is now speedy, thanks to Ryan_Lanes work last week. [14:32:17] I read about that [14:33:59] MaxSem: I have updated the packages in /data/project/repo of the maps project now [14:34:28] The packaging scripts I used to build them are at https://github.com/apmon/OSM-rendering-stack-deplou/tree/master/wikipedia/debian [14:35:59] I also tried to set them up on maps-tiles1. Unfortunately something went wrong and now puppet doesn't work anymore [14:36:06] Is Coren here/ [14:36:19] In meeting, will be back in 5 [14:36:46] hi [14:39:51] Sorry. Back. [14:40:22] fwilson: gnuplot you say? [14:41:56] Change on 12mediawiki a page Wikimedia Labs/Tool Labs/Notepad was modified, changed by MPelletier (WMF) link https://www.mediawiki.org/w/index.php?diff=672365 edit summary: [+12] +gnuplot-nox [14:42:20] which puppet defs do I need to be able to config my instance with puppet? [14:42:36] is puppetmaster::self enough ? [14:42:59] Oren_Bochman, I'm not sure I understand the question. [14:43:16] If you want to write puppet code, then you want puppetmaster::self [14:43:20] I want to config my moodle instance with puppet [14:43:23] ok [14:43:30] If you just want to configure using puppet, then… everything already does that. [14:43:41] I want to code it [14:43:58] https://wikitech.wikimedia.org/wiki/Help:Self-hosted_puppetmaster [14:44:01] since we might need a bunch of these for different languages [14:45:15] For me the addition of puppet classes via labsconsole didn't work [14:45:32] apmon: How come? [14:45:45] It said the class wasn't found. [14:45:59] Whereas when I added it directly to site.pp it did find it and run through fine [14:46:07] apmon: Ah, yes, well you need to have it present in the operations/puppet git /before/ you add it to your node. :-) [14:46:20] I did [14:47:07] apmon: Oh, I see what you mean. Yeah, adding the class via the labconsole adds it you your /node/ definition, but it still has to be reachable from site.pp (or indirectly, if it is a module). [14:47:54] apmon: The labsconsole basically just adds "include your_class" to your node definition, it still has to be visible to puppet in the general case. [14:47:56] Was that always the case? I thought the above used to work (around March), but I might have forgotten a step that was necessary [14:48:43] That was all I did in site.pp to make it work. Added role::osm:db to include of the default node def [14:48:44] apmon: Well, you might have been using a module; puppets autoloads them. If your class isn't in a module, then it needs to be added to site.pp [14:48:44] apmon, thanks! [14:49:02] It is a module [14:49:29] https://gerrit.wikimedia.org/r/#/c/36222/ is the module in question [14:49:41] apmon: o_O then it should have worked. Odd. You added it through "Manage Puppet Groups" then turned it on in "Configure Instance"? [14:49:51] yes [14:50:04] What is the project/instance? [14:50:10] apmon, I've already applied puppetmaster::self to maps-tiles[12] [14:50:51] I tried it on maps-tiles1 (which now has a broken puppet altogether after a restart) and on map-tiles2 [14:50:57] sorry maps-tiles3 [14:51:02] broken how? [14:51:03] * Coren checks. [14:51:08] which is a new instance I created yesterday [14:52:29] after a node is self-hosted, you configure it by editing site.pp [14:52:31] apmon: https://wikitech.wikimedia.org/w/index.php?title=Special:NovaInstance&action=configure&instanceid=a43c1612-c13e-4aeb-ae0b-b2a86a14b36f&project=maps®ion=pmtpa [14:52:44] Oh, wait, let me try and restart maps-tiles1. Because it was doing something strange, I might have killed puppet, which might explain why it sais it can't connect [14:52:45] Note that you didn't turn on any of the classes. :-) [14:52:57] I turned them off again, as it didn't work [14:53:01] Ah. [14:53:10] Mind if I try it? [14:53:25] yes, go for it. [14:53:39] Oh, wait, it's puppetmaster::self [14:53:39] It should be role::osm:db and role::osm:tileserver that should work [14:53:43] yes [14:54:02] If a node is puppetmaster::self, it no longer gets update from git automatically -- you have to update it yourself! :-) [14:54:34] Well, the module in question isn't in the main repository (only in gerrit so far) which is why I need the self hosted puppet [14:54:46] Ah. Okay. :-) [14:54:54] but it is present in the local repository [14:55:55] apmon, want me to do it or you're trying it yourself? [14:56:19] apmon... It worked. [14:56:27] notice: /Stage[main]/Osm::Db::Setup::Basic/Postgresql::Createuser[osm]/Postgresql::Sqlexec[createuser-osm]/Exec[echo "CREATE USER \"osm\";" | psql postgres]/returns: executed successfully [14:56:34] Coren: where did you try it? [14:56:40] maps-tiles3 [14:56:51] well, that has those in site.pp under the default node [14:56:57] Ah. [14:56:58] which is what worked [14:57:04] * Coren tries it again after removing that. [14:57:10] cheers [14:57:45] MaxSem: Try the puppet class through labsconsole, or where you refering to something else? [14:58:06] nah, labsconsole shouldn't work:) [14:58:36] why not? [14:58:46] apmon: ... still works even after removing it from site.pp [14:59:17] hmm, ok. Perhaps I was doing something wrong then. [14:59:47] you can't apply a class that hasn't been merged in gerrit yet [14:59:51] MaxSem: Do you mind if I reboot maps-tiles1 [15:00:01] MaxSem: You can, iff you're puppetmaster__self [15:00:12] apmon, rebooting [15:00:12] puppetmaster::self* [15:00:25] mmm [15:00:26] err: Could not retrieve catalog from remote server: Connection refused - connect(2) [15:00:42] MaxSem: Is that after rebooting? [15:00:50] no, before [15:00:58] I was getting that error, but it might be because the local puppet master was killed by me [15:01:22] which is why I want to reboot the instance [15:01:42] although it might be easier to just restart puppet... ;-) [15:02:37] aha [15:02:43] now it's clearer [15:02:43] err: Could not retrieve catalog from remote server: Error 400 on SERVER: Invalid parameter day at /etc/puppet/modules/osm/manifests/tileserver.pp:42 on node maps-tiles1.pmtpa.wmflabs [15:03:12] I haven't had that issue before [15:05:49] ho do Is set the labs_mediawiki_hostname to the fully qualified hostname [15:08:38] Oren_Bochman, let me adjust your quotas so you have a public IP [15:08:48] thanks [15:11:42] would he.moodle.labs.org be an ok he.moodle.labs.org [15:12:07] labs_mediawiki_hostname ? [15:15:24] You will need an actual, existing DNS name first… it won't do much if you point labs_mediawiki_hostname to a nonexistent name. [15:15:50] And it will need to be under a domain that exists for labs already. [15:16:08] I'm confused [15:16:24] labs_mediawiki_hostname tells the wiki where it lives. [15:16:29] But it doesn't cause the wiki to live there. [15:16:34] …if that makes sense. [15:17:18] what I'm asking is - what domain would work for this [15:18:06] I guess once it works we can also set the official dns to point at the same address [15:18:55] You should start by configuring your instance with a public IP and assigning it an address, using this page: https://wikitech.wikimedia.org/wiki/Special:NovaAddress [15:19:15] That provides you with a list of high-level domains, and the ability to assign an arbitrary hostname. [15:19:29] Once you have that set, you'll enter that into labs_mediawiki_hostname when you set up the wiki [15:19:58] If you don't like the options in the list, I can create another one. It will always be under wmflabs.org. [15:20:44] (Sorry, not sure if I'm answering your question :( ) [15:21:06] MaxSem: I am now also getting the Invalid day error in puppet [15:21:25] on a new instance I just created maps-tiles4 [15:21:30] no you are very helpfull [15:24:21] Coren: I just tried it on a fresh instance and the labsconsole setting of puppet classes now indeed seems to work [15:25:05] I am reasonably sure I did the exact same thing yesterday, but oh well, as long as it works now it is fine. Thanks for your help. [15:26:01] MaxSem: After commenting out the cron script in tileserver.pp it now seems to run. [15:26:13] apmon: np. That's what they pay me the reasonably-sized bucks for. :-) [15:26:25] Be back in a few. Lunch. [15:26:29] no idea why that no longer works (or newly throws an error) though [15:27:26] hi DarTar [15:30:44] MaxSem: There are a couple of minor things I noticed about the puppet scripts. [15:30:52] yup? [15:31:43] In the mapnik datasource defininition there is a parameter , which if set forces postgresql to use a tcp connection rather than a unix domain socket [15:31:56] with that the automatic user identification no longer works [15:32:17] which means you'll need to actually set passwords and set up postgresql to use them [15:33:30] normally the easiest ist to just comment out the parameter, but as we might want to have the db on a different server than renderd, I guess that wouldn't work [15:34:54] secondly, I have removed the automatic creation of the /var/lib/mod_tile from the packages, as one might want to use a dfferent directory (or a rados store) to store tiles [15:35:45] creating the tiles directory therefore needs to be added to the puppet scripts (as other parts currently rely on its existence) [15:36:25] thirdly, it might be better to run renderd under the osm user and not under www-data. [15:36:31] in such case, you'll need to remove all the default configs that mention this dir. not worth the effort if you ask me [15:37:13] all of the configs need to be set correctly by puppet for the installation anyway [15:37:16] what's the difference between www-data nad osm? [15:37:32] nothing really, it just seems cleaner [15:37:48] but if you are OK with it, then it should work that way [15:38:49] mmm [15:38:59] other than that, I think everything worked fine [15:38:59] all the files appear to be owned by osm [15:39:21] so indeed doing that from osm might make sense [15:39:54] mod_tile, which runs in apache and therefore as user www-data needs to be able to read the files in the tile dir (assuming one is using a file based tile directory). [15:40:03] but that can be done through file permissions [15:40:12] yeah [15:40:29] but allowing just one user to write into these dirs sounds safer [15:40:46] also, there's a dependency on toolserver in load-next. I don't quite understand what it does... [15:41:11] it just translates a date to a replication sequence number [15:41:22] makes it easier to set up replication. [15:41:34] but you can get the correct sequence number through other means as well [15:42:09] e.g. looking in http://planet.openstreetmap.org/replication/minute/ to see what the correct state.txt is to start replication from [15:42:30] yeah, a dependency on third party site is dangerous [15:42:55] No issue with taking that out. [15:43:12] It is only needed once for the intitial import anyway [15:45:44] apmon: [15:45:46] "/var/lib/mod_tile/.osmosis/": [15:45:46] ensure => directory, # so make this a directory [15:45:46] recurse => true, # enable recursive directory management [15:45:58] so it should already be created [15:46:40] hmm, it failed for me yesterday. But the recurse should indeed have worked [15:47:22] Oh thanks Coren :) [15:48:05] err: /Stage[main]/Osm::Importer::Files/File[/var/lib/mod_tile/.osmosis/]/ensure: change from absent to directory failed: Cannot create /var/lib/mod_tile/.osmosis; parent directory /var/lib/mod_tile does not exist [15:48:53] somehow recurse => true doesn't seem to do what I thought recurse would do [15:49:35] lemme prepare a new commit [15:57:05] mhm, puppet docs in their shiny epicness: [15:57:21] "recurse: Whether and how deeply to do recursive management." [16:02:09] apmon, I've updated https://gerrit.wikimedia.org/r/36222 [16:07:41] looks good. [16:08:07] It might be worth making the tiles directory a variable and then use templates to substitute it [16:08:32] what for? [16:09:06] to be able to change it easily [16:09:12] e.g. when we move over to rados [16:09:37] or to move it to /data/project/tilesdir to use it as a shared directory accross instances for testing [16:10:00] but perhaps it isn't worth it [16:11:12] eh, project storage is too slow [16:12:53] as long as it works, it should be fine for testing fail-over procedures, even if it is slow [16:13:02] you just can't do load testing with it... ;-) [16:14:16] and load testing wouldn't really work in labs anyway (for one you can't get a performant full planet db on labs) [16:29:16] MaxSem: I see there is a passwords::osm::db class. Do you already use it? I guess that lives in the puppet/private directory? [16:29:31] no I don't [16:39:48] Any ideas on why a gnuplot script wouldn't run from a webtool, but would run from the command line? [17:11:44] [bz] (NEW - created by: Željko Filipin, priority: Unprioritized - normal) [Bug 47129] Upload Wizard broken at commons.wikipedia.beta.wmflabs.org - https://bugzilla.wikimedia.org/show_bug.cgi?id=47129 [17:35:50] fwilson|busy: u can haz gnuplotz! [17:38:00] yes i see :) [17:38:04] but it doesn't work from my webtool [17:38:06] strangely [17:38:32] which is http://tools.wmflabs.org/voxelbot/cgi-bin/voxel.py?dtype=graph [17:39:02] it runs a gnuplot script from recent vandalism stats, which works from the command line but not from the tool [17:39:47] fwilson|busy: Hm. Lemme check something. [17:40:46] fwilson|busy: [Thu Apr 11 16:25:28 2013] [error] [client 10.4.1.89] gnuplot: not found [17:40:46] You should specify the path explicitly on invocation: /usr/bin/gnuplot [17:40:53] * fwilson|busy feels stupid [17:41:18] yay, pretty graph! [17:41:48] o_O I'm pretty sure your axis is lableled wrong. :-) [17:41:54] of course, it probably is [17:41:58] oh [17:42:02] that's relative time [17:42:33] The other one. Reverts/edits > 1 seems... unlikely. I'm pretty sure you mean edits/reverts :-) [17:42:50] oh [17:42:51] hi again [17:42:54] that's not even a proportion :) [17:43:08] you're supposed to be able to pick from either reverts or edits [17:43:11] I'm having trouble accessing my nw instance port 80 from the web [17:43:24] Ah, reverts /or/ edits per minute. :-) [17:43:28] trying to add a security group entry [17:43:34] :) [17:43:43] http://tools.wmflabs.org/voxelbot/cgi-bin/voxel.py?dtype=graph&last=120 is pretty [17:43:55] but I'm not sure what the CIDR should be [17:44:13] Oren_Bochman: You want to allow from the world at large? [17:44:17] Oren_Bochman: 0.0.0.0/0 [17:44:19] yes [17:44:45] Oren_Bochman: Your instance also needs a public IP for that to work, though. [17:44:46] ok [17:45:14] would 0.0.0.0/80 be better ? [17:45:23] I defined the ip already [17:45:33] Nono, it has to be 0.0.0.0/0 -> that's the IP and mask. [17:45:33] and associated it with the instance [17:45:39] ok [17:45:41] You set the port separately. [17:45:53] now to set up unique image names [17:46:49] I get failed to add rule [17:47:58] I use 80,80,tcp, 0.0.0.0/0, web [17:48:52] Oren_Bochman: Wait, you normally add a source route _or_ include a rule. [17:49:02] Oren_Bochman: What instance is this? [17:49:23] it is called moodle [17:49:42] he-moodle [17:49:56] Oren_Bochman: Lemme check. [17:49:59] in project moodle [17:51:57] Oren_Bochman: Okay, as I though. Just use 80,80,tcp,0.0.0.0/0 but don't add a source group [17:52:06] ok [17:52:28] A source group is when you want to include a set of rules; by saying 'web' there you tried to include the rule within itself. :-) [17:55:46] strange [17:56:04] any how as it works now [17:56:31] so I'll now try to install moodle using puppet ;-) [17:57:48] Coren: could you tell me what line my error is on :) [17:58:39] oh, nevermind [18:00:06] * Coren doesn't mind, then. [18:00:15] alright, tell me where my error is :) [18:00:48] * fwilson|busy can't find it [18:01:57] wait, does python 3 not support the psuedo-tenary operator? [18:03:08] Coren did you solve that problem of addshore with php [18:03:22] Alright Coren, I can't find my error :) [18:03:39] well. [18:03:41] just kidding. [18:04:37] fwilson|busy: I can't actually help you with the line in your script; you invoke a system() equivalent and it's the *shell* that bombs out because you don't have a $PATH set. [18:04:51] petan: What problem? Lemme check scrollback. [18:04:52] no it's not that anymore [18:04:59] It was a silly syntax error [18:05:11] I fixed the gnuplot-not-found thing [18:05:27] [Thu Apr 11 17:59:53 2013] [error] [client 10.4.1.89] File "/data/project/voxelbot/cgi-bin/voxel.py", line 30 [18:05:27] [Thu Apr 11 17:59:53 2013] [error] [client 10.4.1.89] rvoe = "reverts" if rvOrEdits = "rv" else "edits" [18:05:49] Coren he can run his script on sge [18:05:51] yep [18:05:53] that was it [18:05:57] but he can run it from instance [18:06:03] Coren are you helping out with the toolserver migration [18:06:04] now cgitb is working :) [18:06:38] petan: Ah, yes, that was fixed; the reason it didn't work is because of permissions - he was trying to run it from /his/ account rather than his tool's. :-) [18:07:00] Oren_Bochman: Strictly speaking, that's my primary job. :-) [18:07:11] yay, no more image. [18:07:13] hmmm [18:08:07] permissions :) [18:08:43] maybe... [18:09:30] Interesting. [18:09:33] It's skipping the file... [18:11:05] * Coren doesn't have enough context. [18:11:11] Is this good or bad? :-) [18:14:51] Coren: any idea why I would get 403s when viewing the generated images? [18:15:14] fwilson|busy: I can look. Gimme an url that does that? [18:15:19] sure, one min [18:15:26] http://tools.wmflabs.org/voxelbot/img/9e68517380a29b767250cdee5001a7dd.png [18:17:04] Oh, god, _please_ don't put any directory o+w! :-) You scripts run with the UID of your tool, so it's never needed. [18:17:13] really? [18:17:16] * Coren nods. [18:17:22] oh that's cool [18:17:26] * fwilson|busy did not know that [18:17:53] wheee, prettygraph! [18:17:58] Hm. From what I can tell, at least one of the ways you generate your images creates the files without world read permission. Where in the code do you savethem? [18:18:16] the graph function [18:18:25] See: -rw-rw---- 1 local-voxelbot local-voxelbot 4470 Apr 11 18:13 9e68517380a29b767250cdee5001a7dd.png [18:18:34] hmmm [18:18:49] i'll see what i can do [18:18:54] Can you point me at a line/file? [18:23:53] fwil|actuallybus: Also, +x on pngs seems... a little pointless. No harm in it, though, just a bit unusual. [18:48:00] How i can execute a perl script with local modules with jsub? [18:49:25] UA31_: Well, your script will see the same filesystem on the excution nodes it sees on the login box. It should Just Work(tm) [18:49:56] UA31_: But remember that if you had to set environment variables for the includepath, etc, your script also must set them. [18:51:36] I have installed the modules with cpanm [18:53:00] For local installation of modules [19:01:29] UA31_: Aaah, did you install them for the /elvisor/ account or for youself? :-) [19:01:57] For main account ua31 [19:02:42] Ah, well then it should work when you submit jobs; but you really should do so from your tool account, not your own. [19:02:54] Want me to switch to your user and see what's wrong? [19:04:38] Switch [19:05:13] UA31_: What is the job that's giving you problems? [19:05:22] UA31_: I.e.: what command line are you trying, exactly? [19:05:49] jsub -continuous botf.sh [19:06:23] in rlinks? [19:16:22] Unable to parse the feed from https://bugzilla.wikimedia.org/buglist.cgi?chfieldfrom=-4h&chfieldto=Now&list_id=151044&product=Wikimedia%20Labs&query_format=advanced&title=Bug%20List&ctype=atom this url is probably not a valid rss, the feed will be disabled, until you re-enable it by typing @rss+ bugzilla [19:22:54] @rss+ bugzilla [19:22:55] This item was enabled now [19:23:15] (bugzilla died, see -operations) [19:28:45] bugzilla is up again [19:29:29] \o/ [19:29:58] Yeay! Praise zombie bugzilla, shambling on after its death! [19:41:33] andrewbogott: so, it should be possible for instances to actually boot on nova-precise2, assuming it has enough memory [19:42:03] also, I did get resize somewhat working, but it's not very reliable [19:42:11] so, if we want to risk killing the instance, we can :D [19:45:04] Ryan_Lane: I think there's not enough memory, but I'll try one right now. [19:45:31] it's going to error [19:45:36] if there isn't enough memory [19:46:52] yep, immediate failure [19:47:43] Do you want to risk a resize? Or should we dig into the process of making a new host? [19:48:02] hm [19:48:57] Puppetizing seems worthwhile in the long run anyway... [19:49:14] And I don't mind spending the time. I'm not sure how much time you burned on nova-precise2 after my initial setup though [19:50:22] a few hours at least [19:50:38] would be nice to have it fully puppetized [19:50:52] what did you need working instances for, btw? [19:51:40] Sometimes when I select classes in the 'configure instance' page, I can't deselect them. Like, I uncheck the box but next time I look it's checked again. [19:52:04] oh? [19:52:06] Pretty minor, but would be easier to test if I could actually use 'configure instance' on a test box. [19:52:24] you can do that on instances in the error state [19:52:32] Really? [19:52:34] * andrewbogott tries it [19:52:34] yeah [19:53:25] Hm, sure enough. Of course, it works properly there :) Have to make some test cases. [19:53:32] heh [19:53:52] Anyway -- go ahead and offer your comments on those three patches, and I'll think about what the next steps for puppetizing are. [19:53:53] if you feel like totally redesigning that page (or entire workflow) have at it [19:53:57] I hate it [19:54:36] it's ugly and the code is complex and error prone [19:55:15] Would be nice if the list of options actually came from the puppet manifests. I don't really know how we'd do that though. [19:55:38] assuming we documented them in some standard way, we could just parse the repo [19:55:43] in a cron [19:56:46] heh. opendj is adding a rest interface [19:56:51] for people who don't like ldap calls [20:06:38] that seems good! [20:06:42] For me at least :) [20:08:52] ok… but reproduced and diagnosed. It's just because the same puppet class appeared twice in the list… I have to uncheck it both places to really turn it off. [20:08:58] So this is even less important than I thought [20:09:17] ah [20:09:18] right [20:09:21] yeah, that's a known issue [20:09:33] known to me anyway :) [20:09:40] I don't think it's in bugzilla [20:10:01] making this a javscript interface would be nice [20:10:22] then fallback to a simpler html interface [20:10:26] this is another good use for chosen [20:10:57] put all of the classes into a single multi-select, with optgroups [20:11:22] have text input fields for all of the variables [20:11:35] but only have one variable by default [20:11:44] with "add another variable" link underneath [20:11:48] that adds a variable [20:12:09] have a select box for the variable name and a text input for the variable [20:12:26] in my head it looks simpler, but who knows :) [20:14:10] we could even just drop the groups [20:14:16] it's a confusing concept for users [20:14:47] another option would be to have proper documentation for roles [20:15:10] where we say "for this role, the following variables can be used" [20:15:28] then parse it, and add a role at a time [20:15:53] and fallback to html would just be a list of roles and variables (without groups_ [20:15:55] ) [20:16:17] Ryan_Lane: So, do I get to reap labstore[34] today? [20:16:24] no reported issues, right? [20:16:27] * Coren redies his schythe. [20:16:28] I didn't see any [20:16:35] I'd say yes [20:16:37] go for it [20:16:41] Nobody told me of any, and I don't see a related bz [20:16:53] scythe(?) [20:16:58] it's not like bringing them back up will help anything [20:17:07] I already shrank all the volumes. heh [20:18:19] labstore3: destroy? [Y/s/j]: [20:18:27] kill em both [20:18:37] and put a ticket in to get their shelves reconfigured [20:18:44] Did you power them off already? [20:18:47] in the pmtpa queue [20:18:48] I did not [20:18:57] * Coren can't seem to reach 'em from fenari [20:18:59] but all the glusterfs services are killed on it [20:19:00] no? [20:19:28] I can [20:19:32] Ah, nevermind. Some network burp I guess. [20:21:00] andrewbogott: hm. we could add default puppet config into LDAP from a nova notification, couldn't we? [20:21:23] do we have access to the metadata from there? [20:21:37] and can we get notifications on metadata update? [20:22:04] I'd really like to make it possible for users to use the cli [20:22:09] Currently metadata update doesn't send a notification. [20:22:11] ah [20:22:12] ok [20:22:18] It's a pretty simple hack to make... [20:22:21] * Ryan_Lane nods [20:22:36] once we switch to moniker, puppet is the only thing left [20:24:16] we could also write an ENC to pull the information from the metadata on the puppet server [20:24:26] then we wouldn't need a notification [20:25:20] we have a lot of script that depend on nodes being in ldap. though [20:25:32] I guess this isn't a small change. heh [21:23:09] !log integration -jobbuilder : updated local puppet and running puppetd -tv [21:23:11] Logged the message, Master [21:47:38] Coren: +x on PNGs is pointless. [21:47:40] * fwilson fixes [21:51:49] broke it again. [21:51:51] hmm [21:52:09] oh [21:53:49] Ryan_Lane, i'm only poking because I'm excited and want to start using this. Also its on my analytics scrum todo list this week, so you know, others want me to poke you too: [21:53:50] https://gerrit.wikimedia.org/r/#/c/58540/ [21:53:52] :) [22:00:45] ottomata1: I'm testing it now. [22:07:55] ottomata1: I'd prefer the classes be parameterized [22:08:09] ottomata1: and the role use the global variables [22:08:33] and pass them in as parameters [22:08:40] I hate global variables [22:10:33] isn't that how it works? [22:10:44] class puppet::self::master($server) { [22:10:51] default => $::ipaddress, [22:11:05] ah. that's the hosts's ip [22:11:06] hmmmmm, i see [22:11:08] from factor [22:11:09] yeah that's a facter var [22:11:22] the only global variable i had set is $::puppetmaster in the role class [22:11:23] which is optional [22:11:30] is it bad to use the facter var there? [22:11:35] nah [22:11:38] that's fine [22:12:14] I hate that factor variables are just global, but that's puppet [22:12:17] and it's stupid [22:12:24] I wish they were namespaced [22:39:00] !log tools rebooted tools-puppet-test (no end-user impact): hung filesystem prevents login [22:39:03] Logged the message, Master [22:41:36] Coren: so. one thought on mysql access for the replicated dbs [22:41:50] Yes? [22:41:54] the replicated dbs could mount the NFS share [22:42:00] for homedirs [22:42:08] run a script that sets passwords for all users [22:42:21] then writes the credentials into their .my.cnf file [22:43:07] sounds like scale fail [22:43:26] scale fail in which way? [22:43:33] it only needs to mount a single volume [22:43:47] only making replicated dbs avaible to one project? [22:43:51] and it only needs to update based on project/user when that changes [22:44:00] that can be per-project [22:44:18] every project is a mount essentially [22:44:25] no it isn't [22:44:26] Ryan_Lane: That may or may not work well; in practice local-* users will need credentials too. I'd tend to do it the other way 'round instead; have the project create the user over the network and stowe credentials on its home. [22:44:27] not with NFS [22:44:47] Coren: what would to it over the network from where? [22:44:50] *do [22:45:06] we can't assume every project has an instance to do this [22:45:13] Hm. Good point. [22:45:26] and then we have to worry about how to give the project credentials to do so :) [22:45:28] Then we need magic handling for service users. [22:45:44] Which is okay -- the info is in OpenStack [22:45:49] I don't see how this changes my solution [22:45:52] it's also in LDAP [22:46:16] it's actually not in openstack for the service users [22:46:24] Hm. I see; you'd watch ldap for missing users and create the mysql user then stow in ~user/my.cnf [22:46:30] yep [22:46:51] Ryan_Lane: The $HOME pattern is there, but we can actually use the getent LDAP home. [22:47:46] But that means we need to create projects x users credentials. [22:48:25] If we do it preemptively. [22:50:17] we need to do what? [22:50:30] oh, the homedirs? [22:50:34] Right. [22:50:40] one thing we could do is base this off homedirs [22:50:43] and not ldap [22:50:53] or both [22:50:59] inotify on directory creation [22:51:02] check ldap [22:51:19] if the user exists, make the credentials, add it to the file in the directory [22:51:56] when the daemon starts it can pull an entire ldap tree and check all the directories [22:52:02] That then has to live on the NFS server proper. NFS doesn't have inotify support. [22:52:03] to ensure it didn't miss anything while down [22:52:08] ah [22:52:09] crap [22:52:12] that's doable [22:52:25] really where else would we put it? [22:53:17] Well, I wouldn't want the NFS server to play with the DB, as a rule, but it's a fair place to run it. It's secure, so mysql credentials to create accounts aren't at risk there. [22:53:24] ah. crap [22:53:25] right [22:53:45] actually, I don't see a problem with it messing with the db [22:54:09] * Ryan_Lane nods [22:54:13] It's... distasteful to my partitioning reflexes. :-) But yeah, it'd work. [22:54:16] heh [22:54:23] well, you need to partition it somewhere, right? :) [22:54:33] do we want the db doing file management :) [22:54:44] Probably not either, for that matter. [22:54:45] s/$/?/ [22:54:55] So yeah. [22:55:15] we have to have some way of getting the credentials to the users [22:55:19] this seems like the easiest way [22:55:36] It's actually fairly easy to implement on the file server because it sees the whole cross-project filesystems. [22:55:41] yep [22:56:06] And this would only create credentials when homes are actually created, which sounds like the right moment to do it. [22:56:28] yeah. no point in having credentials for users who haven't logged into a project [22:56:36] Do we give per-project or global credentials to global users? [22:56:40] now the question is… how do we handle per-project? [22:56:40] hah [22:56:44] same thought pattern [22:56:52] per-project, likely [22:57:04] I'm thinking: global users get global creds; local users get local creds. [22:57:21] well…. the only reason for having creds is to prevent abuse [22:57:32] if we have global creds a root on a random project can steal a user's password [22:57:44] Ah. Good point. [22:57:49] and we'd have to shut that user down on every project [22:57:56] if we do it per-project, we can limit abuse [22:58:03] So we create users patterned project-user%* [22:58:21] the only downside is databases that users create that they want to use between projects [22:58:36] Ryan_Lane: Not really, they can still grant their "other" user access. [22:58:41] yeah. that seems like the best approach (project-) [22:58:46] ah [22:58:46] true [22:58:48] good point [22:59:05] this seems like a reasonable approach [22:59:26] It's trickier from a user-friendly POV, but if you start thinking cross-project databases you're officially not a newbie anymore. :-) [22:59:32] indeed [22:59:39] most people will use databases within a single project [22:59:56] If you do anything cross project it raises security issues [23:00:02] yep [23:00:06] it definitely does [23:00:36] We still need support for database creation; https://bugzilla.wikimedia.org/show_bug.cgi?id=46460 sounds like a sane approach. [23:01:38] seems sane-ish [23:01:43] talk with binasher about this [23:01:53] Yeah, that's the intent. [23:02:24] He's... otherwise occupied atm though and I don't want to distract him away from the golden ring. :-) [23:02:34] * Ryan_Lane nods [23:02:48] I think we can bang out a daemon for rights management pretty quickly [23:02:57] Oh, which raises the extra question: replication is n shards; we have to do this on each. [23:03:10] I have inotify code in ircecho and manage-volumes does similar things to what we want [23:03:56] yeah. we just configure the daemon with all of the servers [23:04:05] * Coren nods. [23:04:07] no biggie [23:04:38] We'll still have to find some way to provide federated databases or some equivalent though. There's an important use case we never got around to consider: [23:04:49] well, that's a binasher question :) [23:04:53] he's designing this [23:05:02] lots of joins between $randomproject and commons, and now $randomproject and wikidata. [23:05:13] yeah, this has been known for a while [23:05:41] TS hacked around it by replicating commons to each shard. This is teh suk. [23:05:49] fuck trainwreck [23:05:59] it can die a thousand deaths in a firey hellhole [23:06:24] a custom daemon to do multiplexing? no thanks [23:06:31] I swear that logic should be in the application and you should die if you hack it into the db [23:07:29] Damianz: Honestly, for /most/ use cases, I'm pretty sure federating commons and wikidata to the other shards would "just work" [23:08:01] But yeah, that's a political problem not a technological one. [23:08:37] If we say "The right thing is to put that logic in your tool", then we'll get a severe treatment of pitchforks and torches no matter how correct we are. [23:09:39] is sudo disabled in the bots project? [23:10:02] no [23:10:50] i can't do it [23:13:28] giftpflanze: What's your wikitech username? [23:13:38] gifti [23:13:56] giftpflanze: The reason why you can't sudo is because you're not listed as a project admin. Did you expect to be? [23:14:23] there was a time where everyone could sudo [23:14:38] and i was never projectadmin [23:15:07] giftpflanze: That was before I got here, so I can't tell you why or whether that was changed. Sorry. :-( You might want to ask Petr, though, since he'd certainly know. [23:15:22] (aka petan on IRC) [23:15:38] mh, ok [23:15:43] Also, Coren, I got it working :) [23:16:00] Coren: it's not based on projectadmin [23:16:05] sudo is separate from that [23:16:06] fwilson: Appropriate exclamation of happiness. [23:16:16] and it has a nice interface too, http://tools.wmflabs.org/voxelbot/ [23:16:20] Ryan_Lane: Oh, right. I keep forgetting. [23:16:47] I forget that the default sudo policy doesn't need to be the /only/ sudo policy. :-) [23:16:55] Coren: maybe you have an idea why neither cron nor fcron is working for me on bots-4? or Ryan_Lane? [23:17:11] maybe petan disabled it? [23:17:18] he wanted people to use grid engine [23:17:48] and the grid engine is usable without cron? [23:17:57] * Ryan_Lane has no clue :) [23:18:44] giftpflanze: Depends on what you are trying to do with it, but yes. You might want to look at this to have an idea how I do it on tools; my understanding is that petan wants to make it pretty much the same on bots: [23:18:48] !toolsdoc [23:18:55] !toolsdocs [23:18:55] http://www.mediawiki.org/wiki/Wikimedia_Labs/Tool_Labs/Help [23:19:48] if i have a job that has to be started once a day cron would still be needed, wouldn't it? [23:20:03] fwilson: Your interface is nice and functionnal. :-) [23:20:27] giftpflanze: Oh, if it's a job that runs at interval, yes. But there's generally a "right" host to submit jobs from. [23:21:01] I think the bots-bnr* are the designated submit hosts for bots. [23:21:44] what a pain … [23:22:55] it should be easier and more reliable, after everything uses it [23:24:30] :) [23:24:39] i plan on bootstrap'ing it or something [23:24:51] after i have more tools [23:26:33] fwilson: Did I tell you about the .description easter egg? :-) [23:27:43] nope [23:27:54] this sounds like fun :) [23:28:12] what is it [23:28:33] fwilson: If you put a .description file in your tool's home, it'll show up at https://tools.wmflabs.org/home.php [23:28:44] ooo cool [23:28:51] * fwilson does so [23:30:47] Ryan_Lane: Oh, I meant to ask you; do we have CPUEntitlement? [23:31:01] what do you mean? [23:31:18] heh [23:31:22] you strip script tags? [23:31:30] Ryan_Lane: Yep. :-) [23:31:43] Ryan_Lane: More precisely, I only let a very few tags through. [23:32:29] IIRC,


[23:32:35] that's what i was about to ask. [23:33:14] what's with all the nick changes [23:34:40] Coren: use chrome [23:34:43] and mouseover xss [23:34:46] err [23:34:48] xss link [23:34:50] :) [23:35:14] ugh [23:35:22] Ryan_Lane: Can't solve a human problem with technology. Someone who does something like this on tools will have their ass handed to them. :-) [23:35:34] Can I test your anti-script-tag mechanism? [23:35:42] fwilson: Please do. [23:36:05] Okay, certainly [23:36:16] it works so fra [23:36:17] *far [23:37:00] hmmm [23:37:24] https://www.owasp.org/index.php/XSS_Filter_Evasion_Cheat_Sheet seems useful [23:37:29] let's see what I can do there... [23:38:19] no img script injection for me then [23:40:28] ooo this massive list of events looks like fun [23:42:41] Coren: are you using internet exploder? [23:42:46] because I might put you into a loop [23:44:16] fwilson: Bite your tongue. Of course I don't. :-) [23:44:20] :) good [23:44:32] wait a second [23:44:35] you're a sysadmin [23:44:39] sysadmins don't use Internet Exploder [23:45:04] Nope. It causes the uglies, but no running of scripts. :-) [23:45:34] hmmm [23:45:36] let's see... [23:45:51] nope, [23:45:55] no executing arbitrary commands [23:46:44] what else can I do... [23:48:13] nope. [23:52:58] Coren, I found absolutely nothing. [23:53:16] fwilson: I am, I am sure, devastated. :-) [23:53:23] You need to make your code less secure next time [23:53:30] :)