[00:00:41] I can't find it with other queries too [00:01:01] look at andrew/hosts.txt [00:01:18] Doesn't it contain that entry? [00:02:01] uhm [00:02:17] yes it does... [00:02:34] something's wrong with either indexes or ACLs [00:02:37] this is very wrong [00:02:42] if you search it by key, it fails [00:03:10] well indexes is exactly what Ryan_Lane was working on last night [00:03:14] So… I guess we need to know more about 'monkeying with ldap' [00:03:25] somewhere around here my opendj skills are coming to their max [00:03:29] let's wait for ryan [00:04:02] so aRecord=10.4.0.58 returns [00:04:13] while associatedDomain=deployment-bastion.pmtpa.wmflabs does not [00:04:49] what about associatedDomain=i-00000390.pmtpa.wmflabs [00:04:58] 'cause DNS works for that one -- same instance. [00:10:10] thanks again, paravoid. 'night. [00:10:29] sorry, I don't think I can help much more than that [00:10:34] without digging deep into opendj [00:10:45] and I really don't feel like that right now :) [00:11:00] ryan is getting back soon I think, right? [00:11:09] You confirming what I see is, itself, useful. [00:11:16] And, yeah, I think he'll be back. [00:11:31] right, I hope backlog is going to help him debug this quicker [00:11:34] bye! [00:12:08] (03PS1) 10MarkTraceur: Transfer inline count to a separate field [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/95572 [00:12:20] * marktraceur tired of this (2 comments) bullshit [00:12:50] hah [00:14:07] YuviPanda_zzz: Wake up, I need you to do my bidding [00:15:05] Also, wth is with the code conventions in that repo [00:15:08] It's like [00:15:11] Anarchy [00:16:22] is https://bugzilla.wikimedia.org/52630 puppetized? [00:17:53] huh, Yubipanda [00:18:22] https://bugzilla.wikimedia.org/buglist.cgi?quicksearch=Yubipanda [00:24:08] Coren: how is one supposed to edit https://tools.wmflabs.org/ ? [00:24:17] i.e. the html [00:24:28] not an individual tool row [00:25:10] Its puppetized in toolabs/www [00:25:52] More precisely in labs/toollabs, the www directory [00:26:06] Why? [00:26:42] i wonder what all this portgrabber/portgranter stuff is [00:27:57] i'm not finding what you pointed too [00:28:00] to* [00:28:36] * jeremyb waits for git grep [00:28:53] yeah, only 1 hit [00:28:54] wmf-puppet$ find -name www | cat -n 1 ./files/powerdns/recursorstats/www [00:29:02] Coren [00:29:22] unless i'm just too out of date [00:29:53] Ah, not the puppet repo, sorry. labs/toollabs. [00:30:02] It's pulled /by/ puppet. :-) [00:30:10] But from a distinct repo. [00:30:20] let's try this again... [00:31:29] jeremyb: https://gerrit.wikimedia.org/r/#/admin/projects/labs/toollabs [00:31:38] right [00:31:50] i went through gitblit instead of gerrit though :) [00:31:57] Same difference. :-) [00:32:07] very fast clone [00:32:34] Very small repo. :-) [00:33:19] portgrabber / portgranter is the magic behind the new one-webserver-per-tool-on-the-grid scheme. [00:34:39] do you have some criteria for what constitutes magic? [00:36:12] hah, x.php!! [00:37:32] jeremyb: Magic is stuff that does something useful but is dreadfully underdocumented. :-) [00:37:44] jeremyb: Alternately, "sufficiently advanced technology" [00:37:58] huh [00:39:30] Coren: what's the rationale for using htmlpurifier? [00:39:52] andrewbogott: hm. bad ldap eh? [00:40:06] So says paravoid, although I didn't quite follow what was bad. [00:40:15] Or, rather, it was misbehaving, but I don't know if that was evident from the data. [00:40:42] jeremyb: To avoid sneakiness or brokenness in .description. It's user-supplied input, I'm not about to stuff it unsanitized on the front page. :-) [00:41:40] Coren: uhhh, and the base? i see some stuff pointing to https and some to http. and i don't know what it's supposed to be doing but i figured maybe fixing that? [00:42:18] hm. indeed. I can't find the record on virt0 [00:42:22] but can on virt1000 [00:42:54] Is that because virt1000 is up to date, or more out-of-date? [00:42:55] I can't by doing an exact search [00:43:00] let me check replication [00:43:03] it's not replication [00:43:06] it's probably the indexes [00:43:47] replication is fine [00:43:50] let me rebuild the index [00:45:13] hm. I created the index with the wrong case [00:45:21] I wonder if that caused the problem [00:45:23] how does one rebuild an index? [00:45:56] first you add the local db inde [00:45:58] index [00:45:59] su - opendj [00:46:01] dsconfig [00:46:08] it'll give you a list of options [00:46:12] 19 is local db index [00:46:26] then you go through the menu to add it (I added equality and substring indexes) [00:46:38] rebuild-index --index associatedDomain --baseDN "dc=wikimedia,dc=org" --start 0 --bindDN "cn=Directory Manager" [00:46:42] that rebuilds the index [00:46:48] restarting opendj will also rebuild it [00:47:06] wasn't opendj restarted while troubleshooting? [00:47:10] twice! [00:47:14] Well, twice by me [00:47:17] right, but I think i broke it [00:47:35] by creating an index "associateddomain", rather than "associatedDomain" [00:47:39] ohhh [00:47:59] it now appears in the search [00:48:08] well, there's a lesson learned [00:48:22] but still why is it working on one and not the other? [00:48:30] did only one get the new indexes? [00:48:39] they are local indexes ;) [00:48:47] I created it correctly on one and incorrectly on the other [00:49:06] ok [00:49:20] so... indexes not in version control [00:49:21] :( [00:49:31] not much in opendj is [00:49:42] it's like a database [00:49:44] ok, so restarting rebuilds the existing indexes but only those indexes that were explicitly passed to rebuild-index earlier? [00:50:02] andrewbogott: only indexes that are added to the configuration [00:50:14] which is what I did via dsconfig [00:50:16] andrewbogott: rebuild-index is for if you're not restarting [00:50:20] indeed [00:50:32] ah! OK, so the mistake was in dsconfig, now I follow. [00:50:32] the idea is that you should rarely if ever need to restart [00:50:47] what is this, unix? [00:51:03] it's the same concept with databases ;) [00:51:14] I'll write up an outage report for labs-l [00:51:45] OK. The outage was tiny, only a few instances affected I think. [00:51:55] yeah, still worth documenting it :) [00:52:00] yep! [01:01:08] errrrr, what happened to Ryan_Lane. he's writing in euro dates? [01:01:11] and not iso8601? [01:01:23] s/\./?/ [01:02:00] :D [01:03:09] $ echo 100*14/11 | bc [01:03:09] 127 [01:03:18] so, it was an outage of 127% of labs [06:27:25] hrmmmmmmm, i wonder how alioth compares to toolserver wrt use cases/hardware/admins [06:28:09] (alioth is debian's self hosted sourceforge clone using a package called fusion forge) [06:28:35] alioth obviously is all debian whereas toolserver is partly solaris [06:28:57] alioth seems from what i can tell to have a bit less bus factor. but still some [06:29:28] anyway, will think about it overnight :) [12:29:09] Change on 12mediawiki a page Wikimedia Labs/Tool Labs/List of Toolserver Tools was modified, changed by Coet link https://www.mediawiki.org/w/index.php?diff=820475 edit summary: /* Active Tools on the Toolserver */ [14:05:40] !log deployment-prep upgrading mysql on -sql [14:05:46] Logged the message, Master [14:08:10] !log deployment-prep rebooting sql and sql02 [14:08:15] Logged the message, Master [14:09:41] !log deployment-prep rebooting both apaches [14:09:46] Logged the message, Master [14:51:41] !log deployment-prep rebuilding search indexes using jobs for testing [14:51:46] Logged the message, Master [16:09:20] i need help [16:09:20] Hi Ahmad_Sammour, just ask! There is no need to ask if you can ask [16:10:01] when i run a (SQL) code in toollabs there an error. [16:10:49] Traceback (most recent call last): [16:10:51] File "uncatedpages.py", line 35, in [16:10:53] passwd = config.db_password) [16:10:54] File "/usr/lib/python2.7/dist-packages/MySQLdb/__init__.py", line 81, in Connect [16:10:56] return Connection(*args, **kwargs) [16:10:57] File "/usr/lib/python2.7/dist-packages/MySQLdb/connections.py", line 187, in __init__ [16:10:59] super(Connection, self).__init__(*args, **kwargs2) [16:11:00] _mysql_exceptions.OperationalError: (1045, "Access denied for user 'u3871'@'10.4.0.220' (using password: NO)") [16:11:34] what is the reson of this? [16:13:11] Any help? [16:15:08] ? [17:11:50] [16:10:57] _mysql_exceptions.OperationalError: (1045, "Access denied for user 'u3871'@'10.4.0.220' (using password: NO)") [17:11:53] Try sending a password.. [17:20:01] Yeah, "using password: NO" is a dead giveaway. [17:40:09] Hello. I have a performance issue with SQL (server: tools-db). I'm doing a lot of INSERTs and the speed of them is around 100 times slower than the same kind of queries on the same table at the old toolserver (even though the table at the toolserver is double the size). You'll find some logs at http://toolserver.org/~apper/debug_timing_tool_labs.txt (for tools-db at tool labs) and at [17:40:09] http://toolserver.org/~apper/debug_timing_toolserver.txt (for toolserver). [17:41:01] I was told to ask coren about this ;) [17:41:20] I know all, see all, and sometimes even remember some of it! [17:41:28] yes, Coren, i told him to ask you whether you think this is normal [17:41:59] because you are an almighty admin and not just some lowly software developer like me so you might know something about it [17:42:12] ;) [17:42:30] It probably is, if you're doing the inserts one at a time; the DB is currently in a different DC than labs is, so there is an extra 26ms rountdrip for every command. I know some tool writer have had great success with /batching/ queries though. [17:42:33] most of the inserts are fast on labs, but some take 5 seconds or more [17:42:45] The only difference I just found out is, that the varchar entries at tool labs are not "character set utf8"... but I can't really believe that's the difference [17:43:05] Wait, you mean that you get high /variability/? That's unusual. [17:43:35] Coren: yes, that's what i thought [17:44:10] but the 26 ms would explain, why an average of 1ms (toolserver) is not possible ;) [17:44:47] batching the queries is a good idea [17:44:59] will the DB move to the same datacenter? [17:45:35] apper: No, we're moving labs instead. :-) It should happen within 2 months or so -- but batching queries is good for efficiency even when you don't have the roundtrip time. [17:45:52] Coren: yes, I will do this [17:46:15] apper: I'm still surprised at the queries being very much longer than the others, though. [17:47:09] yes - these ones are the biggest problem, I think... taking 25ms instead of 1ms is not good, but okay - but taking >1 second is way to slow [17:47:25] I'd like to see your numbers once you start batching -- it may just be load related and will drown in the average. [17:48:09] Otherwise, I'll consult with our DBA see if he sees something odd. [17:48:51] okay, I will try this [18:28:52] An error appear when run (SQL) code in toollabs. [18:40:13] Ahmad_Sammour: I think that Coren and Reedy both responded to your earlier question… did you lose the scrollback? [18:40:24] I think their answer was 'use a password'. [18:41:06] Sorry, seem there was a problem in internet. [18:41:07] Ahmad_Sammour: It was. Look at the error message, where it says 'using password: NO' which means that your client is trying to connect without one. [18:41:18] !log deployment-prep rebuilding Cirrus search indexes to have the 2 replicas like production [18:41:24] Logged the message, Master [18:42:18] Ahmad_Sammour: The password you can use to connect to the databases should be in your tool's home in a file named replica.my.cnf [18:44:40] the problem appear just today. [18:44:51] yesterday it was work. [18:46:22] Ahmad_Sammour: Ah, you were probably bitten by https://bugzilla.wikimedia.org/show_bug.cgi?id=57067 where some recent accounts accidentally had an empty password. It should have never worked before. :-) [18:46:47] Ahmad_Sammour: Look in your replica.my.cnf, the line with password= contains the password you should be using. [18:48:43] excuse me i havn't much experience in toollabs :) [18:48:49] Yes, my accounts had no password until today, too. But when connecting like described, everything works wheater there is a password or not ;) [18:49:26] apper: Yes, mysql doesn't actually make a difference between 'no password' and 'password is empty' [18:59:47] Coren: you should also tell apper that putting data in the replicated dbs means they can go away at any time [18:59:57] and for data that exists permanently it should be on tools-db [19:00:07] oh fuck [19:00:29] that means to copy when joining with replicas [19:00:31] YuviPanda: Well, we do say that people should be backing up ALL THEIR STUFF! :-) [19:00:42] giftpflanze: A regular dump works also. [19:00:44] Coren: yeah, but people generally don't expect DBs to just go away [19:00:51] we really really should mention that prominently in the docs [19:00:53] * YuviPanda goes to check [19:01:02] Coren: how does that work? [19:01:52] giftpflanze: cronjob with mysqldump? [19:02:00] giftpflanze: tl;de: "mysqldump database" at regular interval. [19:02:09] de ? [19:02:20] ah, now i get it [19:02:21] giftpflanze: You may want to read the mysqldump manpage for options that'd fit your requirements best. [19:02:50] Alternately, for most tools, simply regenerating the data is actually less trouble. [19:02:53] well … i have file data to reconstruct from anyway [19:03:32] because a 1 week db connection always seems to fail [19:07:16] Coren: https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Help#Connecting_to_the_database_replicas doesn't say anything about 'data can go away any time!' [19:07:17] so [19:07:29] should add! [19:07:39] Oh hey, look! It's a wiki! [19:07:41] :-P [19:07:44] ^^ [19:08:16] * Coren chuckles. [19:08:27] I can put it on my TODO list and get around to it eventually. [19:08:44] YuviPanda: I'm only using tools-db [19:08:54] apper: then there's no 26ms delay [19:09:00] (IIRC?) [19:09:02] coren: is sharing a db a thing? [19:09:30] Coren: Is tools-db also in another DC? [19:09:56] apper: No, that one is in the labs proper. [19:10:45] giftpflanze: Yes. You have all rights to your databases so you can give out grants. Also, we hold the toolserver tradition that any database name ending in '_p' is, by default, readable by all users. [19:10:58] mh [19:11:29] do we mention _p in the docs? [19:12:30] Coren: than this is not the explanation - my performance problems are on tools-db. At the moment I'm altering the table to use utf8 as charset, then I will do tests with batching queries. [19:13:45] (after drinking some friday beer with friends, of course ;)) [19:18:42] apper: enjoy your friday beer :) [19:19:08] apper: Beer has priority. [19:19:23] giftpflanze: Hm. Actually, you're right -- I don't think we do. [19:19:45] giftpflanze: This was just an implicit bit of knowledge from the toolserver. [19:29:06] ok, i will tell the people that added me to their tool to share their database with me [19:33:07] erm? [19:33:22] giftpflanze: you can't just become tool and use the DB? [19:33:28] or add grants for yourself? [19:33:44] if you're a tool member then you're a tool member [19:34:04] yes, but everyone should be able to access that data [19:34:33] then i don't understand "with me" [19:35:22] it's their database, they didn't know how to share it with me but to add me to the tool. when they grant access to everyone noone has to ask [19:36:16] giftpflanze: they should come to this channel and you should teach them :) [19:36:25] easiest way is to name with a _p suffix [19:36:38] i guess so [19:39:06] names like enwiki_p are also a reminiscent of toolserver, Coren? [19:39:39] giftpflanze: enwiki exists as well [19:39:51] giftpflanze: Yes, for the same reason. [19:39:56] enwiki_p is the version you can read [19:40:04] ok [19:40:09] Coren: will we ever get EXPLAIN support? [19:40:29] nice that we can now show create table\G [19:40:42] not terribly new but i think hasn't been there forever either [19:43:08] does a database have to be in the form user__dbname_p, or is dbname_p sufficient? [19:44:47] andrewbogott: I have a project you can nuke [19:44:49] * YuviPanda goes to check name [19:47:07] giftpflanze: dbname_p is sufficient to grant access, but users can only /create/ user__dbname*. [19:47:29] well, they managed to do that [19:47:45] on tools-db [19:48:10] jeremyb: lack of explain support on views over tables you do not have select access to is an upstream design decision. [19:48:35] * YuviPanda does explains on local db, or on the production slaves [19:48:45] not everyone can do the latter, and then there are differences anyway [19:51:01] <^d> andrewbogott: You said mention any projects that can be nuked from high orbit? The 'gitorious' project & any instances it may still have has no use. [19:51:06] (03CR) 10Yuvipanda: [C: 032 V: 032] Modify access rules [labs/tools/grrrit] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/95656 (owner: 10Yuvipanda) [19:51:18] ^d: great! [19:51:20] * andrewbogott makes a list [19:51:39] andrewbogott: 'whatcanido' can also be nuked [19:51:41] has no instances anyway [19:51:48] YuviPanda: thanks [19:52:49] andrewbogott: any idea how I can give other people +2 on the labs/tools/grrit repo? [19:52:58] andrewbogott: i want anyone who has +2 in core to be able to +2 things there [19:53:08] specifically to help marktraceur, but generally too [19:53:46] YuviPanda, can you view this page? https://gerrit.wikimedia.org/r/#/admin/projects/labs/tools/grrrit,access [19:53:56] yeah I'm there [19:54:24] andrewbogott: I added 'mediawiki' group for 'submit' but apparently that doesn't work [19:54:33] it… should? [19:54:34] I think? [19:54:37] ^d ? [19:54:50] andrewbogott: marktraceur tried to submit, and he couldn't [19:54:59] he should be on group mediawiki, I think [19:55:02] considering he's staff [19:55:16] <^d> mediawiki needs owner + submit. [19:55:22] <^d> owner grants the +2's and so forth. [19:55:25] <^d> submit grants....submit [19:55:29] ah! [19:56:19] (03CR) 10Yuvipanda: [C: 032 V: 032] Modify access rules [labs/tools/grrrit] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/95658 (owner: 10Yuvipanda) [19:56:21] marktraceur: try now? [19:56:32] (03CR) 10MarkTraceur: [C: 032] Transfer inline count to a separate field [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/95572 (owner: 10MarkTraceur) [19:56:33] uzzah! [19:56:49] Coren: _p is only for read access, i guess? [19:57:06] giftpflanze: It is. Anything more needs to have grants added. [19:57:17] thx for clarifying [19:58:14] oh, you actually said that before, i just forgot [19:58:15] ty ^d, andrewbogott [19:58:26] marktraceur: no jenkins, so you need to V+2 and hit submit [19:58:47] Coren: i also can't explain on the real table even though i can show create. i think it's probably worried that i could get some secret data with explain's row count guesstimates [19:58:51] Aw. [19:58:56] is 'Databases ending in _p are granted read access for everyone.' an correct english sentence for docs? if not, how to improve? [19:59:29] Wait, YuviPanda, it +2d, it just didn't merge [19:59:55] Also you should figure out a way to do configuration without checking it into the repo [20:00:35] giftpflanze: Maybe 'Database whose name end with _p'? [20:01:06] database names ending with _p? [20:01:34] marktraceur: yeah, you need to hit V+2 and hit 'submit' [20:01:49] marktraceur: configs in repo are cool and nice. [20:01:55] and non-headachey [20:02:45] YuviPanda: Only if it doesn't have private stuff [20:03:18] marktraceur: there's no private stuff here [20:03:20] except the password [20:03:26] that definitely needs to live elsewhere [20:03:30] but for now, git stash! [20:03:30] That's what I'm talking about [20:03:40] I committed it so we can rebase [20:03:52] you commited the password?! [20:03:52] I also added a gerrit remote [20:04:02] Yeah. Don't push from the labs instance. :P [20:04:14] brrr marktraceur [20:04:15] bad marktraceur! [20:04:16] :P [20:04:26] marktraceur: if so you need to also mention that on the wikitech thread [20:04:27] err [20:04:29] wikitech page [20:04:30] pull -r [20:04:31] Will do [20:04:33] than just pull [20:04:34] ty [20:04:52] marktraceur: okay, I'll be off now. [20:04:54] Ohh I see [20:04:57] 'kay, cheers [20:05:02] marktraceur: oh, what do you see? [20:05:09] DON'T LEAVE ME HANGING! [20:05:31] YuviPanda: I see where the note about git pull is [20:05:38] right [20:05:42] READ SIR! [20:06:30] Damn it [20:06:48] I was testing [20:07:08] andrewbogott: I don't have any red errors, but I have a bold purple error running sudo puppet -tv on instance oauth-apache01. [20:07:31] purple! What does it say? [20:07:48] Hm, the comment count didn't work. Will refine. [20:07:58] andrewbogott: err: Could not retrieve catalog from remote server: Error 400 on SERVER: Duplicate definition: Class[Misc::Deployment::Vars] is already defined; cannot redefine at /etc/puppet/manifests/misc/deployment.pp:164 on node i-00000659.pmtpa.wmflabs [20:08:14] hm, yep, that's broken! [20:08:14] And also err: Could not retrieve catalog; skipping run [20:08:15] marktraceur: testing in production? :P [20:08:27] anomie: I can have a look if you're not used to debugging puppet errors. [20:08:31] YuviPanda: Basically [20:08:37] What else am I supposed to do? [20:08:39] andrewbogott: Please do, thanks [20:08:44] marktraceur: i do that too, but not by +2ing them [20:08:47] Heh. [20:09:05] marktraceur: for each gerrit patch it has a commandline you can copy paste to fetch it and check it out [20:09:06] into a repo [20:09:09] marktraceur: so I use that [20:09:20] code locally, push to gerrit, pull into instance, test, repeat [20:09:34] Yup [20:10:45] Oh, lol, I didn't commit the template change. [20:11:33] anomie: what project is this? [20:11:40] andrewbogott: oauth [20:12:35] (03PS1) 10MarkTraceur: Commit template change for inline comments [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/95661 [20:14:29] (03CR) 10MarkTraceur: "Just a test" (033 comments) [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/95661 (owner: 10MarkTraceur) [20:14:55] Who's resetting the bot? [20:15:23] marktraceur: that looks like it is crashing [20:15:27] when encountering a message [20:15:27] Wuh oh. [20:15:28] or something [20:15:31] look at the error logs? [20:15:40] Reverting, sec [20:15:44] kk [20:16:10] marktraceur: i had a local setup running, should put it in vagrant puppet at some time [20:16:12] not too hard... [20:16:19] but not documented anywhere, sadl [20:16:20] y [20:16:27] Hrm, no errors [20:16:52] Oh, there they are. [20:17:09] Not that commit's fault, though [20:17:20] heh [20:17:59] (03PS2) 10MarkTraceur: Commit template change for inline comments [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/95661 [20:18:43] (03CR) 10MarkTraceur: Commit template change for inline comments [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/95661 (owner: 10MarkTraceur) [20:19:44] (03CR) 10MarkTraceur: "Testing comments again" (033 comments) [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/95661 (owner: 10MarkTraceur) [20:19:47] Sweet. [20:19:56] (03CR) 10MarkTraceur: "Testing with no comments" [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/95661 (owner: 10MarkTraceur) [20:20:04] (03CR) 10MarkTraceur: [C: 031] Commit template change for inline comments [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/95661 (owner: 10MarkTraceur) [20:20:11] (03CR) 10MarkTraceur: [C: 032] Commit template change for inline comments [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/95661 (owner: 10MarkTraceur) [20:20:18] Testing is fun. [20:26:08] Updated docs on wikitech [20:31:01] (03PS1) 10MarkTraceur: Handle 1 comment too [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/95691 [20:32:13] (03CR) 10MarkTraceur: "Only one" (031 comments) [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/95691 (owner: 10MarkTraceur) [20:32:19] Sigh. [20:33:43] (03PS2) 10MarkTraceur: Handle 1 comment too [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/95691 [20:34:44] (03CR) 10MarkTraceur: "Only one comment!" (031 comment) [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/95691 (owner: 10MarkTraceur) [20:34:48] Huzzah. [20:35:13] * YuviPanda disappears [20:35:20] thanks for looking at this, marktraceur! [20:35:23] Yarp [20:35:55] (03CR) 10MarkTraceur: "Two should still work" (032 comments) [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/95691 (owner: 10MarkTraceur) [20:36:09] (03CR) 10MarkTraceur: [C: 032 V: 032] Handle 1 comment too [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/95691 (owner: 10MarkTraceur) [20:40:11] will the datacenter move destroy all instances and build them new? [20:41:17] @andrewbogott [20:41:47] giftpflanze: unlikely. [20:41:56] and i heard /home is also not local [20:42:00] But as a general rule you should be equipped to rebuild instances in any case :) [20:42:09] giftpflanze: it depends on the project. It's local on some projects and not on others. [20:42:18] how to find out? [20:43:22] well… it should be obvious if you have more than one instance. [20:43:28] It's also a checkbox on the project configuration apge [20:43:29] page [20:43:39] mmm [20:44:08] by (un)checking that all data is moved? [20:45:56] why do the tools exec nodes have ip addresses? [20:47:00] Unchecking the box won't move anything, it will just mean that suddenly your home directory is local -- and probably empty. [20:48:10] anomie: Both of your instances (-apache01 and -apache01) include the class 'misc::deployment::scripts.' As best I can tell that class never worked. [20:48:22] Do you know otherwise? Are you using/hoping to use deployment scripts? [20:48:41] Because that class isn't used anywhere else, so I'm inclined to remove it entirely. [20:48:58] andrewbogott: It did work, and it's there because I wanted to have mwscript and the like on the instance. [20:49:14] 162 class misc::deployment::scripts { [20:49:14] 163 include misc::deployment::common_scripts [20:50:43] mutante: ? [20:50:55] just saying all it does is include "common_scripts" [20:51:07] Well, it's the next line that's problematic. [20:51:13] but as you say it seems like just "misc::deployment" is used right now in site.pp [20:51:40] but that's odd because i thought at some point that was used [20:51:44] common_scripts includes mediawiki which includes mediawiki::sync which defines Misc::Deployment::Vars. Then misc::deployment::scripts tries to redefine it with different args. [20:52:06] can you delete an instance and make a new one with the same name? [20:52:07] ugh [20:52:42] giftpflanze: You can, although there will be a DNS lag during which the new instance may be hard to reach. [20:52:53] mhm [20:53:04] mutante, anomie, the difference is that misc::deployment::scripts defines class { "misc::deployment::vars": system => "git-deploy" } [20:53:11] andrewbogott: I suspect it worked until I2800d01a [20:53:19] Whereas previously it's defined as class { "misc::deployment::vars": system => "scap" } [20:53:33] I tracked down the changes, I'm pretty sure this was broken as long ago as June. [20:54:15] anomie: OK, so, what exactly do you want on those instances? Presumably we should use something that's used in production, which misc::deployment::scripts is not. [20:54:23] uuh.. i see. you'll know better then, but not that surprised about brokeness [20:54:29] Considering the last successful puppet run was "Fri Jul 5 02:17:21 UTC 2013" [20:54:36] (03PS1) 10Aude: Update wikidata irc channel [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/95695 [20:54:59] mutante, do you know what gives with 'scap' vs 'git-deploy'? [20:55:06] andrewbogott: As long as mwscript and that works, it's fine. scap or git-deploy isn't needed. [20:55:57] ok. So, mutante, you think those labs instances should include 'misc::deployment' or just 'misc::deployment::common_scripts'? [20:56:09] anomie, if I try the latter, will you be able to tell if you have what you want? [20:56:28] (that would involve removing scripts you care about and seeing if puppet recreates them.) [20:56:47] Go for it [20:57:49] andrewbogott: honestly, i dunno how far the switch has gone exactly, but i still see people scapping, so i guess until then should have the previous one [20:58:08] Ryan_Lane: ^? should that be git-deploy already? [21:02:27] anomie: OK, now there's a new failure message which… hopefully you can resolve. [21:11:52] andrewbogott: Hmm. /data/project is screwed up. Which I wouldn't care about except puppet wants /usr/local/apache to symlink there. [21:12:12] 'screwed up' = read only? [21:12:53] andrewbogott: "ls -d /data/project" works, "ls /data/project" says "ls: cannot access /data/project/: No such file or directory" [21:13:03] hm, ok, one minute... [21:13:46] Ah, it's not screwed up, it just never had shared storage to begin with. [21:19:19] anomie: on to the next error now :) [21:20:06] ah, then this comment was probably you andrewbogott [21:20:08] 161 # Can't include this while scap is present on tin: [21:20:08] 162 # include misc::deployment::scripts [21:20:18] just ran across that again actually looking for something different [21:20:28] Nope, wasn't me, but true nonetheless. [21:20:41] Actually I think that's a different conflict from the one I was troubled about. [21:20:41] i just installed unzip on tin for Reedy [21:20:52] i just dunno where to put it in puppet [21:20:59] all those roles on tin :p [21:21:05] Living life dangerously [21:21:12] Outlaw server admin mutante [21:21:17] lol [21:22:01] would throw it even in standard [21:22:07] node tin { package { 'zip': ensure => present } }  # For reedy RT #123456 [21:22:19] andrewbogott: It seems functional now, thanks! [21:22:30] cool [21:39:07] Ryan_Lane, andrewbogott: virt11 is getting full [21:48:18] so full that things are starting to fail? [21:49:36] (03CR) 10Legoktm: [C: 032] Update wikidata irc channel [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/95695 (owner: 10Aude) [21:50:04] thanks legoktm [21:50:08] :) [21:50:31] hm [21:50:31] jenkins doesn't automerge there? [21:51:08] doubt it [22:01:07] paravoid: it's been that way for a while [22:01:29] icinga started warning 1d 3h 4m 3s ago [22:01:42] there's a bug in openstack that I patched [22:01:52] but there's a couple vms I need to deal with to free up space [22:02:01] basically, one of the base images is 160G raw [22:02:03] lol [22:02:12] for a fucking ephemeral disk [22:02:18] which means it's totally empty [22:02:38] I patched it so that ephemeral base images use qcow [22:02:54] but there's a couple images based on the 160G ephemeral [22:03:20] and they need to have their data moved onto a new disk based on the qcow, then the 160G disk can go [22:03:31] virt6 on the other hand is just full :) [22:03:40] and for that one we need to move some vms around [22:03:56] I was just planning on waiting till we move to eqiad and it'll clean itself :) [22:05:11] I wish those ciscos had more disks [22:28:15] Ryan_Lane: some instances in ldap have puppetVar: instancecreator_username and some don't. And, I just created a new instance, and it doesn't… do you know when/why we lost that functionality? [22:28:21] hi! does anybody know, how to start a perl script (that uses a locally installed module) via jstart? [22:29:47] lustiger_seth: I'd expect it to just work… what's happening? [22:30:04] "jstart script.pl" gives an error message, that a module could not be found. how do I tell jstart, that the module is installed locally? [22:31:17] I installed the module at /data/projects/camelbot/perl/lib/ [22:31:39] lustiger_seth: so… jstart doesn't run scripts on your current system. It dispatches them to one of the exec nodes. [22:32:03] When you say 'locally' do you mean in the cwd? [22:32:39] no, not the cwd, but the directory /data/projects/camelbot/perl/lib/ [22:33:16] OK, so… that doesn't seem like an issue with jstart. Your perl script needs to know where to look... [22:33:45] I don't know enough perl to tell you how to do that, only that you need to do it. Presumably you can set a module path someplace? [22:35:02] in perl I could say "use lib qw(data/projects/camelbot/perl/lib);" to ensure that that path is added to the searched paths [22:35:23] normally that works, but when using jstart, it does not work. [22:36:10] Seems like you're missing a leading / in that path [22:36:10] lustiger_seth: Shouldn't there be a slash before "data" in that? [22:36:21] oh, you're right, even without jstart it does not work... [22:37:14] copy and paste error (vim is sooo slow on that toolserver.) there is a slash. [22:37:54] Also, shouldn't it be /data/project/, not /data/projects/? [22:38:01] oops. [22:38:52] andrewbogott: it was removed because that attribute doesn't support utf [22:39:28] reasonable. But too bad that we don't have a way to track down instance owners :( [22:42:13] ok, my problem was the wrong trailing "s", sorry. [22:42:44] lustiger_seth: no worries, everyone needs a second pair of eyes now and then :) [22:43:16] i wish my problems could go away by a second pair of eyes … [22:44:49] I've got another problem. And this time it seems to appear only when using jstart: there is a perl module called WWW::Mechanize. It is installed system-wide at /usr/share/perl5. [22:45:31] if I login as the tool (become camelbot) I am able to run a perl script that uses WWW::Mechanize [22:45:55] But if I use jstart, the error appears, that WWW:Mechanize cant be found. [22:46:23] lustiger_seth: I'll look to see if it's on the exec nodes. Can you tell me how to determine if it's available on a given system? [22:46:43] oh, nm, I see it on tools-login... [22:48:41] lustiger_seth: Indeed, that module is not available on exec-08. [22:48:54] So, best to file a bug requesting whatever ubuntu package contains it. [22:49:22] ok, at https://bugzilla.wikimedia.org/ ? [22:49:25] lustiger_seth: I expect you can also log into exec nodes and check this for yourself. I'm not positive. [22:49:27] Yep! [22:49:35] And bug Coren about it if/when he appears. [22:50:34] how can i log into exec nodes? just "ssh exec-08"? [22:50:59] well… it may not be trivial, you'd need some ssh magic. [22:51:22] no, that's it [22:51:30] just ssh tools-exec-08 i think [22:51:41] legoktm: that's it? Doesn't he need a key forwarded to tools-login first? [22:52:00] i ssh into tools-login.wmflabs.org and it works fine for me [22:52:12] seems to work for me, too. [22:52:16] I'm always around! [22:52:17] curious. [22:52:19] Ok. [22:52:26] lustiger_seth: sometimes saying Coren's name makes him appear. [22:52:26] * anomie|away likes qlogin -q task@tools-exec-08 [22:52:56] andrewbogott: sounds a bit like beetlejuice. [22:53:01] Similar. [22:53:29] And if I ask Coren here, I don't need to file a bug ticket? [22:53:33] lustiger_seth: ssh works to get onto the exec nodes, but you probably shouldn't be there. Why do you think you need it? [22:53:46] lustiger_seth: My answer to most bugging is "file a ticket" :-) [22:53:52] Coren: why do have exec nodes ip addresses? [22:53:58] Except sometimes Coren snatches your still-beating heart straight out of your ribcage. So that's a little different from beetlejuice. [22:54:18] giftpflanze: I'm not sure I understand your question? [22:54:27] meh [22:54:39] (You can only log in to the exec nodes from tools-login or tools-dev btw, not from outside) [22:55:02] Coren: I just wanted to check, why a perl script of mine is not working via jstript. and it seems that the perl module WWW::Mechanize is not installed at the exec nodes. [22:55:19] s/jstript/jstart/ [22:55:20] It's not in puppet. [22:55:50] what does "in puppet" mean? [22:55:52] installed? [22:56:22] Coren: I suggested that lustiger_seth could log into an exec node if he wanted to see if a given perl module was available. Bad advice? [22:56:35] lustiger_seth: Things get installed on the exec nodes through the puppet configuration. [22:56:54] oh, that's a software, I see. [22:57:00] andrewbogott: It's a valid quick check, but the only authoritative list is what's in the class. [22:57:13] * andrewbogott nods [22:57:53] ... gitblit just died. [22:58:20] Ah, no... it just breaks when I request HEAD. Odd. [22:58:30] lustiger_seth: https://git.wikimedia.org/blob/operations%2Fpuppet/f3db6e91cdefa91f27e04f639f9de8146d1d14a5/modules%2Ftoollabs%2Fmanifests%2Fexec_environ.pp [22:58:53] Check that list for your dependencies. If they're not all there, you can request new packages (including perl modules) though bugzilla. [22:59:47] andrewbogott: hm. it may be in nova [22:59:50] ok [23:00:26] andrewbogott: it is [23:00:41] oh, cool. [23:00:43] in fact, we can add the info to the wiki via the nova plugin [23:00:55] it's the user_id field [23:01:09] hmm, (filing a bug at bugzilla) "components"... is it bots, tools, Infrastructure or general or other? [23:01:49] * lustiger_seth chooses "tools" [23:02:12] yep, tools. [23:03:16] Coren: https://git.wikimedia.org/blob/operations%2Fpuppet/production/modules%2Ftoollabs%2Fmanifests%2Fexec_environ.pp is a slightly nicer URL, it should always show HEAD. [23:03:37] anomie|away: Indeed. [23:05:17] ok, done, https://bugzilla.wikimedia.org/show_bug.cgi?id=57118 [23:06:05] thanks for your help Coren, andrewbogott, anomie|away! [23:06:15] yep!