[00:00:41] Coren: cool, All files of my service group has been deleted and my jobs are stopped [00:01:08] why this thing happened? [00:01:22] Amir1: None of which has surprised you, since this has been announced repeatedly on labs-l for months, right? [00:01:49] I'm not member of labs-l [00:01:59] Today is migration day to eqiad for the tools which have not yet been migrated by their maintainers during the migration period. [00:02:01] I forgot to subscribe [00:02:28] Amir1: Then you should subscribe. The best prevention against surprise is to be informed, and all Labs-related announcement go on labs-l. :-) [00:02:32] there isn't anything about it in here: https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Help [00:02:42] about hte migration [00:02:53] I'm gonna subscribe [00:03:25] is there any help about the migration [00:03:29] That's because documentation pages are not a good place for announcements. You'll have to be patient; since your tool is in the batch migration. I'll send an email to the list with details and instructions once they are ready for maintainer intervention. [00:03:43] Amir1: There is nothing you can do at this time except wait. [00:04:32] okay, do you have an estimation when it'll be done? [00:08:39] Amir1: I expect at least another 6 hours or so. [00:08:53] okay, thank you [00:12:04] hi Coren , i was reading the documention to migrate ptwikis now [00:12:16] but i jus read you speaking with Amir1 [00:12:27] should I dont try to migrate now and just wait? [00:12:42] HenriqueCrang: Indeed. I'm sorry, you're too late to migrate by hand. So yeah; patience will be needed. :-) [00:12:52] ok :-) [00:53:26] Coren: If you get a chance, could you make that new trebuchet LDAP user a member of the "wikidev" group? [01:04:39] aude: Re deleting tool "aude" (cf. https://wikitech.wikimedia.org/w/index.php?title=Tool_Labs/Migration_to_eqiad&diff=104632&oldid=104628), could you please file a bug instead? On that page it will probably fall through the cracks, and with the migration in progress it's hard to do it immediately. [02:24:28] Do I correctly recall there being a pastebin hosted on the labs? [02:26:05] Yes. [02:26:08] paste.wmflabs.org, I think. [02:29:27] Gloria: Not found. [02:30:05] There's ongoing server maintenance. [02:31:02] Gloria: google-public-dns-a.google.com can't find paste.wmflabs.org: Non-existent domain [02:31:19] Maybe it was elsewhere. [02:31:26] I remember seeing a pastebin somewhere. [02:32:31] http://tools.wmflabs.org/paste/ [02:32:34] There it is. [02:34:51] Gloria: "appears to be non-functional at this time" for me? :/ [02:35:37] Is it in batch migration? [02:36:21] http://tools.wmflabs.org/paste/ loads for me. [02:36:31] But there is ongoing maintenance still, I'm fairly sure. [02:39:13] Gloria: Hmm, seem to have it on https. [02:40:47] Exciting. [02:43:36] Gloria: Ta btw. [03:14:40] Hello, all. I'm getting a 400 again on my tools. AND I can't log in via ssh because I'm getting a bad host key. anyone know why? [03:18:27] Magog_the_Ogre: bad host key part is normal in this case. it has actually changed,so you need to delete the old one and accept new one [03:18:50] this is one of those cases where it actually changed due to migration [03:19:12] and the clients will make it sound really bad (for usually good reasons) [03:19:50] why, what happened? [03:19:53] I missed the memo [03:20:18] mutante|away, [03:20:28] mutante|away: Can you log in to tools-login? [03:20:32] It fails for me. [03:20:43] Hmm, actually. [03:20:51] Never mind, wrong key. :-) [03:21:09] I usually use an alias that includes -q and the key. ;-) [03:21:28] Magog_the_Ogre: labs is moving from the old data center in Tampa to the new one [03:21:35] New one in Virginia. [03:21:41] ohhhh ok [03:21:56] Gloria: i'm not sure i have a login there yet [03:22:09] i am a labs user but not really a tool labs user so far [03:22:28] Ah. [03:22:29] and ops have other bastions too [03:22:31] I'm the opposite, mostly. [03:22:42] I'm still getting a 400 on my tool [03:22:48] Which tool? [03:22:49] and I ran "webservice start" [03:22:53] er [03:22:55] all of them [03:22:58] Link? [03:23:01] http://tools.wmflabs.org/magog/oldver.php [03:24:01] hmm, can't say i get a 400, it was really slow [03:24:04] but then it worked [03:24:09] i see some form [03:24:27] you know what? [03:24:29] works for me [03:24:33] it's because i typed in webservice start [03:24:37] and then IMMEDIATELY tried to load [03:24:41] I had to give it a minute >_> [03:24:49] yeah, it felt like you just started it [03:24:54] first load slow, now normal [03:24:59] Seems home directories are pretty locked down by default. [03:25:51] there should be a page with the ssh host keys, btw [03:25:57] so you can actually compare them [03:26:04] if you don't feel like blind trust, heh [03:26:30] omg [03:26:34] it deleted my upload log [03:26:36] the whole thing [03:26:44] "it" ? [03:26:44] 11000 entries >_> [03:26:51] it's gone from my home directory [03:27:04] I can restore from backup but like 2 months of logs will be gone :( [03:27:46] I'm 98% sure it wasn't something I did. [03:28:28] don't panic yet, maybe something is just not mounted [03:28:41] or can be restored after glusterfs restart whatnot [03:28:44] as it happened before [03:28:59] ok [03:29:12] * Magog_the_Ogre puts his panic hat back in the closet [03:29:15] opening bug is still good idea though [03:29:27] if you are seriously missing data [03:29:35] ok [03:29:37] there is a lot going on currently [03:29:57] so that would be helpful for to keep track [03:29:59] I will open a bug before I go to sleep (i.e., in like the next two minutes) [03:30:09] *few [03:30:11] great [03:30:20] thanks for the help :) [03:30:38] I promise to only cry one bitter tear of remorse over the logs ;) (jk) [03:30:54] !ping [03:30:54] !pong [03:31:37] !log deployment-prep deployment-bastion now using deployment-salt as puppet master [03:35:28] FYI https://bugzilla.wikimedia.org/show_bug.cgi?id=62767 [03:38:03] "deep-sixed". and i learn another English synonym. looks good [03:39:22] http://en.wiktionary.org/wiki/deep_six#Etymology [03:47:12] mutante|away: There is. [03:49:23] https://wikitech.wikimedia.org/wiki/Help:SSH_Fingerprints/tools-login.wmflabs.org [03:49:38] Coren: wiki page? yea, saw people asking before, [03:49:51] just wasnt sure about .. that:) nice [03:50:04] Help namespace it is [03:50:35] Magog_the_Ogre: [03:51:20] And also, Magog_the_Ogre: plz 2 subscribe to labs-l. No surprises that way. Well, /fewer/ surprises. [03:53:24] mutante|away, I try to be creative when I write on the internet, because Heaven knows I don't have much other outlet [03:53:49] I am aware that I probably irk the 3/4th of Wikimedians who aren't EN-N, but I figure they might be able to learn something :) [03:54:21] Magog_the_Ogre: your chance to ask about the file again, heh [03:59:01] mutante|away, I'm sorry? [04:00:00] Magog_the_Ogre: you should (nicely) ask Coren about your bug report [04:00:23] oh [04:00:24] :) [04:00:56] Coren, https://bugzilla.wikimedia.org/show_bug.cgi?id=62767 [04:01:18] maybe it was/is in the process of being moved? i wasn't quite sure [04:01:35] still empty directory :-/ [04:02:03] mutante|away, are you implying that programmers aren't especially cordial people? :) [04:03:27] not implying anything, you should just extra because they are ultra busy moving all teh things [04:04:58] at my job, we always joke about how terrible at communicating we programmers are [04:05:15] I am OK in IRC, but in general I'm pretty bad too at it tbh [04:05:53] subscribe to the list:) [04:05:55] and I would never look a gift horse in the mouth. The whole project is FREE for me to use and to add my tools to, so I expect nothing [04:05:57] already did! [04:06:00] hehe, ok [04:06:33] Einem geschenkten Gaul schaut man nicht aufs Maul. [04:07:05] that's the "gift horse" thing:) [04:07:38] lol [04:07:59] but it rhymes [04:08:48] http://en.wiktionary.org/wiki/Gaul#Etymology_2 + http://en.wiktionary.org/wiki/Maul#Etymology [04:09:20] "hack" = bad, old or incapable horse? [04:09:51] that makes it even better [04:09:55] I have never heard the word "hack" or "nag" used for a horse [04:11:05] I have heard hack to mean 1) something done on the computer, 2) a particularly nasty cough, 3) an attempt at something, 4) a swing of the baseball bat, and 5) a really lousy writer [04:11:23] that was tmi [04:11:25] what I meant was [04:11:29] https://en.wiktionary.org/wiki/hack [04:11:36] etymology 1 only is what I know :) [04:11:48] well, Pferd = capable, regular horse. Gaul = bad, incapable horse [04:12:19] some day wiktionary will just be a wikidata frontend, but i love both [04:12:20] that sounds like one of those words that used to be more common in English, but has fallen out of usage with people like me who rarely is around a farm [04:12:26] lol [04:13:22] I've often wondered about the "translation" sections I see on Wiktionary [04:13:34] the number of combinations is astronomical [04:14:10] you should add some:) there's a gadget for that that makes it really fast, he [04:14:33] yea, every word of every language in every other language... not more:) [04:16:28] I must sleep [04:16:30] and yes, we include vulgarities ( stops being off-topic) [04:16:32] same here [04:16:45] it's 12:15AM in the US and I have to be awake at 6:45AM [04:16:52] good night, thanks again [04:22:00] Hey folks, what is going on? The page http://tools.wmflabs.org/?list lists no tools. My own tool, http://tools.wmflabs.org/?tool=mathbot does not show up. [04:30:35] Hey folks, what is going on? The page http://tools.wmflabs.org/?list lists no tools. My own tool, http://tools.wmflabs.org/?tool=mathbot does not show up. [04:47:51] mzmcbride@tools-login:~$ become lolrrit-wm [04:47:52] sudo: sorry, a password is required to run sudo [04:47:53] :-( [05:01:44] Same here, both tools and tools-eqiad .wmflabs.org seems to be showing empty lists [05:26:17] !log Created cvn-app3 as the first cvn server in wmflabs-eqiad. Don't use for now (not yet set up with proper packages), but will replace cvn-app1 and cvn-app2 from pmtpa soon. [05:26:17] Created is not a valid project. [05:26:22] !log cvn Created cvn-app3 as the first cvn server in wmflabs-eqiad. Don't use for now (not yet set up with proper packages), but will replace cvn-app1 and cvn-app2 from pmtpa soon. [05:26:24] Logged the message, Master [05:34:43] !log cvn Finished migration of all bots from the old and small app server (cvn-app1) to the large cvn-app2. Instance to be decomissioned. [05:34:45] Logged the message, Master [05:36:39] I know I can't create new instances, but I should still be able to grant people access to a project and have them be able to access existing pmtpa instances, right? [05:37:10] I added user 'rxy' to the 'cvn' labs project. When he logs into a pmtpa instance, the connection gets terminated for being unable to create /home/rxy [05:37:31] Hm.. I just created an eqiad instance, and I'm getting the same error when logging in there. So I guess it's not pmtpa related. [05:37:50] https://gist.github.com/Krinkle/1050d5de4f5dacc3479c [05:37:53] rxy: ^ [05:44:00] Coren: andrewbogott_afk: [06:09:49] Coren: andrewbogott_afk: Filed bugs instead; https://bugzilla.wikimedia.org/show_bug.cgi?id=62770 https://bugzilla.wikimedia.org/show_bug.cgi?id=62771 Thanks, [06:37:10] legoktm: do you know what 'SHUT OFF' means, in context of a labs vm? [06:37:37] a vm that isn't running? [06:38:21] legoktm: hmm, how I do get it back up? [06:38:34] there should be a reboot link somewhere I think? [06:38:39] legoktm: nevermind, found it [06:38:42] :D [06:38:50] woo, first login to eqiad [06:43:47] legoktm: do you know the eqiad bastion url? [06:43:59] bastion.wmflabs.org? [06:46:57] legoktm: bastion.wmflabs.org is still pmtpa (208.80./153/.207) [06:47:14] oh :/ [06:47:24] YuviPanda: bastions https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Overview#Bastion_hosts [07:31:18] Is there somewhere a manual telling me how to migrate a db from the old system to the new system? [07:39:56] Or can someone simply move my old database to the new place .. [07:45:32] Coren - is the old database rename still busy? [08:32:55] hello guys, my crontab (as well as tools.spbot's) is empty today. why? does that have something to do with the eqiad migration? i have not start this (https://wikitech.wikimedia.org/wiki/Labs_Eqiad_Tools_Migration) yet [08:57:21] hello [10:25:02] Coren: What I need to for my service groups [10:25:04] ? [10:44:33] is labs DNS still broken? [10:54:33] Coren - why can't I ssh to 'tools-exec-03' (I get a DNS-spoofing error and am warned that someone may be doing something nasty) [10:55:46] ah .. remove them from your known_hosts solves it :-) [10:56:45] databases still not moved, though. (forced move this morning) [11:02:40] Beetstra: in process its the last part to get moved [11:03:04] It will then probably take time [11:04:05] yeah, thats what happens when you dont do it yourself :P [11:04:22] If wikimedia would pay me to .. maintain the antispam bots .. [11:04:35] Or friggin' build it into the MediaWiki software .. [11:04:46] Yeah, I know [11:05:06] * Beetstra shuts up [11:05:23] there are several features that could have ground breaking impacts but no one is willing to code them [11:06:46] WikiLove is more interesting and useful [11:07:01] Hey, every spammer is still a new user to cherish! [11:07:10] * Beetstra again tries to shut up [11:07:56] * Beetstra fails [11:08:06] I was amused by the "TODO: Talk with the owners and maintainers of current tools and bots running elsewhere and help learn what is stopping them from migrating to Tool Labs, and file bugs in Bugzilla about those issues." on https://www.mediawiki.org/wiki/Wikimedia_Labs [11:12:46] Beetstra: the two bugs that Ive had open for years that have the biggest potentials are about including images in import/export and the second is about having ISBN links in the database and being able to query them [11:13:14] Lets find all articles that reference a given ISBN [11:13:48] go to special:BookSearch key in the ISBN and poof, every article where there is a link to that ISBN [11:14:05] we now finally have logs on the blacklist, that is massively interesting [11:14:17] Beetstra: linky? [11:14:34] Finally we can say "no, we don't deblacklist this, that guy tried to spam it just yesterday!" [11:14:44] https://en.wikipedia.org/w/index.php?title=Special:Log/spamblacklist&limit=500&type=spamblacklist&user= [11:15:53] Beetstra: Not cool [11:16:06] sorry, that is the en.wiki link [11:16:18] Beetstra: thats enwiki only [11:16:24] *admin only [11:16:29] admin only, yes [11:16:33] It exists on all wikis [11:16:40] it shouldnt be admin only [11:16:54] There is no real need for it to be admin-only, indeed [11:17:29] in fact it would make non-admin clerking at the blacklist/whitelist easier [11:17:35] afk [11:18:45] yep [11:28:18] anyone know how i can turn on a shutoff instance? specifically, this one: https://wikitech.wikimedia.org/wiki/Nova_Resource:I-00000962.pmtpa.wmflabs [11:33:49] hashar: do you know how i can turn on a shutoff instance? specifically, this one: https://wikitech.wikimedia.org/wiki/Nova_Resource:I-00000962.pmtpa.wmflabs [11:37:32] dan-nl: reboot it from the wikitech web interface? [11:38:00] thanks, trying that [11:39:29] hashar: that took care of it, thanks. the timeline for migration is by the end of march 31, correct? [11:39:31] hi Coren, the crontabs ar NOT copied. [11:39:33] crap., [11:41:18] Steinsplitter, same here [11:48:42] dan-nl: I think so. [12:39:57] Beetstra: see bug 62781 [12:52:18] Steinsplitter: They have been, but the finishing script won't run until migrations are done and there's a ... ridiculously large tool in progress right now. [12:53:07] Steinsplitter: But you can check for the presence of a ...DATA.crontab file in the tool's home; it contains your crontab. Remove it if you edit your crontab with it, otherwise when the finishing script runs it'll overwrite them. [13:00:17] Coren: i cannot find :/ maby you have a minute to look into it? krin kle is not here and i guess ther ar atm hunderts of redlinks in wikipedia. this is a important bot. [13:01:06] Steinsplitter: All bots are important. Also, what do you mean you cannot find it? [13:01:09] -rw-r--r-- 1 root tools.delinker 1132 Mar 18 12:42 /data/project/delinker/...DATA.crontab [13:01:20] Coren: whats the large tool? [13:01:48] enwp10 [13:02:29] Ouch [13:08:45] my tools have been moved. How can I enable them? [13:08:51] Coren: forgotten to type "...", thanks :) [13:13:22] Coren: I'm not able to run become on the new cluster (since I have no password) nor passwd [13:14:04] fale: http://lists.wikimedia.org/pipermail/labs-l/2014-March/002196.html [13:14:36] Coren: thanks :) [13:15:53] Coren: in the tools page, no tool is listed :/ [13:17:05] fale: I'm sorry, what? [13:17:30] Coren: http://tools.wmflabs.org/ <-- no tools here [13:17:56] Huh, interesting. [13:18:00] * Coren looks into it. [13:28:31] Ah. It's my fault. I managed to overwrite some recent changes of mine. /me facepalms. [13:31:56] Coren: I hope nothing big :) [13:33:04] * fale also hopes that eqid will be labs' home for many years since in less than 3 months I did already moved my tools twice :/ [13:33:35] No plans to move anything from eqiad for years. :-) [13:36:30] Coren: :) [13:42:42] ah Coren good morning! [13:43:16] Coren: so I got some issue with the shared NFS server hosting MediaWiki uploaded files. The files are created by the 'apache' user and it does not exist on the NFS server [13:43:40] Coren: you told me about using nobody:nogroup and 777 but I am not sure how to have apache write files with those credentials [13:43:58] and I thought we could just add a apache user on the NFS server just like you did for l10nupdate and mwdeploy user :] [13:45:33] hashar: (a) You have nothing special to do; apache will be able to write there as expected and (b) 'apache' is a poor candidate for a global group; it's not mw-specific really so possibly conflicting, yet is also not used (that way) in ubuntu at all. [13:46:09] Coren: so should I simply chown/chmod the files ? [13:46:09] Although, to be fair, right now much of that user/group thing is a mess -- including production. [13:46:28] hashar: ... what files? You need not chown anything. [13:47:10] on the beta cluster, files uploaded by users via Mediawiki are written to the NFS shared directory /data/project/upload7 (we dont have swift) [13:47:44] hashar: Yes, that part I know. What files to you mean? [13:47:52] the files are root:root 777 right now. So that probably works [13:48:03] ... what files? [13:48:07] all of them ? [13:48:08] :-D [13:48:24] You mean the actual /uploaded/ files? [13:48:29] yesss [13:48:38] You don't care about those. :-) [13:49:11] under /data/project/upload7 mediawiki creates directories like / for example wikipedia/en/ [13:49:45] then under that the usual md5 based hierarchy like 0/00/ [13:49:46] Files don't gain anything depending on who owns them. :-) [13:49:55] and that needs to be writable by all apache instances [13:50:10] okk [13:50:20] so the current root:root 777 would work hopefully :] [13:52:57] Coren: the apache user has the same UID/GID (48) on all my instances \O/ [13:53:10] and the files are 777. I guess issue is solved [13:53:10] thx [13:53:19] (there was no issue in the first place hehe) [13:54:30] Coren: on a different subject, I could use a lvm disk on the continuous integration instances in EQIAD :-] [13:54:47] which is all about requiring labs_lvm :] https://gerrit.wikimedia.org/r/#/c/119263/1/manifests/role/ci.pp,unified [13:55:05] on pmtpa we have been using /dev/vdb on /mnt [13:55:52] ... okay? [13:59:09] hashar: Merged. [13:59:49] Coren: https://bugzilla.wikimedia.org/show_bug.cgi?id=62771 you read this already ? and can you fixing it? [14:02:13] rxy: No; I'm busy on the migration. But the last problem being described there is almost certainly incorrect security groups. [14:03:24] Coren: hmm , thanks [14:09:15] Coren: I tryed that method, but the sudo error is still there [14:09:28] fale: What tool? [14:09:45] from fale I try to do "become lists" [14:10:29] Coren: oh, I had to deconnect :) [14:23:36] is DNS still broken? [14:25:55] !log [14:25:55] Message missing. Nothing logged. [14:26:35] !log integration creating new slaves integration-slave1001 and integration-slave1002 using role::ci::slave::labs [14:26:37] Logged the message, Master [14:28:42] maxsem@bastion1:~$ ssh jitsu.eqiad.wmflabs [14:28:43] ssh: connect to host jitsu.eqiad.wmflabs port 22: No route to host [14:28:53] Coren, what's broken?^^ [14:28:58] Coren: also got a change to make labs_lvm exec failures to emit stdout/stderr on failure https://gerrit.wikimedia.org/r/#/c/117199/ [14:29:19] aka exec { logoutput => 'on_failure' } [14:30:29] MaxSem: Migrated instance? [14:30:36] yep [14:30:41] worked yesterday [14:31:24] or was that its pmtpa clone that's no more today? [14:32:11] MaxSem: You're connecting from a pmtpa bastion? [14:32:18] yep [14:32:32] is there an eqiad one? [14:32:42] MaxSem: That may or may not work; check the project's security groups. (Also, you should be using an eqiad bastion, the pmtpa ones will disapear soon) [14:32:59] what's its external hostname? [14:33:27] bastion-eqiad [14:35:35] Coren, weird - I connect to bastion.eqiad.wmflabs.org but still see bastion1 in promt [14:36:13] bastion-eqiad you mean? Yeah, they're named 'bastion1-3, etc also. Just .eqiad) [14:37:02] so I still can't connect to jitsu even from eqiad [14:42:50] MaxSem: Okay, did you check the security groups, and is there anything on the console? [14:45:26] security groups look the same as in pmtpa, console output looks empty [14:46:23] Well, that's my point. If they're the same as in pmtpa, they may well not be correct. What's the range from which ssh is allowed? [14:48:29] 0.0.0.0/0 [14:49:12] hmm, what if it's the same group from pmtpa with the same name? [14:50:53] That shouldn't cause an issue. [14:52:20] MaxSem: That VM doesn't even arp; I think it's dead, jim. :-) [14:52:32] MaxSem: And you say nothing on the console either? [14:52:37] нуы [14:52:40] yes [14:53:14] What project is this? [14:53:17] mobile [14:58:09] MaxSem: There's definitely something odd going on, afaict no instance in your project is working right. When were those migrated/ [14:58:10] ? [14:58:33] Coren, Sunday evening [14:59:34] I'll have to involve andrewbogott_afk to dig deeper; I'm not sure what state his part of the migration is in and whether that could have this effect. [15:26:58] Hi! I had requested for a labs instance on March 1 (https://wikitech.wikimedia.org/wiki/New_Project_Request/mediawiki-VERP) inorder to setup an environment to test and implement the email functionality of MediaWiki to implement VERP functionality for my proposed GSoC project ( mediawiki.org/wiki/VERP ). It's not yet approved. Can someone look into the same ? [15:30:15] tonythomas: You'll probably have to wait until early next week; we're in the middle of a migration at the moment. :-( [15:30:46] Coren: in that case. ok. Will wait [15:31:21] We /might/ be able to squeeze it in late this week though; but no promises. Sorry about the poor timing. [15:41:20] Coren: ok. Hope to hear the notification soon :) [15:51:18] is there a problem with equids? A migrated instance can not be accessed by host name, floating ip or from bastion with ssh. Wikitech reports it as active. i-00000165.eqiad.wmflabs [16:14:41] hm, anyone know an email address for dan-nl? [16:15:06] I mothballed project yesterday and today he re-started the pmtpa instance :( [16:20:42] slevinski: what is the name and project for that instance, please? [16:22:20] hey andrewbogott, is that because his migrated instance is dead just like mine are?;) [16:22:34] MaxSem: no [16:22:39] But, I'm going to look at yours next [16:22:48] thanks:) [16:23:22] signwriting project [16:23:30] Coren: around? [16:23:33] ah, andrewbogott is [16:23:38] something is very broken with labs' DNS [16:23:43] faidon@bastion-restricted1:~$ host catalogcompiler.eqiad.wmflabs [16:23:44] catalogcompiler.eqiad.wmflabs has address 10.68.16.24 [16:23:44] catalogcompiler.eqiad.wmflabs has address 10.68.16.4 [16:23:55] yikes [16:23:55] the latter is tools-webproxy [16:24:14] major confusion when I got two different prompts when SSH'ing in :) [16:32:51] paravoid: that looks like a consequence of https://bugzilla.wikimedia.org/show_bug.cgi?id=58717 [16:33:03] which I couldn't solve last time, but I'll give another look soon :( [16:40:16] hi! I'm the maintainer of the 'svwiktionary' tool. How do I activate it again? (I think it's been auto-migrated to eqiad) [16:43:12] skalman12: I think Coren wants to send a mail to labs-l with instructions once everything's finished; for web access, it's probably enough to "webservice start", but I don't know the status of your tool's migration as to not hit something mid-stream. [16:44:10] andrewbogott: Do we have a test suite that croaks about differences between OpenStack, LDAP and SMW? [16:44:37] scfc_de: I don't think so, I've always done it by hand [16:45:46] That sounds like work :-). [16:46:04] scfc_de: so... should I wait or should i run "webservice start"? [16:53:36] skalman12: 'webservice start' cannot hurt you. The only things that may have been delayed are databases and crontabs. [16:54:05] skalman12: If your tool is a webservice, then it should work. You may want to read this though: [16:54:07] !newweb [16:54:07] https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Help/NewWeb [16:58:39] Coren: thanks. the parts that don't use a database work as expected [17:01:15] Coren: found out a crazy puppet dependency issue with labs_lvm and /mnt. I got resources defined on /mnt but they are released before the mount occurs :-]  So I went with a hack at https://gerrit.wikimedia.org/r/#/c/119305/ [17:01:20] Coren: it is not urgent :-) [17:13:03] hashar: +2'ed [17:14:10] (and merged) [17:14:12] Coren: so I forgot to migrate my tool...I just ssh'd into tools-login which I assume points to eqiad now? but after doing $become toolname, I don't see any of my stuff except for a ...DATA.crontab file [17:14:30] am I looking in the wrong place? [17:15:02] You /only/ have a ...DATA.crontab? What's your tool? [17:15:48] legobot [17:16:01] there's also a public_html and replica.my.cnf [17:16:29] but the public_html dir is empty, and I did have stuff in it [17:16:34] Yeah, those are created in eqiad. Your tool is scheduled for copy still; it's amongst the biggest ones so ended up at the tail of the list. [17:16:49] ok. Can I still do it manually? [17:16:54] Nope. [17:16:56] Coren: i ssh into bastion-eqiad.wmflabs.org then ssh tools-login [17:16:58] then become lolrrit-wm [17:17:00] Coren: Apparently, geohack (https://tools.wmflabs.org/geohack/geohack.php?pagename=Gro%C3%9Fe_Moskwa-Br%C3%BCcke&language=de¶ms=55.748888888889_N_37.624444444444_E_dim:1000_region:RU-MOW_type:landmark) produces a 503, which leads to "The URI you have requested, http://tools.wmflabs.org/?503, doesn't seem to actually exist.". [17:17:04] sudo: sorry, a password is required to run sudo [17:17:11] Sorry. The deadline for that is gone. :-( [17:17:16] * aude would like to try to get grrrrt back [17:17:50] aude: http://lists.wikimedia.org/pipermail/labs-l/2014-March/002196.html [17:18:10] Coren: yeah, my fault. is there an eta on when it'll be migrated? so I can let other people know and figure out whether I should revive it on the TS temporarily [17:18:11] Coren: thanks [17:20:03] legoktm: Not sure, it's currently doing a tool with a huge DB... probably in a couple of hours. [17:20:40] ok, thanks. [17:20:55] sorry for not doing this earlier >.> [17:21:09] scfc_de: It looks as though the webserver for geohack is stuck/very overloaded. [17:21:30] Coren: Yeah (works again now), but the 503 shouldn't fail. [17:21:48] (Coming from the proxy, I assume.) [17:22:21] scfc_de: Hm. I just noticed a configuration error that would cause that in some cases. Noted. [17:24:15] scfc_de: fix't. [17:25:34] Geohack itself, otoh, is feeling ill. [17:26:26] Ah, yes, it's reaching its connection limit over and over. [17:28:18] Coren: tools are migrated? [17:28:23] gerrit-to-redis doesn't seem migrated yet [17:28:32] YuviPanda: Most are. [17:28:37] Coren: ah, so some aren't :) [17:28:45] The very biggest ones. [17:28:52] i can't log into it per http://lists.wikimedia.org/pipermail/labs-l/2014-March/002196.html [17:28:59] I don't think any of yours aren't. [17:29:20] Coren: hmm, gerrit-to-redis still seems to tell me 'sudo: sorry, a password is required to run sudo' [17:29:23] aude: Did you update the maintainer list, then log off tools-login? [17:29:28] Coren: i did [17:29:31] YuviPanda: see the mail [17:29:34] oh [17:29:43] aude: What tool is this? [17:29:48] now i can get into lolrrit [17:29:53] ah [17:29:57] since YuviPanda is here, i'll let him work on it [17:30:01] aude: grrrit-wm is up, gerrit-to-redis needs to be up [17:30:15] aude: you seem to be more up to date with migration than me (I still haven't caught up on labs-l :() [17:30:16] i wasn't sure which part to work on [17:30:18] aude: so you should do it :) [17:30:38] give me a minute to fix the sudo thing [17:30:42] aude: sweet [17:30:47] oh, i can't [17:30:53] not a member [17:31:09] just go to https://wikitech.wikimedia.org/wiki/Special:NovaServiceGroup and manage members [17:31:13] aude: hmm, let me do it [17:31:14] you can add me :) [17:31:19] aude: yeah doing :) [17:31:30] then it will refresh the ldap stuff for the tool and then [17:31:41] log off / login to tools eqiad again in a few minutes [17:31:47] and the error should be gone [17:32:14] i actually don't know how the redis part works without reading the documentation, though or poking [17:32:31] aude: heh, at this point I might not either :P [17:32:36] heh [17:33:03] aude: added you as well now. [17:33:13] ok [17:33:20] it will take a few minutes for ldap to sync [17:33:22] aude: it might need to get re-initialized. If the sudo thing goes away :) [17:33:24] aude: ah, righto [17:34:50] i'm in [17:37:31] aude: hmm, I'll brb in about 15min. poke around till then? :D The bash history should be illuminating, I hope [17:37:36] ok [17:37:46] * aude being distracted in the office [17:43:10] scfc_de: I've just added the ability for admins to increase per-tool workers. Check /data/project/.system/config [17:43:58] MaxSem: Just an update… it looks like all instances on that host (virt1007) have no network access. [17:44:01] So, you're not alone! [17:44:09] I'm trying to scare up a network engineer to help [17:44:19] scfc_de: The default is 5 workers / 20 connections; this allows overriding the number of workers (with connections being 4x) [17:44:49] andrewbogott: It's easy to scare a network engineer; tell them someone is playing with a backhoe. :-) [17:45:01] aude: db: p50380g50686__subscriptions [17:45:04] aude: might be the issue... [17:45:25] probably [17:45:41] i restarted everything and nothing [17:45:59] Coren: is possible to see 500s errors instead of see the default tools page? [17:46:36] Coren: This is not quite a diagonal-cutter scenario, since I can ssh to virt1007 itself. Fortuitously, there are no tools instances on that host. [17:46:41] fale: I don't know what you mean? The default 500 page is no more informative. [17:47:18] Coren: uhm... is there a way to see some informative 500s page to fix the problem? :D [17:49:02] YuviPanda: you working on it? [17:49:11] i think i know what to do now, if not [17:49:52] fale: ... not really. You'll want to check your error logs normally; or if you want to print debugging info not return a 500. [17:50:22] aude: no, please go ahead :D [17:50:47] Coren: on error log nothing appears, only on access log. I'm not able to understand if I'm doing something wrong with the app or with lighttpd [17:51:14] doing [17:51:18] fale: You might want to turn some debugging on to make things more verbose. If your app in php? [17:51:31] Coren: yep, is php [17:52:14] fale: Start your script with: ini_set('display_errors', 'On'); [17:52:29] Coren: thanks for the suggestion :) [17:52:40] fale: This way PHP errors will end up in the actual response, that tends to be useful. :-) [17:52:47] :) [17:53:42] Which tool of yours is giving you trouble? [17:54:33] Coren: lists [17:54:35] andrewbogott: ping? dptypes.eqiad.wmflabs seems unreachable. wikitech says it is 'ACTIVE' but puppet status 'unknown' [17:54:58] Coren: but only when it has to display an actual list (the navigation in folders works fine) [17:55:07] YuviPanda: what host is it on? [17:55:26] fale: display_errors should help a lot then. [17:55:31] andrewbogott: how do i find that out? it says i-000001a3.eqiad.wmflabs [17:55:58] Oh, um… if you click on the link, there's a 'current host' entry [17:56:10] fale: Oh, wait, you are using reqwiterules? [17:56:29] andrewbogott: virt1006 [17:56:39] YuviPanda: dang [17:57:01] andrewbogott: oh? [17:57:11] fale: Look at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Help/NewWeb#Enabling_request_logging then [17:57:28] YuviPanda: virt1007 has, like, an uplugged network cable or something. Was hoping you were the same problem [17:57:31] fale: That adds debugging information on exactly how lighttpd is processing the requests, so you can see if there is an issue with your rewrite rules. [17:57:47] Coren: thanks a lot :) [17:57:56] andrewbogott: hmm, damn. This has been there since morning though. It was also auto-migrated, and 'SHUT OFF' until earlier in the day. [17:58:17] andrewbogott: it might also be just a puppet issue. if you can login with root key and tell me you can login, then I'll try to fix the other issues (this is running labs-vagrant) [18:00:13] YuviPanda: is this an instance that I mothballed? [18:00:31] YuviPanda: there's probably a step now to add subscriptions to the mysql db [18:00:42] not quite sure how/where to do [18:00:42] andrewbogott: I... don't know? :( [18:00:46] andrewbogott: sorry, been really behind things [18:00:53] it seems connected to redis and gerrit [18:00:56] where did it come from? How did it get shutdown? [18:02:10] YuviPanda: what project? [18:05:55] andrewbogott: design. I just checked it today. [18:06:13] andrewbogott: also, terribly terribly sorry, but I need to run now if I have any hope of getting dinner before the city shuts down :( brb in about 45m? :( [18:06:14] YuviPanda: as you see from this page, 'design' is mothballed: https://wikitech.wikimedia.org/wiki/Labs_Eqiad_Migration/Progress [18:06:15] sorrry! [18:06:27] andrewbogott: oh! I can rebuild it, it's not too much of an issue. [18:06:36] andrewbogott: anyway, brb! again, sorry :( [18:06:42] Well, check in with me before you do anything so we can get it into a reasonable state. [18:18:07] MaxSem: Current theory is that a network cable got yanked out of your virt host. Chris is driving to the datacenter to jiggle the connector :) [18:18:18] :) [18:19:00] * andrewbogott is over the moon at the prospect that this might not be an OpenStack failure [18:24:06] Coren: how's progress on the database move/rename? (sorry, I just want to make sure I'm not doing something wrong) [18:27:35] Earwig: Long, slow and painful, but it progresses. For /most/ tools, though, it's done. [18:29:00] hm... okay [18:37:07] andrewbogott / Coren : why my project rxy's 'public_html' is assigned to other UID (my UerID is 51788, but public_html (and tmp) directory (and etc.) is assigned to 51789)? I can't modify/rm/chown it. Could you please fixing it? [18:39:56] @Tool Labs [18:50:15] MaxSem: try now? [18:51:26] andrewbogott, can connect now, it just doesn't accept my key [18:51:40] MaxSem: hm… and it used to? [18:51:54] jitsu? [18:52:08] yes [18:52:28] try now? [18:52:50] not sure if it used to - I used short name instead of FQDN yesterday and might have ended on pmtpa instyance [18:53:28] MaxSem: due to the network outage, /public/keys wasn't mounted. [18:53:31] I just mounted it. [18:53:42] andrewbogott, thanks - works now [18:53:58] MaxSem: great! staging seems troubled, I'm looking at that one now [18:56:11] rebooting jitsu to see if upstart works [18:59:40] well, crap, I can't get 'staging' to respond to a ping even :( [18:59:48] MaxSem: is that something you can rebuild? Or shall I keep working on it? [19:01:12] andrewbogott, maxsem@bastion1:~$ ssh staging.eqiad.wmflabs [19:01:12] ssh: Could not resolve hostname staging.eqiad.wmflabs: Name or service not known [19:01:12] maxsem@bastion1:~$ host staging.eqiad.wmflabs [19:01:14] Excellent. My instance is back up. It was on virt1007. Thanks for giggling the cable. [19:02:09] andrewbogott, I think we can rebuild stagigng if it is hopeless [19:02:27] MaxSem: thanks. I don't know what happened, but it was OOM when I looked at it earlier this morning. [19:02:35] aha [19:02:36] I've rebooted but it seems not to've survived :( [19:02:50] yep, it had apache eating all memory [19:03:12] to the point that mysql crashed:P [19:04:36] yurik & dr0ptp4kt, did you guys have anything valuable on staging? [19:05:02] MaxSem, don't think so [19:05:12] MaxSem, not that i know of [19:05:45] it had an alias staging-zero.wmflabs.org [19:10:32] Krinkle|detached: Are you still working on the 'cvn' project? Can you please update the progress page accordingly? [19:37:34] YuviPanda: i got stuck at how to add subscribers [19:37:48] i setup a new database with the subscribers.sql [19:37:53] updated the config with the new db [19:38:18] my changes should be commited to git also [19:39:43] idk if the old database comes back... perhaps [19:39:47] Coren: ^ [19:48:16] aude: back. [19:48:25] aude: I've forgotten wtf it was doing with subscriptions. looking now [19:51:50] it otherwise seem to connect to gerrit and redis ok [19:52:54] aude: grr, for some reason -login doesn't resolve for me. [19:53:01] aude: my provider's dns acting up again, I think [19:53:08] no, i think it is labs [19:53:47] try bastion-eqiad.wmflabs.org [19:53:54] then ssh tools-login [19:55:02] * aude would have liked to see the database in tampa [19:55:41] aude: nah, works now [19:55:48] ok [20:00:40] bd808: Regarding your stillborn instances yesterday… it turns out that that labs host had an unplugged network cable. So every instance on that box was unreachable… one the load balancer swapped you over to a different host things started working again. [20:01:06] So -- you weren't doing anything wrong at all (in case you were worried) [20:01:07] Ha. Mystery solved [20:01:18] yep [20:01:25] I knew it wasn't *my* fault :) [20:01:35] ok, good! [20:04:14] aude: I've to run registrar.py as a process, then 'subscribe' grrrit-wm to it [20:04:29] aude: stupid arch decision on my part, since I asssumed lots of other tools would also want to listen to gerrit sream [20:04:56] aude: meeting, brb :( it should be a 5min fix, I can do it when I'm back. [20:04:58] yu ok [20:05:06] gah [20:14:27] aude: Databases are being copied, but the going is slow. [20:14:53] Coren: ok [20:15:06] i don't think it's important for the gerrit bot [20:31:28] aude: oh, if databases are being copied, and then we have a new one, I wonder if that'll cause problems [20:34:04] it has a different name [20:34:09] aude: aah. hmm, ok [20:34:13] aude: let me try to do it now [20:54:42] andrewbogott, Coren: Is there currently any way to force all instances created in a given project to apply a puppet role automatically? [20:55:40] hashar and I would like to figure out how to give all hosts in the deployment-prep project some default configuration: https://bugzilla.wikimedia.org/show_bug.cgi?id=62795 [20:57:30] andrewbogott: Coren (sorry to bother, you must be tired)... [20:57:41] we are trying to move the wikidat ajenkins instance [20:58:23] in pmtpa, we put stuff in /mnt/ (where seemed most storage is, aside from /data/project shared stuff) [20:58:53] in eqiad, i don't see /mnt space available [20:59:46] trying to figure out where we should put stuff [20:59:48] aude: What image did you use? My huge image has a /mnt of 150G [21:00:31] 80gb [21:00:38] The disk should be /dev/vdb [21:01:04] http://pastie.org/8948643 [21:01:30] i can't login to tampa, but know we had something for /mnt that was most of the instance space [21:02:04] seemed the right place for jenkins builds [21:02:24] Hmm. Oh. I'm looking at a pmtpa instance :/ [21:02:59] bd808: Re puppet role, you could try and submit a conditional check to a core class, or (and I would prefer that) do that on wikitech. You can query the Puppet classes and variables with SMW IIRC, so setting up a watch dog to fix any instance not being properly configured should be easy. [21:03:04] we made this instance on march 6, so maybe something was not right then [21:03:56] bd808: I don't think we have that… you can set up a per-project checkbox but you still have to go through and check it for every instance. [21:04:12] aude: I was looking at a pmtpa instance and not eqiad. This is my huge instance in eqiad http://p.defau.lt/?_c34jGcXLWvC2n2ZF8anXA [21:04:40] aude: New instances in eqiad have a new drive but it is not partitioned by default. Coren has made some ready-made puppet classes to chop it up and mount it as needed. [21:04:49] lemme see if I can figure out where those are... [21:04:56] ok [21:06:33] scfc_de: I don't know anything about SMW magic but that sounds cool. [21:08:01] aude: That should be labs_lvm if I'm not mistaken. [21:08:12] ok [21:09:44] i don't see a check box for that [21:10:30] aude: I think it's only for inclusion in other Puppet classes ATM, but I'm not sure. [21:10:43] ok [21:10:43] we can do that [21:11:11] aude: Maybe look at role::ci::slave::labs::common for an example? [21:11:18] ok [21:11:39] bd808: [21:11:39] https://wikitech.wikimedia.org/w/index.php?title=Special%3AAsk&q=%5B%5BResource+Type%3A%3Ainstance%5D%5D%5B%5BProject%3A%3Adeployment-prep%5D%5D&po=%3FInstance+Name%0D%0A%3FPuppet+Class%0D%0A%3FPuppet+Var&eq=yes&p%5Bformat%5D=broadtable&sort%5B0%5D=Modification+date&order%5B0%5D=DESC&sort_num=&order_num=ASC&p%5Blimit%5D=20&p%5Boffset%5D=20&p%5Blink%5D=all&p%5Bsort%5D=Modification+date&p%5Bheaders%5D=show&p%5Bmainlabel%5D=&p%5Bintro%5D=&p% [21:11:56] bd808: That should give a list of all instances in Deployment-prep with Puppet classes and variables. [21:12:14] There's CSV exports & Co. as well ("Format as:"). [21:12:29] aude: regarding storage, I'm stumped. Will have to wait for Coren to appear and explain... [21:12:50] scfc_de: Neat. I'll play with it and see what I can figure out [21:13:28] the labs jenkins put stuff in /mnt [21:15:10] andrewbogott: I think role::ci::slave::labs::common shows how to do what aude wants. scfc_de gave the labs_lvm clue. [21:15:35] bd808: ok, but it's crazy that there isn't a ready-made role class for other kinds of labs instances [21:15:47] Which, I guess I can make one... [21:15:48] * bd808 nods [21:16:04] i just can't find "labs_lvm" in puppet [21:16:09] Looks like it would be relatively easy, except for the sequencing [21:16:16] nor clue on ci role [21:16:27] must be something simple though [21:16:45] aude: modules/labs_lvm/manifests/volume.pp [21:16:53] hmmm [21:18:14] git pull would help [21:20:52] aude: I'll try to make a simple role, give me a few minutes [21:22:54] ok [21:34:10] aude: would you mind signing that everything is ok with gerrit-to-redis migration? https://wikitech.wikimedia.org/wiki/Tool_Labs/Migration_to_eqiad#Migration_status.2Fnotes [21:36:00] hedonil: ok [21:36:12] aude: hey, fine ;) [21:42:31] andrewbogott: your patch works nicely [21:42:58] aude: OK, great. I'll merge it so you can use it on a normal instance... [21:43:10] ok [21:44:37] aude: added a checkbox in the 'labsdrives' group [21:45:19] yay [21:58:28] <^d> foo.wmflabs.org.wmflabs.org [21:58:31] <^d> Copypaste fail. [22:01:17] http://tools.wmflabs.org/guc/index.php what want labs say to me? "No webservice": "You have not enabled a web service for your tool, or it has stopped working because of a fatal error." -it's enabled, it did work... whats wrong? [22:02:23] Luxo: your tool has been migrated to a new datacenter [22:03:01] Luxo: https://wikitech.wikimedia.org/wiki/Tool_Labs/Migration_to_eqiad [22:04:16] hedonil: ok, thx. " tool will be left in inactive state until intervention by one of its maintainers. " what does I have to do? [22:05:00] Luxo: do you use a database on tools-db? if not simply $ webservice start should do the trick [22:05:52] hedonil: yeeaah it works! [22:05:53] <^d> andrewbogott: done with moving gerrit project to eqiad, updating wikitech. [22:05:58] thx bye [22:06:08] Luxo: Luxo great! [22:06:15] ^d: thanks! [22:06:51] Luxo: please leave a sign that everything works fine - on the wikitech migration page ;) [22:07:12] <^d> andrewbogott: I nuked my one pmtpa instance. If you guys need to do anything else it's all yours, I'm off tampa. [22:09:10] hedonil: done [22:09:18] Luxo: ;) [22:13:19] [bleep]ing enwp10 almost done! [22:18:54] where can I find my old crontab? [22:19:32] steenth: I think in new crontab but it's just been commented out. [22:20:48] "crontab -l" give "no crontab for tools.dawikitool" [22:21:29] steenth: try $cat ...DATA.crontab (the dots are significant) [22:23:24] yes - it's work [22:32:50] steenth: and if everything works fine, pleas leave a sign at https://wikitech.wikimedia.org/wiki/Tool_Labs/Migration_to_eqiad [23:45:13] Betacommand: Is https://bugzilla.wikimedia.org/show_bug.cgi?id=54056 still an issue for you? [23:45:53] Betacommand: And https://bugzilla.wikimedia.org/show_bug.cgi?id=54053.