[00:00:18] don't forget to disable them on pmtpa's tools-dev, for that matter. [00:05:29] Hmm, I found one. Not sure about two others I'm somewhat confident I had. [00:08:51] If only SGE had a scheduled job type [00:18:23] Betacommand has there been progress on that bug? [00:18:38] ToAruShiroiNeko: I havent checked [00:18:48] Coren: is the filearchive table available yet? [00:28:01] Betacommand: I've been a bit busy to keep an eye on the bug. I'll take a look early next week as I do bug wrangling to sort what's still relevant and not. [00:33:15] ToAruShiroiNeko: ^ [00:42:23] thats a shame [00:42:33] I needed the information quicker than that for the submission [00:42:38] I'll post what I have got [00:42:46] I'll note the bug though, which one was it? [01:01:14] Coren could you tell me the bug? [01:08:01] the one Betacommand was refering to? [01:53:52] would anyone else know? [01:54:28] UTFS? [01:55:07] https://bugzilla.wikimedia.org/buglist.cgi?quicksearch=filearchive&list_id=283142 ctrl-f labs [02:31:54] Change on 12mediawiki a page Developer access was modified, changed by PiRSquared17 link https://www.mediawiki.org/w/index.php?diff=921038 edit summary: [+138] mark for translation :-) [03:27:55] Is eqiad up and running for new labs instances, or are there known issues? [04:04:06] tried creating my first new instance in eqiad, i see it being done in console output, but channel 0: open failed: administratively prohibited: open failed [04:04:16] while i can connect fine to existing pmtpa instance with that key [04:04:35] in the same project [04:08:06] puppet status at 'unknown' [04:23:25] mutante|away, FWIW, the one I created did eventually come up (http://mwui.wmflabs.org/wiki/Main_Page), but it took longer than I expected. [04:29:51] Coren: wikitech losing sessions was due to a mistake I made when forcibly logging people out [04:30:29] superm401: ok, thanks, i'll try again in a little while, can still use the pmpta instance for the moment [04:31:20] Coren: I fixed it last night [04:38:46] hmm .. Error 400 on SERVER: Could not find class role::labs::instance for i-000000ea.pmtpa.wmflabs [04:39:46] configured puppet::self on existing pmtpa instance right before this [04:47:15] eh, other instance in same project, same config, doesn't happen, a bit weird but who knows what else happened there for testig [04:47:50] and root's bash_history is global across instances? [04:55:21] Ryan_Lane, mwui is now down again, and when I try to run puppetd --test, I get: [04:55:25] err: Could not request certificate: getaddrinfo: Name or service not known [04:55:39] Never mind, forgot to sudo [05:04:29] mutante|away: Re "channel 0: open failed: administratively prohibited: open failed", a) did you try logging into bastion-eqiad and ssh'ing there to see if you are using the right hostname? b) If it just times out, you might need to check the security groups to see if they allow ssh from bastion-eqiad (10.68.*?) [05:09:16] scfc_de: it doesn't time out, it really closes on me, i'm using bastion-restricted for all of them [05:09:35] i guess that is the problem and i need bastion-restricted-eqiad [05:10:08] the ssh config is identical to the pmtpa instances i can connect to and which are in same project [05:10:13] mutante|away: Can bastion-restricted resolve the hostname? [05:10:46] no, it can't [05:11:00] only the pmtpa instances, not the eqiad instance [05:11:53] That's odd and a bug :-). [05:16:13] what times out is bastion-eqiad, because i'm not supposed to be on it [05:16:25] ok.. hmm. [05:23:26] hello, i know it sucks to be greeted with bugs :) [05:23:42] but there may be one for ops people trying to get to eqiad instances [05:23:54] is there something like bastion-restricted-eqiad? [05:27:42] mutante|away: I'm behind… what's happening? [05:28:10] andrewbogott: which bastion should i use to get to eqiad instances? [05:28:21] because bastion-restricted doesnt work for them [05:28:25] mutante|away: just use bastion-eqiad.wmflabs.org for now [05:28:31] and bastion-eqiad doesnt like me [05:28:34] times out [05:28:37] I haven't set up a proper restricted bastion yet. [05:28:39] Really? [05:28:40] Hm... [05:28:44] can you ping it? [05:28:59] it should be 208.80.155.129 [05:29:35] i can, just time 10498ms [05:29:40] 3 packets transmitted, 3 received, 0% packet loss, time 10232ms [05:29:41] (I think it must be working for me because otherwise I wouldn't be able to use IRC) [05:29:58] let me try proxy'ing through it again [05:31:09] ok, so i can connect to it directly, just not to the new eqiad instance behind it [05:31:26] which one, for example? [05:31:37] eris.eqiad.wmflabs has address 10.68.16.62 [05:31:37] dzahn@bastion1:~$ ssh eris [05:31:45] and then nothing happens [05:32:00] but i see it being done in console log [05:32:14] project "planet" [05:32:16] ssh: connect to host eris port 22: Connection timed out [05:32:42] it's the first eqiad instance i created but the project existed before [05:32:57] connecting to the pmtpa instances isnt an issue [05:33:28] Is the problem maybe this? https://wikitech.wikimedia.org/wiki/Labs_Eqiad_Migration_Howto#Security_Groups [05:33:31] (I just wrote that last night) [05:34:05] ah! checks [05:34:05] also, btw, if you're accessing via bastion-restricted (in pmtpa) you need to specify domain. So it would be 'eris.eqiad.wmflabs' rather than just 'eris' [05:34:16] but from bastion-eqiad it's just 'eris' since you're in the same region [05:34:18] dzahn@bastion1:~$ host eris [05:34:19] eris.eqiad.wmflabs has address 10.68.16.62 [05:34:27] ok, yea [05:35:49] trying to proxy via bastion-restricted and using FQDN [05:35:51] =channel 0: open failed: administratively prohibited: open failed [05:36:03] looks at security groups [05:36:53] yes, 10.4.0.0/21 for SSH [05:37:04] but that is simply the default security group [05:37:04] So that's tampa only [05:37:10] that's not custom [05:37:19] yea, changing rule [05:37:45] ah, i have 2 security groups, one for pmtpa and one for eqiad [05:37:57] but both use 10.4.0.0/21 [05:39:13] Failed to add rule. [05:39:17] tries again [05:39:39] should i delete the entire default security group for pmtpa and the instances there if i dont care? [05:39:45] to keep them [05:41:11] Successfully added rule. (when _not_ selecting default in source group) [05:42:49] ok, that changed it from timeout to channel 0: open failed: administratively prohibited: open failed [05:43:08] You should delete the instance if you don't want it. No need to delete the security group. [05:43:21] which project is this, again? [05:43:35] ok, i'm just editing the one for eqiad [05:43:39] project planet [05:47:18] ssh eris.eqiad.wmflabs.org [05:47:18] ssh: Could not resolve hostname eris.eqiad.wmflabs.org: Name or service not known [05:47:22] uhm [05:47:27] still trying different ways [05:47:52] 64 bytes from eris.eqiad.wmflabs (10.68.16.62): icmp_req=1 ttl=64 time=0.618 ms [05:47:56] on the same bastion1 [05:48:00] wth [05:48:13] mutante|away: I think that the instance may never have come up. Unless you know otherwise. [05:48:27] It looks like you checked role::puppet::self when creating the instance, is that right? [05:48:28] the console log showed "Login:" [05:48:32] no [05:48:38] i did on venus and mars [05:49:02] it is definitely checked [05:49:19] puppet status is at unknown [05:49:38] and it never had a different status [05:50:08] maybe i did after it was up but i know it doesnt work when checking anything during creation [05:50:13] Mind if I delete eris and build a new one? It's just a fresh empty instance right? [05:50:15] and i also watched the console log [05:50:17] until it was done [05:50:23] no, don't mind [05:50:27] yes [05:50:49] ok, just a moment... [05:53:56] thanks [05:56:42] mutante|away: try now? [05:56:53] I changed the security rule again. I don't know what your previous change didn't work... [05:57:21] *why [05:58:53] andrewbogott: it works! [05:59:11] eh, yea, why though:) [05:59:18] but cool, ty [05:59:33] i just copy/pasted the CIDR mask and 22,22 tcp [06:00:28] oh, well, works, via bastion-eqiad [06:00:50] via bastion-restricted it doesn't, but it's ok, i'm unloading the prod key anyways [06:05:36] should the puppet status not be unknown anymore? [06:05:59] tries to apply puppet::self again [06:06:18] notice: Finished catalog run in 24.35 seconds [06:08:31] that looks like i works, finished run again after getting ::self [06:09:39] andrewbogott: yea, it looks all ok, minus this one part that it claims puppet status says Unknown in the web ui, while it actually finishes fine [06:09:49] s/says/is [09:04:44] Once migration to eqiad is complete, do we let a root know to purge the pmpta side of things? [09:05:47] TheLetterE what I have seen is that this will be done eventually ... no rush [09:06:11] Thanks GerardM- I will sign on wiki to say I'm done :) [09:07:33] !log deployment-prep restarted varnish and varnish-frontend on deployment-cache-text1 [09:07:34] Logged the message, Master [09:08:47] TheLetterE, if you're migrating a whole project then make a note here: https://wikitech.wikimedia.org/wiki/Labs_Eqiad_Migration_Progress [09:08:58] If you're talking about tools, I think Coren is keeping track of things some other way. [09:09:28] Just a tool - it just said if we have purged pmtpa to sign on wiki, but I've just signed anyway. I'm all done. [09:41:54] help!!!!!! [09:42:00] according to https://wikitech.wikimedia.org/wiki/Labs_Eqiad_Tools_Migration [09:42:10] first migrate-tool gabrielchihonglee-bot [09:42:22] then finish-migration gabrielchihonglee [09:42:43] while I do the second step, it shows finish-migration: command not found [09:42:51] what should I do? [09:43:23] gabrielchihonglee@tools-login:~$ migrate-tool gabrielchihonglee-bot Copy started Thu Mar 6 09:38:22 UTC 2014 [09:47:16] Copy complete Thu Mar 6 09:43:12 UTC 2014 [09:52:40] Gabrielchihongle: you may need for Coren to show up here… he's in Montreal so it will be a while. [09:52:52] okey, thanks! [10:38:51] (03CR) 10Hashar: "IIRC Carl told me he was not too confortable with nodejs and regex mixed together. Yuvi you might want to take over." [labs/tools/grrrit] - 10https://gerrit.wikimedia.org/r/116996 (owner: 10AzaToth) [13:56:56] "finish-migration" says: "User databases owned by this tool on the replicas are not renamed but the new tool credentials have been granted access.". But my database was renamed from pXXXgYYY_abc to sZZZ_abc. Was this intended, that I have to change all programs? [13:59:08] the merge script is merging cronjobs too? [14:00:29] Coren: ping O_O [14:00:55] Ping. [14:01:24] apper: That may have been made clearer; databases /on the replicas/ aren't renamed, but those on tools-db are. [14:01:56] Coren: mabye i go to move delinker too equiad... will all cronjobs etc by copyed and is the "sys" the completly the same? [14:01:59] Coren: ahhh, okay, thanks. The replicas are the replicated wiki databases... [14:02:02] Steinsplitter: Yeah, but you can split them back up again if you want. It just was unreliable to try and make two sets of migration from "stuff-from-login" and "stuff-from-dev" [14:02:11] Coren: than everything is right ;) [14:03:05] just migrated my first tool, everything went right [14:03:13] And yes, the change of DB username is a major annoyance, but this was the only right time to correct those (the current scheme had issues that prevented a number of requested features) [14:03:51] Coren: yes, it's a bit annoying, but it's okay [14:04:29] Coren: i see the db for delinker is on equiad but the tool is still on the *old*, is this possible? [14:04:35] apper: Although, if you wanted your code to be /really/ fance you could have it use table names derived from the username at runtime. :-) [14:05:02] Steinsplitter: No; but the migration script only /copies/ it does not delete in any way. [14:05:26] Coren: yes, the scipt dos not change database names etc i guess? [14:05:35] Steinsplitter: Not in pmtpa no. [14:05:49] Coren: if i break somthing, you can revert from backup? :O [14:05:56] Coren: hehe ;). I normally only have one php include file which has functions to give a database connection, so I only have to change one line in one file, this is okay [14:06:07] Steinsplitter: Not really; but not files I harmed by the process. It's just a bit manual. [14:07:00] k, thx [14:09:28] what should I do on pmtpa? delete all files? destroy database? or just leave everything as it is? [14:11:24] apper: You can do any/all of that, or keep the data around as a safeguard. At the end of the migration period, tools marked as having been migrated will just go away with pmtpa. [14:12:37] Coren: "marked as having been migrated" where? Just using finish-migration or in the table at https://wikitech.wikimedia.org/wiki/Tool_Labs/Migration_to_eqiad ? [14:13:25] apper: finish-migration having ran. Marking it in the table is appreciated though, as it means there is no question. [14:13:47] okay [14:14:19] Also, the wiki page gives a good overview but many people haven't marked it. [14:22:21] Coren: Any news about the mail thingy? [14:22:39] hedonil: Mail is on todo today. [14:22:51] You shouldn't get the error anymore, at least, but also no mail. [14:23:06] There are issues atm with the mail going to the wrong place I'm working out. [14:24:31] Coren: I still get the error (but it's also the last chance to run ultd. scripts on tools-logon w/o being email'd ;) ) [14:25:10] Ah, right, the destination exists but since you never got mail, the mailbox doesn't. [14:27:03] Coren: yep yep. mutt is looking for a matching folder in /var/mail, but of course there isn't any yet [14:50:38] Coren: I'm thinking about a wikipage with some facts & figures ( Volumes, shared Resources, user based api's etc.) as an overview [14:51:14] hedonil: It's a Wiki™ :-) [14:51:17] Coren: Would you mind providing a blank page Tool Labs\Resources overview or similar ? [14:51:49] Coren: ha hello :-] I added you as a reviewer to a bunch of beta eqiad changes :] [14:52:16] hedonil: https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Overview [14:52:22] hashar: Lemme go see those. [14:52:47] Coren: fine. there we go [15:02:27] hashar: All three should be merged for you now. [15:02:43] !!! [15:02:56] Coren: and while you are in an interrupted state I could use https://gerrit.wikimedia.org/r/117193 for contint :] [15:03:09] it git::clone a bunch of repo on the labs jenkins slaves [15:05:49] err: /Stage[main]/Role::Applicationserver::Common/Labs_lvm::Volume[second-local-disk]/Exec[create-vd-second-local-disk]/returns: change from notrun to 0 failed: /usr/local/sbin/make-instance-vol 'second-local-disk' '100%FREE' 'ext4' returned 1 instead of one of [0] at /etc/puppet/modules/labs_lvm/manifests/volume.pp:52 [15:05:52] I love puppet [15:05:55] lack of verbosity :] [15:07:27] for some reason our puppet as exec { logoutput => False } [15:07:28] (03CR) 10coren: [C: 032] "ACHIEVEMENT UNLOCKED" [labs/toollabs] - 10https://gerrit.wikimedia.org/r/113335 (owner: 10Tim Landscheidt) [15:11:00] hashar: That error message is completely incomprehensible. [15:11:31] hashar: Can you tell me what instance this is running on so I can debug? [15:12:26] Coren: https://gerrit.wikimedia.org/r/117199 would make the labs_vm exec calls to log output on failure [15:12:35] the instance is deployment-apache02.eqiad.wmflabs [15:12:44] puppet fails for other reasons as well [15:13:07] and I forgot to says to you "good morning" [15:16:50] hashar: Ah, hell. There's something funky going on when the volume name contains dashes. [15:17:10] ahh [15:17:13] * Coren did not know that. [15:17:13] would underscore bellowed ? [15:17:28] Yes. But you can't change it yet; the volume was already created. [15:17:32] Hang on. [15:17:33] I suspected dash would cause an issue but tried nonetheless [15:17:52] It just causes the device name to become 'vd-second--local--disk' [15:17:57] Which confuses the mount. [15:18:36] yeah I rebooted the other instance deployment-apache01.eqiad.wmflabs and it is not happy :D [15:18:37] I'd rather fix the issue than change the volume name -- you are /not/ going to be the only one who does that. [15:19:16] ... that's not related; the worse this can to is fail to mount /srv [15:22:00] Coren, when you have a minute can you send those two lists of self-hosted puppet instances? (I'm assuming you already have a script handy to produce the lists, otherwise I can sort it out myself) [15:22:59] andrewbogott: Better yet, I already /have/ the list if only I can remember where it is. :-) [15:23:07] that'll do [15:25:13] Coren, indeed /srv can't be mounted :D The disk drive for /srv is not ready yet or not present. Continue to wait, or Press S to skip mounting or M for manual recovery [15:25:42] hashar: ... stupid ubuntu. [15:25:46] :-] [15:26:03] there must be some option in fstab to skip that [15:26:49] hashar: No, that's part of rc.something rather. It's also comletely insane; not havin /usr is reasonable to stop boot but some random /srv? [15:28:11] yup that is a bit insane [15:30:59] hashar: -02 is now correct with the fixes to the class. [15:31:18] good! [15:31:23] mind fixing up deployment-apache01.eqiad.wmflabs while at it ? [15:31:50] hashar: I don't think I can; that idiocy seems to happen before puppet gets a chance to run. [15:32:21] well puppet did run [15:32:31] before the boot, no? [15:32:39] the instance crashed when the labs_lvm got applied and I attempted to reboot it [15:32:58] Right. But now puppet can no longer run on it; it stops booting /before/ it tries. [15:33:07] yup [15:33:19] cant we access to the console to press S (skip) ? [15:33:36] hashar: No. There is no console to access. [15:33:52] Hmmm. [15:34:02] I *might* be able to send it a fake keypress. [15:34:18] But it's almost certainly easier to delete and recreate. [15:34:53] Coren: ok deleting :] [15:35:05] ah [15:35:17] can you try the fakekeypress or console access? [15:35:28] cause if I recreate it I got to update a bunch of puppet/mw code to change the IP address [15:37:42] hashar: What's the kvm instance number? (I-XXXXXX) [15:42:49] Coren: I-0000007d.eqiad.wmflabs [15:43:05] sorry for the trouble :-( [15:45:43] I just hit 's' on its "keyboard". No help, I'm afraid. If you are quick in delete/recreate you'll probably get the same IP. :-) [15:47:14] hehe [15:47:22] deleting it [15:48:07] Coren, fingers are being pointed at labs for bots editing logged out. :p [15:48:41] Cyberpower678: And why would labs be pointed at for bugs in bots? [15:49:00] Coren, because people don't know what they're talking about. [15:49:24] Because the issue happened simultaneously in 3 different bots/ [15:49:46] No bot should edit without &assert=bot anyways. [15:49:58] Where is this discussion? [15:51:42] Coren, https://en.wikipedia.org/wiki/Wikipedia:AN#Possible_bot_malfunctioning.3F [15:53:30] $ ssh dumps-1.eqiad.wmflabs [15:53:30] The authenticity of host 'bastion2.wmflabs.org (208.80.153.202)' can't be established. [15:53:33] RSA key fingerprint is 45:e3:61:b6:e4:0a:69:fd:95:31:89:2c:0b:db:47:3b. [15:53:38] Someone please update V [15:53:40] https://wikitech.wikimedia.org/wiki/Help:SSH_Fingerprints [15:59:08] Cyberpower678: https://bugzilla.wikimedia.org/show_bug.cgi?id=62288 [16:05:57] Hi all. I'm having trouble looging into my newly created eqiad instances [16:06:15] I have set up my ssh config with the ProxyCommand directive (a long time ago) [16:06:34] my config includes statements for pmtpa and eqiad [16:06:52] the former always worked well, the latter not at all [16:07:13] it seems that I get onto the eqiad bastion but then it disconnects [16:07:14] dschwen: You likely will have to log in (and/or proxycommand) though bastion-eqiad.wmflabs.org [16:07:20] Ah. [16:07:26] Hm. Need dpaste. [16:08:12] and how do I ssh from a pmtpa instance to an eqiad instance [16:08:20] I'm sure there is an FM to R [16:08:33] but I don't see it [16:09:58] dschwen: Use fqdn; 'instancename.eqiad.wmflabs' [16:10:00] ok, looks like I've got to update my ssh config [16:11:51] Coren: thanks for your consistent communication and help throughout this migration (of course) [16:12:54] :) [16:12:58] sumanah: All in a day's work, ma'am. [16:13:10] * sumanah curtsies [16:16:24] ssh: Could not resolve hostname fastcci-master.eqiad.wmflabs: Name or service not known [16:17:01] but the instance shows up in my instance list [16:17:10] created about 30mins ago [16:26:41] host tools-login.eqiad.wmflabs does not return any result [16:26:53] host tools-login.pmtpa.wmflabs works as expected [16:26:54] ... what? [16:27:15] Ah. It's entirely possible that eqiad names are only visible from eqiad bastions. [16:27:17] this seems to be the most basic thing needed to migrate [16:27:31] how am I supposed to rsync my stuff? [16:27:33] dschwen: login(-)eqiad [16:27:53] or is /data/projects synced automatically [16:27:53] hedonil: That's public IP. [16:28:19] dschwen: Wait, you want to migrate tools from the tools project or a non-tools project? [16:28:55] no, my instances [16:29:22] I appreciate your responsiveness, but this is not specifically tools related ;-) [16:29:57] I would just like to rsync stuff from my old pmtpa instances to my newly created eqiad instances [16:30:08] Aaah. [16:30:35] and I would like to be able to log into my eqiad instances [16:30:50] (but that's a bonus ;-) ) [16:30:56] Logging into your eqiad instances is easy, but you have to do it from the eqiad bastions. :-) [16:31:11] I have the ProxyCommand setup [16:31:50] The rsync might need to use IPs; but you might also want to talk to Andrew first. My understanding is that he's doing a rsync of most /data/project things in the background and you might just be a rename away. :-) [16:31:53] as I said it does login to the bastion but not one step further [16:32:27] Before you proxycommand, have you tried doing the steps yourself? [16:32:35] I really don't get it. Am I exceptionally stupid, or why do these super basic steps fail for me?! [16:32:53] one sec [16:32:59] dschwen: I don't know; you're the first to report issues with logging into their eqiad instances. [16:33:32] ssh bastion-eqiad.wmflabs.org [16:33:32] If you are having access problems, please see: https://wikitech.wikimedia.org/wiki/Access#Accessing_public_and_private_instances [16:33:32] Permission denied (publickey). [16:33:35] ugh [16:33:37] what? [16:34:31] You have the same username in labs as you do on your own box? [16:36:16] of course not ... :-( [16:36:40] So you're likely to have better luck with ssh labsusername@bastion-eqiad.wmflabs.org :-) [16:36:57] yeah [16:37:19] anyhow from my box "ssh tools-login.pmtpa.wmflabs" and "ssh tools-login.eqiad.wmflabs" both work [16:37:22] so far so good [16:38:08] "ssh fastcci-master.eqiad.wmflabs" does hang [16:38:23] (1h old eqiad instance) [16:39:13] Just hangs there? Have you checked your security groups? IIRC, they are copied verbatim from pmtpa so if you had ssh restricted to pmtpa there, it'll also be the case here. [16:39:17] "tools-login:~$ ssh fastcci-master" hangs, too [16:39:29] oooooooooh [16:39:42] I need a new network in there? [16:40:01] 10.0.0.0/8 should be okay in practice unless you fear opsen logging into your instances from prod. :-) [16:40:24] ok [16:40:31] it was 10.4.0.0/16 [16:40:39] actually /21 [16:41:00] what would be the corresponding subnet on eqiad? [16:41:08] hiyaa [16:41:20] couldn't that have been changed automatically? [16:41:22] andrewbogott: how does an opsen access an eqiad labs instance? [16:41:26] through bastion-restricted? [16:41:38] and why am I the only one stumbling over this :-/ [16:42:08] ottomata: bastion-restricted or bastion-restricted-eqiad [16:42:14] ottomata: andrewbogott just created bastion-restricted-eqiad [16:42:15] that:) [16:42:21] If the former you'll have to specific .eqiad.wmflabs as part of the instance name [16:42:37] yeah having trouble at the moment, will try -eqiad [16:42:42] dschwen: you're not [16:43:27] dschwen: I think that was basically impossible without doing it all by hand. IIRC, andrewbogott meant to write an email about it but may have forgotten. [16:43:38] Unable to create and initialize directory '/home/otto'. [16:43:56] ok, so what is the corresponding subnet then? [16:43:59] Coren: I wrote https://wikitech.wikimedia.org/wiki/Labs_Eqiad_Migration_Howto#Security_Groups [16:44:14] dschwen: 10.68.0.0/20 [16:44:19] dschwen: 10.68.0.0/20 [16:44:22] hmm, andrewbogott, i get that when trying to ssh to bastion-restricted-eqiad.wmflabs.org [16:44:25] ok, thx [16:44:38] looks like it kinda works though [16:44:39] but I guess there os really no harm in expanding to 10.0.0.0/8 [16:44:40] but then it logs me out [16:44:45] Although I've seens some weird behavior when there are two different rules for the same port. So you may have to use 10.0.0.0/8 -- that can be your fallback [16:44:52] ottomata, just a minute, I will look. [16:45:20] since we have 2 separate security group groups, one for pmtpa and oen for eqiad [16:45:27] ah ok, no probs, i just got in through usual bastion-restricted [16:45:32] i did not use the 10.0.0.0/8 [16:45:45] but the actual IP range for eqiad labs is 10.68.0.0/20. [16:45:49] mutante, does bastion-restricted-eqiad.wmflabs.org let you in? [16:46:18] hold on, gotta disconnect from prod [16:46:50] Hi All [16:47:28] Please I have problem in access tool labs [16:47:34] any one here [16:47:35] andrewbogott: yes, that works [16:47:44] changed ssh config [16:47:52] thanks! [16:48:04] Hm, so why not for ottomata... [16:48:21] ottomata: does bastion-eqiad.wmflabs.org work for you, or does it hang up when you connect? [16:48:26] because of his home dir being gone? [16:48:44] Sure, but it should create a homedir when he logs in... [16:48:59] same thing last night for me [16:49:06] you had to add the security group for me [16:49:12] for some reason we didnt quite get [16:49:26] when i tried to add it, something didnt quite work [16:49:45] then you do seemingly same thing, worked.. hmmm [16:50:12] Nah, last night you accessed bastion fine, it was the next hop that was the problem I think [16:51:05] ottomata: can you connect directly to bastion without proxying anything? [16:51:21] andrewbogott: wasnt sure which error he gets, ok, yea [16:51:54] re: 2 different rules for same port. i deleted the old one after adding the new one [16:53:21] andrewbogott: no i cannot connect to bastion-eqiad [16:53:24] it disconnects [16:53:30] ottomata: ok, stay tuned... [16:53:33] debug1: Server accepts key: pkalg ssh-rsa blen 279 [16:53:33] Connection closed by 208.80.155.129 [16:53:33] well now I get a step farther [16:53:37] please can anyone help me [16:53:41] I get logged into my new instances [16:53:45] I still don't see anything in https://wikitech.wikimedia.org/wiki/Help:SSH_Fingerprints [16:53:47] I get the MOTD [16:53:49] Ouda, please tell us your actual question [16:53:51] bastion-restricted-eqiad gets farther [16:53:55] but then I get "Connection to fastcci-worker1.eqiad.wmflabs closed." [16:53:59] i see the welcome motd or whtaever [16:54:01] How am I supposed to ssh in if I don't know what fingerprints to trust? :< [16:54:04] The last Puppet run was at Thu Mar 6 16:41:06 UTC 2014 (12 minutes ago). [16:54:05] blabla [16:54:20] Quiet!. :p [16:54:22] then just Connection to bastion-restricted-eqiad.wmflabs.org closed. [16:54:38] Coren, something interesting is happening with bastion-restricted-eqiad. /home is read-only [16:54:50] andrewbogott,I want to active my account pecase it is diables [16:55:00] my username is : Mohame Ouda [16:55:11] andrewbogott: It might have gotten mounted before manage-nfs-volumes got to it if you were really really fast. :-) [16:55:55] andrewbogott: Lemme see. [16:56:02] really fast at what? [16:56:09] Ouda, disabled how, by whom, why? [16:56:31] I don't now , I can't login by it now [16:56:47] could you please check its status [16:56:56] andrewbogott, do I need to manually copy stuff on /data/project ? [16:57:07] Ouda: Ah, so the issue is that you cannot log into tools-login? [16:57:34] yes exactly [16:57:54] dschwen: Yes. That or open a bugzilla ticket asking me to move everything. [16:58:10] (But in that case I will probably shutdown all instances temporarily to get a safe copy.) [16:58:10] andrewbogott: create instance, and have /home mounted before manage-nfs-volumes has a chance to set the ACLs right. [16:58:31] Coren: Ah, so it's not /me/ being fast, it's Nova. I understand. [16:58:37] Shall I just reboot in that case? [16:59:13] andrewbogott: That should work. I can't remount /home because there are people on it anyways. [16:59:35] well… rebooting will be unpopular then, but... [16:59:40] is it faster if you move it? [16:59:50] mutante, ottomata, whoever, I'm about to reboot bastion-restricted-eqiad [16:59:57] Oh, duh, people is me. :-) [17:00:02] i.e. more efficient [17:00:08] dschwen: Not necessarily. I'd encourage you to just copy the files that you need and leave the rest to perish. [17:00:33] we'll I need everything there. I just copied what I needed from the toolserver [17:00:49] Coren: hm, still read-only [17:00:55] I'll reboot and we'll see what happens [17:01:28] andrewbogott: go ahead i can't log in anyway :) [17:01:29] The kernel might be too smart with caching for its own good. :-) [17:01:50] yep, got disconnected [17:01:58] Coren, reboot, still read-only [17:02:20] Looking now. [17:02:33] but then here we go again. Logged into maps-wma1.pmtpa.wmflabs, I try to "ssh maps-wma1.eqiad.wmflabs" and get... [17:02:35] ssh: Could not resolve hostname maps-wma1.eqiad.wmflabs: Name or service not known [17:02:45] yeah got some readonly as well [17:02:54] ottomata: do you what fingerprints to trust for bastion2? [17:03:05] labstore.svc.eqiad.wmnet both /project/deployment-prep/project and /project/deployment-prep/home [17:03:11] sounds vaguely familiar, what dschwen said, i also saw that once, but then it worked again [17:03:13] * do you know [17:03:26] * Coren is looking at this. [17:03:43] Nemo_bis: no? [17:03:46] I think there's a race condition between adding the ACLs that make break things. Shouldn't be hard to track down. [17:03:47] andrewbogott , could you please check my account [17:04:10] dschwen: maps-wma1.eqiad.wmflabs resolves for me in both eqiad and pmtpa [17:04:25] Sob, is everyone creating eqiad instances without ssh'ing into them? Or just blindly typing "yes"? [17:04:35] I'm glad it does for you ;-) [17:04:43] ns cache issue? [17:04:47] Ah, I think I see the issue. mountd is being "helpful" and caching permissions. [17:04:58] * Coren convinces it not to. [17:05:01] Ouda, it looks to me like you are trying to connect with your wiki username rather than your shell name. [17:05:53] andrewbogott, ooh , I think I forget my shell name . please how can I get it [17:06:29] Ouda: it should be visible here: https://wikitech.wikimedia.org/wiki/Special:Preferences [17:06:35] 'Instance shell account name" [17:06:37] andrewbogott: You on bastion-restricted atm? [17:06:54] Coren: yes, hang on... [17:06:59] that better? [17:07:29] I don't get it. It's exported rw, the filesystem is rw, and the mount is rw. [17:07:37] And it works on most instances. [17:07:42] * Coren digs deeper. [17:08:05] andrewbogott . thank you sooo much I loged it now [17:08:13] Coren, i don't know how the bastion 'restricted to' settings work… possibly that's messing with us somehow? [17:08:15] Seems unlikely [17:08:20] Ouda: Great! You're welcome. [17:08:40] Got it, filed a bug https://bugzilla.wikimedia.org/show_bug.cgi?id=62328 [17:09:58] rebooted my maps-wma1.pmtpa instance and now I cannot log back in :-((( [17:10:28] labs is pretty bugged for me right now :-( [17:11:12] my fastcci-master.eqiad instance was showing a ton of errors in the console, the last being about eth0 not being able to attach to a socket or something [17:11:17] reboot did not help [17:11:25] ended uop deleting and recreating the instance [17:12:16] dschwen: the one about IPv6? afaict that existed before [17:12:58] it shows up in the new instance and is just ntpd complaining [17:13:03] but I get the same issue [17:13:06] Coren: two questions: have i still toollabs database access on pmtpa after i run migration script? how do i know if another tool is migrated or not? My tool i reading database content created by many other tools [17:13:14] ssh logs in and gets disconnected right away [17:13:32] sorry, but this is a bit frustratig [17:15:11] dschwen: What error for ssh? [17:15:34] on the pmtpa instance it just hangs after offering my private key to my instance [17:16:03] on the eqiad instance, it just disconnects after giving me the MOTD [17:16:44] pmtpa sounds like /public/keys is unavailable for key verification, eqiad: not a clue. [17:16:55] debug1: Exit status 254 [17:18:08] screw this, I will postpone the migration until these kinks are ironed out [17:18:31] unfortunately even my unmigrated instance is inaccessible :-( [17:21:19] dschwen: what project and instance? [17:21:28] maps-wma1 [17:21:38] which is that, the project or the instance? [17:22:15] maps is the project [17:22:26] isn't that the standard naming scheme for instances? [17:22:36] project-instancesuffix [17:22:45] often, not always [17:22:56] we'll I'm a good citizen [17:23:21] :) [17:23:24] thank you [17:23:50] So… you are using proxycommand or key forwarding for access? [17:29:13] dschwen: there was a gluster issue that was messing with maps-wma1.pmtpa.wmflabs. As for maps-wma1.eqiad.wmflabs I can't explain; it is working fine for me. [17:30:29] dschwen: I can ping maps-wma1.pmtpa.wmflabs from the eqiad instance but not the other way around. May be just a dns delay. [17:32:06] * andrewbogott goes back to sleep [17:33:25] thx andrewbogott_afk [17:38:00] Merlissimo: No, the tools-db is (annoyingly enough) a per-datacenter affair because it was virtual and is now a physical server. [17:38:13] (Unlike the replicas which were already physical) [17:38:51] andrewbogott_afk: FYI: It really seems to be a caching issue of some description; bastion-restriced now has writable home and I did nothing. [17:42:28] Coren: yes i know that there are two different tools-db. But how do i know if a tool X is still on tampa or already on eqid, so that i can send a query to the database server used by tool X. if i have stl access on tampa dbs i could simply create an ssh tunnel [17:43:59] Merlissimo: You do; your previous credentials will keep working in pmtpa until it is shut down. Figuring out whether a tool has migrated from eqiad is a bit more complicated, however; there is no definitely correct method because not everyone does it the same way. [17:46:05] i am collecting data from six other toollabs tools. checking manually every day if the maintainer have migrated a tool is not a good idea [17:46:08] Testing for he presence of ~tools.toolname/...DATA.olduser works, if the tool was migrated with migrate-tool, and the tool's home's permissions have not been restricted. [17:47:01] ok, i'll try to implement this [17:47:39] I should expect it might be simplest to test for the appearance of the database on the eqiad tools-db though. [17:49:47] sooooo... I added a public key on my pmtpa instance to authorized_keys and tried to use the private key to log in from the equiad instance to the pmtpa instance [17:49:49] no dice [17:50:01] still tells me permission denied [17:50:11] is the local authorized_keys file not sourced? [17:53:28] heya YuviPanda: yt? [17:56:10] hey Coren, do you know if Instance proxy works in eqiad labs? [17:56:46] ottomata: Not as of last night but my understanding is that this is on Ryan's plate. [17:57:03] ah! ok cool [17:57:05] good to know [17:57:11] dschwen: No, only the keys in /public/keys are [17:57:31] ok cool, that is a big deal for most of the analytics instances [17:57:36] i was starting to migrate some of them but will wait [17:58:46] ottomata: My understanding is that this isn't all that far off. [17:58:57] Keys are added to that through the gerrit interface? [17:59:38] ok cool, tahnks [18:01:29] oh, nevermind, found it [18:45:32] Coren: Any clue why wikilink [[toollabs:xxx]] doesn't work on wikitech? How can this be fixed? [18:47:56] coren any known issue with tools-webserver-01 ? [18:48:23] hedonil: Hm; I dunno if that's even in the interwiki table there. [18:48:55] matanya: It's crap? :-) Beyond that no, but as things migrate off it pressure should be lower. Lemme go check. [18:49:41] matanya: It seems okay to me. What issue are you having? [18:50:46] Coren: https://dpaste.de/20gg [18:50:57] i'm not sure it is software issue [18:52:28] matanya: TUSC isn't in pmtpa at all anymore (it migrated) and since you are trying to connect to a webserver rather than the proxy, you don't get the automagic redirection. [18:52:46] matanya: You should be using tools-webproxy not tools-webserver-whatever. [18:53:05] Coren: i just used the web interface of the too [18:53:06] *l [18:53:31] asking as a user rather than as a dev this time :P [18:53:41] Yes, but by connecting to the backend server (tools-webserver-01); it's not on that backend anymore. If you use the proxy (tools-webproxy) it'll find the tool where it lives. :-) [18:53:46] Ah! [18:54:03] Then the tool's maintainer needs to do what I just said. :-) [18:56:23] that is magnus masnke [19:11:04] Coren: What's the current amount of storage and RAM for tools-db eqiad? [19:11:18] hedonil: The new one, you mean? [19:11:24] Coren: yep [19:12:29] It's got 32G or ram, but it shares it with the postgresql slave. There's 8T set aside for the db atm. Why? [19:13:04] Coren: filling the doc step by step .. https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Overview [19:19:44] hmm, is eqiad.wmflabs dns messed up or sumpin? [19:19:55] otto@bastion-restricted1:~$ ping wikimetrics-staging1.eqiad.wmflabs [19:19:55] ping: unknown host wikimetrics-staging1.eqiad.wmflabs [19:21:19] you're not the first to report it, seems that happens sometimes but then it works again [19:21:56] i could like look it up in DNS using "host", and then right after when ssh'ing to it and didn't find it [19:22:12] but before we even got to look closer, it was normal again [19:22:19] There is something strange going on with DNS. I'm probably going to set up a caching dns server within the infrastructure to help; the openstack-provided one doesn't seem to handle a lot of load. [19:22:46] otto@bastion-restricted1:~$ host wikimetrics-staging1.eqiad.wmflabs [19:22:46] otto@bastion-restricted1:~$ host wikimetrics-staging.pmtpa.wmflabs [19:22:46] wikimetrics-staging.pmtpa.wmflabs has address 10.4.1.184 [19:34:35] Thanks hashy/Faidon eventhough you're not here right now... [19:35:30] for what? [19:39:44] Finding/fixing the varnish bug that was causing api sessions to not be passed back to the client properly on login so edits failed for bots/people... if people use an api [19:40:20] Though technically also to csteipp, for finding the bug in beta prior to it getting on production [19:40:31] * Damianz goes back to finding food [19:50:07] Coren: what is the ip range (ip4 and/or ipv6) bots will use from eqid, so that i can add it to autoblock-whitelist [19:50:46] That's like the whole labs 10. range unless you keep it up to date with everys erver provisioned afaik [20:01:47] Damianz: How's everything going? Cluebots all working OK? [20:03:02] Yes - the first restart (ie login) after paravoid pulled the varnish geoip stuff out, everything went back to awesome [20:06:46] Damianz: \o/ So you have CFT now? :p [20:07:31] cft? [20:09:47] Damianz: Copious Free Time (TM). [20:11:05] Not really... got some projects to work on and some work work todo [20:12:26] Damianz: :D [20:12:49] * a930913 vaguely remembers when he had CFT. [20:14:13] Damianz: Can you try fix the 8KB issue some time tonight though? :) [20:15:03] Maybe... need to fix a couple of other things to tidy stuff up tonight [20:49:53] petan: Is there a chance that you can make http://tools.wmflabs.org/paste/ work properly with https ? ( browser blocks unsecure resources) [21:10:42] * SamB finds the wording on https://wikitech.wikimedia.org/wiki/User_talk:SamB#Your_shell_access_was_granted confusing given that he never filed any request ... [21:12:56] SamB: Perhaps it should mention that one is filed automatically on your behalf. :-) [21:16:21] Just leave out the first half of the sentence? [21:29:26] hmm, I was going to ask about debugging a problem with url2commons, but it looks like it's not crapping out on me today ... [21:30:30] oh but my form contents are lost :-( [21:32:12] * SamB goes to see if he can workaround that with fiddler ... [21:44:33] Coren: I accidentally uploaded a web file to the pmtpa instead of eqiad, and now I can't get my migrated webstuff. [21:45:38] a930913: I'm not sure I understand; nothing you add to pmtpa will affect... oh, could it be something that would have created a public_html? [21:46:20] a930913: Just rm the public_html or rename it out of the way (in pmtpa); this is what the proxy uses to decide whether to show the pmtpa or eqiad webservice. :-) [21:49:20] Coren: I tried that, but it still 404s. Do I need to do something else, to kick it? [21:51:33] a930913: There shouldn't be no; what is the exact URL where that happens so I can figure out what's going on? [21:51:55] Coren: One of my tools misses the X-Forwarded-Proto header on eqiad (it was added here: https://bugzilla.wikimedia.org/show_bug.cgi?id=53689) [21:51:56] is some sort of stale cache a possibility? [21:52:27] SamB: It's /possible/, but I'd need to dig a bit deeper to find out. [21:53:29] danmichaelo: Can you try via https://tools-eqiad.wmflabs.org ? (ignore the SSL warning). This will help tell me where it gets dropped. [21:54:37] danmichaelo: (The current setting is "amusing" since it may involve three different layers of proxy) :-) [21:54:42] Coren: Yup, it's there if i use https://tools-eqiad.wmflabs.org [21:54:52] :) [21:55:27] danmichaelo: The amount of necessary trickery to get the proxying to switch automatically between two DCs is annoyingly brittle. Thankfully, it's short-term. [21:56:42] Coren: Must have been slow/cached. It looks fixed now. [21:57:26] a930913: I'm guessing apache is doing at least /some/ caching to avoid having to do a lookup in the filesystem at every request. [21:58:03] danmichaelo: Ah, I see why. Hmmm. [22:07:05] danmichaelo: Urk. The only way I can fix this with the extra layer of proxy would require the tools-eqiad.wmflabs.org to have a valid certificate. [22:07:20] * Coren tries to figure out a workaround. [22:07:45] Coren: Ok, I just added a javascript workaround for my tool [22:08:11] It will self-fix when tools-eqiad becomes tools [22:08:19] (In ~2 weeks) [22:08:33] Coren: Yup, no prob for a few weeks [22:09:13] danmichaelo: Sorry about that; the current proxy-to-proxy-to-proxy setup is an ugly hack to ease the transition while tools are split between the two DCs. [22:12:59] Coren: A small sacrifice for having everything else work so well :) [22:13:41] anyone know what would cause the issue would be on a labs (non tools) instance where you can connect to the instance, but it attempts to create your home directory and failes 'Creating directory '/home/jamesur'. Unable to create and initialize directory '/home/jamesur' ' (then sends the welcome message and closes the connection) I'm not sure why it's even attempting to create the home directory... I've been on the instance be [22:13:42] fore [22:14:43] does /home/jamesur exist already? [22:15:06] Coren: the normal setup being just proxy-to-proxy? [22:15:16] huh: it 'should' (I've been there before and put things there) but I can't tell because I can tell because I can't login [22:15:31] *but I can't tell because I can't login [22:15:34] SamB: In most cases, just proxy-to-webserver. [22:15:34] oh [22:15:51] it attempts to create it then closes the connection because it's unable to [22:16:03] Coren: why does the DC-split thing add TWO proxies? [22:16:05] but it connects... so it is clearly accepting my ssh key [22:16:06] jamesofur: Probably gluster being sick again. [22:16:33] I'll send him some flowers [22:16:43] SamB: To circumvent some issues about traffic between the two DCs. [22:17:23] SamB: One of them is completely transparent though. [22:18:04] how completely? like a generic socket port proxy? [22:18:11] s/socket // [22:27:33] Coren: in eqiad-tools tclsh is tclsh8.5, in pmtpa it is tclsh8.6, solution=? [22:36:05] gifti: Bleh. Needz upgrade. [22:38:20] gifti: Actually, both are installed. [22:38:54] indeed [22:39:01] but that's not my point [22:39:03] I'm thinking that which it uses by default is annoyingly order-dependent. You should probably use one explicitly. [22:39:32] sigh [22:39:35] ok [22:39:51] shouldn't it be the newest by default? [22:40:27] Arguably so; but that's obviously not how Ubuntu does it. [22:40:51] omg, ubuntu such fail, wow [22:41:17] ok, i will alias tclsh to tclsh8.6 and see if my scripts use it [22:41:50] Also, probably not 'ubuntu' so much as 'debian'. I doubt Ubuntu did something specific to override debian defaults. :-) [22:42:02] hm, ok [22:58:49] What bastion should I be using to log into instances in eqiad? [22:59:10] * bd808 thinks his ssh_config is borked [22:59:30] bd808 [22:59:33] ummm [22:59:39] i think you can use bastion.wmflabs.org [22:59:40] no? [23:00:00] hmmm... [23:02:37] bd808: bastion-eqiad.wmflabs.org , try that [23:03:15] mutante: That seems to get me a step farther. Sorry that I was debugging without reporting results [23:03:27] ottomata: and for ops there would now even be bastion-restricted-eqiad.wmflabs.org [23:03:31] Now it's my old friend: Unable to create and initialize directory '/home/bd808' [23:03:52] * bd808 shakes fist at all shared filesystems everywhere [23:04:02] bd808: my bet on that one is "gluster" [23:04:47] mutante: Could be. I'll try some other boxen. Thanks for the bastion tip though [23:05:13] yw [23:06:28] yeah mutante, that wasn't working earlier today [23:06:38] i couldn't log into it, some home dir problem [23:06:43] havne't tried since htough [23:11:20] ottomata: i think that's the gluster issue then .. [23:13:59] I thought gluster wasn't going to be used in eqiad? [23:14:14] * bd808 must have been wrong [23:16:23] eh, my only source is that i heard a gluster issue being mentioned earlier on this channel [23:16:56] and that we have them every once in a while since we have labs [23:17:48] though not the identical one [23:19:19] * bd808 shrugs [23:20:24] I was able get into tools-login. I'll check back in a bit to see if I can get to the new instance I just built in wikimania-support [23:20:48] Coren: Is tools-eqiad.wmflabs.org exactly what will be tools.wmflabs.org once pmtpa is gone? [23:20:56] Argh... really? "Cancel" has href=history.go(-1) on [[Special:FormEdit]] [23:21:02] anomie: That's the intent. [23:21:08] I havent' seen that since I was browsing geocities in 1998 when I was 7. [23:21:09] Damianz: You still around? [23:21:11] Coren: I created a new tool, it seems it's not been replicated to eqiad [23:21:47] Which is probably the first and last bit of javascript I encountered until my 16th [23:21:56] because it usually doens't work! [23:25:25] hedonil: It shouldn't need replication; it gets created there. What tool, and what happens? [23:25:46] Oh, new tool home creation might not have been turned on in eqiad yet. Lemme check. [23:26:03] Coren: hm. name: tools-info. I can't become tool in eqiad [23:27:04] Coren: become: no such tool 'tools-info' [23:27:25] hedonil: Yeah, it's the tool home creation. Give me a minute. [23:27:34] Coren: 'k [23:30:06] FWIW, I needed to add a security group rule allowing port 22 from 10.68.16.0/21 for ssh via bastion-eqiad to my eqiad instances [23:30:36] It looks like the security groups from pmtpa were copied over without that addition [23:30:38] hedonil: Try this? [23:31:03] Coren: yep thx [23:31:11] bd808: You probably want to replace the 10.4.0.0 by 10.0.0.0/8 rather; security groups don't cope with two rules with the same target. [23:31:32] Coren: Ok. It's working but I trust you :) [23:32:11] bd808: It only takes one of them, so you just turned off pmtpa (if it mattered); but I don't think you can reliably trust /which/ of the two it will take. [23:33:05] Ok. So a new rule for the whole /8 and drop the old rules [23:34:28] Coren: is the same change needed for the port 5666 rule? (I'm not sure what that allows) [23:35:05] bd808: That's Icinga monitoring, so probably so. [23:35:22] Okey doke [23:37:23] Coren: How about my old nemsis: Unable to create and initialize directory '/home/bd808' [23:37:41] Should I add the nfs role and reboot before I care too much? [23:37:49] Wait, where is this?' [23:37:57] This is on wikimania-scholarships.eqiad.wmflabs [23:37:59] Could someone help me identify what process is leaking memory like crazy? I always thought it was the cvn bot, but I'm thinking it is something else because I see it on the cvn apache instance as well, and I see a ganglia process with 12.8g of vmem usage !? [23:38:04] on cvn-app2 [23:38:13] New instance I just created a bit ago [23:38:14] http://ganglia.wmflabs.org/latest/?r=month&cs=&ce=&c=cvn&h=cvn-app2&tab=m&vn=&mc=2&z=medium&metric_group=ALLGROUPS [23:38:16] There is no gluster in eqiad. That's not supposed to happen. Lemme try to figuring out why. [23:38:24] http://ganglia.wmflabs.org/latest/?r=month&cs=&ce=&c=cvn&h=cvn-apache4&tab=m&vn=&mc=2&z=medium&metric_group=ALLGROUPS [23:38:48] Coren: ok. Thanks [23:39:04] I migrated the apache to an instance with more memory so it's less spiky now, but I really shouldn't have to weekly reboot a plain apache server for going OOM [23:39:16] bd808: what project is this? [23:40:01] Betacommand could you specify the bug # for me, the one needing a fix [23:41:07] Coren: It's the wikimania-support project [23:41:32] bd808: kk. [23:44:53] YEAY! I found the root issue. [23:45:02] \o/ [23:45:26] Our more recent image "helpfully" uses the FSC with NFS. This works fine, so long as you don't do a negative cache early before the ACLs for a new instance propagate. [23:45:28] drum roll [23:45:58] Now to find a good long-term fix (hedonil: you should be able to log in now) [23:46:39] * bd808 assumes that was for me actually [23:46:48] Coren: I can log in. Thanks [23:48:16] Coren: for one tool I have really big databases (...DATA.tools-db-dump.sql.gz has 637 MB), the point importing database on eqiad failed - when I came home, ssh connection was closed and not all data is in the database. What's the best way to do it again? [23:48:50] You mean during finish-migration? [23:48:57] ToAruShiroiNeko: give me a sec [23:49:26] Coren: yes [23:49:33] You want: zcat ...DATA.tools-db-dump.sql.gz | /usr/bin/mysql --defaults-file=replica.my.cnf -h tools.labsdb [23:49:43] Coren: thanks [23:49:53] apper: You might want to drop the database(s) first though, but it shouldn't be necessary. [23:50:06] okay, I'll try it [23:51:38] Coren: my unix know-how is very limited, will "nohup zcat ...DATA.tools-db-dump.sql.gz | /usr/bin/mysql --defaults-file=replica.my.cnf -h tools.labsdb" work? Or will "zcat ...DATA.tools-db-dump.sql.gz | /usr/bin/mysql --defaults-file=replica.my.cnf -h tools.labsdb" run even if the connection breaks half-way? [23:52:17] apper: nohup will probably not do what you hope it will. I would suggest running that in a screen instead. [23:52:39] Coren: okay, than I hope the connection will stay alive :). Thanks a lot [23:52:58] Don't worry, if the connection dies, screen will just detach. You can reattach it afterwards with 'screen -R' [23:53:05] ah, okay [23:53:28] ToAruShiroiNeko: bug 61813