[00:04:28] Hello! I find myself unable to log in today (either to tools-login.wmflabs.org, where I'm used to logging in, or bastion.wmflabs.org, which appears to be its replacement?) [00:07:22] I'm hoping to update the data for my appliication today, so any assistance with the new login procedures would be appreciated. [00:08:31] (Ah, I assume the "eqiad outage underway" message is the explanation for now? Will try again later, then.) [00:12:27] OK, I think the outage is over. [00:12:32] chippy: try now, if you're still here? [00:12:55] trying [00:13:11] andrewbogott, unable to ssh into bastion :( [00:13:24] really? Which one? [00:13:37] bastion.wmflabs.org [00:14:06] says Permission denied (publickey). [00:14:12] chippy: now? [00:14:36] andrewbogott, in. for both. Thanks [00:14:45] great. [00:14:50] andrewbogott, also able to sudo okay [00:14:58] great! [00:15:18] i can't ssh into wikidata-test instance [00:15:19] woo! [00:15:24] can ssh into wdjenkins [00:15:32] (wikidata-dev) [00:18:53] aude: did wikidata-test work for you previously? [00:19:22] It's a migrated self-hosted puppet instance right? [00:21:18] …aude? [00:24:57] it worked [00:25:05] worked an hour or so ago [00:25:37] self hosted, yes [00:26:21] ok, looking [00:26:26] Migrated from pmtpa, or fresh in eqiad? [00:26:45] fresh in eqiad [00:28:01] did puppet run, before? [00:28:31] yes [00:28:39] all puppetized [00:28:44] 'k [00:29:20] mind if I reboot it? [00:29:26] the master died [00:29:27] go ahead [00:29:28] let me see why [00:30:04] I don't understand why this puppet failure would break logins [00:30:21] it shouldn't [00:30:26] permission denied [00:30:31] ok, I started the master [00:30:35] puppet should recover now [00:30:37] let's see auth [00:30:48] ok [00:31:22] try again [00:31:28] or... wait [00:31:47] oh for crying out loud [00:34:28] aude: should be fixed now [00:35:35] paravoid: works for me. Do we need to do whatever you did on other self-hosted boxes? [00:35:39] paravoid: it's good [00:35:40] thanks! [00:35:45] no [00:35:49] andrewbogott: no, bad sed [00:36:02] ok. [00:36:07] thanks for fixing [00:39:26] chippy, aude, does this close out the 'maps' project? Or are you still salvaging data from pmtpa? [00:40:01] andrewbogott, okay, files got and almost all copied across. Services appear to be running. double checking [00:40:29] whatever chippy says [00:40:49] :) [00:41:09] Krinkle: do you still want to do that file transfer tonight? Or are you on to other things/thwarted by the outage? [00:41:25] yeah the only thing im seeing is a very small permission thing but that's something i will fix [00:41:41] so I think I've copied all the data [00:41:45] double checking now [00:44:46] andrewbogott, yes I believe I'm all set now [00:44:58] great -- thank you! [00:45:56] hm, seven left [00:46:23] I can fix things on my end, one thing that may need looking at is getting the maps-warper.instance-proxy.wmflabs.org running. But I would imagine this magic would work when the migrations complete? [00:46:56] instance-proxy is getting phased out, but you should be able to set up a dedicated hostname using the 'manage web proxies' link in the sidebar [00:47:11] okay thank you :) [00:50:11] andrewbogott, done it. That is awesome. Thanks [00:51:12] np [00:54:19] Krinkle: I'll try to keep a window open… beep me if you're still around. [01:03:38] andrewbogott: Yeah, got pulled away. Gonna have to do this tomorrow.. Cutting it close. [01:03:55] tomorrow's ok, just let me know when you're ready. [01:05:01] andrewbogott: Can you give a prospect of the overal situation in tampa? Is it of the magnitude that like the instances host masters are going to be shutdown or would be there be a chance (without changing labs plans) to have it running there a little longer. [01:06:36] Well, I want to do a soft shutdown /before/ we start pulling apart the servers so there's a few days for stragglers to notice that things are getting shut down. [01:06:56] But you may get an extra day of slack since I can't shut down anything until beta is finished. [03:18:43] ok, let's build a trusty image! :D [03:19:01] andrewbogott: not sure if you're around, but I'll describe what I'm doing as I do it [03:19:35] I'm going to start off by adding the default trusty image to glance, so that I can use a trusty image to build the trusty image [03:19:59] this will use the cloud-init script that gets injected to bootstrap the system so that I can log in [03:26:21] ugh. mediawiki on wikitech is so old [03:26:38] we really need to get rid of SMW. I don't know how we're going to upgrade it now that composer is required [03:27:32] I don't think you can even include it without composer now [03:27:46] what a fucking idiotic move on their part [03:29:39] oh, a trusty image is already there [03:29:43] and an instance it booted with it [03:30:47] ugh. right. puppet 3 [05:17:01] Coren: are you still around ... https://bugzilla.wikimedia.org/show_bug.cgi?id=63152 seen this one ? [06:29:14] andrewbogott_afk, on toro.eqiad.wmflabs, I am getting an error when puppet tries to mount /data/project: [06:29:15] https://dpaste.de/4g8v [06:29:24] I will continue debugging this tomorrow, but I wanted to let you know. [06:29:59] Reproducible by running sudo puppetd -tv [07:51:49] !ping [07:51:49] !pong [08:42:08] YuviPanda are you aware of bug https://bugzilla.wikimedia.org/show_bug.cgi?id=63152 [08:42:18] * YuviPanda peeks [08:42:31] it is preventing many services from functioning [08:42:33] andrewbogott_afk: Coren: I;m unable to log into the existing cvn instances in tampa (cvn-app2.pmtpa.wmflabs). Everything was fine a few days ago. [08:42:37] Now it suddenly aborts connections and claims to be unable to create /home/krinkle (why would it need to create that? I've logged into that instance 100s of times) [08:43:18] Krinkle het kan zijn dat je last heb van diezelfde bug [08:43:21] GerardM-: I don't think I can do anything :( needs andrewbogott_afk or Coren [08:43:42] are they both in the same time zone ? [08:43:57] GerardM-: I think so :) [08:43:58] err [08:43:59] :( [10:48:13] !ping [10:48:14] !pong [10:48:16] ok [10:54:21] !log deployment-prep Deleting job beta-update-databases , replaced by datacenter variants beta-update-databases-pmtpa and beta-update-databases-eqiad [10:54:24] Logged the message, Master [11:10:58] !log deployment-prep deleting job beta-code-update , replaced by datacenter variants beta-code-update-pmtpa and beta-code-update-eqiad [11:11:01] Logged the message, Master [11:17:38] YuviPanda: Just because the bot replied doesn't mean you're actually here [11:17:41] * Reedy glares at YuviPanda [11:18:40] ok. so, looking at moving the "awb" project from toolserver to labs [11:18:52] couple of web php scripts with a mysql backend [11:18:59] used for svn snapshot/tarballs [11:19:10] Should this be a new project, or just use tools? [11:19:46] Then just create a service group? [11:29:34] !log deployment-prep MediaWiki code and configuration are now self updating on EQIAD cluster via Jenkins jobs. First run: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/4/console [11:29:37] Logged the message, Master [11:39:01] out for lunch [12:54:22] !log deployment-pref Fixed permissions on eqiad bastion for /srv/scap . Others (such as mwdeploy) could not read / execute scap scripts [12:54:22] deployment-pref is not a valid project. [12:54:30] !log deployment-prep Fixed permissions on eqiad bastion for /srv/scap . Others (such as mwdeploy) could not read / execute scap scripts [12:54:33] Logged the message, Master [12:55:12] !log deployment-prep mediawiki l10n cache being rebuild!!! [12:55:15] Logged the message, Master [13:26:08] Krinkle|detached: Might be gluster being ill. Again. [13:26:35] Krinkle|detached: Why do you need to connect to the tempa instances anyways? All the files from there have been copied over to eqiad. [13:43:07] manybubbles: hi did you get the ElasticSearch in eqiad beta warmed up? [13:43:18] hashar: should be ready! [13:43:53] manybubbles: you might be able to try out by tweaking your DNS hosts entry to point to eqiad [13:44:06] I have been using http://paste.openstack.org/show/74443/ [13:44:30] ah [13:53:12] now I got to debug out why ferm iptables rules are not applied [13:53:14] goooood times [13:54:35] Coren: helllooo GlusterFS /home is broken on deployment-prep project in pmtpa :-/ [13:54:44] still got to access the instances there :/ [13:54:58] Bleh. [13:55:06] I got a permission denied [13:55:11] which is usually because home is dead [13:57:34] hashar, Krinkle|detached, superm401, there was an ldap outage yesterday which caused gluster to lose its marbles. I'll catch up... [13:58:01] !log deployment-prep rebased puppetmaster git repository, reapplied ottomata live hacks. [13:58:04] Logged the message, Master [13:58:08] andrewbogott: thanks :-] [13:58:13] andrewbogott: I'm failing the gluster volume start deployment-prep-home force [13:58:33] If fails instantly with "operation failed" [13:59:05] ooo, will unapply those hashar, thanks [13:59:26] hashar, I thought deployment-prep used nfs? [13:59:30] Is it some of each? [13:59:43] Gluster has clearly decided to be a pain until it dies. [13:59:44] ottomata: I rebased the repo and deleted your local 'otto' branch [13:59:52] Hey, wait, andrewbogott is right. That's NFS! [13:59:58] ottomata: if you need hack on the puppet master, just propose a change in Gerrit and cherry pick it :-] [14:00:20] andrewbogott: Coren: oh men sorry. Yeah deployment-prep uses NFS for /home . Sorry :-( [14:00:31] D'oh. :-) [14:01:44] hashar: What instance is giving you issues? [14:01:52] naw, i'm done hashar, thanks, its really hard to do that when I have know idea what the problems are [14:02:09] i was using that to debug, and meant to reset my changes when I was done [14:02:15] forgot (of course) anyway, fixed now [14:03:46] ottomata: thanks . I did some further cleanup . Looks good now. [14:04:00] Coren: deployment-bastion.pmtpa.wmflabs and deployment-apache32.pmtpa.wmflabs [14:04:17] Coren: but most probably any instance on pmtpa :( [14:04:26] hashar: seen my comment in -operations ? [14:04:44] matanya: replying there [14:04:51] thanks [14:06:13] hashar: Nothing wrong with /home there. Lemme see what's up in the logs. [14:06:42] Aha. The ssh keys volume is dead. I.e.: nothing in pmtpa could work atm. [14:07:37] and I thought ssh relied on pam_ldap or something to grab the user public key [14:17:34] hashar: Have you tested https://gerrit.wikimedia.org/r/#/c/119256/ on labs? Shall I merge it? [14:17:52] ah [14:18:08] the relevant diff is https://gerrit.wikimedia.org/r/#/c/119256/4..5/manifests/misc/logging.pp [14:18:19] looking on the puppet master [14:18:43] It looks right, I just want to verify that you've tried it beforehand [14:18:57] I have tested it [14:18:59] testing now [14:21:07] err: Could not retrieve catalog from remote server: Error 400 on SERVER: Invalid parameter after at /etc/puppet/manifests/misc/logging.pp:49 on node i-0000010b.eqiad.wmflabs [14:21:07] :-( [14:21:33] easy fix luckily [14:26:13] puppet doesn't like the 'after => ' parameter in the file {} statement grr [14:26:58] err: Failed to apply catalog: 'mkdir -p /data/project/syslog' is not qualified and no path was specified. Please qualify the command or specify a path. [14:27:51] hashar: /bin/mkdir [14:28:21] andrewbogott: was the downtime of wikitech an april's fool joke ? [14:28:33] matanya: no [14:28:48] ok, so just a nice date then [14:29:07] Good point, though, I guess I should've picked monday :) [14:38:22] sorry net dropped [14:41:14] the syslog-ng patches works for me on deployment-bastion.eqiad.wmflabs https://gerrit.wikimedia.org/r/#/c/119256/ https://gerrit.wikimedia.org/r/#/c/119257/ [14:41:20] for prod that would impact nfs1 and nfs2 [14:41:56] hashar: ok! I will merge... [14:42:36] could use some help with some puppet class not being realized on my instances :/ [14:45:10] hashar, what are you seeing? [14:45:43] some beta instances need iptables rules to rewrite the public IP to the instance private IP equivalent [14:45:52] that is because you can't access the NATed public IP from inside labs [14:46:37] so I got a thin role class role::beta::natfix which invokes the class beta::natfix which create_resources() the ferm rules I need [14:46:40] the definition is in modules/beta/manifests/natfix.pp [14:47:03] apparently the role class is applied properly (the salt grains patch introduced yesterday does add the role::beta::natfix grain on the instance) [14:47:10] but the beta::natfix is not applied [14:47:16] ferm package is not even installed :/ [14:48:04] hashar, what's an example of an instance that applies that role? [14:48:11] or, I mean, tries to apply that role :) [14:48:11] ah of course [14:48:16] deployment-parsoid04.eqiad.wmflabs [14:49:07] where's the logic that would be installing ferm? [14:49:17] modules/beta/manifests/natfix.pp [14:49:25] which is a hash map invoked by create_resources() [14:49:29] This stuff is all merged, right? Or do you have a different version in progess? [14:49:34] beta::natdestrewrite [14:49:42] it is all merged and working on pmtpa [14:49:57] the ferm::rules are realized with vim modules/beta/manifests/natdestrewrite.pp [14:50:08] which include base::firewall [14:50:25] gripping for firewall in /var/lib/puppet/state/* yields nothing :-/ [14:52:17] maybe the puppetmaster does not recognize create_resources() and bails out [14:52:33] Yeah, I'm not very familiar with create_resource but I've seen it used elsewhere... [14:53:17] that is very powerfull [14:53:34] apparently the include beta::natfix is never included :/ [14:54:07] oh my god [14:54:14] I think I know why [14:54:25] class role::beta::natfix { [14:54:25] include beta::natfix [14:54:25] } [14:54:37] the puppet scope might interpret beta::natfix as being role::beta::natfix [14:54:41] so it would include itself [14:54:47] oh, yeah, that's probably right [14:54:57] So I think include ::beta::natfix ? [14:55:01] trying with a :: prefix include ::beta::natfix [14:55:15] damn refactoring broke it [14:55:53] fixed! [14:58:22] !log deployment-prep fixed up role::beta::natfix . Ferm is now being applied again on various application server instances {{gerrit|121378}} [14:58:25] Logged the message, Master [14:58:28] and that of course breaks ssh access [15:00:47] is remapping IPs strictly needed? Could it be done via hostnames instead? [15:01:20] that is done in the network definitions listing all bastion hosts [15:01:30] then ferm by default list of those hosts and allow ssh access from them [15:01:37] bastion-eqiad.wmflabs.org never got added to the list of bastion hosts [15:01:42] patch https://gerrit.wikimedia.org/r/121379 [15:01:54] I tweaked it when alexandros did the ferm patch [15:02:39] hashar: there are several other bastions, do you mind adding them all? [15:02:59] amend my patch? :-] [15:03:25] I would happily copy paste though [15:04:11] 10.68.16.68, 10.68.16.67, 10.68.16.66, 10.68.16.5 [15:04:13] thanks [15:07:34] Oh, FFS. [15:07:49] Hey, that's the sound of Coren fixing a bug! [15:08:08] andrewbogott: Yeah, so the issue is client-side caching after all; but my google-fu fails at figuring out how to turn it off. [15:08:29] andrewbogott: The clients don't even /hit/ the mountd once it decided it failed. [15:08:43] andrewbogott: Reboot does fix, otoh [15:08:56] So, clearly, there is some evil caching going on. [15:08:57] ok… superm401, mind if we reboot toro? [15:09:04] andrewbogott: I just did. [15:09:10] ok then :) [15:09:19] andrewbogott: And, unsurprisingly, it fixed things. [15:09:51] Yeah, that's a mixed blessing. [15:09:54] So now, I really need to find a way to prevent the attempt at the mount until we know it's there -- so I'll be implementing that ugly thing we discussed. [15:10:14] ok :) [15:10:18] andrewbogott: amended https://gerrit.wikimedia.org/r/#/c/121379/2/manifests/network.pp [15:10:23] with all four EQIAD bastions [15:10:33] Gotta go see dentist though, will be back in ~3h [15:11:05] possibly we could get the three public bastions to share the same ssh host key then get bastion-eqiad.wmflabs.org to round robin to all three IP [15:12:31] andrewbogott: and you need the previous patch https://gerrit.wikimedia.org/r/#/c/121378/1 [15:12:48] andrewbogott: which fix up the include beta::natfix , I have picked it on beta puppet master and that works [15:13:22] hashar: ok, merged [15:16:12] my hero [15:16:24] so beta in eqiad is almost competed now [15:16:37] I will craft a mail and find out whether zeljkof is available tomorrow to test it out [15:16:46] I guess we can do the switch on monday [15:16:58] which would delay pmtpa labs shutdown slightly [15:21:03] !log deployment-prep applying role::beta::natfix on deployment-bastion.eqiad [15:21:06] Logged the message, Master [15:21:45] hashar: didn't you said we will abandon that once move to eqiad? [15:21:51] matanya: this ? [15:22:23] natfix [15:22:34] na we need it [15:22:40] until labs supports DNS split horizon [15:23:06] aka have the DNS give back private instances IP instead of the public address whenever a DNS query is made from an instance [15:23:30] yeah, makes sense. i thought it was on the plan for eqiad [15:23:37] !log deployment-prep role::beta::natfix cant run on deployment-bastion.eqiad because the ferm rules conflicts with the Augeas rules coming from udp2log :-( [15:23:39] Logged the message, Master [15:26:38] matanya: ever migrated Augeas iptables rules to ferm ? [15:26:59] not yet, only iptables. maybe it is time to start [15:27:24] and i didn't find any puppet-lint RT tickket [15:28:54] yes, Coren and andrewbogott , it does work now :-) [15:37:36] hasteur: The "dns server" that's part of nova-network doesn't have support for split horizons. Then again, I'd really like to have a proper dns server in Labs eventually, so that might get fixed this way. [15:38:50] that is defintily a better way to solve it [15:48:23] hashar: Are you also migrating the 'integration' project? [15:49:09] andrewbogott: yeah [15:49:20] Need me to do anything? Or is it finished? [15:49:32] got patches pending for misc::package-builder [15:49:40] I haven't found out if you guys are using that class [15:49:56] it basically setup pbuilder/cowbuilders images under /var/cache/pbuilder and install all tools needed to build debian package [15:50:07] link? [15:50:18] /var/ on labs is only 2GB which is too small so I need to have the cache under /mnt/ ( which is lvm) [15:50:24] * hashar digs in his emails [15:51:01] I must not be marked as reviewer on them, I don't see anything in my gerrit list [15:51:19] Mandatory linting: [15:51:19] https://gerrit.wikimedia.org/r/#/c/120005/ [15:51:19] Fix an notify {} that was not using message => [15:51:19] https://gerrit.wikimedia.org/r/#/c/120008/ [15:51:19] Replace /var/cache/pbuilder with a variable ($pbuilder_root) [15:51:20] https://gerrit.wikimedia.org/r/#/c/120009/ [15:51:20] Wrap misc::pbuilder with role classes to vary the pbuilder_root: [15:51:21] https://gerrit.wikimedia.org/r/#/c/120013/ [15:52:01] andrewbogott: added you as a reviewer on all four changes [15:52:09] thanks [15:52:12] I made them very atomic to ease merge/deplkoy/Review (hopefully) [15:55:00] !log integration deleting integration-apache1 Was used as a proxy for other instances and as a dev box for integration.wikimedia.org/ . Freeing the public IP address while at it [15:55:02] Logged the message, Master [15:55:08] one less instance! [15:55:39] 208.80.153.222 pmtpa labs IP released. [16:07:00] petan: Two questions…. 1) need any assistance with migrating project 'nagios'? 2) interesting in working on labs ganglia at some point after the dust settles from the migration? [16:07:30] andrewbogott: sign me on #2 [16:07:45] matanya: great! [16:07:54] I set up a new instance but it doesn't quite work and I don't know why :) [16:07:56] andrewbogott: interested? or interesting? :P [16:08:03] interested [16:08:33] andrewbogott: yes I can help you with ganglia, regarding #1 I was in thought that damian was working on that [16:08:57] Damianz: are you? :P [16:12:36] hashar, I'm trying to do a before/after with the lint patch, and the 'before' puppet run is… still running. [16:15:08] andrewbogott: the package creates two cow builder images which is rather slow [16:15:13] basically download and instance Precise images [16:15:27] yeah, I'm going to have some breakfast and will check back after [16:35:45] Yay, puppet magically fixed itself overnight. Thanks to whoever fixed gluster. [16:39:34] superm401: do you need any help with the editor-engagement migration? [16:39:56] andrewbogott, don't think so. I'm working my way through recreating on eqiad right now. [16:40:08] great. Let me know if you run in to any trouble [16:42:14] andrewbogott: great that you can test on labs :] [16:42:32] alexandros is working on a script to easily compare catalogs between two sha1 [16:42:54] will most probably integrate it as a jenkins job that one can manually trigger [16:48:51] error: server certificate verification failed. CAfile: /etc/ssl/certs/ca-certificates.crt CRLfile: none while accessing https://git.wikimedia.org/git/mediawiki/extensions/MwEmbedSupport.git/info/refs [16:49:21] After: [16:49:22] git clone https://git.wikimedia.org/git/mediawiki/extensions/MwEmbedSupport.git [16:50:14] Using the gerrit URL works. Is that a known issue? [16:53:32] superm401: On Tools? Yes ... Let me look up the bug. [16:53:47] scfc_de, was on Labs, but not Tools Labs. [16:56:44] hashar: you saw matanya's comment on https://gerrit.wikimedia.org/r/#/c/120009/1/manifests/misc/package-builder.pp ? [16:57:20] superm401: Maybe the same issue? https://bugzilla.wikimedia.org/show_bug.cgi?id=62432 [16:59:01] scfc_de, thanks, commented. [17:00:13] andrewbogott: ah no sorry [17:00:49] having trouble logging in to a labs host -- logs me right out (after converting to local puppet master) [17:00:59] cajoel: hostname? [17:01:07] keys seem to be there, as I get a motd, but it's like the shell is /bin/false [17:01:13] andrewbogott: flow-localpuppet [17:01:59] what project? [17:02:07] netflow [17:02:09] @info [17:02:09] http://bots.wmflabs.org/~wm-bot/dump/%23wikimedia-labs.htm [17:02:17] i-000002a3.eqiad.wmflabs [17:04:28] cajoel: mind if I reboot? [17:04:36] not at all -- I should have tried that.. [17:05:08] andrewbogott: also created a new instance, and it seems to take a long time to put my public key in place. [17:05:14] I'm not sure quite what is wrong, but there was something wrong with the nfs mounts. [17:05:24] interesting [17:05:25] So when it tried to create the homedir it failed and kicked you out. [17:05:28] A reboot should help. [17:05:39] yeah, seems better now. [17:05:45] was working yesterday -- maybe the mounts changed? [17:05:51] Coren is working on a more comprehensive fix for this. [17:05:54] sweet [17:05:56] thanks [17:10:01] hmm, i'm unable to log into bastion.wmflabs only able to log into bastion-eqiad [17:12:04] Me, too. ("Permission denied (publickey).") andrewbogott, did you close down pmtpa-bastion? [17:14:18] nah, it's just gluster still freaking out from yesterday's ldap outage [17:14:20] try now? [17:15:37] is there a way to re-order the images in instance creation so that Trusty 14.04 isn't the default selected image? [17:16:25] andrewbogott, yep, able to login to it now [17:16:36] cajoel: yes, that's a mistake, I thought that Ryan fixed it yesterday. [17:21:46] andrewbogott: I noticed region=eqiad in the url, region=pmtpa still works, undocumented emergency ability (instance creation only hidden in the UI?) Anyway, was just testing. Will delete again :) [17:22:07] Migrating now, hopefully will work [17:22:09] (cvn) [17:23:15] hello [17:23:50] I used to have a php_error.log in my tool home account [17:23:56] it's not there anymore [17:24:27] and my tool is not working properly: https://tools.wmflabs.org/ptools/ [17:24:35] is there something going on ? [17:25:51] pleclown: in the future I'd encourage you to subscribe to the labs-l list so that you can get notices of planned outages and other changes. [17:26:00] In the meantime, though, I believe Coren can help you with useful links. [17:26:14] andrewbogott: sorry got interrupted with Zuul getting wild. [17:26:21] andrewbogott: It's been a while, apparently [17:26:34] re package-builder having a scoping issue . I replied to matanya concern on https://gerrit.wikimedia.org/r/#/c/120009/1/manifests/misc/package-builder.pp [17:26:44] not sure what is the best way to fix it up :/ [17:26:56] can someone peek at this file on production /etc/ferm/conf.d/10_nrpe_5666 [17:27:21] in labs, I'm getting proto tcp dport 5666 { saddr $INTERNAL ACCEPT; } -- I expect that $INTERNAL is supposed to be replaced with something... [17:27:37] andrewbogott: I created 3 eqiad instances, the first and third one are up and running with all OK afaics. The second one seems to have /home missing. Should I wait or reboot or something else? [17:27:51] (cvn-app4 is OK, cvn-apache5 is missing /home, cvn-apache5 is OK) [17:27:57] Krinkle: wait and reboot is still the best approach. Coren is working on a better solution. [17:27:59] checked `df` [17:28:12] it has /project volume though, but not home. [17:28:21] Krinkle: I had to reboot a new instance I just created to get /home working (also check that you didn't start 14.04 instances) [17:28:23] the other two have both. I created all three of them just now. [17:28:29] yeah [17:30:00] cajoel: Playing around with ferm? $INTERNAL is a variable only defined in realm production, IIRC. [17:30:04] *ferm variable [17:30:12] ah -- that might explain it [17:30:21] yeah -- trying to get ferm working in labs [17:30:40] I spent some time trying to get ferm running in Labs, but gave up after a while. [17:31:22] wondering how I might work around that... [17:31:32] it looks like it's getting farther that that is used to. :) [17:32:34] INTERNAL is here.. [17:32:34] /var/lib/git/operations/puppet/modules/base/templates/firewall/defs.erb [17:35:09] class base::firewall { [17:35:10] ok [17:35:14] right -- been down this road before [17:35:33] scfc_de: did you try workign with base::firewall [17:36:47] !log cvn Installing packages on new eqiad instances (cvn-app4, cvn-app5, cvn-apache5) [17:36:50] Logged the message, Master [17:37:30] cajoel: Yes; then I locked me out and my mood dropped :-). [17:37:45] hashar: I actually don't understand matanya's comment. pbuilder_root is a param for the class, and you reference it as a variable... [17:37:46] what's the problem? [17:37:46] pleclown: this might be a good place to start reading: http://lists.wikimedia.org/pipermail/labs-l/2014-March/002228.html [17:38:00] scfc_de: I'll get a fresh cup of coffee.. :) [17:38:04] andrewbogott: it is used inside a define under the class [17:38:10] andrewbogott: so the define might have another scope [17:38:18] oh, I see. [17:38:18] andrewbogott: What's the status on ganglia btw? [17:38:31] A define can take a param, can't it? [17:38:33] andrewbogott: so I think I should just add $pbuilder_root as a parameter [17:38:42] Krinkle: Turned off at the moment, pending someone having time to fix it. [17:38:54] hashar: seems like [17:40:00] scfc_de: working! [17:40:48] andrewbogott: It does seem to need special treatment though, no? Not just the link on the web interface (that's indeed something in the config or the osm extension), but also to get the data. [17:40:55] Or could any project do what ganglia does? [17:41:01] I recall seeing deamons on my instances for it [17:41:22] Krinkle: I'm sorry, I don't know much. I'm happy to give you access to the ganglia project if you would like to look around. [17:41:34] No thanks :) [17:41:44] cajoel: Wow! :-) [17:42:18] seems really odd that base::firewall is responsible for definitiions that live inside /etc/ferm/conf.d [17:42:33] should those be migrated in to the ferm modue? [17:43:19] Krinkle: I think the only way Ganglia is "special" is that its aggregator server is named in the Puppet manifest for Labs hosts so they know where to send their data. [17:43:41] andrewbogott: I have added a bunch of parameters to both defines : https://gerrit.wikimedia.org/r/#/c/120009/1..2/manifests/misc/package-builder.pp [17:43:44] And the fact that all instances are sending that data [17:43:58] andrewbogott: sorry must escape now :-/ Feel free to merge in, I can catch up on my instances tomorrow morning [17:44:13] Krinkle: I am off see you later :] [17:44:24] hashar: k, gnight [17:44:48] who can allocate public IPs for equiad hosts? (I need it for a UDP service) [17:45:01] cajoel: There's another oddity there that the ferm module defines rules for ntp, while (IMHO) they should be in base::firewall. But -operations may be more knowledgable about that. [17:45:26] cajoel: Project admin when the quota isn't reached. You need andrewbogott or Coren to up your quota. [17:45:35] (Which could be 0.) [17:45:46] cajoel: just a second... [17:47:32] andrewbogott: removed IP from pmta, need similar in equiad when you have a moment [17:48:37] andrewbogott: it worked [17:48:38] thanks [17:50:04] cajoel: ok, done [17:59:21] andrewbogott, I'm trying the scp at https://wikitech.wikimedia.org/wiki/Labs_Eqiad_Migration/Howto#Self-migration_of_shared_data but getting: [17:59:23] ssh: connect to host pronunciationrecording.pmtpa.wmflabs port 22: Connection timed out [17:59:37] Could that be a security group issue, or is it something else? [17:59:43] andrewbogott: Can I do a copy over scp or ssh manaully for part of the data as a test, or is that something you have to do? I imagine there's no route for me to projectstorage.pmtpa.wmnet or cvn-app2.pmtpa.wmflabs from eqiad instances, right? [17:59:45] probably security group. [18:00:01] Krinkle: there is a route, just check your keys and security groups. [18:00:17] superm401: I can open up your pmtpa instances to ssh from eqiad if you like. [18:00:24] This is 'editor-engagement' project, right? [18:00:28] andrewbogott, sure, thanks. [18:00:32] Correct [18:00:34] Lol, I didn/t realise we're doing the same [18:01:29] superm401: better? [18:02:01] Krinkle: the limit is that you probably don't have the ability to do a root->root connection. So if you want a full volume copy it's easier for me to just do that from the outside. [18:03:21] andrewbogott: If it's not too much work, could you do a copy now, and then another one later? The one now would be while pmtpa is still running (so I can try booting one of the bots with debug and dry-run parameters and check it out while the tampa bots are still writing and being the primary databases for now) [18:03:35] Krinkle: what project, volume? [18:03:37] and, nfs or gluster? [18:03:45] I dont know nfs or gluster [18:03:46] andrewbogott: project-cvn [18:04:01] The /data/project/cvn as mounted on cvn-app2.pmtpa [18:04:14] If you didn't specifically configure it for nfs then it's probably gluster [18:04:25] Krinkle: if you want a not-quite-up-to-date copy, you have one of those already [18:04:33] in /data/project/glustercopy [18:04:36] is that enough for your testing? [18:04:51] andrewbogott: I was slightly too old. I deleted it just now. [18:04:54] It* [18:04:57] ok [18:05:05] * YuviPanda makes appropriate Krinkle's age joke [18:05:14] andrewbogott, yep, thanks. [18:07:18] Krinkle: how's that? [18:07:20] Not much in there... [18:07:57] andrewbogott: Looking good [18:07:58] Thanks :) [18:08:02] np [18:08:53] andrewbogott: Hm.. that is small. Interesting.. [18:09:14] 117M on tampa, 125M on eqiad. Funny too. [18:09:44] 'du' doesn't really work on gluster. [18:09:47] so I'd trust the 125 [18:09:52] yeah [18:11:21] andrewbogott: It's fine for now this run, but I noticed grp stuff didn't survive (all root/root now). I suppose the uids won't match, but do you have a recommended way to re-apply or map that? [18:11:45] um… I probably didn't use the right flags in my rsync [18:11:49] want me to wipe and try again? [18:12:45] Actually… there, did that fix it? [18:13:28] Thanks [18:13:50] Ah, lol, you also saved me a git pull. [18:14:00] I just made a new commit to the repo from the one in tampa [18:14:13] It swapped mid-commit xd [18:16:17] andrewbogott: Okay, the app server testing will take a bit. Doing the apache first. Can I just unplug the public ip in tampa, and set up web proxy for the same name ('cvn') in eqiad to the equiv instance? [18:16:28] Krinkle: yep! [18:16:40] Or, if you need things besides http then I can grant you a public IP in eqiad [18:16:50] http and https [18:16:57] Ok, then a proxy should work [18:17:43] Krinkle: Note that some people have had an issue with the proxy: labs instances can't contact public labs IPS. So if the service running on your labs host tries to access itself /via/ the proxy, you'll get a timeout. [18:18:00] andrewbogott: k [18:18:02] There isn't an official solution to that, although adding an alias to localhost in /etc/hosts works pretty well [18:18:06] andrewbogott: Hm.. I'm confused byt he web proxy page [18:18:09] https://wikitech.wikimedia.org/wiki/Special:NovaProxy [18:18:13] yeah, how so? [18:18:13] it says there is one in eqiad already [18:18:27] Yeah, I probably made that. [18:18:45] anyway, if you turn off the old one, things should settle down pretty quickly. [18:19:05] point to a deleted pmtpa instance (the old apache in tampa, instead of the newer cvn-apache4 in tampa) [18:19:41] it is currently being served from cvn-apache4. I intent to set up a different subdomain (like cvn-eqiad) temporarily for now, as the domain is actively serving as an API, so I'm not touching the live one [18:20:13] Cool, that was amazingly fast [18:20:13] https://cvn-eqiad.wmflabs.org/api.php [18:38:40] andrewbogott: OK, I've got web proxy on cvn.wmflabs.org now, seems its still going to pmtpa [18:38:44] I guess that one has precedence [18:38:51] Didn't remove it yet to ensure uptime [18:38:58] the eqiad one is ready for traffic now [18:39:11] You can just switch over the proxy & dns whenever you're ready. [18:39:20] Do you need another rsync or are you happy with what you have? [18:39:53] the api.php is serving based on the data from the copy you made earlier, that's fine for now. I made it read-only for now. [18:40:11] Then once the apache and api.php are switched over, I'll do the bots (which are still running and writing to tampa now) [18:40:57] andrewbogott: How do I switch it over? I setup web-proxy/eqiad/cvn.wmflabs.org > cvn-apache5.eqiad and still have web-proxy/pmtpa/cvn.wmflabs.org > cvn-apache4.pmtpa [18:41:28] cvn.wmflabs.org is pointing at both things right now so it's… just luck that you're getting pmtpa. [18:41:36] Just delete the proxy pointing to the pmtpa instance [18:41:38] OK [18:42:01] https://cvn.wmflabs.org/api.php?users=X2 is 404ing now [18:42:24] andrewbogott: [18:42:40] I can't recreate the pmtpa one [18:43:09] Does cvn.wmflabs.org resolve to 208.80.155.156 for you? [18:43:13] that's eqiad, that's what we want [18:43:29] 208.80.153.214 [18:43:38] Ah, so, your dns is just stale. [18:44:12] well, me and everyone else. it doens't magically update. I thought web proxy would cover the internal difference? Isn't *.wmflabs.org served all through eqiad now (even pmtpa web proxies?) [18:44:33] No. [18:44:41] 208.80.153.214 is the eqiad proxy [18:44:49] Can the pmtpa one be re-created? It's okay if it stays on the old apache [18:44:50] ah, sorry, scratch that [18:44:58] the web proxy I just deleted there [18:45:01] 208.80.155.156 is the eqiad proxy, 208.80.153.214 is the pmtpa [18:45:27] It's not easy to recreate… anyway, won't dns update in a minute or two? It already has, for me. [18:45:41] still getting 404 herer [18:45:51] Since when does DNS update that quickly? [18:46:36] Re-created the pmtpa one [18:46:40] Krinkle: you still need access to stat1? [18:46:56] andrewbogott: Looks like I coudl just change the query prameter of the web proxy special page [18:47:02] the link is just hidden by default it seems [18:47:05] at least so it appears [18:47:34] Yep, my pmtpa apache instance is getting traffic again [18:47:40] OK... [18:47:43] Okay, that was only a short outage [18:47:46] I must not understand what you want :) [18:48:07] I want to not have people complain telling me cvn is down for everyone patrolling wikimedia edits via it. [18:48:39] It's quite sensite beause its a live stream, an edit missed means it'll never be looked at again basically (well, sans watchlist and random spots) [18:49:11] Lemme see if I can turn off the dns for pmtpa without breaking access via 208.80.153.214 [18:49:18] then we can have a graceful switchover as dns catches up [18:49:27] Yeah, serving from both is fine. [18:50:15] there, did that break anything? [18:50:16] cvn-api doesn't provide the edit feed, it provides complementary data, but the bots depend on it (they shouldn't but that's how it is and its a bitch to fix because of other legacy stuff, i've been trying to work my way through the pile of stuff for over a year, sanitising it bit by bit) [18:50:52] Works for me still [18:50:59] OK, what does cvn.wmflabs.org resolve to now? [18:51:02] one http request just now made it to the eqiad apache [18:51:07] not mine though [18:51:10] yeah, that was me [18:51:19] traffic is still flowing into pmtpa according to my log tail [18:51:39] ok… we'll watch it for a bit. In theory things should switch over as dns updates [18:51:44] OK [18:51:48] Thanks [18:52:28] andrewbogott: Preparing the bots now, stand by for copying project-cvn data like the last copy (/data/project/cvn, with rights etc) [18:52:48] that's the same data that I copied before, right? [18:52:51] Yep [18:52:55] but not eytyet [18:53:16] matanya: Not sure about the context, will get back to you, doing migration atm. [18:53:36] sure, it is about ottomata's mail. [18:58:34] andrewbogott: There's a couple of requests from users coming in to the eqiad instance (I can tell by the query parameters), still only very few though. [18:59:08] Anyway, not worried for now. It can take a few hours as far as I'm concerned, having the old one serve stale data is fine now. readonly is better than a gap. [18:59:09] almost done [19:01:25] andrewbogott: OK, green ahead. Can you copy it now (replace the previous copy) [19:02:38] Krinkle: done -- look ok? [19:04:25] Yep, I think so. [19:09:40] andrewbogott: Alrighty, cvn-app2 is all cleared and idle now. [19:09:59] Will wait for traffic flow to reduce on cvn-apache4.pmtpa (still hot at the moment) [19:10:07] But otherwise, it's all done. [19:10:39] ok, will check back in half an hour to see if we've safely swapped over [19:11:05] !log cvn Migrated all applications from pmtpa to eqiad. cvn-app*.pmtpa can be shut down. [19:11:07] Logged the message, Master [19:12:00] andrewbogott: btw, is icinga also considered community maintained like ganglia? [19:12:13] yes, at the moment [19:12:27] more so, even, inasmuch as ganglia was originally set up by a wmf contractor [19:13:39] matanya: Hi [19:13:49] matanya: Is this per the RT ticket from Ori about eventlogging? [19:13:50] hi Krinkle [19:14:09] no krinkle. mail from ottomata about stat1 [19:14:29] Krinkle: stat1 will be decommed, replacement is stat1003 [19:14:41] do you still use your stat1 access for anything? [19:14:56] ottomata: sorry for pinging you so much [19:15:09] ottomata: matanya: If I have access to that it's either because it was given to me without my knowledge (I never used it), or because my request from _last week_ to get access to eventlogging things was granted already. [19:15:31] How does my access stand out from other's? [19:15:59] we are asking every one [19:16:35] OK. [19:16:42] either way, don't worry about me. I've never logged in. [19:17:17] thanks [19:17:19] yay, thanks to both of you :) [19:18:44] list shrinks quickly [19:41:59] * Coren returns from the dentist. [19:43:53] Krinkle: https://www.whatsmydns.net/#A/cvn.wmflabs.org says that everyone is now pointed at eqiad. Does it look that way to you as well? [19:44:31] Still getting traffic [19:44:37] a request every other second [19:44:40] different clients [19:45:04] At least 2 users from nl.wikipedia and various users with referal ilo.wikipedia.org [19:45:07] Hm… [19:45:13] I guess their local systems could have caches, huh? [19:45:21] At some point we'll have to force them to update [19:50:38] What's the TTL? One hour? Also, you can't manage those clients that use the IP (or have it hard-coded in /etc/hosts). [19:51:40] I think it's one hour. But it's not an emergency, we can wait a few more hours just in case. [19:51:47] I just want this wrapped up before Krinkle heads home for the day [19:52:12] It's not going to be hardcoded in etc/hosts [19:53:26] * andrewbogott envisions a Very Paranoid user who has decided that DNS just can't be trusted [19:55:36] Or someone who thinks they can speed up their scripts by a millisecond :-). [19:56:47] Where is the MX for tools.wmflabs.org set? I don't see it in https://wikitech.wikimedia.org/wiki/Special:NovaAddress or in operations/dns. [19:57:55] scfc_de: I think you can't view the NovaAddress page for tools without being a project admin [19:58:04] I see it there. [19:59:42] Am I not a project admin in Tools? I only see the public IP mail.tools.wmflabs.org => tools-mail there, but not the MX for tools.wmflabs.org. [20:00:58] You aren't an admin. there are very few. [20:01:08] (Motivation: Where the MX record is set, a SPF record would be set as well.) [20:01:59] So MX is handled by OpenStackManager? [20:02:08] Change on 12mediawiki a page Wikimedia Labs/Tool Labs/Needed Toolserver features was modified, changed by 79.21.110.143 link https://www.mediawiki.org/w/index.php?diff=942387 edit summary: /* Database replication/access */ [20:03:32] scfc_de: Yes, looks that way. [20:03:51] Do you want me to dig up the ldap entry for that? [20:05:20] andrewbogott: If you have it handy, yes, thanks. But a grep of "mx" in OpenStackManager turns up empty?! [20:06:37] https://dpaste.de/jr1W [20:10:24] andrewbogott: Thanks! [20:15:26] on our new instance in eqiad loading a page results in 100-200 database queries originating at LCStoreDB::get, what should be configured to fix that? [20:16:01] i know that i18n stuff, but not sure how its supposed to be configured [21:07:03] [11:18:39] ok. so, looking at moving the "awb" project from toolserver to labs [21:07:03] [11:18:51] couple of web php scripts with a mysql backend [21:07:03] [11:18:59] used for svn snapshot/tarballs [21:07:03] [11:19:10] Should this be a new project, or just use tools? [21:07:04] [11:19:46] Then just create a service group? [21:12:17] Reedy: still need access to stat1 ? [21:14:23] #offtopic [21:14:23] ;) [21:17:09] Reedy: #justusetools [21:17:30] pywikibot's nightlies are also there [21:17:57] and as long as the WMF doesn't move data centers again, there is approximately zero maintenance ;-) [21:29:27] valhallasw: heh [21:29:45] It seemed overkill to spin up a seperate box with apache/php/mysql etc [21:32:14] Failed to set group members for local-awb [21:32:15] :( [21:33:18] local-? [21:33:27] That's where are you trying to do this? [21:34:21] https://wikitech.wikimedia.org/wiki/Special:NovaServiceGroup [21:34:29] they're all local-* on there [21:35:28] Oh, ah, on wikitech yeah. [21:35:35] What were you trying to do exactly? [21:36:14] For what it's worth, that bug is next on my list... [21:36:24] of course, it was there when I woke up this morning and I haven't gotten to it yet :( [21:38:13] superm401: All done with Editor-engagement? Can I mark it as fully migrated? [21:38:31] andrewbogott, no, getting there, though. [21:38:36] ok [21:39:09] andrewbogott: We have an issue with gluster; it's no longer exporting /public/keys properly; and I don't see why. [21:39:28] Coren: OK, one minute... [21:40:21] Coren: So, if I can log into bastion-restricted.pmtpa.wmflabs... [21:40:26] what is the symptom for me to test? [21:40:29] remount? [21:44:10] andrewbogott: Wait, you can log in /there/? [21:44:16] sure, working fine. [21:44:50] I can also ls /public/keys [21:44:57] although there is a dramatic pause before it lists [21:44:57] Huh. [21:45:13] which instance is plaguing you? [21:45:13] i-0000031b has it fail, so does tools-login [21:45:20] And every other one I've tried. [21:45:33] Ah, but not bastion-restricted. [21:45:40] Dafu? [21:45:52] Consequence of the ldap fail, maybe? [21:46:03] probably, I've seen similar problems elsewhere... [21:46:22] * Coren tries rebooting one of the affected instances. [21:46:51] That works. [21:47:07] Yeah, restarting autofs doesn't. So I don't know which part is stuck... [21:47:38] Maybe rebooting is fine, though, since this is starting to feel like swabbing the titanic [21:48:04] autofs is a rickety, unstable pos. Why do you think I was so glad to get rid of it? :-) [21:52:41] Krinkle: how about now? [21:53:23] andrewbogott: 3 distinct url patterns coming in still. From meta.wikimedia and ilo.wikipedia [21:53:44] :( [21:53:46] I estimate maybe a dozen users at most. But they're actively patrolling though, I can see their edits. [21:53:52] I might be able to reach them and see what's up [21:54:25] Do browsers locally cache dns? Maybe it's just the result of an epic browser session? [21:55:43] andrewbogott, yes, definitely.: https://encrypted.google.com/search?q=firefox+dns#q=firefox+dns+cache [21:55:51] And of course, desktop OSes can too. [21:56:16] superm401: are they smart enough to refresh the cache if they get a 404? [21:56:24] Or will they just 404 until the browser is restarted? [21:56:30] no [21:56:36] Coren: Ping! (Re to -6 hrs ago: The "dns server" that's part of nova network). Did you accidentally mistarget me? [21:56:38] Server not found and 404 are not the same. [21:56:53] It might invalidate on server not found, but I have no idea if it does. [21:57:58] One of those articles says newer Firefox versions don't use their own cache (they rely on the OS), FYI. [21:58:15] And can those users be reached by IRC/village pump/etc.? [21:58:53] Reminder again: hasteur is a editor/bot-operator on en.WP. hashar is a WMF volunteer who deals with puppet and core labs items. Please do not target hasteur for code reviews of core WMF labs items. [21:59:22] You can review them if you want [21:59:45] Except by the time I get to the code review they realize their mistake and I'm not part of the review any more. [22:00:13] hasteur: You don't need to be invited :-). Just share your opinion anyway :-). [22:00:19] Coren: Add another user to the group [22:00:21] i think we should all call Antoine hashy anyway [22:00:52] scfc_de: Can I express tantrum opinions for being notified but then beign uninvited? [22:02:40] Tbh you should be glad for the un-invite, Jenkins is very spammy :D [22:03:49] andrewbogott, I got a MySQL importing a big dump while migrating. [22:04:05] ERROR 1114 (HY000) at line 8177: The table 'text' is full [22:04:25] Does that mean the drive is full, or just some MySQL limit? [22:04:33] superm401: I dont know -- /is/ the drive full? [22:04:38] disk or table [22:05:07] Hmm, /dev/vda1 ... [22:05:08] andrewbogott: Btw you can definatly impose many nukes [22:05:12] No, there's 8K left. :) [22:05:17] that's full [22:05:24] Yeah I know. [22:05:26] Okay, so I probably configured it with a smaller drive. :( [22:05:46] db servers on virtualisation suck... also [22:06:37] andrewbogott, what are 'Allocated Storage' and 'Filled Storage'? [22:06:53] superm401: where are you looking? [22:06:55] On the quota page? [22:06:59] https://wikitech.wikimedia.org/wiki/Nova_Resource:I-000002c2.eqiad.wmflabs [22:07:09] Where's the quota page? [22:07:26] superm401: Ah, ok -- so… your instance has space allocated but not partitioned. [22:07:44] To partition it you can include a puppet class. If you want it in /srv or /mnt that's easy. If you want to customize then you need to set a variable. [22:07:46] What's your pleasure? [22:08:16] The default is to include role::labs::lvm::mnt -- that should be on the instance configuration page already. [22:08:34] That'll give you a great big volume on /mnt for your dbs. [22:09:27] andrewbogott, hmm, that's unchecked for some reason. [22:09:32] Shouldn't mediawiki singlenode handle this? [22:09:36] yes, you have to check it if you want it. [22:09:45] andrewbogott, alright, so I'll check those. [22:09:53] superm401: I don't know, maybe? It depends on how big of a wiki you're handling. [22:10:02] andrewbogott, without that, what is: [22:10:12] /dev/vda1 ext4 7.6G 7.2G 8.0K 100% / [22:10:12] superm401: checking that will probably make whatever is in /mnt now vanish, replaced with an empty volume. [22:10:24] So move it away beforehand and then move it back, if you want to save it. [22:10:27] I guess that vda1 is the one that filled up, so what's that. [22:10:45] andrewbogott, it's okay, I have to rerun the SQL which has a drop anyway, so might as well start over. [22:10:49] ok [22:11:09] Instances in eqiad have a default volume for the OS to live on (vda1) and then a bunch of misc. space to do what you want with. [22:11:25] We haven't really settled yet on whether that should start out partitioned by default or remain available to customize. [22:11:34] Right now we've erred towards customization. [22:11:45] Sorry, it's a change from pmtpa, not very well publicized atm :( [22:12:00] Ah, that makes sense. [22:14:14] andrewbogott, actually, I'm going to blow them away so puppet will reinstall MW for me. [22:14:42] Hmm, wait, no I don't need to do that since the SQL import already recreates everything. [22:27:30] superm401: is partitioning working properly? [22:28:48] andrewbogott, I rebuilt them after all for simplicity, since I didn't want to have a full drive, and if I migrated /mnt over to the lvm, I was worried about old stuff languishing in the MySQL (apparently it's not good at reclaiming space) [22:29:07] As soon they're done spawning, I'll reconfigure, this time with lvm on both. [22:29:14] Only one of them has a big import, which I'm going to start ASAP. [22:29:36] superm401: that all sounds reasonable, should work. [22:45:09] andrewbogott, there was some kind of Puppet error after enabling MW and lvm. Not sure if everything is running in the right order. [22:45:12] Trying to rerun now. [22:45:24] notice: /Stage[main]/Mediawiki_singlenode/Exec[mediawiki_setup]/returns: PHP Notice: Uncommitted DB writes (transaction from DatabaseInstaller::createTables). in /srv/mediawiki/includes/db/Database.php on line 4131 [22:51:09] superm401: There won't be any deterministic link between the lvm mount role and other roles you apply. The safest thing to do is apply the lvm role; force a puppet run; apply other roles; force another puppet run. [22:51:37] bd808, alright, thanks. [22:52:12] Alternately, making your role require -> Mount['/mnt'], while ugly, will enforce order. [22:52:51] (And is also strictly correct, you /do/ require /mnt to be mounted after all) [22:53:09] require => * [22:53:27] Yeah, that sounds right, I don't have time to work on it now, though. [22:56:18] Is there a *good* reason for betalabs not having any CSS? [22:56:23] Or just a really shitty reason [22:58:31] marktraceur, it's probably still the SSL bug. [22:58:43] marktraceur, try visits https://bits.wikimedia.beta.wmflabs.org first. [22:59:14] Yep, https://bugzilla.wikimedia.org/show_bug.cgi?id=48501 is still open. [22:59:16] Hm, nope [22:59:20] Still borked [22:59:33] https://en.wiktionary.beta.wmflabs.org/wiki/Wiktionary:Main_Page [23:01:03] superm401: itym https://bits.beta.wmflabs.org [23:01:14] Better now, ta [23:01:44] marktraceur, oh, thanks. [23:03:38] Also++ upload.beta.wmflabs.org [23:08:22] But that didn't work apparentnly [23:08:33] Or maybe the fails are cached [23:15:41] Works for me in Firefox. [23:15:56] Alright, my long import is finally running again, and this time I'm pretty sure it will have enough space. [23:30:19] quick q: for subsequent patch sets, it's just git push right? (no 'review'?) [23:30:44] is there a page somewhere that walks through the iterative gerrit best practice? [23:32:36] cajoel: subsequent = changes that depend on uncommited changes? Then yes. For uploading new patch sets to a change, git review should work fine. [23:37:58] <^d> `git push`? You should never be typing `git push` unless it's part of `git push [remote] [sha1]:refs/for/[branch]` [23:39:52] andrewbogott, is it possible to increase the /mnt capacity of an existing instance? [23:40:22] superm401: not really. [23:40:35] If you're storing things that aren't a database, you can stow it in /data/project [23:41:01] It's in the DB. It's good enough for now (the import finished fine), I'll keep an eye on it. [23:59:10] andrewbogott: cvn-apache2 access log has been idle for the past 30 minutes [23:59:19] woo! [23:59:31] So, shall I turn off the dns for cvn, and mark that project as fully migrated? [23:59:35] Yep