[00:00:35] Damianz: I'm reading that you can broadcast to a port, and multiple clients can listen. [00:01:49] broadcast is to all, multicast is to a group, unicast is to one [00:01:56] Coren: ok, updated [00:02:32] Damianz: Yeah, so we bag a port for the relay, and broadcast to it. [00:02:42] Then anybody who wants to use can tune in. [00:03:17] ermm [00:03:21] we'd want multicast [00:03:25] and yes if labs supports it [00:03:38] Do you have some example code to read from a redis key? [00:03:44] apaprently none of tools has redis-cli installed :( [00:04:04] Damianz: Python, import redis. [00:17:23] Damianz: Would it be worth just making a twisted server to connect to? Or perhaps we're overengineering, and we ought to just have a local IRC? [00:18:53] Anyone know how to reach eqiad web server internally (e.g. wget)? http://tools-webserver-01/TOOLNAME used to work, but no more? [00:20:22] a930913: I've got an example client in node that isn't very flexiable - anything that can listen on a socket and figure out where to send data would wokr though [00:20:29] tools-webproxy-eqiad [00:21:58] Damianz: "unable to resolve host address `tools-webproxy-eqiad'" [00:22:41] tools-webproxy.eqiad.wmflabs has address 10.68.16.4 [00:22:56] There isn't really 'the webserver' to access now since it's all just new web stuff [00:23:23] a930913: So if you subscribe to 'testkey' you'll get json of all edit data [00:26:25] Damianz: Thanks, the IP seems to do the job. But, there should really be a better way to reach labs tools from within labs tools... [00:26:44] * magnusmanske_ calls it a day... [00:27:42] Intersting tools-webproxy.eqiad.wmflabs isn't resolvable from pmtpa... hmm maybe Coren can fix that [00:28:06] Coren: So are we supposed to delete everything in pmtpa once we're done moving to eqiad? [00:28:25] anomie|away: You can, or you can be paranoid and leave it there. [00:28:37] * anomie|away will be lazy and leave it there [00:32:10] I removed my xxxG logs form pmtpa [00:32:32] a930913: If you're interesting/feel like hacking php https://github.com/DamianZaremba/cluebotng/commit/35f3768b1a461fddf93a3923bf22a6f1bfd05323 [00:34:58] Damianz: 'testkey'? [00:35:22] it's a redis 'channel' or w/e it's called [00:35:36] because I'm awesome at naming things [00:36:47] Damianz: On tools-redis? [00:36:53] mhm [00:37:04] >>> r.exists("testkey") [00:37:05] False [00:37:06] ? [00:37:35] in eqiad? [00:37:59] Damianz: Wha? On tools-dev. [00:38:06] Connecting to tools-redis. [00:38:44] https://gist.github.com/DamianZaremba/d59946bccb69ce63c020 [00:44:38] Damianz: I'm not getting anything :/ [00:45:59] I had it stopped for a min to add extra data... but it's definatly pushing atm. [00:47:02] Well, I'm not receiving anything using your code. :/ [00:47:38] Where are you running it from? [00:47:56] Damianz: tools-login atm. [00:48:10] run it from tools-login-eqiad [00:48:34] Should there be a difference? [00:48:48] yeah [00:48:52] different dc = different box [00:49:10] apaprently you can't even connect cross dc to the redis port for some reason (I tried from a bots server earlier) [00:49:21] Damianz: Aren't we both connecting to the same server? [00:50:16] If you're using my code exactly on tools-login, you're connecting to tools-redis.pmtpa.wmflabs. I'm connecting to tools-redis.eqiad.wmflabs [00:50:27] They aren't clustered/replicated [00:51:09] Two servers with the same name, in a similar place, why? [00:52:01] same reason tools-webprooxy, tools-exec-0{1..8}, tools-login etc are all named the same but have different domains [00:52:46] We kinda do the same thing at work, but also number them offset depending on which side so you definatly know where you're connecting because dns is silly [00:53:49] In our case, the primary concern for the same hostnames is to ensure that the scripts will work with as little change as possible during and after migration. The *.pmtpa will just go away at the end. [00:54:17] Wait, everything's moving? [00:54:55] ... a930913, it has been announced at regular interval on labs-l over the past several months, and explained in detail howto this morning. [00:55:00] I kinda think it would be cool if pmtpa just became eqiad2... still isolated for the purposes of like omfg nfs is broken, lets test ceph [00:55:26] Damianz: We'd rather physically move the servers to make labs bigger. :-) [00:56:11] Also, the actual datacenter itself is being decomissioned. [00:56:25] Coren: Where am I supposed to be signed up to to hear about these things? :p [00:56:41] labs-l. You should be signed up there anyways. [00:56:43] As far as the datacenter is concerned good riddence... considering the amount of fibre issues in the last few years [00:57:35] Though it will sadly mean the death of some instances that have been around sine the early years of labs, which are no longer actually used but left hovering [00:57:53] I'm not sure why that'd be sad. :-) [00:59:29] Coren: How do I sign up to labs-l? [00:59:32] !labs-l [00:59:32] https://lists.wikimedia.org/mailman/listinfo/labs-l [01:00:10] The sallient email being: http://lists.wikimedia.org/pipermail/labs-l/2014-March/002160.html [01:08:44] * a930913 panics and breaks the glass. [01:09:03] Coren: "sudo: sorry, a password is required to run sudo" on finish-migration. [01:09:16] a930913: What tool? [01:09:22] Coren: referencebot. [01:09:50] Just so I can track down that infrequent bug, were you only recently added as maintainer to that tool? (Or was it created recently)? [01:10:05] Coren: Nope, months ago. [01:10:10] Hm. Odd. [01:10:45] Doomed? [01:10:48] All is lost? [01:13:21] Hardly. All I have to do is poke the maintainer list. :-) You'll have to log off and back on though. [01:13:48] Probably a stupid question, but how do I access my eqiad account once I run the migration script from my tools-login account? [01:13:51] For some reason, /some/ service groups have an outdated maintainer list. Simply adding and removing a maintainer fixes it. [01:14:21] TCN7JM: You can log in from the 'net via tools-login-eqiad.wmflabs.org, or from the old server by simply 'ssh eqiad' [01:15:43] Alright, got it. Thanks. [01:17:18] Coren: So what now? [01:17:36] a930913: ... ? Your issue isn't fixed? [01:18:03] Coren: I'm just being a muppet :p But it should be fixed without me doing anything more, yes? [01:18:18] Hardly. All I have to do is poke the maintainer list. :-) You'll have to log off and back on though. [01:21:13] Coren: LDAP is the authorative source for service group members? [01:21:40] Ah, wonderful. I guess we find out tomorrow if it explodes. [01:21:51] scfc_de: It is, but right now there are two copies due to the migration between local-foo and project.foo [01:22:04] And for a while, only the local- was being correctly updated. [01:22:35] k [01:31:59] Coren: What was the command to start the lighttpd? [01:32:24] "webservice start"? [01:32:40] * Coren nods. [01:37:36] Coren: I thought log files weren't meant to be copied? [01:37:58] They aren't, by default. [01:38:41] Oh silly me, they're the new ones :p [01:39:00] #2AMproblems [01:42:44] petan: When are the wm-bots migrating? [01:43:11] When one of their maintainers migrates them. :-) [01:44:09] I'm running blind until then :p [02:00:26] UDP I said, as it won't break anything, but no, he had to use TCP, so my code now throws a hissy fit when it can't connect. D: [02:47:16] andrewbogott: We fixed the proxy stuff. You can now access drmf.wmflabs.org [02:47:33] Howie_: great! Are you still hitting the timeout, or is that resolved? [02:48:26] hold on let me test [02:48:53] yes that worked! [02:49:00] cool [02:50:39] You can see ... ? http://drmf.wmflabs.org/wiki/Zeta_and_Related_Functions ? [02:53:21] Howie_: Yep, looks good. [02:53:51] If there's a hope of rolling whatever that is into production it'll probably need some performance work :) [02:54:54] andrewbogott: How would that work? [02:55:18] I think it's slow just because of the amount of equations which need to be rendered. Do you think it could still be made faster somehow? [02:55:48] Howie_: Um… I have no context for what that page is doing or what it is for. I'd recommend you check in with Ori about performance, he can probably judge whether or not it will be a problem. [02:56:13] Yeah, could be just the large number of equations. [02:56:13] In that case, one way of making this faster would be just to have a smaller number of equations on a given page. This is easily doable. [03:26:37] Hi Coren, still busy with the nigration? Remember you installed fastcgi for me a while ago. On eqiad I get "/data/project/zoomviewer/cgi-bin/iipsrv.fcgi: error while loading shared libraries: libfcgi.so.0: cannot open shared object file: No such file or directory" when lighttpd tries to launch my fastcgi program [03:27:07] I think I used to link it statically (but cannot remember how I did that) [03:27:28] is libfcgi missing on the web nodes? [03:27:51] dschwen: It might never have been installed there; if you linked statically you only needed it where you built. [03:33:50] yeah [03:34:06] ok, I'll try to get it to link everything statically for now [03:34:50] but there would be less surprises if the lib was actually installed on the web nodes [03:36:24] dschwen: That can be done relatively easily -- please open a bugzilla for tracking. I'll be able to hack through install requests in a few days once most people have migrated. [03:36:42] ok, thx [03:46:57] Coren, I just migrated orgchart.eqiad.wmflabs to eqiad and I can't access it. Can you? [03:47:09] I wonder if I'm making some dumb mistake w/the security group... [03:47:41] lemme check. [03:48:10] thanks [03:49:14] andrewbogott: What project is it in? [03:49:20] 'orgcharts' [03:51:55] Works from tampa. Maybe it doesn't allow two different security groups for the same port/proto and only picks the one? [03:52:25] I put 10.0.0.0/8 in for the other ones, I haven't tried two. [03:52:26] I just added the rule, maybe there's a lag [03:52:32] although I don't know why... [03:52:36] But, ok, I'll try that. [03:52:42] http is working so it's clearly a firewall thing. [03:52:46] thanks! [03:53:03] andrewbogott: In other news, tools migration is proceeding smoothly. [03:53:34] Yeah, so I see! Nice work -- I hope you weren't sleepless all weekend writing those scripts. [03:53:56] Monday was long. :-) [03:54:14] Project migration seems to be going OK. There are going to be a TON of orphan projects. [03:54:23] Or, at least, projects that no one cares about until I break them :) [03:54:43] andrewbogott: It's not like that's a bad thing really. [03:54:54] nope, should be fine. [03:55:48] Coren, good guess. Replacing the rule rather than adding a second solved the problem. [03:57:06] andrewbogott: We probably want to change the default. [03:57:41] I think the default is already correct for new projects. But I'll make sure. [03:59:43] * Coren hasn't checked. [04:36:34] 77 tools migrated; seemingly without issues. Yeay. [04:38:55] 77/how many? [04:39:06] soon to be 78 [04:51:52] andrewbogott: 600-odd, I think. Though I expect that many of those will end up orphaned too. [04:52:31] Oh, 77/600 is pretty good! [04:59:03] For a first day, yeah, I'm happy. I was expecting a slower and bumpier ride; but it's fairly smooth sailing to date. [04:59:52] Coren++ [05:09:06] Whee, the database performance on the new Tool Labs is amazing. <3 [05:09:39] because no one is using it yet ;) [05:09:48] give it a few weeks [05:09:57] well, less people are using it anyway [05:10:00] Pfft. Let's keep it that way then. :p [05:10:58] But I'd say https://bugzilla.wikimedia.org/show_bug.cgi?id=55929 is resolved now. [05:21:58] Ryan_Lane: I think he's refering to the fact that the DB is now feet away instead of 0.026 light-seconds away. :-) [05:22:07] ah, yeah [05:22:13] that makes a lot of sense ;) [06:01:57] Coren: for some reason directory URLs that don't end with a slash get redirected to http://tools-webgrid-01:4068/. For example, this URL is broken: http://tools.wmflabs.org/pathoschild-contrib/stewardry but this one is fine: http://tools.wmflabs.org/pathoschild-contrib/stewardry/ [06:02:27] Is this a known issue, or something I'm doing wrong, or should I file a bug? [06:41:50] Ryan_Lane: (repeat of email question): When you twiddled wikitech login sessions the other day, what did you do exactly? [06:51:44] Um, crap, got kicked and can't tell if my last message was sent. So, sorry if this is a repeat... [06:51:52] Ryan_Lane, when you twiddled wikitech login sessions the other day, what did you do exactly? [07:17:32] andrewbogott: three things [07:18:36] use labswiki; [07:18:46] update user set user_token=null; [07:18:53] truncate openstack_tokens; [07:19:02] (that's in mysql, of course) [07:19:09] * andrewbogott nods [07:19:10] then I purged memcache by restarting it [07:19:38] your session should work past a single browser session, but only if you opt to stay logged in [07:20:10] Well, doesn't sound like anything you did could affect future behavior anyway. [07:20:20] But I'm pretty sure the behavior is different… want to check for yourself? [07:20:32] Maybe I've varied my behavior in some way that I'm unaware of... [07:22:29] I logged in with "Keep me logged in", closed my browser, then re-opened it [07:22:33] still logged in [07:23:30] hm [07:23:34] * andrewbogott tries it yet again [07:23:57] which browser are you using? [07:24:09] I tried in chrome and firefox [07:24:32] ff [07:24:50] yeah, logs me out. I'm sure I ticked the box [07:25:49] ah. indeed. in safari it's not working properly for me [07:26:59] Ryan_Lane: you can look at it if it interests you… if not, then not :) Pretty unlikely that you caused the change, now that I know what you did. [07:27:23] hm. maybe I wasn't supposed to set it to null [07:28:41] yeah, it's not updating my token [07:30:21] in fact, it's nulling the token [07:31:12] my openstack token is being set fine [07:31:18] but my mediawiki token is not [07:35:06] So 'user_token' isn't the current token, it's a template of some sort? [07:35:17] Or is it just that null vs '' is tripping a bug someplace? [07:39:15] andrewbogott: gixed [07:39:16] *fixed [07:39:21] apparently you can't null the tokens [07:39:27] it won't regenerate them [07:39:42] there's a resetUserTokens.php maintenance script [07:39:47] I just ran that [07:39:55] then I logged out and back in with the option selected [07:40:19] good thing there's mediawiki devs around to help (aaron in this case :)) [07:40:44] Ryan_Lane: that's a bit obscure, but, great! thank you. [07:41:04] yeah, so in the future the proper way to handle the user tokens is via that script [07:41:14] but the truncate of openstack_tokens is still needed [07:41:16] and the purge of memcache [07:42:12] * andrewbogott looks on wikitech for a sensible place to note that down [07:42:24] hm, but first I have to log in again [07:47:13] https://wikitech.wikimedia.org/wiki/Help:Force_all_users_to_log_in_afresh [07:51:21] you want to restart memcache last [07:51:23] * Ryan_Lane edits [07:51:30] first I have to log in ;) [07:55:38] OK, I'm out for a bit, will be back tomorrow AM. Thanks again for fixing! [08:02:35] !ping [08:02:35] !pong [11:51:55] hmm Coren in eqiad puppet seems to run with 2 errors on a fresh instance [11:52:27] https://www.irccloud.com/pastebin/YYgPDRV8 [11:54:11] oh wait, I realise your not here yet xD [11:54:47] addshore: puppet might be triggered before /home had an opportunity to be configured/mounted [11:54:53] ah no [11:55:02] the NFS server has no home did for wikidata :] [11:55:06] labstore.svc.eqiad.wmnet:/project/wikidata-build/home failed, reason given by server: [11:55:07] No such file or directory [11:55:28] ;_; [11:56:24] I assume that should be created on project creation (well before instance creation); could you file a bug? [11:58:57] (Or for legacy projects that were created before eqiad Labs: By the Labs admins :-).) [12:02:03] Will file a bug, is there any way you could poke the dir into existence now so I could carry on migration? ;p [12:04:07] I don't have access to the NFS server; only andrewbogott_afk and Coren probably. [12:04:21] ahh okay :) thats fine! [12:04:31] (And other ops if it is documented :-).) [12:08:27] for your reference https://bugzilla.wikimedia.org/show_bug.cgi?id=62252 [12:12:21] How can I add migrated tools to service groups? They don't exist atm, according to Special:NovaServiceGroup (functionality was broken before anyway, so perhaps best just to wait until pmtpa is RIP?) [12:39:08] Hi, it is normal for migrate-tool to delete the public_html folder of the tool? [12:41:02] Second question: is it there a "standard" UI framework for the web tools of Tool Labs? I'm using Bootstrap now. [12:49:23] pietrodn: I think it just moves public_html (could be wrong on that though) [12:50:07] Okay, other question, when projects get mothballed, does that mean the instance gets mothballed and can be rebooted? [12:54:24] jarry1250__: yes it will be in a shutdown state but you can always start it up again [12:54:57] addshore: Okay, cool. It's just that I guess I don't need the instance rebooted for a while, but I will want it rebooted at some point :) [12:55:29] should be fine then :) might be worth adding a note to https://wikitech.wikimedia.org/wiki/Labs_Eqiad_Migration/Progress [13:03:25] !migration is https://wikitech.wikimedia.org/wiki/Labs_Eqiad_Migration [13:03:25] This key already exist - remove it, if you want to change it [13:03:30] !migration [13:03:31] https://wikitech.wikimedia.org/wiki/Labs_Eqiad_Migration_Progress [13:03:34] :O [13:25:18] Damianz: Should I be getting the redis string in chunks, or all at once? [13:26:38] 'morning labs. [13:26:45] * Coren reads backscroll. [13:27:56] addshore: The project was created without the 'shared home' and 'shared project dir' option? [13:28:15] * addshore does not recall [13:28:25] Coren: waiting forever in finish-migration [13:29:22] addshore: Manage projects -> configure -> checkboxes at the top. :-) [13:29:36] zhuyifei1999_: Do you have databases on tools-db? [13:29:56] likely yes [13:30:16] That may take some time then, because it requires a dump and restore. [13:30:29] lovely Coren :D so tick those 2 boxes and re run puppet and magic things will whur and things shall be fixed? ;p [13:30:59] coren: it shows "That tool doesn't seem to be migrated yet" [13:31:02] addshore: There's a delay before those get created though; ~5min as a rule. But yeah. [13:31:09] awesome :) [13:31:12] zhuyifei1999_: Which tool? [13:31:27] yifeibot [13:31:32] * Coren looks. [13:31:55] zhuyifei1999_: It's still being copied between the two datacenters. [13:32:30] Coren: when will it finish? [13:33:26] That depends only on how much data there is to move. It takes several minutes /GB [13:33:40] Coren: is it possible to make a copy of /shared/pywikipedia/ on equiad, without loosing the other one, so that bots don't breack on migration? [13:34:23] Alchimista: Yes. I saw the email. That's not a difficulty and in fact I have to do it since it's not 'part of a tool'. I'm going to do it shortly. [13:35:02] Oh, you have *GOT* to be shitting me. [13:35:09] XFS decides to break *now*? [13:35:22] zhuyifei1999_: That's why it stalled. [13:35:27] thanks Coren, any ETA? i forgot to check if /shared/pywikipedia/ was already on equiad, so my bots are stoped now [13:35:53] Alchimista: I fix the pmtpa server, then about ~15m later. So within half an hour. [13:36:23] Coren: perfect, that way i don't need to install it locally, and then change all back :D [13:36:28] Lol, wikipedia is going to break because all the bots are going down :p [13:36:32] intersect-contribs: Scheduled for copy Wed Mar 5 12:28:39 UTC 2014 [13:36:41] It's stalled I think [13:36:50] no DBs [13:36:53] XFS broken. Will return shortly. [13:37:12] Does anybody know redis here? [13:37:44] a930913: Yuvipanda is the resident expert. [13:38:08] @notify YuviPanda [13:38:08] This user is now online in #wikimedia-dev. I'll let you know when they show some activity (talk, etc.) [13:38:09] * anomie just looks things up at http://redis.io/commands when necessary [13:38:36] * Coren roars at the pmtpa NFS server. 'Your days are numbered!' [13:38:57] anomie: Either Damianz is doing something weird, or redis is splitting the messages to me :| [13:41:41] @notify andrewbogott_afk [13:41:42] This user is now online in #wikimedia-labs. I'll let you know when they show some activity (talk, etc.) [13:42:01] petan: wm-bot on eqiad? [13:42:08] a930913: not yet [13:42:16] I know. [13:42:27] a930913: there is this bug https://bugzilla.wikimedia.org/show_bug.cgi?id=62234 [13:42:32] Your TCP idea means BracketBot is now broken :( [13:42:37] until it's fixed wm-bot doesn't move anywhere :/ [13:42:47] a930913: what do you mean? [13:42:50] what TCP idea [13:43:28] @replag [13:43:28] Replication lag is approximately 00:00:00.6917770 [13:43:44] this one is on eqiad ^ [13:43:48] petan: Sending the, erm, relays? [13:44:05] * Coren power cycles the stupid pmtpa NFS server. [13:44:10] a930913: ok I have no idea what you talk about, what is the context? [13:44:27] petan: @relay I think. [13:44:29] Coren: xfs again ? [13:44:30] so far I know you talk about something related to BracketBot (what is it?) and TCP idea [13:44:43] matanya: Yeah. While in the middle of migration, natch. [13:44:46] a930913: it perfectly works [13:44:57] matanya: We're getting rid of that filesystem in <3 weeks. [13:45:09] petan: Yeah, but it used TCP which means breaking pipes. [13:45:17] Coren: ext4 instead? [13:45:18] Silly emoticons making it hard to type less-than-three. :-) [13:45:23] a930913: I see it's definitely sending in chunks of 8192 bytes of message payload. Do you know where his source is? [13:45:33] a930913: you just need to relay to i-00000816.pmtpa.wmflabs [13:45:36] matanya: Ayup. XFS has gotten really flaky under load in modern kernels. [13:45:46] a930913: what do you mean by "breaking pipes" [13:46:04] petan: UDP doesn't "break" but TCP does. [13:46:12] define "break" [13:46:15] Coren: interesting, rhel 7 ships xfs as default [13:46:40] while deb based sticked to ext for years [13:46:41] a930913: the difference between TCP and UDP is that UDP is simple data stream with no verification [13:46:55] so it's not so reliable, but faster [13:46:58] i wonder why is it this way [13:47:01] good for video streaming etc [13:47:23] not good for any kind of client / server communication where data are never supposed to be malformed [13:47:32] sending data to IRC using UDP is not a good idea [13:48:10] a930913: however, neither TCP nor UDP performs any "break" or whatever you mean [13:48:41] petan: "IOError: [Errno 32] Broken pipe" [13:48:46] pmtpa NFS on its way back up. [13:49:08] a930913: what were you trying to do? [13:49:44] zhuyifei1999_: migrations will resume as soon at it finishes starting up. [13:51:14] anomie: https://github.com/DamianZaremba/cluebotng/commit/35f3768b1a461fddf93a3923bf22a6f1bfd05323 [13:51:20] Coren: is there an outage? [13:51:30] Betacommand: XFS went down [13:51:42] Betacommand: Yeah, pmtpa NFS server needed reboot. It's almost completely back up now. [13:51:43] pietrodn: ah [13:51:46] petan: Send to wm-bot. [13:51:56] a930913: ok but how [13:52:06] a930913: from which server to which server [13:52:52] !newweb [13:52:52] https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Help/NewWeb [13:53:23] petan: eqiad grid to lab-bots and then to the one you mentioned above. [13:53:46] NFS restarted. Things will unclog shortly. [13:53:55] a930913: sending to 10.4.0.81 100% works [13:54:03] a930913: that is what I do now [13:54:27] @help [13:54:27] I am running http://meta.wikimedia.org/wiki/WM-Bot version wikimedia bot v. 2.0.0.4 my source code is licensed under GPL and located at https://github.com/benapetr/wikimedia-bot I will be very happy if you fix my bugs or implement new features [13:55:05] Is it good or not to embed jQuery code from an external CDN (Google, …) on Tool Labs? [13:55:25] ok, maybe not 100% :P [13:55:29] now it just stopped working [13:56:16] zhuyifei1999_: Migration copies restarted. Sadly, the order is pretty much random so you ended up at some random spot in the queue. Sorry. :-( But the queue is pretty short. :-) [13:56:34] coren ... labs seems to be down [13:56:36] I can SSH now, good [13:56:41] a930913: Well, I've determined it's something in his code. I just sent a message (manually) between two session with 9000 bytes of payload. [13:56:49] GerardM-: XFS in pmtpa died again; it just restarted. [13:57:39] thanks [13:57:43] !log tools petrb: test [13:57:47] Logged the message, Master [13:57:55] pietrodn: use http://tools.wmflabs.org/static/ [13:57:56] a930913: yes you are correct [13:58:04] a930913: that port is firewalled out for whatever reasons [13:58:25] Coren: /shared/pywikipedia/ is actually currently linked to the pywikibot project, but I don't have the time to fix it now (at a conference this week, and holidays next week). If you could just make a static copy that would be awesome [13:58:41] Coren: why is port 64834 firewalled pmtpa - eqiad [13:58:43] valhallasw`cloud: Will do. [13:58:44] sitic: didn't know of that, thank you! [13:58:56] Coren: thanks! [13:59:12] petan: (And there I was, thinking I was going mad. :p ) [13:59:12] a930913: I will eventually change the port... that is probably only solution now [13:59:29] petan: Random traffic isn't allowed between datacenters; only within projects in the same DV. [13:59:29] I don't expect any ops to fix it nor explain it [13:59:32] DC* [13:59:34] Coren: is there a way to access the status page for eqiad? [13:59:36] Well, the database on eqiad is just FAST! [13:59:49] Betacommand: Yes, through http://tools-eqiad.wmflabs.org/ [13:59:54] Coren: but another port works just fine [14:00:03] Coren: only this one doesn't [14:00:12] I mean maybe all in this high range [14:00:21] but 5xxxx ports are fine [14:00:39] a930913: If I had to guess, I'd guess it's his socket listener is reading buffers of 8192 bytes and then blindly publishing them, rather than trying to accumulate full records before publishing in redis. [14:02:28] Aha! It looks like the actual XFS problem lives in yifeibot! [14:02:37] (Or at least "one of") [14:03:08] A tool is managing to break XFS? [14:03:39] That's plain evil XD [14:03:40] anomie: Well, no, the XFS bug is broken somewhere in that part of the filesystem. It's obviously not the tools itself. :-) [14:03:50] s/broken/triggered/ [14:04:55] Ah, so there's some sort of corruption in the FS, and it just so happens that the corrupted file/block/whatever is accessed by that tool. [14:05:32] anomie: I don't think there's actual corruption; xfs_repair never sees anything wrong. But there's something around there that makes the driver think so. [14:05:32] a930913: fixed! [14:06:06] Coren: after the migration my tool runs well with the OLD sql credentials… is that ok? [14:06:18] a930913: wait, maybe no :o [14:06:50] pietrodn: No credentials were remove yet; but you want to switch to the new ones before migration is complete. [14:06:59] a930913: now it's fixed! [14:07:05] a930913: same port [14:07:35] a930913: the IP address will however change once wm-bot is in eqiad [14:08:39] Holey carps! How many millions of files are there in that directory?! [14:09:38] petan: \o/ The best part is that it autoconnected. Which means I must have done something right when making it :p [14:12:09] Coren, ping [14:12:30] !ask [14:12:30] Hi, how can we help you? Just ask your question. [14:12:30] a930913: autoconnected? :) [14:13:47] Cyberpower678: ping [14:14:03] petan: As soon as the port opened, it connected by itself. [14:14:10] aha ok [14:18:24] BracketBot lives \o/ [14:20:02] Before the migration was announced, I could access my instances without a problem. After the migration was announced, my instances are no longer accessible from bastion with a reported error of Permission denied (publickey). Could this be related or is it just a coincidence. [14:21:51] * a930913 goes off to uni. [14:26:22] slevinski: It shouldn't be related; the announcement didn't actually scare the instances as far as I know. :-) What instances/project [14:26:35] slevinski: It's probably just gluster being broken again. [14:27:14] the signwriting project, instance i-0000070c and i-00000322 [14:27:34] Coren: virt10 is out of disk space ( 44 GB left ) [14:27:46] DISK CRITICAL - free space: /var/lib/nova/instances 44146 MB (3% inode=99%): [14:28:18] hashar: They're *all* out of disk space. Not caring, they're getting decomissioned soon and instances are being migrated off. :-) [14:28:28] ok ok :D [14:28:30]