[00:00:02] hedonil1, you there? [00:06:05] CP678: found something. [00:06:20] Ooh. Lemme hear it. :-) [00:06:34] CP678: g.raphael.js line 788 ff [00:06:39] CP678: http://pastebin.com/8UvVFkCU [00:07:06] CP678: FF is entering the while loop (text.push ...) [00:07:17] CP678: Chrome doesn't [00:08:00] That while loop controls the positioning of the text on the Y axis doesn't it? [00:08:21] So the question is, what the fuck is causing it? [00:08:28] CP678: the loop fills the array [00:08:56] It can't be a fault of the library, as it works perfectly on a different tool. :/ [00:15:24] CP678: the var condition for entering the loop are different [00:16:03] hedonil1, you likely have a better debugger than I do so what's causing it? [00:16:33] CP678: FF: y=209,2 Y=209,2 length=191 (all values positive) [00:16:47] Ok. [00:17:18] CP678: Chrome: y=-302,8 Y=-302,8 length=-312 (all values negative) [00:17:24] WTF? [00:18:12] That would explain why it is running off of the screen. [00:18:19] Hold on. [00:19:09] CP678: even if I didn't check the values, at least (Y >= y - length) will then be different (substraction of negative value) [00:19:35] I've removed the hacky fixed. [00:19:50] Can you look at it again and tell me what software you're using? [00:20:15] hedonil1, ^ [00:20:36] CP678: ok, nw it enters the loop [00:21:00] But are the charts aligned now? [00:21:07] They are likely not. [00:21:38] CP678: at least the values appear now, but unaligned [00:22:28] And that is the problem I've been trying to fix. Why do they align differently in different browser, yet operate normally in a different tool? :/ [00:36:54] hedonil1, do you see what's causing it? [00:37:10] Causing I've been searching forever for it. [00:47:57] hedonil1, still looking. Steeping through the script code. :/ [01:47:21] Ah, finally, incoming mail should now work correctly. [01:56:56] hedonil1, any luck. I'm stepping through the js scripts. [02:06:31] * hedonil1 switched from coffee to vodka+redbull, has fun debugging end tries some more mniutex [02:06:40] *minutes ;) [02:07:14] until duty calls [02:40:43] andrewbogott_afk: FYI, roughly 50% of tools instances got their puppet.conf munged by the 500 grabbing certificate name. [11:05:04] * Beetstra softly pokes Coren [12:08:30] Beetstra: Take a look in /public/backups/tools-db [12:08:32] !log ganglia Removed IP 208.80.155.168 from aggregator instance, now use the labs proxy. [12:08:34] Logged the message, Master [12:09:15] !log ganglia releasing IP address 208.80.155.168 from the project [12:09:18] Logged the message, Master [12:09:53] Coren: -rw------- 1 tools.linkwatcher root 518223420 Apr 1 17:36 p50380g50621__linkwatcher.sql.gz [12:11:56] There ya go. You can restore it (and rename it at the same time) with something like: [12:13:22] zcat /public/backup/tools-db/p50380g50621__linkwatcher.sql.gz|sed -e s/p50380g50621_/sXXXXX_/|mysql --defaults-file=replica.my.cnf -h tools-db [12:13:34] where sXXXXX is your new db username. [12:18:35] Coren - that is running now, cheers [12:22:59] I'm still not clear on why it didn't get copied through the automated migration, but at least it was saved from the grave. :-) [12:23:48] I wonder how long this will take [12:26:15] helllo [12:26:45] Ah, I wasn't going mad with missing dbs. [12:27:13] Coren: may I get the parsoid user created in LDAP please ?:-] [12:27:30] ... oh d'oh! I didn't push that did I? [12:27:36] * Coren goes do it now. [12:28:57] will let you solve https://bugzilla.wikimedia.org/show_bug.cgi?id=63329 :D [12:35:29] hashar: uid=605(parsoid) gid=605(parsoid) groups=605(parsoid) [12:35:54] Coren: rocks! [12:36:20] !log deployment-prep Manually deleting parsoid user/group on deployment-parsoid04. Will use the LDAP uid/gid instead. [12:36:23] Logged the message, Master [12:38:37] and the next crazy question is whether coren ever attempted to resize an image in nova :-D [12:38:44] that is apparently possible: http://docs.openstack.org/user-guide/content/nova_cli_resize.html [12:38:54] aka change the profile of an instance from m1.small to m1.large :-] [12:39:14] I have not, and last time Ryan attempted to do so it resulted in the instance being destroyed. :-) [12:39:36] sounds an acceptable risk [12:39:43] :-D [12:40:08] I'm in the middle of something, I'll try it for you in a little bit if you want? [12:40:30] But if everything is in puppet, wouldn't it be easier to just blow it up and rebuild it anyways? [12:42:50] that is for ganglia, I wanted to avoid having to update all the configuration files with the new IP address / instance hostname. [12:43:01] not urgent, ping me whenever you are done with whatever you do :] [12:50:07] !log deployment-prep restarted parsoid daemon on deployment-parsoid04.eqiad.wmflabs. It also now log to /data/project/parsoid/parsoid.log [12:50:10] Logged the message, Master [13:05:14] /shared permissions in tool labs seem to totally messed up [13:05:19] +be [13:17:03] gifti: In what way? [13:17:31] you cannot create directories and not access the mediawiki directory [13:17:55] create what directory? And what mediawiki directory? [13:18:04] * Coren needs a bit more detail than this. [13:18:08] and where is the pagecount data that is described in the help? [13:18:20] create: any directory [13:18:33] /shared/mediawiki [13:19:05] Ah, indeed. Odd, I wonder what happened to mediawiki. /me fixes it. [13:20:59] gifti: /public/dumps/pagecounts-raw is the answer to your other question. [13:31:00] Coren: so, would you have to ask for a directory to be created under /shared? [13:32:09] and i wonder where the mediawiki core is to be found [13:49:10] gifti: Yes, but you don't need a bugzilla for it. Any project admin can do so trivially. [13:49:42] gifti: As for the core, I don't know -- YuviPanda created the mediawiki directory. [13:53:43] And of course /shared isn't something magical. Any directory can be made read-/writeable. [13:54:18] scfc_de: ... and now that I consider it, I see no reason to not just make it a+w,+t [13:55:25] * Coren did so. [13:57:06] Is there a policy on fetching external webpages? [13:57:42] a930913: Not directly; what did you have in mind? [13:58:55] Coren: I had/have/etc. a script that checks whether URLs can be made relative by checking the difference between the HTTP and HTTPS version. [13:59:54] a930913: I see no reason why that's a problem so long as you have a reasonable rate limit (or you're only fetching pages from our projects) [14:00:29] Coren: Yeah, what's your definition of reasonable? [14:01:12] No more than a few pages per seconds at the most. Also, if you wanted to be really smart about it you could probably use HEAD rather than GET [14:01:33] The point is "don't hammer other websites". :-) [14:01:39] At the moment, it checks each URL in parallel, so if I did use it repeatedly I could enforce gaps between. [14:01:57] Coren: I thought as much, just wanted to make sure there were no hard limits. [14:02:24] Coren: HEADs just mean the page exists, not that it's the same though. [14:02:48] I'm being smarter than just testing for existance. [14:04:12] a930913: There are lots of goodies in the headers you could use to verify; including content-length and timestamps. But yeah, it's not an issue so long as you make sure you can't accidentally slashdot a site. [14:04:50] I think any website that delivers different content to anonymous users depending on the protocol is broken. [14:05:03] you could also just hash the header itself and compare [14:05:10] if it's truly meant to be identical [14:05:46] scfc_de: Yeah, but one protocol could be under construction or something. [14:08:55] cia.gov is a 301 to the https version. [14:09:55] a930913: That's a worthwhile case to check for. [14:09:57] a930913: Yes, and IMVHO I don't think that it us our obligation to spider the whole web to detect and work around those fringe cases. [14:10:09] !log deployment-prep Fixed database updating job https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/ . It was not running on the proper node. [14:10:12] Logged the message, Master [14:13:03] And https://www.bbc.co.uk is a 301 to the http version... [14:28:21] hoi, Magnus indicates to me that there are likely issues at labs [14:28:31] many of his tools do not operate properlh [15:09:11] GerardM-: That's vague. Nobody else reports issues; what is he seeing? [15:09:40] a query like this one does not run [15:09:41] http://tools.wmflabs.org/wikidata-todo/autolist.html?mode=and&cat_name=2014%20deaths&cat_lang=en&cat_project=wikipedia&cat_depth=1&q=claim%5B31%2C5%5D%20and%20noclaim%5B570%2C4294967295%5D [15:09:47] it did [15:12:40] it is "waiting" [15:14:01] Since I have no idea what it's trying to do, nor does it give any indication of what might be wrong, it's a little hard (read: impossible) for me to know what issues it might have. [15:17:07] It might be DB related. I see catscan2 is holding tons of locks. Again. [15:21:28] GerardM-: The problem is that magnus's catscan tool holds locks on databases, sometimes for hours. [15:22:15] I need to have a talk with him and find better solutions. [15:24:08] I note queries that do not depend on a category work fine, so that's almost certainly it. [15:38:40] Coren: https://bugzilla.wikimedia.org/show_bug.cgi?id=63421 [15:40:24] GerardM-: Commented there. [15:40:34] thanks [15:41:53] Coren: Looking at that, port* seems to emit "Argument "\nD^P5" isn't numeric in subroutine entry at /usr/lib/perl/5.14/Socket.pm line 260, line 1." to error.log. [16:03:12] Coren: jmail seems to be running stuff as debian-exim. Is that by design? [16:27:19] coren.. it is working again .. [17:09:19] I gave ssh username@bastion.wmflabs.org -L 8080:your-instance:80 as given in https://wikitech.wikimedia.org/wiki/Access#Accessing_services_using_port_forwarding and the connection ended me in bastion, but not into my instance [17:10:07] I was trying to access the MediaWiki running on my instance box1, on my browser, which is not happening [17:10:23] I end up at tonythomas01@bastion1 [17:10:36] tonythomas: you can go ahead and set up a proxy for your instance, that'll be much easier. Just visit the 'manage web proxies' link in the sidebar [17:10:50] Um… presuming you want web access. Am I understand correctly? [17:11:06] andrewbogott: yeah ! will I need to have additional permissions ? [17:11:24] Probably need to be a project admin... [17:11:29] what project is this? [17:12:03] hi Coren [17:12:10] andrewbogott: Its a project to implement VERP for MW [17:12:18] Coren: I'm in a meeting for 2 hours (ugh) but I can help finish up the proxy after [17:12:29] tonythomas: I mean, what is the name of the project? On wikitech? [17:12:43] YuviPanda: Cool beans. I'm just back from lunch. [17:12:48] andrewbogott: mediawiki-verp [17:12:53] Coren: sweet :) [17:13:31] tonythomas: looks like you're a project admin, it should just work. [17:13:41] andrewbogott: ok. I will try to add the proxy [17:17:21] tonythomas: Of course that will mean that the whole world can access port 80, not just you. [17:17:26] Usually people are fine with that. [17:18:01] !log deloyment-prep testing https://gerrit.wikimedia.org/r/#/c/121436/ against beta's Elasticsearch servers [17:18:02] deloyment-prep is not a valid project. [17:18:34] andrewbogott: oh! ok. I got the public IP as http://208.80.155.156/ a 404 page now though [17:20:09] tonythomas: that's the right IP. If you're getting a 404 that's probably because your instance is actually producing a 404 :) [17:20:16] You can test by doing a local wget on the instance... [17:21:42] andrewbogott: ok [17:23:04] tonythomas: Since proxies are all in the same namespace, I'd encourage you to use a proxy name that's actually meaningful and identifies your project. [17:23:33] tonythomas: really, the same goes for instance names as well. [17:23:50] andrewbogott: ok. I will try to get that changed in the earliest. [17:24:41] tonythomas: your instance is also firewalled off from port 80, so nothing can access it (including the proxy). https://wikitech.wikimedia.org/wiki/Special:NovaSecurityGroup [17:25:30] andrewbogott: The 404 at http://208.80.155.156/ is the proxy not finding a Host: header it has in its list? [17:25:55] scfc_de: first of all… you can't visit the IP because that's the IP of the proxy machine (which proxies a million different addresses.) [17:26:07] You need to use the actual url, which in your case is http://box1.wmflabs.org/ [17:26:13] And, also, note the firewall issue [17:26:31] So each webproxied hostname gets its own IP? [17:27:20] andrewbogott: I will try to get that fixed [17:27:20] scfc_de: no, that's the opposite of what I'm trying to say :( [17:27:20] The /proxy/ has a fixed ip. [17:27:20] It looks at the URL to determine how to relay requests. [17:28:16] andrewbogott: Okay, so you second my guess that the 404 at http://208.80.155.156/ comes that the client provides no Host: header that the proxy has in its list of proxied hosts, and thus says "not found"? [17:29:03] Oh, English language. Let's say: support my guess. [17:29:14] scfc_de: the proxy has a db of servers that it proxies for. None of them are named '208.80.155.156', hence, 404. [17:29:31] andrewbogott: Okay, then we're on the same page. [17:49:22] I’m looking for someone familiar to the web server setup on Tool Labs. volunteers? :-) [17:51:58] ireas: Do you mean the lighttpd server run by individual tools or the proxy setup surrounding them? [17:52:24] scfc_de, I’m not quite sure yet ^^ probably the proxy [17:52:53] short version of the problem: Java 7 denies to connect to tools.wmflabs.org via HTTPS because the Tools web server does not accept "tools.wmflabs.org" as its host name [17:52:58] (→ extended SNI) [17:54:39] !log deployment-prep done installing plugins on Elasticsearch in beta [17:54:42] Logged the message, Master [17:54:51] ireas: From outside? [17:55:08] ireas: It seems likely that your java client needs to add the RapidSSL CA cert to it's local trust store. [17:55:24] bd808: No, that would be our job. [17:56:17] scfc_de: Ah. Sure if the request is originating from a tool. [17:57:50] bd808: No; webservers shouldn't require the installation of client-side certificates. tools.wmflabs.org uses (at least in the past) a run-of-the-mill certificate that should work in any browser. [17:58:30] https://www.ssllabs.com/ssltest/analyze.html?d=tools.wmflabs.org looks alright to me. ireas, still there? [18:04:25] scfc_de, sorry, I was afk [18:05:45] scfc_de, hm, strange. I’m receiving ‘handshake alert: unrecognized_name’ from the server. see http://bugs.java.com/bugdatabase/view_bug.do?bug_id=7177232 for a detailed description of the problem [18:06:20] (a workaround is to disable the SNI checking; but it would be better if the web server behaves correctly, if possible) [18:08:07] ireas: Do you have some source code for testing? [18:10:27] If I "openssl s_client -servername tim-landscheidt.de -connect tools.wmflabs.org:443", the connection isn't refused by the webserver, so the problem doesn't seem to lie in the webserver's configuration. [18:11:27] I stand corrected: It fails on Toolserver. Hmmm. [18:11:41] scfc_de, Java is one of only a few clients that close the connection after a unrecognized_name error, though this seems to be the behaviour required by the standard [18:11:53] scfc_de, preparing some source code … [18:12:43] ireas: Don't worry, openssl shows the error just fine, I think. [18:14:27] Coren: did you see my note on jmail? [18:15:04] valhallasw: Not yet. [18:15:33] valhallasw: How... dumb. [18:15:53] valhallasw: Do you have a reporting link I can use to yell at Yahoo? [18:17:46] Coren: err? [18:18:04] Yahoo? [18:24:00] ireas: https://bugzilla.wikimedia.org/show_bug.cgi?id=63435 [18:24:42] scfc_de, great, thanks! :-) [18:25:36] http://korma.wmflabs.org/ is not loading. Is this part of a wider Labs problem? [18:25:43] valhallasw: Hm. That wasn't /your/ note. [18:25:51] Coren: oh, that's not what I meant. I meant jmail running as the exim user. [18:25:53] qgil: Doesn't look like it is. [18:26:02] Wait what? [18:26:04] hm [18:26:10] valhallasw: That's a bug. [18:26:13] Coren: ok. [18:26:13] * Coren fixes the bug. [18:44:54] Coren: available in about 15 mins? I'll be out of meetings then :) [18:45:01] Coren: saw my patch yesterday with the init script? [18:46:22] YuviPanda: I probably did, but it's not on my radar atm. linky? (And yes, I did see it) [18:46:29] Coren: yeah moment [18:46:32] (Also, 15m is okay) [18:46:46] Coren: https://gerrit.wikimedia.org/r/#/c/123112/ [18:51:57] valhallasw: Should be fixed. I like it when a bug is easy to fix like that. :-) [18:52:51] Coren: How can I get the 80 port open for my instance mediawiki-verp ? [18:53:07] tonythomas: You need to setup a proxy for it. [18:53:49] Coren: I was trying about tunneling via ssh [18:53:54] that should work ? [18:54:17] tonythomas: Well, as long as your security rules allow incoming port 80 in the first place. You might want to check that. [18:54:29] I gave ssh tonythomas01@bastion.wmflabs.org -L 8080:box1.eqiad.wmflabs:80 [18:54:40] and the connection ends me in tonythomas01@bastion [18:56:12] fixed [18:56:13] sigh [18:56:15] so many gmail filters [18:56:22] sorry, wrong channel, I do apologize [18:59:59] Coren: cool. Seems to be working now. [19:01:32] tonythomas: Well yeah, that's what you asked for. :-) Your local port 8080 should be forwarding though, while the connection is up. [19:03:04] Coren: ok. opening up the 80 and 443 ports [19:10:33] Coren: I added the port 80 rule as per https://wikitech.wikimedia.org/wiki/Help:Security_Groups#Examples and I get this error [19:10:36] Failed to add rule. [19:11:12] Make sure you don't already have a port 80 rule, and that you do not include a group (there's a pulldown, you should leave it at no value) [19:12:29] Coren: I think I added 'default' as the group [19:12:40] correcting [19:16:48] Coren: it worked ;) strange that the configuration never went under default, even though I selected default from the drop down [19:16:53] bug ? [19:18:02] No, unless it's a /documentation/ bug; that dropdown isn't for picking where a rule goes, but for "inclusion" of groups in groups. [19:21:50] Coren, http://korma.wmflabs.org must be down because of the migration, I guess. See https://bugzilla.wikimedia.org/show_bug.cgi?id=63441 [19:21:57] Is there anything we should do? [19:24:57] I can understand that tool maintainers miss the notices of the migration, but as a project owner ... [19:27:21] qgil: Yeah, it'll need TLC from Andrew, but it's not an overly complicated maneuvre. But honestly, project owners have no excuse -- the migration was announced *months* in advance, in every channel we could think of *and* email were sent to every project owner. [19:28:26] Coren, yep, the problem is that I project owner assumed that all project members where receiving the email as well. But yeah, I'm not blaming anyone else than ourselves. :) [19:29:25] Wait, that one's yours? :-P [19:29:51] Honestly, it's usually easier/cleaner to just spin a new instance up and reapply puppet. Everything /was/ in puppet, right? [19:30:07] scfc_de, I am the project owner nominally, but the actual admins in charge of the instance are another team not following the MediaWiki / wikitech channels. I got the emails, I thought they were receiving them as well. [19:30:56] Coren, no idea, I'm just "a customer". [19:40:20] YuviPanda: lighttpd 22369 tools.admin 4u IPv4 254363000 0t0 TCP 10.68.16.28:35394->10.68.16.53:8282 (ESTABLISHED) [19:40:24] So it's holding it open at least. [19:44:40] qgil: I meant the technical admins, sorry if that wasn't clear. IMHO if you install servers in an environment, you want to make sure that you get notified about anything on the horizon that may affect you. [19:46:26] scfc_de, yes, the "problem" is that someone like me succeeded creating the Labs instance (yay for Labs documentation and ease of use) even if the regular sysadmin work would be performed by someone else. I did that because, at the time, we had a project before we had a sysadmin confirmed. :) [19:49:13] qgil: I'll catch up with you in a few minutes… do you want to just rebuilt your instances, or do you need them revived? [19:53:35] Coren: hey! [19:53:40] hi andrewbogott if our sysadmin can access to the instance, then they will be able to rebuild whatever is needed [19:53:41] Coren: sorry, back on now. [19:54:03] qgil: ok, so connection is open. let me look at redis [19:54:10] qgil: what instance, what project? [19:54:29] Coren: hmm, nothing there. I wonder if I can see logs from proxylistener? [19:54:40] hmm, let me start it locally. [19:54:48] andrewbogott, YuviPanda https://bugzilla.wikimedia.org/show_bug.cgi?id=63441 [19:54:51] Want me to restart the lighttpd then? [19:54:58] * Coren will be using csbot's for that. [19:56:05] Coren: no am trying to run proxylistener interactively [19:56:06] so moment [19:56:16] Coren, what tool was it that you used to magically edit ldap records in place last month? [19:56:25] * andrewbogott wonders if it's easier than the ldapmodify rigamarole I usually use [19:56:38] Coren: ok, try now? [19:57:16] andrewbogott: ldapvi [19:57:33] andrewbogott: Easy++. Check the history on virt1000, there's one there. [19:57:57] Not scriptable, but full of easy for one-off fixes. [19:58:15] YuviPanda: Should be working. [19:58:57] hmm [19:59:26] Coren: looking at nginx logs... moment [19:59:51] Action? [yYqQvVebB*rsf+?] <- unix ftw! [20:00:08] Coren: I see it's 404 too. [20:00:32] Coren: hmm, redis is still empty. [20:00:48] YuviPanda: Lemme confirm what I send with a quick tcpdump [20:00:56] Coren: ok [20:02:39] Ah, I get "Identd authentication failed. Please contact an administrator" [20:02:46] * Coren contacts YuviPanda. [20:02:51] Coren: aaah. [20:02:55] hmm, let me find logs [20:04:54] Coren: is there somewhere the output of stdout for a init script is saved? [20:05:21] YuviPanda: No; but that should end up on the server console. [20:05:36] Coren: hmm, let me live hack to log to file [20:07:36] Coren: ok hit it again? [20:07:36] webgrid has security group execnode, so it let should port 113 pass. [20:09:13] YuviPanda: Ah, I have a different issue as well: I'm sending a newline between the hostname and the : [20:09:18] Coren: hmm, 'INFO:root:Identd auth failed, sent 59558, 8282 got back 59558 , 8282 : ERROR : NO-USER' [20:09:28] But that shouldn't give me "Identd authentication failed. Please contact an administrator" should it? [20:09:43] Coren: no, identd checking is done before it reads anything [20:09:56] Coren: identd on the machine says nothing for those ports [20:10:06] That's odd. [20:10:09] Coren: I wonder if identd is setup right on the servers. I only testd it locally on my machine [20:10:40] YuviPanda: I know it worked well in the past, but I haven't checked past migration. [20:10:47] Coren: ah, hmm. Maybe that's an issue? [20:10:53] * Coren tests. [20:11:54] YuviPanda: Hm, I think your /query/ is wrong because I definitely don't have the same ports here. [20:12:26] * Coren was on 35394 [20:12:49] Coren: uh, tha'ts weird [20:13:00] needs moar tests. [20:13:29] yeah [20:13:32] * YuviPanda considers [20:14:08] 22,45609 [20:14:08] 22 , 45609 : USERID : UNIX , UTF-8 :root [20:14:16] WFM [20:14:55] Coren: hmm, I'm looking at proxylistener now [20:16:07] * Coren tests another. [20:17:25] * Coren rages at someone running a bot on tools-login. Again. [20:18:35] heh [20:19:34] 47629 , 22 : USERID : UNIX , UTF-8 :tools.csbot [20:19:38] Works for tools too. [20:19:47] hmm [20:20:26] the client_address variable seems wrong [20:22:29] Hm. I dunno about what your python tools give you, but you remember that ports are in NBO right? [20:23:11] Coren: hmm, I don't know what type it is giving me, and I also do not know if it is handling it for me [20:25:54] Coren: try now? [20:27:22] How's that? [20:27:49] Coren: weird, that didn't hit it at all [20:28:21] Coren: oh, nevermind [20:28:22] Coren: INFO:root:Identd auth failed, sent 31194, 8282 got back 31194 , 8282 : ERROR : NO-USER [20:28:23] hmm [20:28:33] Coren: was that the right port that was opened? [20:28:51] nc has --source-port, if that is useful for testing. [20:28:58] 56850 this time [20:29:18] Wait, or 55929 [20:29:28] Clearly, neither is 31194 [20:29:55] indeed [20:30:13] Aha! but you _are_ bytesex inverted [20:30:29] 55929 -> 0xda79 [20:30:40] 31194 is after a ntohs [20:30:43] 31194 -> 0x79da [20:30:55] so in both cases client_address told me the same thing [20:30:59] Ah, so you already have the right port in the right order. This time. [20:31:15] * Coren is confused. [20:31:17] yaeh [20:31:20] let me get rid of that now [20:31:23] and see what happens [20:31:35] Coren: I just got a 'INFO:root:Identd auth failed, sent 4830, 8282 got back 4830 , 8282 : ERROR : NO-USER' [20:31:57] YuviPanda: You seem to be lagging, that's the other port I have you (I had accidentally started twice) [20:32:02] gave* [20:32:15] Coren: aah, might just be my tail being slow [20:32:30] Coren: right. so let me get rid of the ntohs [20:33:17] Coren: try now? [20:35:03] Coren or YuviPanda, does either of you have any understanding of how smw is supposed to work? I'm baffled by the query results I'm getting for labs stats. [20:35:08] https://wikitech.wikimedia.org/wiki/Special:Ask/-5B-5BResource-20Type::instance-5D-5D/-3FInstance-20Name/-3FInstance-20Type/-3FProject/-3FImage-20Id/-3FFQDN/-3FLaunch-20Time/-3FPuppet-20Class/-3FModification-20date/-3FInstance-20Host/-3FNumber-20of-20CPUs/-3FRAM-20Size/-3FAmount-20of-20Storage/searchlabel%3Dinstances/offset%3D0 <- includes many deleted pages! [20:38:27] YuviPanda: Luck? [20:38:54] Coren: my file logger is laggy, waiting for it to appear [20:38:57] andrewbogott: Not more than 'how to look up a property' [20:39:04] andrewbogott: I think SMW got out of sync some time ago. Is there some maintenaince script to reparse that. [20:39:20] YuviPanda: I'm not convinced it's your file logger; tcpdump shows you haven't sent a response at all. [20:39:28] scfc_de: I can't tell if you're telling me or asking me...? [20:39:31] Coren: oh, hmm. [20:39:53] Coren: did tcpdump show an incoming connection to identd? [20:40:00] andrewbogott: Asking :-). Otherwise I'd have a filename handy. [20:40:01] I wasn't watching for that. [20:40:06] Lemme try again. [20:40:08] ok [20:40:19] INFO:root:Identd auth failed, sent 60996, 8282 got back 60996 , 8282 : ERROR : NO-USER [20:40:22] Ah, and your daemon gives me a response as I close the socket. Amusing. [20:40:22] I see just now [20:40:44] scfc_de: What would 'out of sync' mean? It has a cache someplace rather than looking in the actual wiki db? [20:40:52] YuviPanda: That's the right port, at least... [20:40:56] Wait. [20:41:23] hmm, Coren maybe add a newline to end of message? [20:41:36] There's one. [20:42:01] andrewbogott: That's my assumption (wiki page save => update DB). I've asked on #semantic-mediawiki just now, maybe they know more. [20:42:28] Coren: so , 'INFO:root:Identd auth failed, sent 60996, 8282 got back 60996 , 8282 : ERROR : NO-USER' has right port pair but... got wrong response from identd? [20:42:37] * YuviPanda is confused [20:42:59] YuviPanda: I don't get it, everytime I test identd it works. [20:43:15] Lemme try it now from the proxy box. [20:43:23] ok [20:43:51] 40628,8282 [20:43:52] 40628 , 8282 : USERID : UNIX , UTF-8 :tools.csbot [20:44:07] o.O [20:44:20] ... [20:44:35] ah hjmmm [20:44:40] I've a space afte rmy , [20:44:45] *after [20:44:52] don't think that should affect things [20:44:54] echo '40628,8282' | nc tools-webgrid-01 ident [20:44:54] 40628 , 8282 : USERID : UNIX , UTF-8 :tools.csbot [20:45:06] Nope. Also works. [20:45:11] (with a space) [20:45:18] right [20:45:57] But also, you're not sending me a response immediately anymore; so perhaps it only tries after I close the socket (which would fail) [20:46:24] Coren: right but you *are* getting the response so the socket can't close beofre then? [20:46:34] It's half-closed. [20:46:57] It's already been torn down; I get lingering packets after you got my RST [20:47:09] ah, hmm [20:47:49] How many newlines are you expecting? I'm sending you exactly two: [20:48:42] 2f 2e 2a 0a 74 6f 6f 6c 73 2d 77 65 62 67 72 69 64 2d 30 31 3a 34 30 30 30 0a <-- exact payload [20:49:41] That is: /.*\ntools-webgrid-01:4000\n [20:49:44] Coren: two in total, right? [20:49:45] yeah [20:49:52] that should be what I am getting, *I think* [20:50:16] Well, that's what tcpdump shows you get. :-) [20:50:58] Coren: ah, I mean that is what I should be expecting, I think. [20:51:18] Coren: I bet it's just something massively but trivially wrong with my socket code. [20:51:21] Hm. It didn't use to stall like this when I had the extraneous newline [20:51:51] Coren: hmm, add an extraneous newline? [20:52:12] About to try just that [20:53:03] YuviPanda: Wait, your daemon is down. [20:53:05] BTW, as you're messing with identd as well, that uses CRLF IIRC. So you might want to use that for the proxy granter as well, or put a huge warning sign on the code so you don't get confused :-). [20:53:06] SYN -> RST [20:53:37] Coren: try again. [20:53:47] scfc_de: The actual implementation gleefuly accepts just NL, at least. [20:54:04] Coren: But sends out CRLF, doesn't it? [20:54:13] It should. [20:54:22] scfc_de: identd response is fine here, I guess [20:54:36] YuviPanda: Nope, same behaviour -- I see my request go out but you're not answering it. [20:54:55] YuviPanda: I think you broke your code by accident. Revert to known good? [20:55:03] Coren: hmm, yes [20:55:42] Say when. [20:55:58] Coren: reverted [20:56:05] * YuviPanda has no logs tho [20:56:36] * Damianz wonders why his bot is running twice hmm [20:58:02] YuviPanda: dafu? I get the same behaviour. I connect, send the request, then nothing. And you don't do an identd. [20:58:08] ... [20:58:11] wat [20:58:31] Coren: ok, I think I'm too sleepy to debug right now :( I'll write some tests for it tonight and see what's up. [20:58:40] Coren: how do I test new portgranter? [20:58:44] Coren: just have it locally and run it? [20:58:48] Perhaps that's what the issue has always been, just didn't notice the sequence. The reason identd fails is because your proxy never tries until I tear the connection down (by which time it's too late) [20:59:11] YuviPanda: It's already running in general -- you just need to start a webservice and it will try to use your proxy. [20:59:18] ah, hmm [20:59:19] Coren: ok [20:59:36] Coren: yeah, needs a bit more measured approach. I'll look at it tomorrow? [20:59:40] Sure. [20:59:45] Coren: This might also be a consequence of the threaded server, etc. [20:59:56] I'll take a debugger to it and see what's happening [21:00:28] thanks for the help, Coren! :)