[00:05:08] Coren: lighttpd? :D [00:05:14] why lighttpd? :) [00:05:42] because it's really, really lightweight so there is little cost to starting a bunch of them. [00:05:57] Works very well too. [00:12:23] * Ryan_Lane shrugs [00:12:26] lighter than nginx? :) [00:12:48] (wouldn't surprise me) [00:12:54] sounds good, though [00:13:10] it's much better than running everyone on the exact same web server [00:13:19] Coren: you're running them all on different ports? [00:13:54] so, this uses the grid to schedule a lighttpd on a node? [00:13:56] Yes, sent as a job to the gridengine (on a dedicated queue). [00:14:07] how are you routing to it? [00:14:15] yuviproxy? [00:14:44] No, but it'll be adaptable for it. Right now, the startup scripts registers itself in a flat file the webproxy uses for its rewriterules. [00:14:51] ah, ok [00:15:01] yuviproxy should make that easier :) [00:15:04] It should be trivial to tweak that to register with the yuviproxy instead. [00:15:30] this sounds like a much saner approach [00:15:37] it's still possible for one to take down an entire node, though [00:15:47] maybe launch the lighttp in a cgroup? [00:16:02] this is definitely an improvement, though :) [00:16:07] and more scalable [00:16:37] you could probably also make subqueues for more expensive ones [00:24:15] Coren: +1 good work :) [00:27:06] so, how can I get my exterior IP address then? [00:27:50] it turns out I will need to know what my outfacing ip address is because i must use it as part of the token/hash protocol that the xISBN API uses [00:28:08] sorry, to be clear this if for a tools project [00:34:18] notconfusing: is this a web tool or bot? [00:34:33] for bots there's no way to know what the public IP is [00:34:56] it seems you can hack it with checkip.dyndns.org and REs [00:35:05] * Ryan_Lane shudders [00:35:12] that's one way to do it, yeah [00:35:15] but it can change [00:35:22] so i [00:35:32] 'll just recheck ever so often [00:35:44] every time your bot launches [00:35:55] once your bot is running it's stable [00:36:01] when your bot starts it could be somewhere else [00:36:13] Ryan_Lane, thanks, that's good to know [00:36:22] perfect so it wont add to the runtime then [00:36:23] hm. I wonder if there's a way we can list the IP somewhere [00:58:33] Ryan_Lane: Couldn't they be found buried somewhere in LDAP? [01:08:45] Coren: only kind of [01:08:59] Coren: you'd need to know the public hostname [01:09:11] and if you had that, you could just do a reverse lookup in dns [01:09:15] err [01:09:18] a lookup in dns [01:09:21] Ah, you can't find it from looking at the association to the instance? [01:09:27] that isn't in ldap [01:09:30] that's in openstack [01:09:41] only dns and puppet is in ldap, [01:31:36] I'm getting a "Cannot allocate memory" error :/ [01:32:29] https://gist.github.com/theopolisme/c604cb45dac86c219163 [01:34:33] Segmentation fault... [01:34:39] Any ideas? [04:42:43] [bz] (8NEW - created by: 2Krinkle, priority: 4Unprioritized - 6normal) [Bug 55547] tools: Disable the default no-op output buffer - https://bugzilla.wikimedia.org/show_bug.cgi?id=55547 [04:50:57] YuviPanda|sick: you there? [05:16:09] !log cvn Moving CVNBot14 from cvn@willow.toolserver.org to cvn-app1.wmflabs [05:16:10] Logged the message, Master [08:42:33] Hi all [08:42:39] i have a Q [08:43:14] Why i face "recieving incomplete xml data.sleeping for x seconds" when i want to run my bot on labs? [09:25:51] [bz] (8RESOLVED - created by: 2bgwhite, priority: 4Immediate - 6critical) [Bug 55498] Webserver is down - https://bugzilla.wikimedia.org/show_bug.cgi?id=55498 [09:49:50] !log tools tools-webserver-01is getting a 500 Internal Server Error again [09:49:54] Logged the message, Master [09:50:30] petan: around? [09:54:49] or Coren, although I feel it may be a little early :) [09:58:01] hello [09:58:24] how do I download a dump of zh-yuewiki? [09:59:08] qwebirc1035095: http://dumps.wikimedia.org/zh_yuewiki/ [09:59:37] I mean on labs [09:59:45] to my tool name space [10:01:59] qwebirc1035095: /public/datasets/public/zh_yuewiki [10:02:09] Thanks! [10:02:19] np [10:45:14] i [10:54:19] [bz] (8NEW - created by: 2Magnus Manske, priority: 4Unprioritized - 6blocker) [Bug 55556] Extremely/unusually slow SQL on categorylinks table - https://bugzilla.wikimedia.org/show_bug.cgi?id=55556 [11:29:25] I am having problems to ssh into some labs instances that could ssh into before. [11:29:32] They are in the analytics project [11:29:49] So I can no longer log in to for example: kraken-namenode-standby [11:29:55] https://wikitech.wikimedia.org/wiki/Nova_Resource:I-00000862 [11:30:20] But I still can log into some other instances of the analytics project (e.g.: limn0) [11:31:00] The error message I get when trying to log in on kraken-namenode-standby is "Permission denied (publickey)." [11:46:37] Output of ssh -v [...] is at http://dpaste.com/1412029/plain/ [11:59:25] legoktm: You have queries on revision WHERE rev_user=; you realize that this will do full table scans every time (= very very long) unless you use revision_userindex instead? [12:34:47] https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Help/NewWeb [13:11:20] [bz] (8RESOLVED - created by: 2Dirk Beetstra, priority: 4Highest - 6blocker) [Bug 54690] Tools-exec instances are not running the same software installs - https://bugzilla.wikimedia.org/show_bug.cgi?id=54690 [13:11:21] [bz] (8RESOLVED - created by: 2bgwhite, priority: 4Immediate - 6critical) [Bug 55498] tools-webserver-01 is down - https://bugzilla.wikimedia.org/show_bug.cgi?id=55498 [13:12:49] [bz] (8RESOLVED - created by: 2Liangent, priority: 4Normal - 6major) [Bug 54934] Wikimedia Labs database replication has seemingly stopped (s1 and s2?) - https://bugzilla.wikimedia.org/show_bug.cgi?id=54934 [13:14:54] [bz] (8RESOLVED - created by: 2Robin Krahl, priority: 4Normal - 6normal) [Bug 54107] *_userindex tables need to be documented - https://bugzilla.wikimedia.org/show_bug.cgi?id=54107 [13:15:12] [bz] (8RESOLVED - created by: 2Sumana Harihareswara, priority: 4Unprioritized - 6normal) [Bug 54700] Add "Labs seems slow, what do I do?" to Help doc - https://bugzilla.wikimedia.org/show_bug.cgi?id=54700 [13:17:01] [bz] (8RESOLVED - created by: 2Liangent, priority: 4Unprioritized - 6normal) [Bug 54451] User databases on s3 were lost (in the outage?) - https://bugzilla.wikimedia.org/show_bug.cgi?id=54451 [13:18:51] [bz] (8NEW - created by: 2Betacommand, priority: 4Unprioritized - 6normal) [Bug 54133] Web-servers do not treat .htaccess consistently - https://bugzilla.wikimedia.org/show_bug.cgi?id=54133 [13:20:42] Coren: https://bugzilla.wikimedia.org/show_bug.cgi?id=54451 - why can't they be backed up? [13:21:24] Because we don't do that, and most data *is* ephemeral. Users are, OTOH, welcome to dump any important data they want at interval. [13:23:41] [bz] (8RESOLVED - created by: 2Jesse PW (Pathoschild), priority: 4Unprioritized - 6normal) [Bug 53668] Some replicated databases are missing tables - https://bugzilla.wikimedia.org/show_bug.cgi?id=53668 [13:24:23] Coren: is a mysqldump command in crontab good enough? [13:25:32] liangent: It should, although if you wanted to be /really/ cool and the data is of general value, pushing the result to git would be (a) extra redundantly delicious and (b) possibly useful to others as well. [13:28:12] Coren: is there a reason not to run lighttpd as the user? then the user could just use jkill to kill the server [13:28:33] Coren: and how is the server started? automatically when a HTTP request is made? [13:30:19] valhallasw: (a) There's a couple, mostly needed to prevent some attack vectors, and (b) it's started if you've got a configuration file (although, once that becomes the default, it will be 'if you have a config file or a non-empty public_html') [13:31:09] ah, OK [13:31:34] Coren: no. data there are my bot execution states [13:32:10] liangent: So then just a mysqldump at interval and you're all set. [13:32:59] Coren: but what caused the loss? [13:33:04] on s3 [13:33:48] liangent: The actual database went exploody; we had to rebuild it entirely. [13:34:21] Coren: including user databases? [13:34:28] liangent: It's the same DB. [13:34:49] s/DB/cluster in mysql parlance/ [13:35:20] In other words, the entire slice was broken. [13:35:39] Coren: ok [13:52:58] Yup. That new webservice thing is going to end up working fine. [13:58:32] [bz] (8NEW - created by: 2Johannes Kroll (WMDE), priority: 4Unprioritized - 6normal) [Bug 55562] 'take' subfolder bug - https://bugzilla.wikimedia.org/show_bug.cgi?id=55562 [14:03:34] Coren: Hmm. How do I tell if my tool is actually being served from the new webgrid thing or if it's still being served by apache? [14:04:01] anomie: Easy way: the format of the access.log is subtly different. [14:04:38] Also, 'qstat -u root -q webgrid' will show you if your server is running. [14:05:10] Coren: Well, either it's extremely subtle or oauth-hello-world is still being served from apache even though I see a job for it for webgrid. [14:05:27] I shall check. :-) [14:06:09] anomie: Ah, no, it keeps failing to work and restarts in a loop. (It shouldn't do that) [14:06:24] No wait, I just lied. :-) [14:06:31] Coren, anomie: see also http://tools.wmflabs.org/?status under tools-webgrid-01 [14:06:58] which is slightly easier than remembering qstats's invocation in my experience [14:07:08] * valhallasw likes how ?status just automagically works [14:07:11] anomie: I'm seeing it as up and serving requests. [14:08:09] anomie: (!) [14:08:33] You started a server for *anomiebot* not oauth-hello-world [14:08:34] :-) [14:08:42] Coren: I started one for both ;) [14:09:08] Well, you have stuff in your error.log [14:09:24] Coren: I think I see the problem. https://tools.wmflabs.org/oauth-hello-world/foo.php goes to apache, while http://tools.wmflabs.org/oauth-hello-world/foo.php goes to lighthttpd [14:09:46] Oh, DOH! [14:10:23] fix't [14:10:49] * Coren pretends that worked all along and he just didn't forget to enable it there. [14:14:56] Coren: Hmm. What's the replacement for $_SERVER['SCRIPT_NAME'] in lighttpd/FCGI? [14:15:33] ... I honestly don't know. [14:15:42] anomie: phpinfo() might help for that [14:16:54] valhallasw: Hmm. SCRIPT_NAME is coming through but without the "/oauth-hello-world/" prefix. [14:17:04] anomie: That'd be in PATH_INFO [14:17:16] PATH_INFO has no value [14:17:20] Hm. [14:17:52] phpinfo() will show you everything the server throws at you; lemme try something. [14:19:32] Ah, oh, doh. That's unavoidable; unlike with apache the setup with lighttpd /does/ give you the documentroot. [14:19:48] * anomie suspects that http://tools.wmflabs.org/oauth-hello-world/foo.php is effectively being forwarded to something like http://tools-webgrid-01:98765/foo.php [14:20:11] It is. [14:20:16] * Coren wonders if there is a workaround. [14:20:27] Coren: you can add a header to the request [14:20:58] X-REQUEST-URL, or something like that. [14:21:37] Actually, now that I think of it, there is no overriding reason to not simply put the documentroot one level higher and keep the path the same. [14:21:56] It's not an issue because the proxy will hardcode the component anyways. [14:22:31] This way things will remain compatible regardless of the scheme in use. [14:22:41] * Coren does just that. [14:22:50] * anomie \o/ [14:23:09] Coren: btw, why doesn't the request just get forwarded to the web server? then it just receives the full 'http://tools.wmflabs.org/etc/' url [14:24:05] Ohwait. That won't work. [14:24:07] * Coren ponders. [14:25:31] Ah, yes, easily done after all. [14:28:36] * Coren restarts the webservers. [14:32:48] Hm. That didn't quite work. [14:37:40] Ah, typo. Restarting. [14:41:05] There we go. [14:42:04] That should have SCRIPT_NAME and SCRIPT_URL consistent with running under apache now. [14:42:10] * Coren needs to restart one last time. [14:43:02] anomie: ^^ does it work for you? [14:43:28] Coren: It looks like it does, yes [14:44:00] It needed a bit of alias trickery to keep the documentroot correct, but it should now work with consistent semantics to apache's [15:55:38] I got a race condition I'm not getting. :-( [15:58:51] gah. Has anyone been able to get mw.o's oauth working in python? [15:59:14] apparently it's nonstandard enough to bring the standard flask-oauth library to tears [16:38:53] [bz] (8NEW - created by: 2MZMcBride, priority: 4Unprioritized - 6normal) [Bug 55567] gerrit-stats (gerrit-stats.wmflabs.org) appears to be broken - https://bugzilla.wikimedia.org/show_bug.cgi?id=55567 [16:41:30] Coren: Oh. I'll try and figure out which script is doing that... [16:41:56] Oh, it's working now. [16:42:10] NFS file locking semantics are... not useful for what I was trying to do. [17:02:35] [bz] (8ASSIGNED - created by: 2Tim Landscheidt, priority: 4Unprioritized - 6normal) [Bug 52976] Provide user_slot resource in grid - https://bugzilla.wikimedia.org/show_bug.cgi?id=52976 [17:03:43] [bz] (8RESOLVED - created by: 2MZMcBride, priority: 4Unprioritized - 6normal) [Bug 53640] "links" MariaDB database view is broken on Wikimedia Labs - https://bugzilla.wikimedia.org/show_bug.cgi?id=53640 [17:06:39] [bz] (8NEW - created by: 2Tim Landscheidt, priority: 4Unprioritized - 6trivial) [Bug 48625] Provide namespace IDs and names in the databases similar to toolserver.namespace - https://bugzilla.wikimedia.org/show_bug.cgi?id=48625 [17:08:07] [bz] (8RESOLVED - created by: 2Daniel Schwen, priority: 4Unprioritized - 6blocker) [Bug 52944] Please enable mod_fastcgi support - https://bugzilla.wikimedia.org/show_bug.cgi?id=52944 [17:11:29] [bz] (8RESOLVED - created by: 2Tim Landscheidt, priority: 4Low - 6normal) [Bug 48105] "sudo chown ..." asks for password which doesn't exist - https://bugzilla.wikimedia.org/show_bug.cgi?id=48105 [17:37:28] Coren: If you have a minute, https://gerrit.wikimedia.org/r/#/c/89023/ [17:40:25] anomie: merged [17:40:36] Coren: Thanks! [18:11:20] * Coren considers giving users quotas. 100K should do it. [18:13:19] in which units? [18:15:11] bytes. :-) I'm just groaning because a rsync of the shared filesystem is... lots of stuff. [18:18:17] Ryan_Lane: Did you migrate projects or is this all switches to NFS the users made themselves? [18:18:46] themselves [18:19:04] I'm waiting for it to be stabel [18:19:07] *stable [18:20:44] Ryan_Lane: Looks like you won't have much work to do; I'm seeing actual homes for at least 123 projects [18:20:51] Issues on tools-login - no response [18:21:16] hedonil1: Planned maintenance in progress; this was annouced on labs-l and is in the channel topic. :-) [18:21:58] ok then 8-) [18:28:16] What does "couple of hours" mean in detail 1, 2, 10 20? [18:28:36] Ryan_Lane: we're organizing an event in November targeting researchers / data hackers. What's the current processing time for new tool labs requests? [18:28:47] [bz] (8RESOLVED - created by: 2Dereckson, priority: 4Unprioritized - 6normal) [Bug 53793] Users with a former SVN account not migrated can't create an account - https://bugzilla.wikimedia.org/show_bug.cgi?id=53793 [18:29:29] hedonil1: I'm hoping less than 1, but we might be stuck around 2. [18:29:56] DarTar: depends on who is online, really. About 4-5 people can approve requests... [18:30:15] DarTar: so if you've one of those people online.... Pretty instantaneous. [18:30:25] DarTar: If we have a heads' up and we expect it, pretty much instant. [18:30:43] Coren: Thanks. [18:30:44] k cool, I don't know how many external people are planning to attend, but we should start advertising soon [18:31:20] DarTar: Otherwise, the list tends to be look at about every day or so. [18:31:37] looked* [18:31:49] will give you the heads up once we make the announcement, thanks [18:31:58] (With occasional bouts of "but I thought someone else was doing it") :-) [18:32:14] Anybody care to explain? [18:33:04] !newlabs [18:33:04] This is labs. It's another version of toolserver. Because people wanted it just like toolserver, an effort was made to create an almost identical environment. Now users can enjoy replication, similar commands, and bear the burden of instabilities, just like Toolserver. [18:34:20] Coren: oh. wait. I think I had rsync'd them a while aho [18:34:22] *ago [18:34:28] because I was going to move everyone [18:34:34] then it started having instability issues [18:34:56] Ah, well at least it's going to be a fairly swift rsync since nothing would have changed. [18:35:01] yeah [18:35:07] that hasn't changed in ages [18:44:28] Coren, why is labs not working? [18:45:12] petan, ^ [18:45:29] did you read the mailing list? [18:45:31] Cyberpower678: scheduled maintenance [18:45:34] also /topic [18:45:49] I didn't get an email. [18:46:05] see labs-l [18:47:28] Oh wait. It's coming in now as well a flood of other labs-l messages. WTF? [18:47:37] WTF? [18:47:43] >:( [18:47:53] It just keeps coming, [18:48:42] WTF is catfoot. [18:49:00] I've got zillions of messages about that. [18:49:22] That's not why I subscribed to the list, to be spammed like that. [18:49:54] * YuviPanda takes labs-l to arbcom [18:51:18] * Cyberpower678 backs YuviPanda and also takes Coren to ArbCom. *oh the irony [18:51:21] three, two, one, FIGHT! [18:51:26] that's not irony! [18:52:03] YuviPanda, Coren used to be a member of ArbCom if I recall correctly. Now he's being taken to it. [18:52:08] hi labs people! [18:52:13] who can help qchris fix his labs stuff? [18:52:15] he can't access some nodes [18:52:15] Cyberpower678: :) I know [18:52:17] we aren't sure why [18:52:20] If that's not ironic [18:52:23] what is [18:52:24] ottomata: NFS maintenance now, perhaps that [18:52:30] its been a day or two? [18:52:43] oh [18:52:48] probably not :) [18:53:15] Ok so I'll try in a few hours when NFS maintenance is over. [18:53:17] Thanks. [18:53:31] Cyberpower678: https://en.wikipedia.org/wiki/File:Stop_Defacing_Signs.jpg is ironic. [18:54:17] There's a catch 22 here. [18:55:10] In order to tell people to stop defacing stop signs, you need to deface a stop sign. But defacing a stop sign is going against the sign. [18:56:20] * Cyberpower678 hacks YuviPanda [18:56:33] * YuviPanda takes away Cyberpower678's node [18:56:48] * Cyberpower678 wasn't aware he had one. [18:58:00] * Coren looks at the not-fast rsync with a bit of concern. [18:58:06] * Cyberpower678 digitializes YuviPanda [18:58:51] Digitilization complete. Digital YuviPanda is 4 PB big. [18:59:12] * Cyberpower678 creates 5 MB chunks and uploads YuviPanda to Wikipedia. [19:00:12] YuviPanda, what's it like there? [19:03:27] Coren, eta? [19:04:22] Cyberpower678: Unknown. There seems to have been tools that ran since last night that created gigs of output that now has to be rsynced while the filesystem is off. :-( [19:04:57] Umm... Would that be me? [19:05:12] I know some log files of mine are couple gigs big. [19:06:15] Coren, eta on the system being switched back on? [19:11:38] I've just answered that Cyberpower678 [19:11:41] beta labs is 503 right now? [19:12:07] chrismcmahon: As announced on labs-l some time ago, and today, and on the channel topic: planned maintenance. :-) [19:12:18] thanks Coren. I are slow. [19:17:59] While waiting: https://commons.wikimedia.org/w/index.php?title=File%3ABeethoven%2C_Sonata_No._8_in_C_Minor_Pathetique%2C_Op._13_-_II._Adagio_cantabile.ogg [19:19:24] Coren, I thought that was an eta on labs being operational again, not when the new hardware gets turned on to start resyncing. [19:23:40] Cyberpower678: One requires the other. [19:24:13] * Coren stares at rsync, hoping to scare it into being done faster. [19:25:23] * Cyberpower678 sends a stare through XChat to rsync alongside Coren's [19:25:38] >:| [19:25:41] >:| [19:25:43] >:| [19:26:48] The whole process is, unsurprisingly, IO-bound [19:26:51] * chrismcmahon cheers on rsync from Utah [19:27:07] go rsync go. go rsync go... [19:27:17] Coren, don't tell me you're pushing enter all the time. [19:28:05] Pikachu! I choose you!!. [19:28:14] Use thunderbolt on rsync [19:29:37] Hedonil sends digital german beer to strenghthen rsync's virility [19:29:48] my ssh to tools-dev is down, any known issue? [19:29:57] Danny_B: NFS maintenance. see labs-l [19:30:00] Danny_B, scrollback [19:30:22] sigh, again no notifications :-/ [19:30:35] Danny_B, but there were [19:31:02] did not get any [19:36:15] Connection closed by remote host when trying to ssh to tools-login? [19:36:50] * Krenair should pay more attention to his emails [19:37:12] Danny_B: The first notification was two weeks ago; then a reschedule for this week (the leak shuffled everyone's schedules), then a warning this morning before it started, then a notice when it started. [19:37:49] Danny_B: I'm trying hard, but you guys have *got* to read labs-l. :-) [19:40:17] Coren: I actually just skim labs-l but istr reading it on Yet Another List. I just spaced out that it was RIGHT NOW. [19:40:58] You know, the delay is all the users' faults. If only they didn't keep doing things like writing to files, and all! [19:40:59] :-) [19:41:23] yeah, this job would be great if it weren't for the users. and the programmers. [19:48:25] chrismcmahon: and the ops people [19:53:48] Coren: last time we talked about this you guys promised you will be sending those server notifications (i don't know the proper term, sorry) - some message which pops up on you in your ssh window [19:55:37] Danny_B: That I *did* forget. Mea culpa. [19:56:24] Funnily enough, it's deployment-prep and not tools that has the heaviest catchup to do. [19:56:33] happens... np... just pls try to not forget again in future ;-) [19:57:51] Coren: deployment-prep stores a ton of useless crap in its filesystems [19:58:02] and I have doubts they ever garbage collect their repos [19:58:25] Ryan_Lane: The problem is that rsync doesn't have a "meh, not worth copying" discrimator module. :-) [19:58:30] yeah [19:58:32] I know :( [19:58:36] it totally should [20:01:11] andrewbogott: were you still working on the OpenStackManager interface to invisible unicorn? [20:01:42] Ryan_Lane: https://gerrit.wikimedia.org/r/#/c/88187/ [20:02:06] let me review. we can write a package for yuvi's api [20:02:15] ok [20:02:22] we need to start working on labs migration soon [20:02:27] [bz] (8NEW - created by: 2Antoine "hashar" Musso, priority: 4Highest - 6major) [Bug 48501] [OPS] beta: get SSL certificates - https://bugzilla.wikimedia.org/show_bug.cgi?id=48501 [20:03:29] hashar: so... there's a number of options on the table regarding beta [20:03:41] and none of them really require much ops work [20:03:55] hashar: how are the beta domains setup? [20:04:13] ..beta.wmflabs.org? [20:04:30] and commons.beta.wmflabs.org, wikidata.beta.wmflabs.org, etc ? [20:04:31] Ryan_Lane: e.g. en.wikipedia.beta.wmflabs.org [20:04:35] yes [20:04:39] Ryan_Lane: For future reference, it's not clear that rsync then rsync after for migrating is measurably faster than just doing a tar over ssh and copying the whole thing. [20:04:45] getting a cert is the easy thing [20:04:58] en.m.wikipedia.beta.wmflabs.org also. [20:05:08] It's too late now for me to restart this, but rsync wastes more time figuring out what to diff than it does actually copying stuff. :-( [20:05:25] yeah, so .beta.wmflabs.org [20:05:42] Coren: you have it actually doing diffs? [20:05:48] you aren't using the timestamp options? [20:06:09] yeah, that's going to be slow as shit :) [20:06:27] Ryan_Lane: we have mobile as well [20:06:30] Ryan_Lane: The problem with timestamps is that most of the big stuff that has to be sent is gigs and gigs of appended-to logs. [20:06:58] hashar: right, so, same url scheme ;) [20:07:05] Ryan_Lane: i should move invisible unicorn to gerrit [20:07:11] Ryan_Lane: and commons.wikimedia.beta.wmflabs.org [20:07:12] YuviPanda: yes, please :) [20:07:18] wonder where it should live [20:07:23] labs/invisible-unicorn? [20:07:29] Ryan_Lane: but yeah the aim is to mimic production as close to possible [20:07:29] wikidata.beta.wmflabs.org :) [20:07:37] Ryan_Lane: and thanks for your Sartoris puppet change :] [20:07:43] oh, fuck me, the url scheme is somewhat different? [20:07:48] that needs to get fixed [20:08:03] no idea the details there [20:08:08] Ryan_Lane: the URL for bits I think is also different, not sure [20:08:16] yeah, that all needs to get fixed [20:08:36] Coren: whats the eta for labs? [20:08:56] everything should match the url scheme like this: s/.org/.beta.wmflabs.org/ [20:09:07] Betacommand: Uncertain. The copy is taking way longer than I was hoping. Judging by the disk space, it should be aboute 20-30 minutes. [20:09:23] But it has been anything but linear. :-( [20:09:37] At least we get working hardware this time. [20:09:46] Coren: http://xkcd.com/612/ [20:09:47] Ryan_Lane: iirc bits is at https://bits.beta.wmflabs.org [20:09:49] (BTW Ryan_Lane, no controller stalls no matter how hard I pushed it) [20:09:56] Coren: excellent [20:10:01] Coren: so, that controller is fucked [20:10:10] let's let it run for a while and ensure that's the case [20:10:27] Ryan_Lane: requested at https://www.mediawiki.org/wiki/Git/New_repositories/Requests [20:10:38] ^d: new repo? (scroll to end of https://www.mediawiki.org/wiki/Git/New_repositories/Requests) [20:10:44] Either that, or there is a driver regression with the newer kernels (I haven't taken any chances and did not enable thin volumes on labstore4) [20:10:56] But now that labstore3 will be free, we can experiment. [20:11:55] indeed [20:12:07] hashar: so, the systems that have the certs need to be controlled [20:12:17] I'd imagine that right now the ssl terminators are on the squids? [20:12:29] na its varnish doing the terminaison [20:12:37] everything is on varnish now? [20:12:41] yup :-] [20:12:44] excellent [20:12:48] worked with mark to migrate everything [20:12:52] ok, so we need to control access to the varnish boxes [20:12:59] he even tested text varnish using beta :-] [20:13:02] and we caught bugs! [20:13:05] nice [20:13:12] then got mobile to test it out as well iirc [20:13:16] there's a few options here: [20:13:40] 1. We disallow volunteers from having projectadmin in deployment-prep and we limit sudo access on those systems [20:13:52] 2. We require all volunteers in the project with projectadmin to have NDAs [20:14:15] 3. We move the varnish systems out of deployment-prep and into a locked down project [20:14:21] those are solutions, what are you attempting to prevent ? [20:14:31] volunteers getting access to the * beta certs [20:14:40] can't we just generate random certs using the Labs certificate authority ? [20:14:40] then add the CA public cert to the test browsers? [20:14:55] I suggested something similar [20:14:56] do we really have to use a star ? [20:15:02] twice. in the bug. [20:15:26] if you want to go that route, you can generate your own CA [20:15:36] we killed off the * wmflabs cert [20:16:10] hashar: if we don't get a real cert, then end-users will always get warning pages [20:16:26] so manual testing will be harder [20:16:33] hashar Ryan_Lane my concern is the group of "test browsers" for beta labs is far larger than just automated tests or users we know of. [20:16:44] chrismcmahon: eh? what do you mean? [20:16:51] you mean manual testers? [20:17:09] Ryan_Lane: what you said about manual testers, or anyone who wants to use beta labs wikis casually [20:17:14] yeah [20:17:33] I'm not opposed to real certs, but if we have them, we need to protect them [20:18:16] so far no one has really made a plan of action, though all of the solutions have been mentioned in the bug [20:18:18] Ryan_Lane: part of the utility of beta labs is as a resource for existing and potential community, so the lower the barrier to entry, the better. [20:18:51] chrismcmahon: I'm not disagreeing. I'm just saying there's steps we need to do if we want to have real certs. [20:19:05] Ryan_Lane: yep [20:19:20] and whoever is going to be doing those steps needs to agree to one and start doing them [20:19:25] yep [20:19:27] we can get the certs in a day or so usually [20:19:57] i see what you mean now [20:20:14] so the idea would be to get real certificates for *.beta.wmflabs.org and *.wikimedia.beta.wmflabs.org etc [20:20:21] Ryan_Lane: at a casual glance, your solution #1 above looks like the least expensive and least disruptive [20:20:22] get them published on the varnish instances [20:20:34] make sure only people under NDA are project admin and know about it [20:20:40] then restrict sudo access on those boxes [20:20:52] sounds plausible to me [20:21:00] hashar: it's more than just *.wikimedia *.wikipedia [20:21:09] don't we also have wikibooks, wikinews, etc. ? [20:21:16] it'll be one large combined cert [20:21:19] just like production [20:21:46] solution #2 seems unworkable to me. solution #3 sounds like a hassle but maybe I'm mistaken [20:21:47] yeah hence 'etc' : [20:21:58] it can be a combo of #1 and #2 [20:22:07] or we could just ask tester to manually download Labs CA cert [20:22:10] #2 is what we require on tools [20:22:13] and self sign [20:22:19] hashar: we don't want people to install our CA [20:22:23] it's dangerous [20:22:32]