[01:47:42] 10PAWS: Do not require users to type passwords into PAWS - https://phabricator.wikimedia.org/T120331#1856152 (10yuvipanda) \o/ https://test.wikipedia.org/w/index.php?title=Test&type=revision&diff=253779&oldid=253311 :D :D :D :D [01:53:10] 10PAWS: Do not require users to type passwords into PAWS - https://phabricator.wikimedia.org/T120331#1856153 (10yuvipanda) 5Open>3Resolved This is deployed now, however a bunch of upstream patches/work still needs doing. See https://github.com/jupyter/jupyterhub/pull/346 and linked PRs/Issues. [01:57:49] 10PAWS: PAWS 404 for users with special characters in their names - https://phabricator.wikimedia.org/T120066#1856155 (10yuvipanda) This is probably https://github.com/jupyter/jupyterhub/issues/345 [03:57:39] 10PAWS: PAWS with bot accounts - https://phabricator.wikimedia.org/T120558#1856277 (10jayvdb) 3NEW [04:09:10] Is it okay that I use Tool Labs to temporarily mirror Tor Browser downloads, for a couple of editors in China? [04:10:19] By the way, Tor is working in China with the meek pluggable transport (and obviously you will need IPBE to edit) [04:33:42] 10PAWS: PAWS with bot accounts - https://phabricator.wikimedia.org/T120558#1856329 (10yuvipanda) I think the simplest would be to login directly as the bot account. This means we can enforce a 1:1 mapping between paws user and the wiki user, which is useful for anti-abuse and other features. [04:34:02] Zhaofeng_Li: hey! yes, do so. [04:34:41] YuviPanda: Thanks! [04:35:12] Zhaofeng_Li: keep me informed if you take it down / put it back on, though :) [04:35:36] YuviPanda: Understood! [04:37:35] 10PAWS, 7Documentation, 7Pywikibot-documentation: Expose the fact that PWB is available more - https://phabricator.wikimedia.org/T120072#1856330 (10yuvipanda) I wanna reword it to be more comprehensive and then close this. [04:39:31] 10PAWS: Make the default PS1 more helpful - https://phabricator.wikimedia.org/T120560#1856331 (10yuvipanda) 3NEW [04:40:01] 10PAWS: Make the default PS1 more helpful - https://phabricator.wikimedia.org/T120560#1856338 (10yuvipanda) and current working dir, probably :) [04:41:46] bd808: around? [04:42:06] just got here :) [04:42:31] bd808: :D I have tools.wmflabs.org/paws working with OAuth integrated :D wanted to tell/show [04:43:20] still doesn't work with (WMF) accounts though [04:43:23] or anything with urlencoding [04:47:18] 10PAWS: Implement a 'signing OAuth Proxy' for PAWS - https://phabricator.wikimedia.org/T120469#1856341 (10yuvipanda) p:5Triage>3High [04:47:31] * bd808 logs in to check it out [04:49:35] the shell feels pretty responsive [04:50:02] bd808: yeah, I'm impressed. emacs and vim (and nano) both work pretty well [04:51:48] would it be crazy pants to spin up MediaWiki in an environment like that? [04:52:02] it's just a docker container right? [04:52:36] bd808: yeah, so that's a possibility I want to investigate at some point. [04:52:53] bd808: the problem is how to persist the database, mostly [04:53:01] bd808: if this were on AWS/GCE I'd just use EBS/similar. [04:53:03] slow but ok [04:53:21] bd808: for PAWS I'm using NFS for $HOME, which sucks but is abstracted enough I can replace with cinder/ceph at some point in the far future. [04:53:33] hmmm [04:53:33] bd808: but I don't want to put a DB on NFS, and esp. on *our* NFS [04:53:58] bd808: but yeah, totally. it should be something we can do in the future [04:54:08] bd808: ori is trying to do something similar to help him with mw multidc work [04:54:51] related to hashar's hope for isolated jenkins environments too I guess [04:55:31] bd808: but I think we can't use our current vagrant stuff directly. to do this 'properly' each thing has to be its own container (so, hhvm, mysql, redis/memcached) grouped into one pod... [04:55:40] and if we want mysql to persist it's gotta be it's own RC... [04:55:56] bd808: we can get around this by just providing mysql externally :D Just buy a few boxes and put mysql in it [04:55:59] or just use tools-db [04:56:14] then it's a problem of just how do you get 'vagrant role' equivalent functionality [04:57:10] agreed that using puppet inside a docker container is not awesome. I had mw-vagrant doing it for a while and killed that for the LXC stuff [04:57:51] yeah [04:58:11] but we need to factor it out, and factor out the 'state' (list of roles, actual code, localsettings.php, etc' somehow [04:58:31] vagrant + docker is really sketchy because of the way that port forwards work in that combo [04:58:39] yep [04:58:54] vagrant + k8s is even more sketchy IMO since it kills containers if they look at it the wrong way :) [04:59:09] docker as a provider does too [04:59:43] right [05:00:07] bd808: definitely a more complex problem than pwb, by dint of being a more complex application :D [05:01:05] if you had external redis & mysql it might not actually be too hard... [05:01:16] but a problem for another time [05:01:19] indeed [05:01:32] * bd808 pokes at elsatcisearch some more [05:01:38] bd808: I think in my etherpad of 'things people use toollabs far', if it was expanded to 'labs for' it'll probably be more clear [05:02:15] yeah. I think tools are the right thing to focus on first though. More bang for the buck [05:02:40] yeah [05:02:49] bd808: that's what I tell jdlrobson everytime he becomes sad. [05:03:07] running a wiki farm is more for testing than for helping the on-wiki communiites [05:03:20] yeah [05:04:05] I've found some regression bugs in the lxc vagrant stuff that I should work on tomorrow [05:04:12] \o/ [05:04:18] bd808: I'm going to try to fix the puppet stuff today [05:04:21] thins that labs-vagrant did well are broken now :/ [05:04:47] :( [05:10:54] I made the nginx proxy a lot less complicated [05:11:12] cooool :D [05:11:24] I tried to use it to do some stuff and found out that it was ... not quite right [05:12:16] if we want more than honor system protection it will be harder though [05:12:39] hmm [05:12:41] I found a couple of mostly abandoned projects that tried to make fancy auth proxies [05:12:43] was that the basic-auth stuff? [05:13:17] yeah. so what I have now is GET wide open, POST/PUT/DELETE behind a simple password [05:13:37] and nothing that will "namespace" the users [05:13:44] so you are either in or not [05:13:46] right [05:13:52] I guess that's an ok start [05:14:01] and we'll have to figure out the namespacing proxy ourselves later on [05:14:15] and see how we can do it with the least amount of code possible, in a nice upstream friendly way. [05:14:19] might end up being a fun project :D [05:14:27] the other thing I've been thinking about is how cumbersome tuning ES can be [05:14:43] in what sense [05:15:39] it's easy to swamp the cluster with too many shards or shards that are too big or indexes that aren't optimized for the queries that are being run [05:16:03] that all makes a shared instance with lots of tenants a bit scary [05:16:10] hmm [05:16:10] 10PAWS: PAWS network error: - https://phabricator.wikimedia.org/T120561#1856354 (10zhuyifei1999) [05:16:42] when we were talking about a log storage cluster that was a single application really [05:16:45] 10PAWS: PAWS network error: - https://phabricator.wikimedia.org/T120561#1856355 (10yuvipanda) p:5Triage>3High Yeah, I think this is undelrying network issues. Investigating... [05:16:53] 10PAWS: PAWS network error: - https://phabricator.wikimedia.org/T120561#1856357 (10yuvipanda) a:3yuvipanda [05:16:55] but this server is potentially quite differnet [05:17:00] bd808: yeah [05:17:41] bd808: but I had similar fears about redis, but it ends up being ok. Large scale users stop using it once they disrupt it enough times :) [05:17:48] and move to their own project or something else [05:19:08] brb food [05:58:05] 10MediaWiki-extensions-OpenStackManager: Kill action=novaprojects - https://phabricator.wikimedia.org/T102404#1856375 (10yuvipanda) 5Open>3declined a:3yuvipanda No point spending any effort on this, let's just spend it on horizon instead... [07:34:39] PROBLEM - Host tools-worker-04 is DOWN: CRITICAL - Host Unreachable (10.68.18.165) [07:35:12] ^ is me [07:35:14] it's ok [07:35:16] there'll be two more [07:36:41] PROBLEM - Host tools-worker-05 is DOWN: CRITICAL - Host Unreachable (10.68.18.146) [07:44:17] bd808: ok, so the puppet issues are all gone now, including the sad commit :) [09:14:19] grrrit-wm: hello! [09:14:22] welcome back my friend [09:35:41] 10PAWS: PAWS network error: - https://phabricator.wikimedia.org/T120561#1856408 (10yuvipanda) So... Due to massive cert upheaval (due to T120159), flannel and kube-proxy had stopped working. I spent the last few hours totally destroying and rebuilding our cluster, and everything's sparkly and shiny now! I teste... [09:53:17] YuviPanda: Forgot to say that the mirror is at https://tools.wmflabs.org/zhaofeng-test/torbrowser/ . Feel free to take it down if anything goes wrong :) [09:56:05] Zhaofeng_Li: alright! :) [10:23:26] 10PAWS: Add a way to expose static files to the internet from PAWS - https://phabricator.wikimedia.org/T119859#1856442 (10yuvipanda) Must be very careful to ensure that there's no cookie leaking / XSS stuff. [10:25:02] YuviPanda: ping ? [10:26:37] matanya: about to head to bed... [10:26:40] matanya: pong? [10:27:51] YuviPanda: need help with restating webserver [10:28:24] ? [10:28:25] https://tools.wmflabs.org/derivative [10:28:48] hmmm [10:28:51] that should never happen [10:28:53] let me look [10:29:52] !log tools did webservice start on tool 'derivative', was missing service.manifest [10:29:55] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL, Master [10:30:04] matanya: it should autorestart from now [10:30:28] thanks! [10:30:38] yw! [10:30:45] i'm off now [20:43:53] 10PAWS: PAWS network error: - https://phabricator.wikimedia.org/T120561#1856770 (10yuvipanda) Hmm this just ahppened again. Investigating. [20:54:10] 10PAWS: PAWS network error: - https://phabricator.wikimedia.org/T120561#1856778 (10yuvipanda) Ok, so this faild when it was scheduled on tools-worker-05 but *not* on tools-worker-01. Network requests can come in but do not go out. This makes me suspect this is some form of SNAT issue. [21:00:43] !log rcm deleted rcm-3 (Not needed) [21:00:47] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Rcm/SAL, Master [21:01:18] !log rcm Enable rcm-5, try to replicate phabricator update issue with puppet [21:01:21] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Rcm/SAL, Master [21:02:44] PROBLEM - Host tools-master is DOWN: CRITICAL - Host Unreachable (10.68.16.9) [21:05:56] PROBLEM - Host tools-shadow is DOWN: CRITICAL - Host Unreachable (10.68.16.10) [22:15:57] * gwicke wonders where the labs ui elements in the wikitech sidebar went [22:28:09] gwicke: what's missing? I see lots of sidebar links (Labs, Labs Cloudadmins, Labs Users, Labs Projectadmins) [22:28:50] are you maybe logged out? [22:28:55] in the "Labs Users" sidebar, I only see "Manage Service Groups" [22:29:17] I still managed to get to the instance administration via Special:Specialpages [22:29:37] that link is in Labs Projectadmins [22:29:42] "manage instances" [22:29:58] yeah, I'm evidently not seeing that sidebar [22:30:45] I already tried logging out & back in, as that solved labs problems before [22:31:02] :) it does fix some things [22:31:18] 10PAWS: 'Stop my Server' in PAWS should wait until server is actually stopped - https://phabricator.wikimedia.org/T120351#1856888 (10yuvipanda) 5Open>3Resolved a:3yuvipanda This is done now. [22:31:19] yeah, like no instances listing [22:32:36] how long does new instance creation normally take these days? [22:33:16] I scheduled a small instance 15 minutes ago, but it's still in the 'scheduling' stage [22:33:31] hmmm... that doesn't sound normal [22:34:48] normally my rule of thumb was that it shows up as ACTIVE quite quickly, but SSH might not work for the first 15 minutes or so [22:35:04] yeah that sounds about right [22:35:38] oh well, I guess I'll circle back to this tomorrow [22:35:55] I have lots of rights to poke around on the wiki but don't have access to look at the backing servers where that gets handled [22:36:06] just created a docker image that runs restbase & parsoid in a single node process [22:36:16] now looking into linking that up with mw & mysql [22:36:22] using docker compose [22:36:24] bd808: gwicke I ran into this just now too, some nodes get stuck in scheduling forever [22:36:31] err [22:36:33] instances [22:36:38] created another one and it went to active quickly enough [22:36:46] gwicke: I suggest leaving your stuck node as is and creating another one [22:37:06] okay, let me try [22:48:21] that's also still scheduling after 10 minutes [22:51:45] instance names are docker-testing and docker-testing01 [22:52:05] both in project "services" [22:52:40] gwicke: can you file a bug? [22:52:58] I can look into it if it's super urgent, but would prefer waiting for tomorrow [22:53:08] sure; it's not super urgent [22:55:00] project "labs" ? [22:55:05] gwicke: yeah [22:55:30] 6Labs: Instance creation fails - https://phabricator.wikimedia.org/T120586#1856910 (10GWicke) 3NEW [23:11:36] 10PAWS, 5Patch-For-Review: Terminal gets 'stuck' after a few minutes without activity - https://phabricator.wikimedia.org/T120335#1856927 (10yuvipanda) 5Open>3Resolved a:3yuvipanda Ok, @legoktm just confirmed that this works. The current timeout is at 1h, which I think is reasonable. [23:14:27] 6Labs, 10Labs-Infrastructure, 10MediaWiki-Vagrant: InstantCommons stopped working on Labs-Vagrant, now lots of missing images. - https://phabricator.wikimedia.org/T118226#1856932 (10bd808) > even though I made no changes to the configuration on developer-doc-devhub.developer-doc.eqiad.wmflabs and haven't upd... [23:18:42] 10PAWS: PAWS network error: - https://phabricator.wikimedia.org/T120561#1856347 (10yuvipanda) Ok, I think I've isolated it - I had forgotten that I had upgraded the kernels on the old nodes (kube-proxy needs at least 3.18 I think, and default is 3.16). The older nodes are on 3.19, but that doesn't seem to work a...