[00:16:43] please restart it, hopefully it will help [00:18:05] if there are no processes in D state I doubt it's an NFS issue, but I'll double check the metric in grafana to see if there were D state processes earlier on that node [00:30:15] there are a few dots in the graph of D-state processes for that worker, I don't think they are enough to explain the issue with the tool [00:31:08] https://grafana.wmcloud.org/goto/I6tJjDGNk?orgId=1 [00:34:40] the pod is indeed in CrashLoopBackOff, and has already restarted 44 times [00:35:43] kubectl describe pod shows "Back-off restarting failed container webservice in pod wikiloves-6849f4ccb4-9w6b6_tool-wikiloves" [00:36:19] I will try the stop+start myself while I'm here [00:37:18] the pod is now rescheduled on tools-k8s-worker-nfs-74 and it seems more healthy [00:38:30] I'm going to bed for now, but we'll have to debug this more tomorrow and/or next week [08:18:51] !log jeanfred@tools-sgebastion-10 tools.wikiloves Run webservice stop ; webservice --backend=kubernetes python3.9 start for T379452 [08:18:55] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.wikiloves/SAL [08:23:53] Thanks dhinus for intervening! Tool is still not accessible :/ I have now backfilled information at https://phabricator.wikimedia.org/T379452 [12:26:12] @JeanFred is it okay if I look a bit closer (when I have some time)? will probably involve a few more restarts [12:26:32] (I would try to summarize the results but I think logging everything to the SAL in detail would get a bit noisy) [12:29:01] oh, I just saw this in `kubectl describe pods` [12:29:02] Last State: Terminated [12:29:05] Reason: OOMKilled [12:29:32] Sure, have fun :) thanks! (re @lucaswerkmeister: @JeanFred is it okay if I look a bit closer (when I have some time)? will probably involve a few more restarts) [12:29:38] Ooh (re @lucaswerkmeister: Reason: OOMKilled) [12:29:45] is the tool known to need a lot of memory? (it currently wants 256-512MiB, which I think is the default) [12:30:28] Not that I recall no. Its just reading a (increasingly big) json file and serving stats out of it [12:30:45] But perhaps the increasingly big tipped it over the default [12:31:34] ok, but the code looks like it loads it at startup, not just when a request comes in [12:31:43] so that could indeed explain it crashing [12:33:10] !log lucaswerkmeister@tools-bastion-13 tools.wikiloves added mem: 1Gi to service.template (T379452); webservice stop && webservice start [12:33:18] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.wikiloves/SAL [12:33:54] hm, https://wikiloves.toolforge.org/ just gives me 403 Forbidden ^^ [12:34:07] but at least it doesn’t seem to be crashing anymore AFAICT? [12:34:18] (idk if the 403 is expected or not. maybe the tool is only for WL* admins, idk) [12:36:25] Yes that's correct (re @lucaswerkmeister: ok, but the code looks like it loads it at startup, not just when a request comes in) [12:36:49] (added a comment to the task too btw) [12:36:52] No it is public (re @lucaswerkmeister: (idk if the 403 is expected or not. maybe the tool is only for WL* admins, idk)) [12:36:57] hm, okay… [12:37:08] nothing in uwsgi.log [12:37:39] Can you double check whether it started a py container? Yesterday it started a php container when I used a bare 'web service start' [12:37:41] o_O wtf [12:37:46] “Your webservice of type php7.4 is running on backend kubernetes” [12:37:52] why is it not using the service.template file?? [12:38:06] 🤷‍♀ [12:38:16] ah. I believe that should be `type: python3.9`, not `web: python3.9` [12:38:21] at least when comparing to one of my tools [12:38:43] !log lucaswerkmeister@tools-bastion-13 tools.wikiloves sed -i s/^web:/type:/ service.template # T379452 [12:38:46] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.wikiloves/SAL [12:39:00] !log lucaswerkmeister@tools-bastion-13 tools.wikiloves webservice stop && webservice start # T379452 [12:39:02] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.wikiloves/SAL [12:39:15] yaaaaaay https://wikiloves.toolforge.org/ [12:39:30] Aaaad it's back [12:39:44] Thanks @lucaswerkmeister <3 [12:40:00] So good old OOM in the end. [12:40:09] looks like it yeah [12:40:30] A bit unfortunate this is not surfaced more clearly somehow [12:40:34] yeah :/ [12:40:46] this should not have needed two toolforge admins to figure out [12:42:56] Well tbf perhaps i should get used to dig a bit in kubectl etc [12:46:23] oh, I missed this comment actually, you had the right idea there at the same time I noticed it too :D (re @JeanFred: Can you double check whether it started a py container? Yesterday it started a php container when I used a bare 'web service sta...) [15:56:30] I'm trying to follow this tutorial - https://wikitech.wikimedia.org/wiki/Help:Toolforge/Database#MySQL_Workbench - to connect to databases on toolforge using mysql workbench but I'm having a little trouble...forgive me in advance I don't really understand the things I'm working with I'm just trying my best to follow the tutorials [15:57:11] and I get the error result: "Could not access the SSH tunnel: Access denied for 'none'. Authentication that can continue: publickey,hostbased" [retry] [cancel] [15:58:54] does anyone have any troubleshooting ideas they could give me :( [16:00:38] I don’t have much :/ but that “none” sounds weird… did you put the right SSH username into MySQL Workbench? [16:00:47] it has to be the shell username, not the Wikitech user name [16:01:22] yeah my UNIX shell username is f4udeveloper and i put in f4udeveloper [16:01:28] copied and pasted those here just to double check [16:01:33] ok [16:02:07] then I think I’m out of ideas at the moment, sorry :( [16:02:18] (I don’t use MySQL Workbench myself… maybe someone else who is around does) [19:03:30] Can anyone !help me out on trying to connect to databases using MySQL Workbench? I've been troubleshooting (trying different keys, double-checking passwords), and it's still returning the same "Could not access the SSH tunnel: Access denied for 'none'. Authentication that can continue: publickey,hostbased" error [19:31:47] https://forums.mysql.com/read.php?152,683814,683814#msg-683814 suggests ou may be using the wrong key file and/or the wrong key format [19:42:30] Could that be possible even if I've been able to use the key to access toolforge on the command line? I am using an openssh key and not a putty key and the key file seems to be the correct one? (i.e. putting in the wrong key file gets a different error message) [19:48:55] Well, no one can see exactly what you've setup on your local machine [19:49:01] So it is some amount of "educated guessing" [19:53:02] sorry uhh lemme see if i can send a screenshot [19:53:42] https://imgur.com/a/ppKo64r [19:55:34] the SSH key file goes to a file that was generated like this: https://imgur.com/a/rJzfm3C [19:56:31] The Username and Password parameters were generated using 'cat replica.my.cnf' in the command line in toolforge [19:57:40] all other parameters are unchanged from their defaults https://imgur.com/a/sO0ikPK [19:58:34] (y) [20:00:50] Are you using an ssh agent? [20:01:07] And/or have you given mysqlworkbench the password for the key? [20:02:08] No, I'm not, and yes, I have (in the SSH Password parameter) [20:04:01] At least, I don't think I am using an ssh agent. Sorry, I don't know very much about this stuff--but I presume it wouldn't be possible for me to accidentally be using one [20:07:58] Just double chcking... the private key you're currently trying to use is the one you've uploaded, right? [20:08:05] ie ssh still works (rather, than it only did initially) [20:08:49] Does the test connection give you more advanced logs? It's a while since I touched workbench [20:09:23] By uploaded, do you mean if I've added the public key to toolforge? (yes, the private key corresponds to that key) [20:09:31] Yeah [20:09:38] and the key works when i use it on command line [20:09:44] Like you say, it was obviously right at one point [20:09:55] I just wanted to make sure you hadn't accidentally overwrritten it generating other keys with the same name [20:10:22] i.e. ssh -i "C:\Users\sench\.ssh\id_rsa" f4udeveloper@login.toolforge.org still works in powershell [20:11:12] i'm not sure about the more advanced logs, i'll try to see if it does [20:14:56] you might be pleased to know... I'm getting the exact same error [20:15:35] that's great haha [20:15:50] or maybe not would've been nicer if it was just some mistake on my part [20:15:52] I wonder if it's just broken on the latest build [20:15:58] should i downgrade? [20:16:00] i've gotten the logs now [20:16:07] I'm just trying it to see [20:16:20] would it be unwise to paste them all in this chat or is that fine [20:16:26] it's not thaaat big [20:16:27] yeah, use a pastebin ;) [20:16:33] unless its only a few lines [20:16:34] ah ok [20:16:56] https://pastebin.com/bASVzQzR here [20:19:03] what type of key are you using? [20:19:04] https://bugs.mysql.com/bug.php?id=94620 [20:19:26] as mine is ed25519 [20:19:38] i saw that, i'm using rsa as a result [20:20:04] Not according to those logs [20:20:12] 20:14:44 [INF][ SSHCommon]: libssh: ssh_kex_select_methods ssh_kex_select_methods: Negotiated curve25519-sha256,ssh-ed25519,chacha20-poly1305@openssh.com,chacha20-poly1305@openssh.com,aead-poly1305,aead-poly1305,none,none,, [20:20:27] huh this was how i generated it: https://imgur.com/a/rJzfm3C [20:20:44] look at the .pub file... what does it say? [20:21:00] oh [20:21:02] (The easiest workaround here.. may be to just do the ssh tunnel and mysql parts seperately) [20:21:09] no [20:21:12] it starts with ssh-rsa [20:21:53] the .pub file consists of 'ssh-rsa [the key] [my email]' [20:22:03] Some weird caching in workbench? [20:22:22] possibly? i'll try redoing this from scratch [20:22:30] restart it too [20:22:41] will do [20:22:59] I think it could indeed be an RSA key… I think kex methods are different from key types, and I’m not sure if they’re even connected at all [20:23:14] Wouldn't there be some mention in there... [20:23:16] (some RSA key exchange got deprecated and eventually removed due to being SHA1-based, but RSA keys still work) [20:23:42] >My inability to use an ed25519 key was because MySQL Workbench doesn't seem to deal with key passphrases. It worked after I removed passphrase from the key. [20:23:44] 2019 [20:23:45] * Reedy sighs [20:24:01] oh no [20:24:08] is that why? [20:24:25] march 2023 [20:24:26] >Version 8 now supports ed25519 but only if the private key is not password protected. ed25519 should also be able to be used if the private key is password protected. [20:24:44] [20:21:02] (The easiest workaround here.. may be to just do the ssh tunnel and mysql parts seperately) [20:25:05] https://wikitech.wikimedia.org/wiki/Help:Toolforge/Database#SSH_tunneling_for_local_testing_which_makes_use_of_Wiki_Replica_databases [20:25:11] Then you can use workbench ontop... [20:25:49] Which may be easier than trying to work out Oracle not wanting to fix bugs [20:27:16] Sorry in advance that I'm very unfamiliar with all of this, would I just enter "ssh -N f4udeveloper@dev.toolforge.org -L 3306:enwiki.analytics.db.svc.wikimedia.cloud:3306" in powershell? [20:27:49] That looks... right [20:28:14] “Version 8 now supports ed25519 but only if the private key is not password protected” wtf oracle [20:28:22] ikr? [20:29:06] https://imgur.com/a/XbqxcDf this is what i get when i do that... [20:29:36] You've not got mysql installed locally, do you? [20:32:45] it's either you need to pick another port [20:32:55] Or your powershell isn't running with enough (admin) permissions [20:34:25] >You've not got mysql installed locally, do you? [20:34:25] I might? Honestly no clue [20:35:12] Get-Process -Id (Get-NetTCPConnection -LocalPort 3306).OwningProcess [20:35:52] Handles  NPM(K)    PM(K)      WS(K)     CPU(s)     Id  SI ProcessName [20:35:52] -------  ------    -----      -----     ------     --  -- ----------- [20:35:53]     526     332   619344      11396      65.75   7232   0 mysqld [20:36:11] yeah [20:36:42] So change that first 3306 after -L for something else [20:36:45] (check if its free first) [20:39:03] uninstalled mysql server instead, dunno when it got installed, port is free now, and i've ran the command, and now it's just still [20:39:30] just still what? :P [20:39:41] you might need to kill the process in task manager [20:39:51] still as in the adjective lol https://imgur.com/a/6mHl899 here's what it looks like [20:40:07] Ohhh [20:40:11] Right, yeah, that's what we'd expect [20:40:19] you can setup workbench now [20:40:40] but use standard (TCP/IP) [20:40:48] hostname 127.0.0.1... [20:40:55] username and password as per your replica.my.cnf file [20:44:30] caught up (just reinstalling workbench lol) [20:44:34] do i put enwiki_p still in default schema? [20:45:01] Yeah, but possibly not completely necessary [20:45:46] i think it's worked omg [20:46:09] lemme run something to test [20:46:26] yes it works omg [20:46:27] i've been trying to get this to work all day thank you [20:47:41] https://wikitech.wikimedia.org/w/index.php?title=Help%3AToolforge%2FDatabase&diff=2243176&oldid=2243170 [20:47:46] Hopefully try and prevent someone else failing over this [20:48:40] Just try and remember for future.. You need to setup that tunnel manually every time etc [20:49:47] yeah I'm writing down everything I've done so I can remember this in the future, I'd definitely forget otherwise [20:49:59] \o/