[07:18:15] <Alien333>	 I'm probably being stupid again, but I can't ssh to toolforge since yesterday. All I get is "Connection closed by 185.15.56.62 port 22"
[07:19:27] <Alien333>	 can someone tell me what exactly that means?
[07:31:33] <Alien333>	 to be precise, the complete log ends with
[07:31:34] <Alien333>	 debug1: Server accepts key: /home/a/.ssh/id_ed25519 ED25519 SHA256:[...] agent
[07:31:34] <Alien333>	 debug3: sign_and_send_pubkey: using publickey-hostbound-v00@openssh.com with ED25519 SHA256:[...]
[07:31:35] <Alien333>	 debug3: sign_and_send_pubkey: signing using ssh-ed25519 SHA256:[...]
[07:31:35] <Alien333>	 debug3: send packet: type 50
[07:31:36] <Alien333>	 Connection closed by 185.15.56.62 port 22
[11:50:19] <lucaswerkmeister>	 (FTR, the above messages were also reported at T393829 which was then marked as a duplicate of T393732)
[11:50:20] <stashbot>	 T393829: Ssh to toolforge failing with "Connection closed by 185.15.56.62 port 22" - https://phabricator.wikimedia.org/T393829
[11:50:20] <stashbot>	 T393732: Toolforge bastion sssd/LDAP flakiness (May 2025) - https://phabricator.wikimedia.org/T393732
[11:52:20] <lucaswerkmeister>	 !log root@tools-bastion-13:~# systemctl restart sssd-pam{,{,-priv}.socket} # all three failed with start-limit-hit / Start request repeated too quickly; T393732?
[11:52:22] <stashbot>	 lucaswerkmeister: Unknown project "root@tools-bastion-13:~#"
[11:52:38] <lucaswerkmeister>	 !log tools root@tools-bastion-13:~# systemctl restart sssd-pam{,{,-priv}.socket} # all three failed with start-limit-hit / Start request repeated too quickly; T393732?
[11:52:40] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL
[11:53:16] <lucaswerkmeister>	 !log tools T393732 note: restart of sssd-pam.service actually failed, “may be requested by dependency only”; overall it still seems to have worked though (so next time restarting the sockets is probably sufficient)
[11:53:19] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL
[11:57:57] <lucaswerkmeister>	 right now it seems to be working for me again
[14:07:39] <kanashimi>	 Hi, I can't run `become <mytool>`. Have we changed the way to login?
[14:08:18] <kanashimi>	 It says "sudo: a password is required"
[14:08:22] <wm-bb>	 <nokibsarkar> try `dev.toolforge.org` instead of `login.toolforge.org`;
[14:08:56] <wm-bb>	 <nokibsarkar> @kanashimi
[14:10:29] <lucaswerkmeister>	 !log tools root@tools-bastion-13:~# systemctl restart sssd-sudo.socket # service-start-limit-hit, T393732?
[14:10:33] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL
[14:10:33] <stashbot>	 T393732: Toolforge bastion sssd/LDAP flakiness (May 2025) - https://phabricator.wikimedia.org/T393732
[14:10:57] <kanashimi>	 Thank you. It works. What happens to login.toolforge.org?
[14:11:12] <lucaswerkmeister>	 it’s been having issues for a few days (see the task above) :(
[14:13:38] <taavi>	 lucaswerkmeister: sigh, so my fix attempt from earlier clearly didn't work?
[14:13:47] <kanashimi>	 OK I see
[14:13:49] <lucaswerkmeister>	 seems so, yeah :(
[14:14:07] <lucaswerkmeister>	 I’ve just been blindly restarting the stuff in `systemctl --failed` in the hope that it helps at least temporarily
[14:14:28] <lucaswerkmeister>	 (though this time I actually wasn’t able to reproduce the error, `become` still worked for me. maybe it was cached somewhere)
[14:32:52] <taavi>	 https://phabricator.wikimedia.org/T393732#10809252, unfortunately any of those will probably have to wait until business hours on Monday
[16:20:46] <wm-bb>	 <marufhasan24> I'm getting Connection closed by 185.15.56.62 port 22
[16:22:02] <lucaswerkmeister>	 !log tools systemctl restart sssd-{pam{,-priv},sudo}.socket # service-start-limit-hit, T393732?
[16:22:06] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL
[16:22:06] <stashbot>	 T393732: Toolforge bastion sssd/LDAP flakiness (May 2025) - https://phabricator.wikimedia.org/T393732
[16:22:48] <lucaswerkmeister>	 went right down again :(
[16:23:17] <wm-bb>	 <lucaswerkmeister> you can try dev.toolforge.org instead, that might work better at the moment (re @marufhasan24: I'm getting Connection closed by 185.15.56.62 port 22)
[17:33:58] <lucaswerkmeister>	 !log tools root@tools-bastion-13:~# systemctl reset-failed sssd-{pam,sudo}.service && systemctl restart sssd-pam{,-priv}.socket # try to reset the rate limits this way (T393732)
[17:34:02] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL
[17:34:02] <stashbot>	 T393732: Toolforge bastion sssd/LDAP flakiness (May 2025) - https://phabricator.wikimedia.org/T393732
[17:35:53] <lucaswerkmeister>	 !log root@tools-bastion-13:~# systemctl restart sssd-sudo{,.socket} # looks like the reset-failed didn’t work properly, systemd didn’t even try to start the service again afaict (T393732)
[17:35:53] <stashbot>	 lucaswerkmeister: Unknown project "root@tools-bastion-13:~#"
[17:35:56] <lucaswerkmeister>	 !log tools root@tools-bastion-13:~# systemctl restart sssd-sudo{,.socket} # looks like the reset-failed didn’t work properly, systemd didn’t even try to start the service again afaict (T393732)
[17:35:59] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL
[17:43:01] <lucaswerkmeister>	 (FTR, these restarts didn’t work, I left a comment on the task)
[17:44:14] <lucaswerkmeister>	 !status login.toolforge.org bastion unstable, dev.toolforge.org may work better
[17:47:15] <taavi>	 lucaswerkmeister: you need to start/stop the socket unit (sssd-sudo.socket), doing anything on the service itself is going to do nothing useful
[17:47:42] <lucaswerkmeister>	 I had restarted both (brace expansion)
[17:48:15] <lucaswerkmeister>	 but restarting just the socket didn’t seem to bring the service out of the rate-limited state
[18:48:58] <Guest85>	 Looking for a FOSS website analytics solution that doesn't need a lot of storage space on the database, is easy to set up on Toolforge and that can be used with static webpages. Any recommendations?
[18:56:52] <Guest85>	 or maybe there are statistics for 'tools-static' published somewhere already?
[19:29:17] <LD>	 Hi, I might need some help. I changed my SSH on toolforge but I fail to connect to the server.
[19:30:43] <lucaswerkmeister>	 the main bastion server is having some issues at the moment; dev.toolforge.org might work better
[19:31:11] <LD>	 oh that might explain
[19:31:18] <LD>	 what about login.toolforge.org?
[19:31:56] <lucaswerkmeister>	 that’s the one with issues
[19:32:32] <LD>	 any phab ticket related to it?
[19:33:27] <lucaswerkmeister>	 yes, T393732
[19:33:27] <stashbot>	 T393732: Toolforge bastion sssd/LDAP flakiness (May 2025) - https://phabricator.wikimedia.org/T393732
[19:34:14] <LD>	 thanks
[19:46:52] <LD>	 by any chance, can we webrestart or even stop letaxobot.toolforge.org?
[19:58:30] <lucaswerkmeister>	 !log tools.letaxobot webservice restart (per request on behalf of tool maintainer, as the bastion is having issues atm)
[19:58:32] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.letaxobot/SAL
[19:58:47] <lucaswerkmeister>	 seems to be responding again, yay
[19:59:36] <LD>	 thanks lucaswerkmeister ; btw something is weird about this tool, like I have to webrestart it times to times or it fails at some point
[20:00:47] <lucaswerkmeister>	 hm, there’s a bunch of noise in error.log that doesn’t look related
[20:01:05] <lucaswerkmeister>	 possibly there was something else in the pod logs that is now gone due to the restart :S
[20:02:18] <lucaswerkmeister>	 a health check (https://wikitech.wikimedia.org/wiki/Help:Toolforge/Web#Health_checks) *might* help? but I’ve never tried them on a PHP / lighttpd webservice
[20:02:29] <lucaswerkmeister>	 (that’s a suggestion for later, once the bastion works again, not now ^^)
[20:02:58] <LD>	 indeed, thanks I'll try later