[02:39:44] Anyone else experiencing issues with toolforge login? Currently I'm not able to SSH -- getting `Connection closed by 185.15.56.62 port 22` [03:29:50] Looks like dev.toolforge.org is working, but login.toolforge.org is down. [05:59:41] my cloud VPS proxy is not accessible anymore. Logging into horizon, it seems like the VM is online, and the networks are active. This is for ipoid-opensearch.wmcloud.org [07:01:48] kostajh: i do not see anything on the target instance listening on port 9200, where the proxy is set to send requests to [07:15:28] taavi: thanks for looking. I didn’t change anything, though. [07:41:30] I guess OpenSearch crashed, but I also cannot SSH into the box, seems to be a timeout [07:44:09] kostajh: weirdly enough i'm seeing no attempts for your user to log in there.. can you try with `ssh -v` and paste the output somewhere? [07:45:43] taavi: sorry, was using `ctrl-r` which had `ssh ipoid-opensearch.wmcloud.org` in my history, which does not work, but `ssh ipoidopensearch.ipoidopensearch.eqiad1.wikimedia.cloud` does. [07:46:01] ah, that'd do it [07:46:51] and indeed, OpenSearch was offline due to OOM killer [08:02:57] kostajh: btw, I note your security group uses a source IP other than what we recommend at https://wikitech.wikimedia.org/wiki/Help:Using_a_web_proxy_to_reach_Cloud_VPS_servers_from_the_internet#Security_groups, it'll break whenever we replace the proxy VMs (i.e. what I'm about to do later today) [08:06:34] mdaniels5757: I can login to both without issues, if you are still having problems, can you try adding `-vv` to your ssh command and pasting the output somewhere? [08:07:52] I see some ldap issues around the time you tried too :/ [08:08:04] `May 08 02:12:39 tools-bastion-13 sssd[2845878]: Child [3111197] ('wikimedia.org':'%BE_wikimedia.org') was terminated by own WATCHDOG. Consult corresponding logs to figure out the reason.` (UTC time) [08:08:09] that might have caused login issues [08:22:57] !log project-proxy migrating 185.15.56.49 floating IP to the new proxy cluster T379175 [08:23:01] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Project-proxy/SAL [08:23:03] T379175: Enable IPv6 for the Cloud VPS web proxy - https://phabricator.wikimedia.org/T379175 [08:36:15] !log project-proxy flip redis/API primary status to proxy-5 T379175 [08:36:17] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Project-proxy/SAL [08:36:17] T379175: Enable IPv6 for the Cloud VPS web proxy - https://phabricator.wikimedia.org/T379175 [08:40:45] !log project-proxy shutdown proxy-03/04 [08:40:47] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Project-proxy/SAL [09:22:08] Hi guys, can't `become` any tool. It is warning me that sudo requires a password. What can I do? [09:24:25] try now? [09:24:35] !log tools root@tools-bastion-13:~# systemctl restart sssd-sudo.socket # was in failed state [09:24:38] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [09:24:42] It worked. Thank you! [10:36:13] taavi: thanks, I have now updated the security groups per the doc you linked [10:53:43] !log wmflabsdotorg backfilling AAAA records for web proxies using wmflabs.org T379175 [10:53:46] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Wmflabsdotorg/SAL [10:53:47] T379175: Enable IPv6 for the Cloud VPS web proxy - https://phabricator.wikimedia.org/T379175 [11:31:59] !log cloudinfra backfilling AAAA records for web proxies using wmcloud.org T379175 [11:32:02] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Cloudinfra/SAL [11:32:02] T379175: Enable IPv6 for the Cloud VPS web proxy - https://phabricator.wikimedia.org/T379175 [13:05:34] !log lucaswerkmeister@tools-bastion-13 tools.ranker deployed d1e60efda3 (l10n updates: nl) [13:05:36] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.ranker/SAL [13:07:04] !log lucaswerkmeister@tools-bastion-13 tools.wd-image-positions deployed a330358f39 (l10n updates: hu, ka); also includes 8865bb0c67 (l10n updates: kaa) – apparently I forgot to `git rebase` after `git fetch` last time 🤦 [13:07:05] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.wd-image-positions/SAL [13:07:22] (insert use case for push-to-deploy here) [14:43:40] !log lucaswerkmeister@tools-bastion-13 tools.wdactle deployed 054a3064ca (minor code fixes, setting up ESLint and GitLab CI; should have no functional changes) [14:43:43] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.wdactle/SAL [15:54:23] !log lucaswerkmeister@tools-bastion-13 tools.wdactle deployed 7ef1b88fa0 (logical viewport lengths) [an hour ago, the initial dologmsg failed and I didn’t notice] [15:54:25] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.wdactle/SAL [15:59:29] !log lucaswerkmeister@tools-bastion-13 tools.wdactle deployed b6a2243821 (better mobile layout) [15:59:32] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.wdactle/SAL [19:14:38] can't ssh to toolforge bastion 13, connection just closes [19:15:50] looks like it accepted my key first, then connection closed [19:17:04] huh, indeed [19:17:05] sssd's sad, again [19:17:12] I still have an open connection and that one’s working fine though [19:18:22] we've seen this enough recently that I've created the very bare-bones T393732 to track them [19:18:22] T393732: sssd/LDAP flakiness (May 2025) - https://phabricator.wikimedia.org/T393732 [19:19:04] that particular host seems to be back working [20:21:45] toolforge issues? can't `ssh login.toolforge.org` [20:23:48] T393732 strikes again [20:23:49] T393732: sssd/LDAP flakiness (May 2025) - https://phabricator.wikimedia.org/T393732 [20:26:14] I'm not sure it's ldap, that host just seems generally upset. [20:28:11] it seems to have been having issues on and off for the last ~18 hours [20:33:56] There was a very cpu-hungry job running there that I killed. Any better? [20:34:36] still getting "Connection closed by 185.15.56.62 port 22" [20:35:24] hmmm [20:35:27] and I can't sudo [20:35:35] So maybe two problems at once [20:38:19] oh i just got in. it's very slow though [20:38:35] I restarted sssd, seems to have improved things somewhat [20:38:46] but still doesn't seem quite right [20:40:09] ok, basically everything seems fast and normal now. Not sure if I fixed it or if ldap got healthy on its own... [22:31:01] !log bd808@tools-bastion-12 tools.gitlab-content Build new image from 2952bfc9 [22:31:03] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.gitlab-content/SAL [22:32:12] !log bd808@tools-bastion-12 tools.gitlab-content Restarted to deploy new image from 2952bfc9 [22:32:13] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.gitlab-content/SAL