[00:20:12] Hello? [00:22:21] I'm using Tool Labs. I had created a table at host tools-db. [00:22:26] I find that I can insert data into the table, but I can't "LOAD DATA INFILE" into the table. [00:22:31] Can someone help me? [00:22:31] Hi kanashimi, just ask! There is no need to ask if you can ask [00:24:33] kanashimi, with the LOCAL keyword? [00:24:45] It says "ERROR 1045 (28000): Access denied for user 'user'@'%' (using password: YES)". [00:25:22] You're not going to be allowed to read files remotely on tools-db [00:27:22] Thank you Krenair. It's OK now. [00:28:41] Krenair: May I store file at tools-db? [00:28:56] no [02:19:07] 6Labs, 6Discovery, 10Maps: Replacements for a.toolserver.org, b.toolserver.org, c.toolserver.org not available - https://phabricator.wikimedia.org/T103272#2135999 (10scfc) >>! In T103272#1896916, @Yurik wrote: > @scfc - I'm working on that as part of the [[ https://www.mediawiki.org/wiki/Extension:Kartograph... [06:48:35] PROBLEM - Puppet run on tools-exec-gift is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [06:50:29] PROBLEM - Puppet run on tools-webgrid-lighttpd-1415 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [08:34:53] PROBLEM - Host tools-bastion-01 is DOWN: CRITICAL - Host Unreachable (10.68.17.228) [08:53:02] PROBLEM - Puppet run on tools-worker-1008 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [0.0] [08:56:00] PROBLEM - Puppet run on tools-exec-1205 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [08:56:00] PROBLEM - Puppet run on tools-proxy-01 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [0.0] [08:57:04] PROBLEM - Puppet run on tools-webgrid-lighttpd-1404 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [08:57:20] PROBLEM - Puppet run on tools-webgrid-lighttpd-1414 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [08:58:52] PROBLEM - Puppet run on tools-webgrid-lighttpd-1208 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [09:03:00] PROBLEM - Puppet run on tools-exec-1202 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [0.0] [09:03:24] PROBLEM - Puppet run on tools-exec-1403 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [09:03:46] PROBLEM - Puppet run on tools-worker-1003 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [09:05:18] PROBLEM - Puppet run on tools-bastion-05 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [09:07:06] PROBLEM - Puppet run on tools-bastion-mtemp is CRITICAL: CRITICAL: 70.00% of data above the critical threshold [0.0] [09:07:23] PROBLEM - Puppet run on tools-precise-dev is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [09:07:23] PROBLEM - Puppet run on tools-webgrid-lighttpd-1405 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [09:07:29] PROBLEM - Puppet run on tools-exec-1407 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [09:07:57] PROBLEM - Puppet run on tools-exec-1221 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [09:08:07] PROBLEM - Puppet run on tools-exec-1401 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [09:08:47] where do I find the lag time between labs, and English Wikisource? [09:08:54] 6Labs, 13Patch-For-Review: Convert all ldap globals into hiera variables instead - https://phabricator.wikimedia.org/T101447#2136156 (10yuvipanda) a:5yuvipanda>3None Unlicking cookie! [09:09:35] PROBLEM - Puppet run on tools-exec-1201 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [09:11:21] PROBLEM - Puppet run on tools-exec-1406 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [09:13:03] PROBLEM - Puppet run on tools-exec-1410 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [09:13:49] PROBLEM - Puppet run on tools-exec-1217 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [09:13:49] PROBLEM - Puppet run on tools-k8s-master-01 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [09:13:50] PROBLEM - Puppet run on tools-mail-01 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [09:14:35] PROBLEM - Puppet run on tools-exec-gift is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [09:19:02] PROBLEM - Puppet run on tools-worker-1006 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [0.0] [09:19:37] bah I cannot login to labs [09:21:00] PROBLEM - Puppet run on tools-webgrid-lighttpd-1210 is CRITICAL: CRITICAL: 90.00% of data above the critical threshold [0.0] [09:21:34] PROBLEM - Puppet run on tools-exec-1220 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [09:21:44] PROBLEM - Puppet run on tools-exec-1209 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [09:23:00] PROBLEM - Puppet run on tools-k8s-bastion-01 is CRITICAL: CRITICAL: 70.00% of data above the critical threshold [0.0] [09:23:01] PROBLEM - Puppet run on tools-webgrid-lighttpd-1411 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [09:27:03] Disconnected: No supported authentication methods available (server sent: publickey) [09:27:20] using Putty, and I have changed nothing since I last logged in a couple of weeks ago [09:31:38] and ConnectBot on my phone doesn't work either [09:53:33] hi there, I’m unable to ssh onto tools-login.wmflabs.org and login.tools.wmflabs.org (user ireas) [09:53:39] did I miss something? [09:53:45] Permission denied (publickey,hostbased). [10:04:11] 6Labs, 10Labs-Infrastructure: Unable to SSH onto tools-login.wmflabs.org - https://phabricator.wikimedia.org/T130446#2136202 (10Ireas) [10:07:05] ireas: as you reported in your bug, yep, problem for me too [10:07:48] there were a lot of shinken alerts from 8:53 to 9:23 utc [10:14:15] oh, I see. what does ‘data above threshold’ mean? memory? disk space? [10:21:26] i guess it means files managed by puppet [10:43:49] 6Labs, 10Labs-Infrastructure: Unable to SSH onto tools-login.wmflabs.org - https://phabricator.wikimedia.org/T130446#2136276 (10hashar) The labs LDAP has some kind of troubles apparently. [08:52:28] PROBLEM - Labs LDAP on seaborgium is CRITICAL: Could not bind to the LDAP server I can't authent... [10:48:13] 6Labs, 10Labs-Infrastructure: Unable to SSH onto tools-login.wmflabs.org - https://phabricator.wikimedia.org/T130446#2136280 (10hashar) I have poked the internal operations list. Can't further babysit this task right now though :-( [10:51:52] !log Labs LDAP is probably down. T130446 Cant log to tools-login.wmflabs.org / Jenkins interface and Nodepool yields error 500 communicating with OpenStack API [10:51:53] T130446: Unable to SSH onto tools-login.wmflabs.org - https://phabricator.wikimedia.org/T130446 [10:51:53] Labs is not a valid project. [11:05:20] Hi, can someone tell me, why all my public keys at wikitech are gone? [11:05:43] Luke081515: https://phabricator.wikimedia.org/T130446 [11:05:47] LDAP seems to be down [11:06:22] ireas: Thanks [11:06:40] 6Labs, 10Labs-Infrastructure: Unable to SSH onto tools-login.wmflabs.org - https://phabricator.wikimedia.org/T130446#2136316 (10Luke081515) p:5Triage>3Unbreak! [11:38:31] Luke081515 ireas can you try again? [11:43:59] 6Labs, 10Labs-Infrastructure: Unable to SSH onto tools-login.wmflabs.org - https://phabricator.wikimedia.org/T130446#2136202 (10fgiunchedi) looks like slapd got oom-killed, I've restarted it on seaborgium ```lines=4 Mar 19 08:48:29 seaborgium puppet-agent[8502]: Caching catalog for seaborgium.wikimedia.org Ma... [11:53:03] RECOVERY - Puppet run on tools-exec-1410 is OK: OK: Less than 1.00% above the threshold [0.0] [11:53:47] RECOVERY - Puppet run on tools-k8s-master-01 is OK: OK: Less than 1.00% above the threshold [0.0] [11:56:01] RECOVERY - Puppet run on tools-webgrid-lighttpd-1210 is OK: OK: Less than 1.00% above the threshold [0.0] [11:57:58] RECOVERY - Puppet run on tools-k8s-bastion-01 is OK: OK: Less than 1.00% above the threshold [0.0] [11:59:27] RECOVERY - Puppet run on tools-webgrid-lighttpd-1415 is OK: OK: Less than 1.00% above the threshold [0.0] [12:03:03] RECOVERY - Puppet run on tools-webgrid-lighttpd-1411 is OK: OK: Less than 1.00% above the threshold [0.0] [12:03:03] RECOVERY - Puppet run on tools-worker-1008 is OK: OK: Less than 1.00% above the threshold [0.0] [12:06:04] RECOVERY - Puppet run on tools-exec-1205 is OK: OK: Less than 1.00% above the threshold [0.0] [12:07:04] RECOVERY - Puppet run on tools-webgrid-lighttpd-1404 is OK: OK: Less than 1.00% above the threshold [0.0] [12:07:50] 6Labs, 10Labs-Infrastructure: Unable to SSH onto tools-login.wmflabs.org - https://phabricator.wikimedia.org/T130446#2136202 (10MoritzMuehlenhoff) serpens is still running fine. All labs instances use both serpens and seaborgium in their LDAP client config. tools-login and nodetool should also be converted to... [12:12:04] RECOVERY - Puppet run on tools-bastion-mtemp is OK: OK: Less than 1.00% above the threshold [0.0] [12:12:20] RECOVERY - Puppet run on tools-webgrid-lighttpd-1405 is OK: OK: Less than 1.00% above the threshold [0.0] [12:13:06] RECOVERY - Puppet run on tools-exec-1202 is OK: OK: Less than 1.00% above the threshold [0.0] [12:16:34] 6Labs, 10Labs-Infrastructure: Unable to SSH onto tools-login.wmflabs.org - https://phabricator.wikimedia.org/T130446#2136415 (10Billinghurst) Now working for me. Thanks. [12:30:14] !log zulip restart nslcd on zulip-01 [12:30:17] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Zulip/SAL, Master [12:34:50] !log zulip service supervisor stop, causing high traffic from ldap server T130446 [12:34:51] T130446: Unable to SSH onto tools-login.wmflabs.org - https://phabricator.wikimedia.org/T130446 [12:34:53] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Zulip/SAL, Master [12:35:12] 6Labs, 10Labs-Infrastructure: Unable to SSH onto tools-login.wmflabs.org - https://phabricator.wikimedia.org/T130446#2136420 (10fgiunchedi) possibly related, `nslcd` on `zulip-01` was causing ~3MB/s of outgoing traffic on serpens and now seaborgium after a `service nslcd restart`. Likely due to fast-respawning... [12:42:49] 6Labs, 10Labs-Infrastructure: Unable to SSH onto tools-login.wmflabs.org - https://phabricator.wikimedia.org/T130446#2136426 (10Ireas) Working for me too, thanks! Do you want to leave this task open to investigate the cause of the problem, or should I close it? [12:59:58] 6Labs, 10Labs-Infrastructure: Unable to SSH onto tools-login.wmflabs.org - https://phabricator.wikimedia.org/T130446#2136433 (10tom29739) Working for me now. [13:00:55] 6Labs, 10Labs-Infrastructure: Unable to SSH onto tools-login.wmflabs.org - https://phabricator.wikimedia.org/T130446#2136435 (10fgiunchedi) p:5Unbreak!>3Normal thanks @ireas we can leave it open as there's some followup to do still! [13:05:54] 6Labs, 10Labs-Infrastructure: Unable to SSH onto tools-login.wmflabs.org - https://phabricator.wikimedia.org/T130446#2136446 (10hashar) All back for me as well. Thanks @fgiunchedi and @MoritzMuehlenhoff For the record, Jenkins solely relied on `ldap-labs.eqiad.wikimedia.org`, (seaborgium), I have added the o... [13:14:22] i changed my pw and fixed the my.replica file... but sometimes (!!!) i get _mysql_exceptions.OperationalError: (1045, "Access denied for user 's51916'@'10.68.23.74' (using password: YES)") [13:14:30] maybe there is a synchronization issue? [13:15:51] tools.sbot@tools-bastion-05:~$ sql metawiki [13:15:51] ERROR 1045 (28000): Access denied for user 's51916'@'10.68.23.74' (using password: YES) [13:15:59] but it works with commons. STRANGE. [13:17:19] Steinsplitter: 'changed your password'? [13:17:45] i changed the mysql pw for s51916, yes [13:18:13] SET PASSWORD = PASSWORD('mynewpw'); [13:18:17] On all five (I think?) servers? [13:18:52] just on commonswiki.labsdb (which is s4) [13:19:03] okay, ooks like i have to change on all dbs :( [13:19:08] Yes, so now on all other servers you still have your old password. [13:19:16] ok, thx [13:19:28] changing mysql passwords is not really a supported operation [13:31:07] 6Labs: designatedashboard monkeypatch for proxy records - https://phabricator.wikimedia.org/T130151#2136457 (10Danny_B) [15:07:38] !log rcm.cac preparing git pull [15:07:39] rcm.cac is not a valid project. [15:07:43] meh [15:07:50] valhallasw`cloud: tools.ircredirector is burning 90% cpu on the login node [15:07:53] Can you kill it? [15:08:23] 4525 tools.i+ 20 0 43044 5512 1664 R 94.5 0.1 10871:27 bash [15:09:07] !log rcm.cac preparing git pull [15:09:07] rcm.cac is not a valid project. [15:09:13] strange [15:10:33] tom29739: please don't run high-cpu loads on login nodes :-) [15:11:27] valhallasw`cloud, what is running? I'm not on that user at the moment, so I don't know what's causing it. [15:11:42] tom29739: a bash process at 99% cpu [15:11:53] (killed now) [15:12:04] Strange. Thanks for letting me know. [15:16:30] multichill: I cleaned up some other long-running cpu-slurping processes while I was at it [15:32:43] !log ores workers_per_core 32 --> 28 [15:32:46] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Ores/SAL, Master [15:33:31] * halfak re-runs puppet [15:34:01] can someone help me with sshing as root to an instance? [15:34:30] Luke081515, maybe I can. What instance? Is it a project you are an admin of? [15:34:47] I'm admin of the rcm project, and the instance is cac [15:35:14] Luke081515, when you say "ssh as root", is it OK to just ssh as yourself and then run commands via "sudo"? [15:35:32] luke081515@bastion-01:~$ ssh cac [15:35:32] ssh_exchange_identification: read: Connection reset by peer [15:35:38] :-/ [15:36:18] Hmmm... Do you have an alias in your ssh for cac? [15:36:29] I don't think so [15:36:57] ssh cac connects me normaly to cac.rcm.eqiad.wmflabs [15:37:04] What machine are you ssh-ing from? [15:37:11] from bastion-01 [15:37:28] Gotcha. Ever logged into the machine before? [15:37:42] yeah, that worked [15:38:33] OK. So I suspect that there is something up with the instance. [15:39:26] Luke081515, would it be OK to reboot it via wikitech? [15:43:03] valhallasw`cloud, around? [15:43:08] halfak: si [15:43:19] Looks like this instance might have been borked, but Luke081515 needs to get a copy of a DB from it. [15:43:38] I wonder if you can use your viking powers (root) to help us. [15:43:39] ;) [15:44:16] root access doesn't help if the instance just closes the connection ;-) [15:44:34] heh. Was hoping you had some ideas/tricks [15:44:51] apart from rebooting not really [15:44:51] Or maybe a way to mount the VM's filesystems elsewhere [15:45:06] rebooting might be a fine option [15:45:12] those are viking powers I do not possess, unfortunately :-) [15:45:20] Luke081515, what do you think about a reboot? [15:45:31] I can try it [15:45:42] Can't really get worse. [15:45:54] reboot started [15:45:55] I suppose we might not get to a useful run-lebel [15:45:59] *level [15:48:12] !log ores Ran puppet and restarted uwsgi on web-01 with 28 forks rather than 32 [15:48:14] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Ores/SAL, Master [15:49:10] halfak: The reboot helped, I get another error now ;) [15:49:13] luke081515@bastion-01:~$ ssh cac [15:49:13] Permission denied (publickey). [15:49:15] Yay! [15:49:19] That's a good error :) [15:49:23] valhallasw`cloud, ^ [15:49:33] sec [15:50:19] it doesn't take my root key [15:50:37] is it possible that puppet hasn't run there for a while? [15:52:11] possible [15:52:40] seems like other processes stopped two, NAGF shows less CPU usage [15:56:04] RECOVERY - Puppet run on tools-exec-1401 is OK: OK: Less than 1.00% above the threshold [0.0] [16:03:07] If I want to enable 2FA, do I have to install an App oder someting else? [16:06:23] google authenticator or similar [16:06:41] but you probably shouldn't, because it will make recovery difficult if you ever lose your phone [16:07:12] otherwise I can't use horizon :-/ [16:07:19] and there currently is no good policy in place for recovery (other than 'show up at the WMF and have someone vet it's really you'0 [16:07:35] horizon will require it. [16:08:00] So it's not just a temporary requirement andrewbogott [16:08:21] the 'no recovery policy' part should be temporary ;-) [16:10:45] ^ +1 [16:12:25] is there a way to fix my instance, so that I can login again? [16:15:08] 6Labs, 10Horizon: Horizon - Can't execute actions - https://phabricator.wikimedia.org/T127440#2136690 (10Luke081515) 5Open>3Resolved I enabled 2FA, it works now. [16:17:17] Luke081515: file a bug and ask andrewbogott to take a look at it [16:17:33] ok [16:20:47] andrewbogott: Are you here? [16:33:51] Anyone here have +2 for puppet? I'm blocked on getting this merged: https://gerrit.wikimedia.org/r/#/c/278413/ [16:43:30] !log ores Manually ran `sudo apt-get install aspell-ar aspell-pl` across web and worker nodes [16:43:33] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Ores/SAL, Master [17:27:57] !log list [17:27:58] Message missing. Nothing logged. [17:28:07] log list [17:28:44] hey, anyone can add "!log" command to #wikimedia-ai for https://wikitech.wikimedia.org/wiki/Nova_Resource:Ores ? [18:40:07] 10Labs-Other-Projects: Succesful pilot of Discourse on https://discourse.wmflabs.org/ as an alternative to wikimedia-l mailinglist - https://phabricator.wikimedia.org/T124690#1962887 (10ThurnerRupert) discourse has too many weak points imo: 1. it is not a proper mailing list server. it e.g. reformats mails, cuts... [19:48:34] RECOVERY - Puppet run on tools-exec-1220 is OK: OK: Less than 1.00% above the threshold [0.0] [19:52:08] ores-web-04.eqiad.wmflabs seems to be unreachable [19:52:16] I can't reboot it with the wikitech interface. [19:52:20] Any ideas? [19:52:31] I'd really like to get the darn thing rebooted. [19:53:13] halfak: does it get in state 'REBOOT' or does it stay in 'RUNNING'? [19:53:32] valhallasw`cloud, I get a little modal that says "Failed to reboot instance ores-web-04." [19:53:38] No state changes or anything. [19:53:39] O_o [19:53:42] I can't even get console output [19:53:48] Kill it? [19:54:38] halfak: https://wikitech.wikimedia.org/w/index.php?title=Special:NovaInstance&action=consoleoutput&instanceid=7d8fe37f-e3fd-4fb4-ad9f-6de4e1e88acb&project=ores®ion=eqiad works for me [19:54:48] OK if I try to reboot it? [19:55:28] valhallasw`cloud, yes please :) [19:55:44] "Rebooted instance 7d8fe37f-e3fd-4fb4-ad9f-6de4e1e88acb." [19:55:47] * valhallasw`cloud crosses fingers [19:56:05] halfak: could be a weird wikitech state issue, I guess? logging out/in again might have solved it [19:56:20] valhallasw`cloud, can't access the machine via SSH either [19:56:25] That's what brought me to wikitech [19:56:36] halfak: no, it was hanging according to the console output [19:57:04] but because you also couldn't get console output (while it worked for me), that suggests to me it might have to do something with wikitech rather than openstack itself [19:57:25] Oh. It didn't work for me for a while, but right after you said it worked for you, I tried again and it did work [19:57:30] But reboot still did not work [19:57:34] RECOVERY - Puppet run on tools-exec-1209 is OK: OK: Less than 1.00% above the threshold [0.0] [19:57:47] mmm. [19:58:02] * halfak watches uwsgi come back to life [19:58:02] one option is to right-click the link and open in new tab [19:58:03] :) [19:58:11] that forces a full request rather than something ajax-y [19:58:41] again, not sure if that would have fixed it, but it's always good to have some other tricks up a sleeve ;-) [19:59:07] Good to know. Thanks [19:59:15] * halfak waits for new HORIZONs [20:04:20] !log ores-web-01/02 and ores-worker-01/02/03/04 deleted. ores-web-03/04/05 and ores-worker-05/06/07/08/09/10 started and configured as replacements. [20:04:22] ores-web-01/02 is not a valid project. [20:04:27] !log ores ores-web-01/02 and ores-worker-01/02/03/04 deleted. ores-web-03/04/05 and ores-worker-05/06/07/08/09/10 started and configured as replacements. [20:04:30] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Ores/SAL, Master [20:04:33] <3 labs-morebots [20:32:57] 6Labs, 10Labs-Infrastructure, 10Labs-Other-Projects: Cannot login into cac.rcm.eqiad.wmflabs - https://phabricator.wikimedia.org/T130471#2136953 (10Luke081515) [21:05:18] Coren: andrewbogott: There's no way to call the instance cac.rcm.eqiad.wmflabs / what is wrong, 502 Bad Gateway [21:05:38] can you help please [21:06:23] valhallasw`cloud : maybe you can help, too, can't you? [21:06:52] doctaxon: 21:31 Labs, Labs-Infrastructure, Labs-Other-Projects: Cannot login into cac.rcm.eqiad.wmflabs - https://phabricator.wikimedia.org/T130471#2136953 (Luke081515) [21:08:04] 6Labs, 10Labs-Infrastructure, 10Labs-Other-Projects: Cannot login into cac.rcm.eqiad.wmflabs - https://phabricator.wikimedia.org/T130471#2136979 (10doctaxon) p:5Triage>3Unbreak! [21:08:39] 6Labs, 10Labs-Infrastructure, 10Labs-Other-Projects: Cannot login into cac.rcm.eqiad.wmflabs - https://phabricator.wikimedia.org/T130471#2136953 (10doctaxon) The instance is urgently needed for bot services [21:09:52] doctaxon: please don't set priority if you're not in the group of people who are actually going to fix it [21:10:49] who does fix it? [21:12:23] either Luke081515 builds a new host, or one of the people with access to the virtualization hosts figure out what's going on [21:12:35] the latter is most likely not going to happen before monday [21:15:27] 6Labs, 10Labs-Infrastructure, 10Labs-Other-Projects: Cannot login into cac.rcm.eqiad.wmflabs - https://phabricator.wikimedia.org/T130471#2136983 (10valhallasw) p:5Unbreak!>3Triage This seems to be another case of IP reuse: ``` valhallasw@tools-bastion-05:~$ ping cac.rcm.eqiad.wmflabs PING cac.rcm.eqiad.... [21:20:09] valhallasw`cloud: is it possible to generate a new publickey to access the instance [21:20:57] a different key won't help [21:21:16] what else? [21:21:22] doctaxon: no, because there seems to be an IP mixup somehow, so it's impossible to contact the host over the network at the moment [21:21:41] at the moment, apart from rebuilding the host, there's not much that can be done [21:30:53] Hello. When I do "sql metawiki" labs tell me to enter a password. Which one is the system refering to? [21:32:24] mafk: the sql password in your replica.my.cnf [21:32:35] ah, I don't have that [21:32:41] if it asks for a password, that suggests it can't read that file somehow [21:32:44] new tool? [21:32:56] I'm doing it from my labs account directly [21:33:16] I guess I can "become stewardbots" and do it? [21:33:34] yes, but your regular user should also have a replica.my.cnf [21:34:18] maurelio@tools-bastion-05:~$ ls --> gives no files [21:34:40] Some of the [21:34:59] *Sometimes it doesn't get generated for some reason. [21:35:31] mafk: odd. Please file a bug -- it should be there, and it's a bug if it's not [21:35:49] Okay, I'll do that. Private bug I guess? [21:35:51] it's probably an issue where the credentials were created but the file was not written yet, or something like that [21:35:58] mafk: Are you logged into a tool? [21:35:58] There is an existing bug report for it I think. [21:36:13] Luke081515: nope, just to maurelio@tools-bastion05 [21:36:22] I can become a tool if needed [21:36:29] * Luke081515 can see his repilca [21:36:31] *replica [21:37:35] mafk: just a regular bug is fine -- it's not a security issue [21:40:30] 6Labs, 10Labs-Infrastructure, 6Operations: Some labs instances IP have multiple PTR entries in DNS - https://phabricator.wikimedia.org/T115194#2137010 (10hashar) [21:43:29] 6Labs, 10Labs-Infrastructure, 10Labs-Other-Projects: Cannot login into cac.rcm.eqiad.wmflabs - https://phabricator.wikimedia.org/T130471#2137014 (10hashar) Here what DNS gives us for the name -> IP: ``` $ dig +short ALL cac.eqiad.wmflabs 10.68.17.44 $ dig +short ALL cac.rcm.eqiad.wmflabs 10.68.17.44 ``` Th... [22:11:35] 6Labs, 10Labs-Infrastructure, 10Labs-Other-Projects: Cannot login into cac.rcm.eqiad.wmflabs - https://phabricator.wikimedia.org/T130471#2137032 (10Luke081515) @hashar I added you as a member, maybe you can take a look again? [22:13:03] 6Labs, 10Labs-Infrastructure, 10Labs-Other-Projects: Cannot login into cac.rcm.eqiad.wmflabs - https://phabricator.wikimedia.org/T130471#2137035 (10valhallasw) You're right, 10.68.17.44 does refer to cac: - cac's eth0 mac is fa:16:3e:a4:0a:1e - thus its ipv6 link-local address is fe80::f816:3eff:fea4:a1e...