[09:52:19] Hi, is there a way to see a more detailed log messages from a maintenance job? I'm running something like kubectl logs jobs/testkitchen-updateconfigs-29582510 mediawiki-main-app but I only see the messages thrown via $this->output. I wanted to see some log messages thrown via $logger->debug and so on. Is that possible? [09:52:21] Thanks! [09:56:56] hmm, i'd expect that to be available somewhere in logstash [09:58:52] i'm not sure if https://logstash.wikimedia.org/app/dashboards#/view/d51552d0-e309-11ef-87d0-9371e01d3c68?_a=h@5d63f37&_g=(time:(from:now-24h,to:now)) might have that information, it might be identical to the kubectl logs output... [10:00:19] <_joe_> sfaci: those would be in logstash yeah [10:00:54] <_joe_> if the job is running in debug logging mode, which I am not sure given I don't think we set loglevel: notice on cli, or that we ever did [10:01:13] <_joe_> but yes, logger messages get sent to logstash [10:05:39] Thanks! [10:05:45] I guess the job is not running in debug logging mode [10:05:49] is there a way to change this? [10:06:27] I have seen there is a way to run the job manually but I don't know it's a safe thing considering that it's also running automatically as scheduled [10:06:52] I was wondering if I can run it manually in debug logging mode to have the opportunity to see the whole output at least once [10:08:08] or maybe locally would be enough to see it in my local environment [10:11:11] I'm seeing already that locally is in debug mode automatically. I can see the whole output there [11:19:44] Hi again [11:19:56] Yesterday I mentioned something about a maintenance job failing some days ago [11:21:32] Now that I'm investigating an issue on production (where this job could be involved)m I'm seeing that the job has failed same way several times since then and two of them today [11:21:50] the reason is Warning: EtcdConfig failed to fetch data: (curl error: 28) Timeout was reached in /srv/mediawiki/php-1.46.0-wmf.21/includes/Config/EtcdConfig.php on line 180 [11:22:37] should we be worried about this? Can it affect what the job is doing? it's fetching/caching some data that TestKitchen extension needs [11:22:45] It's also true that the rest of the time is running fine [11:23:29] but according to logstash the job started failing some days ago I would say it never failed before [11:26:49] (i'm not an SRE but) T346971 seems similar [11:27:19] https://phabricator.wikimedia.org/T346971 (i forgot this channel doesn't have stashbot) [11:27:44] from https://phabricator.wikimedia.org/maniphest/query/xv2ahEu0h0Km/#R it looks like that error has been reported for other maintenance scripts before as well [11:28:04] I see. Pretty old ticket but it's still happening [11:28:35] in some cases is just a warning and the job seems to be running fine (at least the ones I have mentioned above) [11:28:41] Thanks for that context