[09:54:09] 06serviceops, 10MediaWiki-Special-pages, 06ServiceOps new, 07Wikimedia-production-error: MediaWiki periodic job update-special-pages-s2 failed - https://phabricator.wikimedia.org/T416440#11582344 (10A_smart_kitten) Hello serviceops! Please could an an SRE grab the stack trace & error time when you have som... [12:41:42] 06serviceops, 06SRE, 07Datacenter-Switchover: Investigate failed maintenance jobs discovered during DC switchback - https://phabricator.wikimedia.org/T335409#11582959 (10Blake) 05Open→03Invalid Given that the maintenance servers are going to be decommissioned (https://wikitech.wikimedia.org/wiki/Main... [15:31:34] hmm, it a bit weird. I ran puppet agent on deploy2002 and it correctly removed the script: [15:31:37] https://www.irccloud.com/pastebin/kIoJOUht/ [15:31:59] but deploy only shows something unrelated [15:32:02] https://www.irccloud.com/pastebin/hy2kkqLS/ [15:41:04] swfrench-wmf: for when you have time ^ sorry for this :( [15:42:49] o/ [15:43:01] Amir1: ah, interesting ... lemme take a look [15:43:12] Thank you <3 [15:45:47] Amir1: heh, I think this scap deployment that was happening concurrently actually picked up and deployed your change: https://sal.toolforge.org/log/U6JDKZwBvg159pQrRQqG [15:46:09] \o/ Less work for me :D [15:46:23] * swfrench-wmf thumbs up [15:46:30] I'll check the scap logs to confirm [15:51:29] yup, that's exactly what happened: https://logstash.wikimedia.org/app/discover#/doc/0fade920-6712-11eb-8327-370b46f9e7a5/ecs-default-1-1.11.0-7-2026.06?id=dpRAKZwBFM-LSVmLxYGz [15:52:13] (expand the message, look for `mw-cron, updatequerypages-deadendpages-s4, CronJob (batch) has been removed:`) [16:01:04] Thanks for checking <3 [16:40:00] 06serviceops: Investigate outgoing discarded packets in the codfw kubernetes cluster - https://phabricator.wikimedia.org/T226237#11584213 (10akosiaris) 05Stalled→03Declined I 'll just closed this as declined. It's close to 6 years, I won't have the time to work on this. [17:19:44] 06serviceops, 10MediaWiki-Special-pages, 06ServiceOps new, 07Wikimedia-production-error: MediaWiki periodic job update-special-pages-s2 failed - https://phabricator.wikimedia.org/T416440#11584313 (10Blake) a:03Blake Hello! It looks like the MariaDB server was briefly read-only for some maintenance during... [17:34:15] 06serviceops, 10MediaWiki-Special-pages, 06ServiceOps new, 07Wikimedia-production-error: MediaWiki periodic job update-special-pages-s2 failed - https://phabricator.wikimedia.org/T416440#11584370 (10A_smart_kitten) Thanks for fetching that! From my perspective this task can probably now be closed (although... [18:05:06] Hi all! Would someone here be able to be around for a service deployment I am planing to do in about 2 hours? I started the deployment a few hours ago, but testing on staging too longer than expected. It's all looking good, so the deployment should be trivial, but I'd need someone around to rescue me in an emergency... [18:06:10] swfrench-wmf: thanks for reviewing the patches for redioscope, btw! I'm hoping to deploy these tomorrow. Today is about JWT for the rewst gateway. Hugh reviewed earlier. Would you be around in a couple of hours? [18:11:14] duesen: no problem re: redioscope! I'm afraid I need to run out for an errand around 20:00 UTC, so I'm not quite sure at this time whether I'll be around. I might also not be the best resource for troubleshooting rest-gateway issues =/ (if that's what you mean)' [18:11:57] ... but if it's more just first-principles awareness of how to undo a deployment that goes sideways, I can definitely help with that :) [18:12:09] swfrench-wmf: yea, I just need someone with root to do a quick revert if needed. I can do the rest. [18:12:24] * swfrench-wmf thumbs up [18:12:43] I can also do it tomorrow, but I don't want to let production be out of sync with the master branch for a full day... [18:12:51] duesen: I'll be around at that time [18:13:33] cdanis: ah, excellent! I'll ping you then. [18:13:45] duesen: agreed, yeah - it's non-ideal to leave the diffs lingering. feel free to boop me here when you're ready to go, as it's still TBD when I need to run [18:13:51] or Chris :) [18:13:54] thanks, Chris! [18:18:11] swfrench-wmf, cdanis: actually, turns out I'm free for half an hour right now, if that works for either of you [18:18:19] sure [18:18:25] * swfrench-wmf thumbs up [18:18:37] we're not using the infra window today, so all yours [18:20:55] ok, applying to eqiad [18:24:01] ...watching metrics on https://grafana.wikimedia.org/goto/G7MaEyHvR?orgId=1 [18:25:09] metrics are normalizing. [18:26:08] how do I send a curl request from the deployment host to the rest gateway? for staging I was using https://staging.svc.eqiad.wmnet:4113. What'S the equivalent for production? [18:26:33] rest-gateway.svc.eqiad.wmnet:4113 should work [18:27:43] Yes, works. Thank you! [18:27:44] you might need a `-k` on your curl if you were not doing so already, depending on how you're calling it (I forget exactly what we put in the SAN list) [18:27:47] awesome [18:29:15] ok, looking good. [18:29:22] applying to codfw [18:29:32] nice! [18:36:02] ok looking good as well! [18:41:13] swfrench-wmf: i'm calling this done, thank you for your support :) [18:41:23] thanks for doing that hard part! :) [22:47:38] I just merged matthieulec's config changes for wikibugs (T415352). The bot is still idling here until the next time its bouncer is restarted, but task reports are now going into the new #wikimedia-serviceops-feed channel. Join there for all your bot spam needs. [22:48:28] thanks bd808 :)