[07:30:39] <_joe_> Krinkle: weren't you in the email thread Lars started? updates on that topic are there [07:32:17] there is an alert for dumpsdata1003 for disk usage of /data [07:32:18] https://grafana.wikimedia.org/d/000000377/host-overview?viewPanel=12&orgId=1&var-server=dumpsdata1003&var-datasource=eqiad%20prometheus%2Fops&refresh=5m&from=now-7d&to=now [07:32:22] <_joe_> and specifically, yes, cronjobs will not be useful, and yes, we will stop revalidating opcache as soon as we begin doing restarts [07:32:33] but it seems nothing really happened during the past hours [07:32:39] what should we do? [07:32:45] <_joe_> apergos: do you know something about that server? [07:33:06] yes, I'll take care of it [07:33:23] apergos: thank youuuuu [07:33:54] I'm mostly not here again because migraines again today but I can do this much [07:34:42] <_joe_> :(( [07:35:10] <_joe_> if you tell us what to look for, we can take care of it [07:39:02] yes exactly, please don't work if you don't feel good :( [07:40:56] well first off it won't explode in a day or a week [07:41:36] if I don't want to reduce the number of dumps kept on the internal nfs shares, and I'd prefer not to at this point, [07:42:03] then the best thing to do is to make sure i clean up the remainder of the bz2 page content files from the oldst dumps we keep [07:42:24] I'll be sortig that out now, or if you think you can live with the alert until tomorrow, then tomorrow [07:43:35] apergos: yes please it is fine tomorrow :) [07:50:50] ok great, because honestly I can't think clearly enough to do this right, I should probably just change the settings on the one host to keep one less round of dumps but I will be able to reason it out better tomorrow when I have a brain [07:51:35] if you open a task for me in the dumps-generaion projct I'll see it tomorrow when I do my daily scan of stuff (in case this falls out of my head today) [07:51:46] thans for the heads up [07:59:41] oh note that the reason it's only that server is that it is the fallback server for both misc and xml/sql dumps, so it gets copies of thm from both the other nfs servers [07:59:47] gone now [13:28:10] elukey, volans, moritzm: I'm happy to review and make suggestions based on additional flake8 rules and clean up stuff if you want but let you know how crazy you want me to go because I can go very crazy. [13:28:21] lol [13:29:13] RhinosF1: rewrite it in rust [13:31:28] kormat: you haven't seen what I've done to code before. I'm very good at refactoring python in a way that makes lines disappear. [13:31:36] Majavah: I'm still a python dev! [13:32:02] RhinosF1: "My favourite linter is rm!" [13:32:33] RhinosF1: import clean_code [13:33:33] kormat: hehe, It's what happens as I've tried to learn good habits from early on and pushed myself to learn appropiate styles. [13:33:57] kormat: my fave is /bin/false [13:34:53] RhinosF1: can I point your attention to https://gerrit.wikimedia.org/r/plugins/gitiles/labs/tools/stewardbots/+/refs/heads/master/StewardBot/StewardBot.py [13:36:34] RhinosF1: 👍 [13:36:42] Majavah: stop using else: return [13:37:23] And variables called x and i! [13:37:42] RhinosF1: that's not my project :P [13:38:19] Majavah: I got to about line 60 before finding that [13:38:25] Heh [13:38:48] I won't go crazy without the code owners being happy [13:50:34] moritzm: puppet is failing with: [13:50:36] > Could not evaluate: Could not retrieve information from environment production source(s) puppet://acmechief1001.eqiad.wmnet/acmedata/orchestrator [13:50:45] do i need to run puppet on acmechief1001 first? [13:53:40] looks like yes [13:56:52] yes, once the cert is registered there, it should work [14:14:00] <_joe_> I love how none of us uses the indicative when talking about puppet stuff [14:14:33] <_joe_> "should / would/ could " [14:16:30] it works every time, 75% of times [14:19:59] moritzm: thanks for your help! https://orchestrator.wikimedia.org/ is live. [14:20:08] \o/ [14:20:08] jbond42: and thanks for your pointers last quarter on where to look for this [14:20:29] <_joe_> kormat: can I play with databases? [14:20:58] <_joe_> drag to move replicas [14:21:01] <_joe_> seems easy enough [14:21:03] _joe_: it's now a multiplayer game, why not [14:21:06] <_joe_> lemme try [14:21:45] also doublechecked with my non-priv test user, which correctly gets rejected with "Service access denied due to missing privileges" [14:22:09] moritzm: ah useful, thanks :) [14:23:17] <_joe_> kormat: jokes aside, that's pretty neat [14:23:23] _joe_: thanks :) [14:23:29] <_joe_> now, how can we feed this info directly back to mediawiki? [14:23:40] marostegui helped (mostly by not getting in the way _too_ much) [14:24:01] <_joe_> kormat: that looks like a huge effort on his part [14:24:07] _joe_: aiui we just need to get orchestrator to call restbase, and restbase to call orchestrator, and it'll be as good as any other service [14:24:10] <_joe_> https://orchestrator.wikimedia.org/web/keep-calm 🤦 [14:24:30] <_joe_> kormat: wait, you mean this doesn't pass through restbase? [14:24:45] not currently, which is why it's obviously not production-ready [14:24:56] <_joe_> definitely [14:39:10] np kormat :) [14:41:00] _joe_, I belive you are mistaken, to be a native mw service, it has to go through a spof undocumented web proxy [14:41:32] but please be easy on kormat, it is only his first attempt [14:41:36] *her [14:41:37] sorry [14:44:37] <_joe_> that's why I was trying to make it simple [14:44:45] <_joe_> of course we also need a mediawiki extension [15:09:12] _joe_: ok, that mathches my expecation as well (re: opcache). I guess there's nothing else to do before then, thanks! [16:25:40] how often does ferm resolve addresses? [16:25:43] every puppet run? [16:33:31] no, just on service start/reload (and the latter gets triggered automatically if e.g some rule or central definition changes (bastions or so))