[09:29:43] I'm not 100% sure this is what we ended up preferring, but let's try it out [09:29:57] I just reset the GDoc for the Monday meeting [09:30:16] and will not reset it until we next meet [09:30:23] ack [09:30:37] so folks can add stuff to it this week, to be discussed in person next [09:31:31] I thought we said that it should be reset after the meeting so that we can add stuff in the 2 weeks period which leads to the next meeting, but I'm not 100% sure either :) [09:49:39] linux networking docs are being converted to ReST: https://git.linuxtv.org/mchehab/experimental.git/log/?h=net-docs [09:49:51] html output here: https://www.infradead.org/~mchehab/kernel_docs/networking/ [09:49:55] volans: yeah I think so too [09:50:45] (was just off last week) [11:34:45] What is our current preference for jobs to be run at low frequency (once a day or less) between a crontab and a systemd timer? [11:34:51] if we have one :) [11:36:24] given that there is an ongoing task to convert all the maintenance crons to timer i guess it is systemd timer [11:48:20] https://trstringer.com/systemd-timer-vs-cronjob/ is a nice concise writeup on the merits of systemd timers over cron [11:50:35] yeah I like them more, was just wondering if it's ok for any kind of frequence, like also once a month [11:50:59] <_joe_> volans: in our case we also get monitoring of the job for free [11:51:02] yeah, I think they are useful for any frequency [11:51:07] <_joe_> I strongly advice to never use cron [11:51:16] <_joe_> esp given how it's implemented in puppet [11:51:55] ehehe [11:52:01] <_joe_> now, we could also write a resource provider for cron in puppet that creates a systemd timer :) [11:52:18] although we still miss the splay for systemd timers right? [11:55:33] volans: I see we are doing things like this, with 'hour' and 'minute' coming from cron_splay: [11:55:36] $systemd_timer_interval = sprintf('*-*-* %02d:%02d:00',$times['hour'], $times['minute']) [11:57:50] systemd timer support splays via RandomizedDelaySec [11:58:00] but only in systemd in stretch and later [11:59:04] <_joe_> splay is one thing, cron_splay is another [11:59:21] <_joe_> that's our function to guarantee servers in a cluster will run things at different times [12:05:33] which of the two splays do we miss then? :) [12:12:17] the latter :) [12:58:33] Could someone create a repo in gerrit for me please? 'gerrit create-project -d "Package helm3" -n operations/debs/helm3 -o ldap/ops -p operations/debs' [13:08:11] <_joe_> jayme: so you decided for the two separate repos after all? makes sense [13:08:48] _joe_: yes. I think it's way more clear that way [13:08:58] <_joe_> +1 [13:11:10] <_joe_> jayme: created :) [13:11:11] plus: we don't have enough helm* packages currently :-) [13:11:22] <_joe_> jayme: right :D [13:11:36] _joe_: thanks <3 [13:24:46] my understanding was that we should follow the create repo procedure anyway, also for tracking purposes. [13:30:10] volans: the question here was if we maintain both version trees of helm (2.x and 3.x) within the same gerrit repo or not [13:30:58] jayme: that's totally up to you, I'm merely talking about the gerrit creation operation ;) [13:31:52] Ah, I might have got you wrong then. What's the "create repo procedure"? :) [13:33:13] that's ok, was mostly for j.oe, not for you :) https://wikitech.wikimedia.org/wiki/Gerrit#Creating_new_repositories fwiw [13:38:17] <_joe_> volans: my understanding is I never cared about that for things like operations/debs [13:39:16] <_joe_> volans: what did I do wrong btw? [13:39:24] would be nice to have it documented to have a shared understanding ;) [13:39:27] <_joe_> volans: what you linked tells you [13:39:43] <_joe_> that I, being a gerrit admin, can just follow https://www.mediawiki.org/wiki/Git/Creating_new_repositories [13:40:35] <_joe_> so, what exactly I did wrong? [13:40:50] <_joe_> you should check things before nitpicking ;) [13:41:18] I checked, and my understading is still that the table at https://www.mediawiki.org/wiki/Gerrit/New_repositories/Requests was used for tracking purposes of repo allocation [13:41:50] and the part if you are an admin that have to create a repo for someone else use those commands [13:41:57] is just for that [13:42:20] <_joe_> no you did not [13:42:25] <_joe_> your link says explicitly [13:42:41] <_joe_> "See mw:Git/Creating new repositories. Those without administrative access to Gerrit should use mw:Gerrit/New repositories instead. " [13:43:29] <_joe_> so yes, technically jayme should've followed that procedure and wait days [13:43:57] <_joe_> let's try to keep the bureaucrazy to a manageable level, thanks [13:44:34] +1 [13:46:07] <_joe_> the only rookie mistake jayme made was asking for it publicly, so now I had to justify myself with you :P [13:46:33] of course :D [13:47:26] anyway, my probably too subtle underlying hinting was that the procedure is "overcomplicated" in our normal use case [13:48:12] too busy for a longer discussion right now, but just wanted to drop a comment to say that I hear that [13:48:16] and agree [13:48:59] something that we need to work on -- I'll discuss that with my peers at the next chance [13:49:25] in the meantime feel free to apply some sort of common sense [15:34:51] _joe_, akosiaris: do you know if there is a "upgrade to helm 3" phab task? Could not find one but I'm not sure about my phab skills, though :) [15:35:25] <_joe_> jayme: well what better chance to learn something new :D [15:36:30] ahh, the joys of phab search [15:37:04] jayme: the first thing to know about phab search is when searching for tasks, don't use the box in the upper right, instead go to https://phabricator.wikimedia.org/maniphest/query/advanced/ [15:37:12] then you have possibly half a chance [15:37:25] jayme: use google with site:phabricator.w.o ;) in the future search on gmail is also a good option ;) [15:38:01] :-o [15:40:08] I'm impressed there even is a difference between "advanced search" and ... "andvanced search" [15:43:53] that is wrong- it means you don't have it configured correctly on your profile [15:46:22] absolutely possible as I did not configure anything by now [15:46:27] make sure you default to tasks and state:open [15:46:36] and I always get what I need as the first result [15:47:14] you can create your own profile and then pin it to the home page [15:47:28] at https://phabricator.wikimedia.org/search/query/edit/ [15:48:02] it works every time, 75% of the times [15:48:23] > Your ticket #18622981 has been successfully created. about the level3 link down [15:49:09] thanks! [15:50:09] jynus: thank you! [15:50:45] the other part is to create tasks and rename them with thing that are easy to search [15:51:05] like "upgrade to helm three" [15:51:06] :D [15:51:36] what was the ticket to reimage databases to buster? Search: databases+buster [15:51:43] pum, first entry [15:52:46] what was the ticket with db1114 issues? Search:db1114 + tickets closed [15:53:01] 3 top results are hw issues [15:53:05] of that server [15:59:30] I see, thanks jynus. But I still have to select the saved query every time (in the upper right box) - or am I missing something? [16:01:18] nope, you can mark it as default [16:01:26] on that screen [16:01:47] that I did... [16:02:01] is it pinned green on that screen? [16:02:04] yep [16:02:08] maybe it just needs a page reload? [16:02:17] i had to set it twice before it worked, for some reason [16:02:35] something something cache [16:02:40] :-D [16:03:26] reload does not work. Trying the kormat way :) [16:04:27] this is my setup: https://phab.wmfusercontent.org/file/data/ozuzflkk2pc2ft5264dw/PHID-FILE-xf4kfz5vpglulpnckkpb/Screenshot_20200428_180349.png [16:04:52] it goes to "Open tasks" for me by default [16:05:03] on every phab link [16:08:17] Ah. The trick is to click the pin and then select the saved query in the drop-down *and* search for something. That persists the selection [16:12:46] it's So Obvious now. :) [16:13:05] isn't it :) [16:14:05] _joe_: Are you sure there is one (binary answer welcome)? Or do you just want me to try harder? :D [16:16:28] jayme: imo just create one, can always dupe it into the old one if it is found [16:18:37] would like to figure it out anyways :) [16:21:35] o/ [16:27:11] <_joe_> sorry I wasn't following [16:27:29] <_joe_> oh the task, no please create it [19:31:04] there's more appserver latency today [19:31:23] began just before 16:00, got bad around 18:30 [19:31:51] some kind of odd traffic patterns against the external store sections [19:48:55] something common to MW and recommendation_api seems slow [19:49:21] i believe recommendation api calls into the mediawiki api? [19:50:39] it's a part of restbase; seems likely [20:08:20] afaict, recommendation api appears to be the victim here [20:38:45] * addshore reads up [20:39:25] ooh, there is an odd pattern against the external stores? [20:45:36] cdanis: ^^ [20:47:03] I think it might be a red herring, but there's something trying to write in small bursts to es2/es3: https://grafana.wikimedia.org/d/000000278/mysql-aggregated?orgId=1&var-dc=eqiad%20prometheus%2Fops&var-group=core&var-shard=es1&var-shard=es2&var-shard=es3&var-shard=es4&var-role=All [20:49:53] I think all of these current slownesses are caused by s8, and particularly db1111 [20:49:58] https://grafana.wikimedia.org/d/XyoE_N_Wz/wikidata-database-cpu-saturation?panelId=21&fullscreen&orgId=1&from=now-2d&to=now [20:50:30] its hard to say more than that right now, i wonder if we should geenrally have a tracking ticket for this [20:51:23] hm, I think you are right [20:51:43] https://grafana.wikimedia.org/d/RIA1lzDZk/application-servers-red-dashboard?panelId=9&fullscreen&orgId=1&from=now-30d&to=now you can see some other spikes from much earlier in the month [20:51:56] less spikes and more blips [20:53:08] it looks to me like db1114 shouldn't be depooled. db1104 is running an alter table and is behind about a day [20:53:15] indeed [20:53:58] when looking at https://phabricator.wikimedia.org/T232446#6083381 it looks like that might have been a mistake and might be causing the increased load for the last 2 days [20:54:30] since the s8 load issues ~ 1 month ago, s8 got another replica, but now its down 2 [20:54:52] Are the effects enough to warrant waking a dba? [20:55:12] IMO probably not, as everything survived 24 hours ago just fine [20:55:49] and comparing the increased pressure last night vs tonight, we are almost through it anyway [20:55:59] I am going to increase weight some on db1101 and db1099, they look to have plenty of CPU headroom [20:57:04] cdanis: ack, sounds good [20:58:08] looks good to me, I don't know if that will end up shifting the load around that much as I don't fully understand how the "groups" end up playing into this [21:04:24] all looks good to me and on the way down, so I'm probably going to head to bed! :D [21:04:34] cheers addshore