[07:35:42] Trey314159, pfischer, mind taking a look at https://docs.google.com/presentation/d/1acRg1twb-x3iLVtbjoFBJcC-wg9BY9hZcXAU6a86p1A/edit#slide=id.g2e747bf5a7e_0_23 (slide 15 of the upcoming DP staff meeting) [09:25:17] getting different numbers for the query_cache metric between prom & graphite.... 176(miss)/349(hit) for prom vs 146/292 for graphite... [10:02:02] lunch [12:56:57] inflatador: updated https://grafana-rw.wikimedia.org/d/dc04b9f2-b8d5-4ab6-9482-5d9a75728951/elasticsearch-percentiles-wip-prom-metrics to remove any remaining graphite usage, simplified the variables as well [13:10:13] dcausse excellent, thanks! I'll update the ticket and start looking at the remaining dashboards. [13:18:33] firefox is doing this "fun" new thing where it forgets how to use DNS and has to be restarted [13:21:33] o/ [13:31:31] dcausse re: dashboards, any idea where we might get the doc size metric? I vaguely remember a mtg where we discussed whether or not we need this. https://grafana.wikimedia.org/goto/z_C4PMZHg?orgId=1 [13:32:21] inflatador: sure, Peter added some info at T376189#10247214 [13:32:22] T376189: Replace the "CirrusSearch.$cluster.updates.all.doc_size" with a new metric that works with SUP - https://phabricator.wikimedia.org/T376189 [13:33:27] imo it's not necessary to pull this metric in the dashboard you migrate, if we're interested in this data we might just look at it there (on the cirrus-streaming-updater dashboard) [13:34:03] ACK, will remove then [13:41:33] guessing https://grafana.wikimedia.org/goto/gupQPMZHR?orgId=1 is another panel we don't need? [13:47:59] inflatador: this link is for the whole dashboard [13:52:17] dcausse damn! The panel I meant to link is "CirrusSearch updates - most active wikis" [13:53:15] inflatador: I think we could migrate this one but saying that it's for the "archive" index [13:53:46] the SUP does not support the archive index so it's still rely on Cirrus to send updates [13:54:08] that's what we see there (+weithed_tags but those should go away soon) [13:54:56] ACK, what is the replacement metric in Prometheus? Or do we need to create one still? I'm not finding anything in explorer [13:56:27] inflatador: should be mediawiki_CirrusSearch_update_total [14:02:33] hmm, I'm not finding it [14:03:09] ah, nm, looks like it's in the k8s instance [14:03:39] hmm, or maybe not [14:03:57] yep, it's there [14:19:32] dcausse I'm not seeing a label for wikis in mediawiki_CirrusSearch_update_total ? https://paste.opendev.org/show/825915/ is what I'm getting [14:21:00] inflatador: sigh... fixing [16:49:09] heading out, have a nice week-end [20:05:57] .o/ [20:07:35] do we need to care about these `mediawiki_job_cirrus_build_completion_indices_codfw.service on mwmaint2002:9100 ` errors? [20:07:42] or alerts, I should say [20:12:30] inflatador: yes I think we should take a look [20:16:18] dcausse ACK, checking now [20:18:02] ouch: zhwiki Fatal error: Out of memory (allocated 233308160) (tried to allocate 10502144 bytes) in /srv/mediawiki/php-1.43.0-wmf.28/vendor/ruflin/elastica/src/Transport/Http.php on line 162 [20:19:29] interesting, where did you find that? [20:20:04] ah nm, found it [20:20:40] * inflatador should probably configure that to log to systemd journal as well [20:22:18] not sure what's happening... probably related to the data on zhwiki wiki, nothing we might be able to fix before the week-end anyways.... :( [20:23:06] dcausse yeah, that was my next question. I could probably add some swap to make it go thru, but that would be frowned upon unless it's an emergency [20:25:30] inflatador: I'd suggest to wait, I see warwiki failing with an OOM, let's wait and see what we can do on monday, it's affecting completion searches, it's just that new pages won't be findable during completion [20:26:56] dcausse ACK, I started T378227 for Monday [20:26:57] T378227: Investigate failed Cirrus index build services on mwmaint2002 (WIP) - https://phabricator.wikimedia.org/T378227 [20:27:05] thanks! [20:27:47] np, see you Monday. I'm going to kids' Halloween carnival now ;) [20:27:58] have fun! have a nice week-end! [20:28:13] you too