[00:17:20] FIRING: [2x] PuppetFailure: Puppet has failed on ms-be1056:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [01:21:56] FIRING: [2x] SystemdUnitFailed: systemd-timedated.service on ms-be1075:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [04:17:20] FIRING: [2x] PuppetFailure: Puppet has failed on ms-be1056:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [05:21:56] FIRING: [2x] SystemdUnitFailed: systemd-timedated.service on ms-be1075:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [07:45:16] <_joe_> is anyone going to review elukey's change? [08:01:56] FIRING: [4x] SystemdUnitFailed: systemd-timedated.service on ms-be1075:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [08:07:54] internet down at home [08:13:31] back on 5G, maintenance is going on on my ISP's network [08:17:20] FIRING: [2x] PuppetFailure: Puppet has failed on ms-be1056:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [09:37:47] > dump wrong_size 8 hours ago 70.2 GB -6.9 % The previous backup had a size of 75.4 GB, a change larger than 5.0%. [09:38:06] That's migration to ES for labswiki [09:38:10] ack [09:38:29] we should optimize text table there but after the MCR work is done in that section [11:10:59] hello, I have an interesting one for you all from the wonderful world of dumps [11:11:09] Got another one of these: `getting/checking text tt:3823741 failed (Received text is unplausible for id tt:3823741) for revision 3823741` like we do a bunch of times a month [11:11:41] and so I checked like this (now with k8s!): `mwscript-k8s -- maintenance/findBadBlobs.php --wiki=enwiki --revisions 3823741` [11:11:56] and I was told by that: `Found 0 bad revisions.` which is new [11:12:49] and figured I should tell you because Amir's been telling me about the stuff happening with content table, text table, and all that. We're seeing a ton of these this month, and they're all retrying 5 times [11:13:14] hi, we haven't started the migration on enwiki yet [11:13:27] so the retroactive migration is not the reason [11:13:35] ok, interesting [11:14:18] what even "unplausible" mean here [11:15:33] oh yeah, didn't notice but the other errors just say "generic error" (https://phabricator.wikimedia.org/T315902) [11:16:06] oh no, in this comment there's evidence of both: https://phabricator.wikimedia.org/T315902#8185439 [11:17:02] https://gerrit.wikimedia.org/g/mediawiki/core/+/c74cab847c7e4b675d24d0822c8b76a3a978bde5/maintenance/includes/TextPassDumper.php#595 [11:17:03] I love that there's a single line we use that word in all of mw: https://gerrit.wikimedia.org/g/mediawiki/core/+/c74cab847c7e4b675d24d0822c8b76a3a978bde5/maintenance/includes/TextPassDumper.php#690 [11:17:19] it's actually not a real word :D [11:18:04] its french for probable [11:18:16] they seem to be bad blobs [11:18:30] oh TIL its also usable in english [11:18:37] so many transparent words [11:19:13] plausible yes, but the antonym is implausible not unpleasible I think [11:19:34] https://fr.wiktionary.org/wiki/implausible its used [11:19:42] will let you work, but please give me a heads up if you want me to keep some backups for longer [11:20:11] plausibility is not even highlighted by my syntax hl, so it must be an uncommon usage [11:21:31] you say this revision has a bad blob? 3823741 on enwiki? [11:21:41] findBadBlobs.php can lie?! [11:24:28] I guessed based on the ticket. But it is reachable without issues https://en.wikipedia.org/w/index.php?oldid=3823741 [11:31:32] milimetric: enwiki dumps has been lagging when dumps happen because of dumping process, maybe related to that? [11:32:17] or then it could be just a logic outdated, sorry, I just read your last message [11:38:03] ok, yeah, clearly something's up where dumps thinks all of these are wrong, so there's more of them than usual, and so it's causing extra load while it retries them 5 times (and also of course broken dumps because they don't make it in the output probably) [11:40:31] I can try to take a look but I'm busy with other things right now, once I free up some time, I can take a look unless someone beats me to it and figure out what's going on [11:48:00] nono Amir, that's our job, I just got made manager and I'm trying to figure out how this delegate button works :P [11:48:00] https://phabricator.wikimedia.org/T368098#10210281 [11:48:37] Thank you <3 [11:48:38] Amir would benefit from knowing how to delegate more 0:-) [11:52:50] I'm trying :D [12:01:56] FIRING: [2x] SystemdUnitFailed: systemd-timedated.service on ms-be1075:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [12:17:20] FIRING: [2x] PuppetFailure: Puppet has failed on ms-be1056:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [12:59:11] trying a second time - anybody up for a Swift infra review? :) https://gerrit.wikimedia.org/r/c/operations/puppet/+/1078380 [13:26:18] thanks! [13:28:59] fwiw :D [15:57:13] is there anyone around that can sanity check https://gerrit.wikimedia.org/r/c/operations/puppet/+/1078706 for me? [16:01:56] FIRING: [2x] SystemdUnitFailed: systemd-timedated.service on ms-be1075:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [16:17:20] FIRING: [2x] PuppetFailure: Puppet has failed on ms-be1056:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [20:01:56] FIRING: [2x] SystemdUnitFailed: systemd-timedated.service on ms-be1075:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [20:17:20] FIRING: [2x] PuppetFailure: Puppet has failed on ms-be1056:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [21:16:56] FIRING: [2x] SystemdUnitFailed: systemd-timedated.service on ms-be1075:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [21:51:56] FIRING: [2x] SystemdUnitFailed: systemd-timedated.service on ms-be1075:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed