[01:20:12] <jinxer-wm>	 (SystemdUnitFailed) firing: (10) monitor_refine_event_test.service Failed on an-test-coord1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
[05:20:12] <jinxer-wm>	 (SystemdUnitFailed) firing: (10) monitor_refine_event_test.service Failed on an-test-coord1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
[09:05:42] <wikibugs>	 10Data-Engineering-Planning, 10Data Pipelines (Sprint 12): Support for moving data from HDFS to public http file server - https://phabricator.wikimedia.org/T317167 (10JAllemandou) This is done :) @Htriedman you can now move your files to `hdfs:///wmf/data/published/datasets/...` and they'll be synchronized to...
[09:05:54] <wikibugs>	 10Data-Engineering-Planning, 10Data Pipelines (Sprint 12): Support for moving data from HDFS to public http file server - https://phabricator.wikimedia.org/T317167 (10JAllemandou)
[09:06:47] <btullis>	 joal: I'm going to go for a refinery deploy now, if that's OK with you. Thanks for merging my pageview allowlist patch yesterday.
[09:07:49] <btullis>	 I'm going to be on the lookout for T334493 and I'll investigate if I see it happening.
[09:07:50] <stashbot>	 T334493: anlytics/refinery deployment broken at refinery-deploy-to-hdfs - https://phabricator.wikimedia.org/T334493
[09:12:23] <btullis>	 Ah, I remember. Wednesdays. :)
[09:12:55] <btullis>	 !log deploying refinery
[09:12:56] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[09:13:05] <joal>	 Hi btullis :)
[09:13:26] <btullis>	 Hello. Sorry if I pinged when you were busy :-)
[09:13:37] <joal>	 no no all good :)
[09:16:36] <btullis>	 So you're ok for me to proceed? Everything looks ok so far.
[09:18:12] <joal>	 For sure! I don't there is anything wildly particluar for this week dpeloy
[09:18:42] <joal>	 And as you were saying, the train allows you to investigate our issue - all good :)
[09:19:02] <joal>	 I'm around today if you need me btullis - kids are on holidays, so I'll be almost classical schedule :)
[09:19:05] <btullis>	 Ack, thanks. Will let you know if anything unusual pops up.
[09:20:12] <jinxer-wm>	 (SystemdUnitFailed) firing: (10) monitor_refine_event_test.service Failed on an-test-coord1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
[09:21:16] <wikibugs>	 10Data-Engineering, 10DBA, 10Data-Services: Prepare and check storage layer for kbdwiktionary - https://phabricator.wikimedia.org/T333270 (10Marostegui) Database `_p` created and grants created. This is ready for views creation.
[09:22:50] <wikibugs>	 10Data-Engineering, 10DBA, 10Data-Services: Prepare and check storage layer for fatwiki - https://phabricator.wikimedia.org/T335018 (10Marostegui) 05Open→03Resolved a:03Marostegui
[09:23:41] <wikibugs>	 10Data-Engineering, 10DBA, 10Data-Services: Prepare and check storage layer for ckbwiktionary - https://phabricator.wikimedia.org/T331834 (10Marostegui) 05Open→03Resolved a:03Marostegui Just checked and all good
[09:24:26] <wikibugs>	 10Data-Engineering, 10DBA, 10Data-Services, 10cloud-services-team: Prepare and check storage layer for azwikimedia - https://phabricator.wikimedia.org/T330442 (10Marostegui) 05Open→03Resolved a:03Marostegui Just checked and all good
[09:24:38] <wikibugs>	 10Data-Engineering, 10DBA, 10Data-Services, 10cloud-services-team: Prepare and check storage layer for vewikimedia - https://phabricator.wikimedia.org/T330704 (10Marostegui) 05Open→03Resolved a:03Marostegui Just checked and all good
[09:24:50] <wikibugs>	 10Data-Engineering, 10DBA, 10Data-Services: Prepare and check storage layer for guwwikinews - https://phabricator.wikimedia.org/T334408 (10Marostegui) 05Open→03Resolved a:03Marostegui Just checked and all good
[09:25:22] <wikibugs>	 10Data-Engineering, 10DBA, 10Data-Services: Prepare and check storage layer for kcgwiktionary - https://phabricator.wikimedia.org/T334739 (10Marostegui) 05Open→03Resolved a:03Marostegui  Just checked and all good
[09:25:59] <wikibugs>	 10Data-Engineering, 10DBA, 10Data-Services: Prepare and check storage layer for kbdwiktionary - https://phabricator.wikimedia.org/T333270 (10Marostegui) 05Open→03Resolved a:03Marostegui Just checked and all good
[09:28:02] <wikibugs>	 10Data-Engineering, 10Data-Services, 10cloud-services-team: Drop several views from ptwikisource - https://phabricator.wikimedia.org/T332596 (10Kizule) Can someone sort this out? There are still related tables on cloud.  `lines=10 MariaDB [ptwikisource_p]> SHOW TABLES; +--------------------------+ | Tables_i...
[09:39:27] <jinxer-wm>	 (HiveServerHeapUsage) firing: Hive Server JVM Heap usage is above 80% on an-coord1002:10100 - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hive/Alerts#Hive_Server_Heap_Usage - https://grafana.wikimedia.org/d/000000379/hive?panelId=7&fullscreen&orgId=1&var-instance=an-coord1002:10100 - https://alerts.wikimedia.org/?q=alertname%3DHiveServerHeapUsage
[09:49:28] <jinxer-wm>	 (HiveServerHeapUsage) resolved: Hive Server JVM Heap usage is above 80% on an-coord1002:10100 - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hive/Alerts#Hive_Server_Heap_Usage - https://grafana.wikimedia.org/d/000000379/hive?panelId=7&fullscreen&orgId=1&var-instance=an-coord1002:10100 - https://alerts.wikimedia.org/?q=alertname%3DHiveServerHeapUsage
[09:56:19] <wikibugs>	 10Data-Engineering, 10DBA, 10Infrastructure-Foundations, 10Machine-Learning-Team, and 9 others: codfw row C switches upgrade - https://phabricator.wikimedia.org/T334049 (10klausman)
[09:56:42] <wikibugs>	 10Data-Engineering, 10DBA, 10Infrastructure-Foundations, 10Machine-Learning-Team, and 9 others: codfw row C switches upgrade - https://phabricator.wikimedia.org/T334049 (10klausman)
[09:57:59] <wikibugs>	 10Data-Engineering, 10DBA, 10Infrastructure-Foundations, 10Machine-Learning-Team, and 9 others: codfw row D switches upgrade - https://phabricator.wikimedia.org/T335042 (10klausman)
[09:58:34] <wikibugs>	 10Data-Engineering, 10DBA, 10Infrastructure-Foundations, 10Machine-Learning-Team, and 9 others: codfw row D switches upgrade - https://phabricator.wikimedia.org/T335042 (10klausman)
[09:59:01] <wikibugs>	 10Data-Engineering, 10DBA, 10Infrastructure-Foundations, 10Machine-Learning-Team, and 9 others: codfw row C switches upgrade - https://phabricator.wikimedia.org/T334049 (10klausman)
[10:28:31] <btullis>	 FYI, the `refinery-deploy-to-hdfs`step of the refinery deploy still isn't working. It's related to this: https://phabricator.wikimedia.org/T335354 
[10:28:53] <btullis>	 I'm investigating solutions and I've added a comment to the ticket, describing how it affects us.
[10:48:05] <icinga-wm>	 PROBLEM - Check systemd state on an-launcher1002 is CRITICAL: CRITICAL - degraded: The following units failed: gobblin-webrequest.service,produce_canary_events.service,refine_netflow.service https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[10:50:12] <jinxer-wm>	 (SystemdUnitFailed) firing: (13) gobblin-webrequest.service Failed on an-launcher1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
[10:55:40] <btullis>	 !log deploying refinery to hdfs
[10:55:41] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[11:00:12] <jinxer-wm>	 (SystemdUnitFailed) firing: (13) gobblin-webrequest.service Failed on an-launcher1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
[11:05:12] <jinxer-wm>	 (SystemdUnitFailed) firing: (13) gobblin-webrequest.service Failed on an-launcher1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
[11:12:14] <btullis>	 !log restart refine_netflow service on an-launcher1002.
[11:12:16] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[11:12:41] <icinga-wm>	 RECOVERY - Check systemd state on an-launcher1002 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[11:15:17] <jinxer-wm>	 (SystemdUnitFailed) firing: (13) gobblin-webrequest.service Failed on an-launcher1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
[11:28:33] <moritzm>	 btullis: If I understand correctly, the only git command excuted in that "post deploy" work flow is the "git log" which validates that the checkout to be pushed to HDFS is up-to-date, right?
[11:28:42] <moritzm>	 I see two options:
[11:29:07] <moritzm>	 1. we provide a wrapper which runs that "git-log" under the analytics-deploy user
[11:29:57] <btullis>	 moritzm: I think that there is something in the `refinery-deploy-to-hdfs `script that doesn't like it either. I'll see if I can find something...
[11:30:02] <moritzm>	 2. we ship a git::systemconfig which sets safe.directory for /srv/deployment/analytics/refinery
[11:31:43] <moritzm>	 yeah, that scripts run git describe at leat
[11:32:57] <moritzm>	 I'm leaning towards 2. in that case, let me propose a patch in Gerrit
[11:32:59] <btullis>	 So I think that (2) is probably best for this.
[11:33:05] <btullis>	 Snap!
[11:35:27] <btullis>	 Another alternative would be to stop using an-launcher1002 and update the process to use a deployment host instead, but that's a bit more involved.
[11:45:10] <joal>	 btullis: at some hopefully not too long in the future we wish to not having to dpeloy refinery as often as we do
[11:45:39] <joal>	 btullis: we wish to separate the airflow HQL code from other code, which will probably lead to a reorganization of repos
[11:45:50] <joal>	 Having a dedicated deployment host feels overkill
[11:47:09] <btullis>	 joal: Great! I'm all in favour. What about adding in a bit of gitops and continuous deployment? :-)
[11:47:23] <joal>	 btullis: Would love that :)
[11:48:20] <joal>	 btullis: it's relevant for some of our code, while for some other we need an HDFS integration (done trough airflow)
[11:48:47] <joal>	 This entails a good bit of work though, maybe/hopefully this fiscalyear?
[11:49:15] <btullis>	 Ack. Count me in.
[12:33:17] <wikibugs>	 10Data-Engineering-Planning, 10API Platform, 10GraphQL, 10Pageviews-API: Responses on pageview API should be lighter - https://phabricator.wikimedia.org/T145935 (10VirginiaPoundstone)
[12:33:46] <wikibugs>	 10Data-Engineering-Planning, 10API Platform, 10GraphQL, 10Pageviews-API: Responses on pageview API should be lighter - https://phabricator.wikimedia.org/T145935 (10VirginiaPoundstone)
[12:40:20] <ottomata>	 btullis: reading https://phabricator.wikimedia.org/T335354#8807059, running refinery-deploy-to-hdfs from deployemnt server sounds cool, but might not be great because generally the git fat artifacts are not synced to the deployment server
[12:40:33] <ottomata>	 they are pulled as a post deploy step on each deploy host
[12:46:02] <wikibugs>	 10Data-Engineering, 10Anti-Harassment, 10Event-Platform Value Stream, 10Privacy Engineering, and 3 others: Exposing revIDs (nothing more) of deleted/suppressed edits for research to respect their removal - https://phabricator.wikimedia.org/T200559 (10Ottomata)
[12:46:15] <wikibugs>	 10Data-Engineering, 10Anti-Harassment, 10Event-Platform Value Stream, 10Privacy Engineering, and 3 others: Exposing revIDs (nothing more) of deleted/suppressed edits for research to respect their removal - https://phabricator.wikimedia.org/T200559 (10Ottomata) cc @lbowmaker @gmodena
[12:54:34] <btullis>	 ottomata: OK, got it. Thanks. So I think that the option (2) suggested my m.oritzm above sounds like the best approach then.
[12:56:36] <ottomata>	 ya sounds good
[13:16:31] <moritzm>	 quick followup question, from which are the steps outlined at https://wikitech.wikimedia.org/wiki/Data_Engineering/Systems/Cluster/Deploy/Refinery#How_to_deploy usually done?
[13:16:41] <moritzm>	 profile::analytics::refinery is added on 12 hosts in total
[13:17:06] <moritzm>	 airflow1001/1005, an-coord1001, an-launcher1002 and the stat hosts
[13:17:17] <moritzm>	 plus unrelated an-test* ones
[13:17:41] <moritzm>	 so should I add the git config to the profile that it's available on all or specifically only to an-launcher1002?
[13:17:56] <moritzm>	 which seems to have been the server from which this was first noticed/reported
[13:18:34] <btullis>	 It's usually done from an-launcher1002 for the prod-hadoop cluster and from an-test-coord1001 for the test hadoop cluster, I believe. However, other people have more experience of deploying this than I have.
[13:22:06] <btullis>	 I'm tempted to say an-coord100[1-2], an-test-coord1001, an-launcher1002 - I think that this task //could// be run on an-coord100[1-2] as it already has the right keytabs.
[13:23:41] <moritzm>	 ok, then I'm adding the git config to a separate profile and will add this to the respective roles
[13:24:48] <btullis>	 Ack, many thanks.
[14:03:31] <wikibugs>	 10Data-Engineering, 10Event-Platform Value Stream: Upgrade Flink Image to 1.17 - https://phabricator.wikimedia.org/T335408 (10Ottomata)
[14:03:47] <wikibugs>	 10Data-Engineering, 10Event-Platform Value Stream (Sprint 12): Upgrade Flink Image to 1.17 - https://phabricator.wikimedia.org/T335408 (10Ottomata)
[14:11:51] <wikibugs>	 10Data-Engineering, 10Event-Platform Value Stream (Sprint 12): mediawiki-event-enrichment: issue async requests from ProcessFunction - https://phabricator.wikimedia.org/T332948 (10Ottomata)
[14:27:08] <moritzm>	 ottomata, joal: when debugging the PCC to add the git config (https://puppet-compiler.wmflabs.org/output/912301/1759/) I realised https://github.com/wikimedia/operations-puppet/commit/a9f74b682de91317d8df9785fe9afd6cf321ee73 broke things:
[14:27:32] <moritzm>	 this renames profile::analytics::hdfs_tools to "class hdfs_tools"
[14:28:11] <moritzm>	 but prpfile::analytics::hdfs_tools is still included in profile::analytics::cluster::client
[14:28:20] <moritzm>	 profile::analytics::hdfs_tools
[15:01:14] <wikibugs>	 (03CR) 10Clare Ming: Creates web schema fragment (031 comment) [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/911412 (https://phabricator.wikimedia.org/T335309) (owner: 10Kimberly Sarabia)
[15:01:29] <ottomata>	 oh ho
[15:15:12] <jinxer-wm>	 (SystemdUnitFailed) firing: (10) monitor_refine_event_test.service Failed on an-test-coord1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
[15:15:19] <joal>	 moritzm: I'm very sorry - I completely missed that
[15:34:39] <ottomata>	 moritzm:  fixed https://gerrit.wikimedia.org/r/c/operations/puppet/+/912316
[15:35:49] <moritzm>	 thx
[15:36:01] <moritzm>	 joal: no worries, easy to miss :-)
[16:20:13] <joal>	 mforns: Would you have aminute for me?
[16:20:22] <mforns>	 sure! batcave?
[16:20:26] <joal>	 OMW
[16:26:10] <wikibugs>	 (03CR) 10Snwachukwu: Migrate pageview druid load hql queries to Airflow (033 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/910520 (https://phabricator.wikimedia.org/T334104) (owner: 10Snwachukwu)
[16:40:13] <wikibugs>	 10Analytics-Radar, 10Data-Engineering-Icebox, 10Machine-Learning-Team, 10Patch-For-Review: Upgrade ROCm to 4.5 - https://phabricator.wikimedia.org/T295661 (10elukey) The last issue has been fixed in T333009: for k8s nodes we just allow `others` to read the devices.  The new ROCm suite has been imported for...
[16:51:54] <wikibugs>	 (03PS2) 10Snwachukwu: Migrate pageview druid load hql queries to Airflow [analytics/refinery] - 10https://gerrit.wikimedia.org/r/910520 (https://phabricator.wikimedia.org/T334104)
[16:51:56] <icinga-wm>	 PROBLEM - IPMI Sensor Status on aqs2008 is CRITICAL: Sensor Type(s) Temperature, Power_Supply Status: Critical [Status = Critical] https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook%23Power_Supply_Failures
[17:48:12] <joal>	 Hi mforns and xcollazo - would you have a minute now?
[17:48:26] <mforns>	 I can
[17:49:59] <joal>	 pinging xcollazo on slack
[17:51:21] <mforns>	 joal: do you have a couple minutes to re-review https://gerrit.wikimedia.org/r/c/analytics/refinery/+/910092 please? I'd like to deploy today if possible.
[17:55:00] <joal>	 mforns: batcave with xcollazo ?
[17:55:07] <mforns>	 ok
[18:03:11] <wikibugs>	 (03CR) 10Joal: "Commented on the first file" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/910092 (https://phabricator.wikimedia.org/T334096) (owner: 10Mforns)
[18:08:37] <joal>	 oh, mforns - batcave again for the CR?
[18:09:34] <mforns>	 heya joal I'm in it
[18:09:53] <mforns>	 but I've read your comments, they make total sense. Will change those
[18:11:27] <jinxer-wm>	 (HiveServerHeapUsage) firing: Hive Server JVM Heap usage is above 80% on an-coord1002:10100 - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hive/Alerts#Hive_Server_Heap_Usage - https://grafana.wikimedia.org/d/000000379/hive?panelId=7&fullscreen&orgId=1&var-instance=an-coord1002:10100 - https://alerts.wikimedia.org/?q=alertname%3DHiveServerHeapUsage
[18:31:27] <jinxer-wm>	 (HiveServerHeapUsage) resolved: Hive Server JVM Heap usage is above 80% on an-coord1002:10100 - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hive/Alerts#Hive_Server_Heap_Usage - https://grafana.wikimedia.org/d/000000379/hive?panelId=7&fullscreen&orgId=1&var-instance=an-coord1002:10100 - https://alerts.wikimedia.org/?q=alertname%3DHiveServerHeapUsage
[18:34:53] <icinga-wm>	 PROBLEM - Check systemd state on an-web1001 is CRITICAL: CRITICAL - degraded: The following units failed: hdfs_rsync_analytics_hadoop_published.service https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[18:35:13] <jinxer-wm>	 (SystemdUnitFailed) firing: (11) monitor_refine_event_test.service Failed on an-test-coord1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
[18:59:37] <wikibugs>	 (03PS4) 10Mforns: Migrate unique devices druid loading queries to Airflow/SparkSQL [analytics/refinery] - 10https://gerrit.wikimedia.org/r/910092 (https://phabricator.wikimedia.org/T334096)
[19:04:50] <wikibugs>	 (03CR) 10Mforns: Migrate unique devices druid loading queries to Airflow/SparkSQL (032 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/910092 (https://phabricator.wikimedia.org/T334096) (owner: 10Mforns)
[19:17:52] <wikibugs>	 (03PS1) 10Milimetric: Adapt virtualpageview druid scripts to spark [analytics/refinery] - 10https://gerrit.wikimedia.org/r/912360 (https://phabricator.wikimedia.org/T334105)
[19:46:16] <wikibugs>	 (03PS5) 10Mforns: Migrate unique devices druid loading queries to Airflow/SparkSQL [analytics/refinery] - 10https://gerrit.wikimedia.org/r/910092 (https://phabricator.wikimedia.org/T334096)
[19:49:48] <wikibugs>	 (03CR) 10Mforns: "For the record: Antoine was concerned that if Hive does not recognize tables created with Spark syntax, HiveToDruid would not be able to r" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/910092 (https://phabricator.wikimedia.org/T334096) (owner: 10Mforns)
[20:50:31] <wikibugs>	 (03PS4) 10Mforns: Migrate queries for webrequest_sampled_128 to /hql (Airflow/Spark3) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/911890 (https://phabricator.wikimedia.org/T334106)
[20:52:54] <wikibugs>	 (03CR) 10Mforns: "Hey Antoine :]" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/911890 (https://phabricator.wikimedia.org/T334106) (owner: 10Mforns)
[20:56:41] <wikibugs>	 (03PS3) 10Kimberly Sarabia: Creates web schema fragment [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/911412 (https://phabricator.wikimedia.org/T335309)
[20:57:12] <wikibugs>	 (03CR) 10CI reject: [V: 04-1] Creates web schema fragment [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/911412 (https://phabricator.wikimedia.org/T335309) (owner: 10Kimberly Sarabia)
[21:00:45] <wikibugs>	 (03PS5) 10Mforns: Migrate queries for webrequest_sampled_128 to /hql (Airflow/Spark3) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/911890 (https://phabricator.wikimedia.org/T334106)
[21:39:44] <wikibugs>	 10Data-Engineering-Planning, 10XTools, 10Chinese-Sites: Run maintain-views on zhwiki, newiki - https://phabricator.wikimedia.org/T334041 (10MusikAnimal) @lbowmaker Any chance we could get an estimate on when you think this task can be fulfilled? My naive understanding is that it's as simple as running a sing...
[21:45:42] <wikibugs>	 (03PS4) 10Kimberly Sarabia: Creates web schema fragment [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/911412 (https://phabricator.wikimedia.org/T335309)
[22:00:54] <icinga-wm>	 RECOVERY - Check systemd state on an-web1001 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
[22:05:13] <jinxer-wm>	 (SystemdUnitFailed) firing: (11) monitor_refine_event_test.service Failed on an-test-coord1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
[22:48:35] <wikibugs>	 10Data-Engineering-Icebox: Improve Bot Detection Heuristics - https://phabricator.wikimedia.org/T310846 (10Mayakp.wiki) In 2023 pageview data, we are seeing spikes in automated traffic that are now affecting external (search engine) referrer traffic ([[ https://w.wiki/6dbK | chart ]])   {F36963923}  We need to i...