[00:56:11] RECOVERY - Check unit status of monitor_refine_eventlogging_legacy_failure_flags on an-launcher1002 is OK: OK: Status of the systemd unit monitor_refine_eventlogging_legacy_failure_flags https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [01:50:03] 10Analytics-Radar, 10Dumps-Generation, 10Okapi, 10Platform Engineering: HTML Dumps - June/2020 - https://phabricator.wikimedia.org/T254275 (10RBrounley_WMF) Yes, thanks @ArielGlenn [01:50:32] 10Analytics-Radar, 10Dumps-Generation, 10Okapi, 10Platform Engineering: HTML Dumps - June/2020 - https://phabricator.wikimedia.org/T254275 (10RBrounley_WMF) 05Open→03Resolved [06:51:42] 10Analytics-Radar, 10SRE, 10ops-eqiad: Try to move some new analytics worker nodes to different racks - https://phabricator.wikimedia.org/T276239 (10ayounsi) [07:03:17] 10Analytics: Upgrade Druid to 0.20.1 (latest upstream) - https://phabricator.wikimedia.org/T278056 (10elukey) Changelogs: https://github.com/apache/druid/releases/tag/druid-0.20.0 https://github.com/apache/druid/releases/tag/druid-0.20.1 [07:03:59] good morning [07:50:58] Good morning [07:51:04] elukey: Hi :) [07:51:14] I'm gonna read some more on Alluxio this morning [07:51:27] elukey: and thank you for the task about druid-0.20 :S [07:51:39] joal: bonjour! [07:52:00] at least we know what it was, those cache misses now make more sense! [07:52:14] indeed :) [07:52:18] upgrading to 0.20.1 shouldn't be that hard, but of course it is a plus maybe for next Q [07:52:28] yup [07:52:38] (even if I'd do it now :P) [07:52:58] thinking about a broker cache in memcached/redis could be a nice idea [07:53:27] even one node [07:53:38] sure elukey - I wonder about overall gain in comparison to hardware cost [07:54:37] ah yes that needs to be taken into consideration [07:55:37] I am wondering how whole cache working could affect druid-based dashboard loading [07:55:50] in theory it should be a good use case [07:55:55] yup [07:56:20] On the other hand, we're moving away from druid for many cases, prefering Presto [07:57:18] 10Analytics-Clusters, 10Analytics-Kanban, 10Patch-For-Review: Upgrade to Superset 1.0 - https://phabricator.wikimedia.org/T272390 (10elukey) Adding some notes from IRC: Superset is kerberized so the move to kubernetes is a little trickier, since we would need to figure out how that works (passing the keytab,... [07:58:06] joal: sure sure [08:00:52] 10Analytics-Radar, 10DBA, 10Patch-For-Review: mariadb on dbstore hosts, and specifically dbstore1004, possible memory leaking - https://phabricator.wikimedia.org/T270112 (10elukey) Restarted all instances on dbstore1004 today :( [08:08:22] Thanks for the changes on capcity-scheduler elukey :) [08:10:34] joal: thanks for your patience! Ok to deploy it to the test cluster? [08:10:48] Yes elukey! let's test :) [08:17:59] all right let's start with [08:18:00] java.io.IOException: mapping contains invalid or non-leaf queue production.analytics [08:18:03] ahahahha [08:18:10] :/ [08:18:22] ok - not thorough enough of a review :) [08:18:50] (03CR) 10Bharatkhatri: "any body tell me the reason of merge conflict" [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/656541 (https://phabricator.wikimedia.org/T263973) (owner: 10Bharatkhatri) [08:18:58] I am going to live hack on the test node [08:19:33] I don't understand how production.analytics is non-leaf [08:19:35] weird [08:21:53] joal: so it wants only the name of the leaf [08:21:56] like "analytics" [08:22:04] elukey: in user mapping [08:22:13] exactly yes [08:22:14] capcity-queue-mapping sorry [08:22:33] ok makes sense - I was about to say, maybe we should have used full path (root.production.analytics) [08:22:39] but it's actually the opposite :) [08:22:44] so this have a non-trivial corollary, namely that leaf queues have unique names [08:23:03] I tried to use the full path at first, didn't work [08:23:23] elukey: indeed (see https://issues.apache.org/jira/browse/YARN-6325 [08:24:50] (03CR) 10Muehlenhoff: [C: 03+1] "LGTM" (031 comment) [analytics/udplog] - 10https://gerrit.wikimedia.org/r/673596 (https://phabricator.wikimedia.org/T276623) (owner: 10Majavah) [08:37:54] joal: all up and running! [08:38:02] \o/ [08:38:08] ssh -L 8088:an-test-master1001.eqiad.wmnet:8088 an-test-master1001.eqiad.wmnet [08:38:08] * joal is gonna have a look [08:38:32] * joal doesn't even have to make the ssh line - thank you elukey :) [08:39:35] elukey: so cool! Andrew's job got reassigned correctly :) [08:40:01] elukey: I think our oozie jobs might not however - let's check [08:43:19] elukey: actually camus failed [08:44:11] joal: I think it is because of the 'essential' queue [08:44:22] elukey: I think so too - checking [08:44:28] I need to change it, lemme try a quick manual run [08:44:56] elukey: we don't even have logs for the job [08:45:51] joal: we do, on an-test-coord [08:46:02] I mean the systemd timers one [08:46:07] Ah of course [08:46:20] elukey: I was checking logs on yarn [08:50:44] (03Abandoned) 10WMDE-leszek: Review access change [analytics/wmde/scripts] (refs/meta/config) - 10https://gerrit.wikimedia.org/r/672824 (owner: 10WMDE-leszek) [08:51:40] elukey: new config works in production.ingest queue [08:52:23] joal: goood [08:52:50] I am currently applying the setting via puppet for all test camus timers [08:56:19] aaand first refine job [08:56:20] org.apache.hadoop.security.AccessControlException: User analytics does not have permission to submit application_1616402022758_0004 to queue production [08:56:23] :P [08:56:39] all as expected :) [08:57:37] changing puppet [08:58:23] joal: ingest or analytics? [08:58:39] refine should be production.analytics elukey key please :) [08:58:41] I suppose analytics [08:58:44] ack perfect :) [09:08:05] refine now works! [09:11:25] need to change druid load too of course [09:16:04] joal: so druid load seems working, both analytics and druid users in the analytics queue [09:28:52] !log move the yarn scheduler in hadoop test to capacity [09:28:54] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [10:29:09] (03CR) 10Phuedx: [C: 04-2] "Per Ottomata's comment above. I'll update this patch and the one that relies on it." [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/668743 (https://phabricator.wikimedia.org/T275766) (owner: 10Phuedx) [11:04:12] (03PS10) 10Phuedx: universalLanguageSelector: Add new properties [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/668743 (https://phabricator.wikimedia.org/T275766) [11:09:02] (03PS2) 10Phuedx: universalLanguageSelector: Add timeToChangeLanguage property [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/672740 (https://phabricator.wikimedia.org/T275794) [11:56:09] * elukey lunch [12:44:51] (03CR) 10Ottomata: [C: 03+1] "+1 oh but one more thought. You could use the major version bump to remove token from the list of required fields. At least this way you" [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/668743 (https://phabricator.wikimedia.org/T275766) (owner: 10Phuedx) [12:48:37] mornin! o/ mforns https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/673075 ? [13:34:42] o/ [13:35:07] elukey: heya - ok if I restart oozie webrequest job in test cluster? [13:35:26] joal: sure! I didn't check, it is stuck? [13:35:35] elukey: IIRC you even had a script to so ? [13:35:59] elukey: it is stuck yes, with errors: org.apache.hadoop.yarn.exceptions.YarnException: org.apache.hadoop.security.AccessControlException: User analytics does not have permission to submit application_1616402022758_0010 to queue production [13:36:11] joal: I have one yes, if you give me the start time I'll kick it off [13:36:24] ah of course it runs in "production" [13:37:14] elukey: the interesting part is that the job doesn't fail, it goes in suspended mode (oozie-launcher fails to start) [13:37:25] ouch, this means that we'll need to roll restart all jobs if we keep this config /o\ [13:37:44] elu expected start time would be: 2021-03-22 08:00 [13:37:49] elukey: indeed :( [13:38:02] elukey: it was kinda known, and it confirms [13:38:05] joal: what if we rename 'analytics' to 'production' ? [13:38:06] hm [13:38:34] elukey: why not, [13:38:34] so we won't need to change the spark/druid timers too [13:38:45] I mean restarting all oozie jobs is horrible [13:38:50] elukey: giving it the leaf-queue is enough? [13:39:03] joal: I hope so yes [13:39:08] elukey: restarting all oozie jobs is indeed not nice, but feasible :) [13:39:33] ah nono ok of course, root.production.production [13:39:35] uff [13:39:47] this is not nice in all directions [13:39:56] right :( [13:40:21] elukey: I'm gonna kill the current webrequest in test - ok ? [13:40:50] joal: yes yes let's think about a more conservative solution [13:41:54] ok elukey, killed [13:42:27] elukey: I can do the modifs and restart if you wish [13:43:32] joal: restarted [13:43:53] it should work, but restarting all oozie jobs.. [13:44:16] actually elukey, 2 coords running - weird [13:45:04] up elukey - 2 coords started - shall I kill one? [13:45:12] yep yep probably my bad [13:46:20] elukey: confirmed webrequest starts flowing again [13:46:58] ack! [13:51:10] interesting elukey! [13:51:16] I can't submit spark jobs [13:51:24] User joal does not have permission to submit application_1616402022758_0090 to queue default [13:52:35] joal: I am pretty sure because you are a bad person :D [13:52:49] I know :) [13:52:58] elukey: this is why I ask you ;) [13:53:16] but you should be in privatedata-users mmm [13:53:57] joal: have you specified the queue? [13:54:06] or did it work by itself via mappings? [13:54:10] both specifying and not [13:54:17] queue specified: default [13:54:28] I tried with others: users.default (not exists) [13:55:10] joal: I suspect that the syntax to allow groups in the acl stuff is different [13:55:22] mornin yall [13:55:27] 'yarn.scheduler.capacity.root.users.default.acl_submit_applications' => 'analytics-privatedata-users', [13:55:30] good morning milimetric [13:55:35] this probably means "user analytics-privatedata-users" [13:55:40] milimetric: o/ [13:55:48] shameless plug for https://phabricator.wikimedia.org/T265765#6930945 (the visidata thing I was messing with) [13:55:56] I'd love to get some eyes on it and spiff it up [14:03:53] ottomata: o/ [14:03:57] do you have a minute? [14:03:58] hello! [14:04:03] ya! [14:04:21] I am trying to drop some files on thorium, that are rsynced from stat1006 [14:04:33] so I have already moved them on stat1006 to /srv/backup just in case [14:04:40] hey all! [14:04:46] ottomata: looking at the change [14:04:56] on thorium though, I keep seeing things like 632G .hardsync.datasets.99pIBqzpm8R7 [14:05:01] joal: we have an interview today, wanna sync before? [14:05:06] in /srv, that hold the files there [14:05:17] sure mforns [14:05:19] I don't recal how the script works though [14:05:29] i have to open it to recall too lets see [14:06:46] mforns: noew? [14:06:56] joal: ok! [14:06:58] bc? [14:07:08] OMW [14:09:53] huh, intresting elukey those should be moved into the destinatino datasets dir [14:10:09] all the files in their are hardlinks to the souce dirs [14:10:14] the ones named for each stat box [14:10:34] once the script makes hardlink copies of the files in the source dirs into that tmpdir [14:10:52] it should move that tmp dir into place at dste dir [14:10:54] dest dir*( [14:10:59] temp_dest=$(mktemp -d$mktemp_dry_run $base_temp_dir/.hardsync.$(basename $dest_dir).XXXXXXXXXXXX) [14:11:04] later [14:11:04] cmd mv -f "$temp_dest" "$dest_dir" [14:11:28] also, those trash ones [14:11:30] should be removed [14:11:30] cmd rm -rf "$temp_dest_trash" [14:11:54] elukey: also, afaics, those are al old? [14:12:03] oh [14:12:06] no there are new ones sorry [14:12:20] but, they aren't every day? [14:13:53] elukey: hardsync-published is still a cron [14:13:55] but... [14:14:01] it is commented out (twice) in root's crontab? [14:14:24] ottomata: I have commented it (with puppet disabled) [14:14:28] to drop files :) [14:14:28] oh ok [14:14:54] elukey: the script output is just being dumped to dev/null [14:14:59] and, it runs with set -e [14:15:00] so maybe [14:15:06] it is failing sometimes in between steps? [14:15:16] and that is causing those temp files to remain [14:15:19] in any case...they are all hardlinks [14:15:26] so they should be safe to remove [14:16:17] ottomata: okok I just wanted an extra pair of eyes before taking actions [14:16:21] will do thanks :) [14:16:24] k coo [14:23:13] (03CR) 10Mholloway: [C: 04-1] Image recommendations table for android (032 comments) [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/668244 (owner: 10Sharvaniharan) [14:35:53] elukey: syntax for ACLs is "user1,user2,... group1,group2,.." - We need a space beore the first group (even if there is no user [14:36:09] joal: I was wondering the same, but I've read [14:36:30] (can't find it) [14:36:41] elukey: found it here: https://partners-intl.aliyun.com/help/doc-detail/62958.htm [14:36:44] anyway, will try with a manual hack adding space blabla [14:36:45] (03CR) 10Milimetric: [C: 04-2] "This is not how we make changes on wikistats. If you would like to edit the translations, you can do so through translatewiki, some docum" [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/656541 (https://phabricator.wikimedia.org/T263973) (owner: 10Bharatkhatri) [14:36:48] joal: ah okok! [14:37:00] (03Abandoned) 10Milimetric: Remove typo error [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/656541 (https://phabricator.wikimedia.org/T263973) (owner: 10Bharatkhatri) [14:37:39] elukey: disallow-all is space-only, as in "no-user no-group" [14:37:55] joal: a clear syntax yes [14:38:01] how can people get it wrong [14:38:06] Mwahahaha [14:40:24] joal: can you re-test? [14:40:34] sure elukey [14:42:00] elukey: success (error from another bit, but success for yarn) [14:43:36] joal: filing a change for puppet [14:43:48] is the error related to the scheduler? I mean, should I wait? [14:44:18] nope elukey, all good [14:44:47] elukey: testing resource allocation: all good [14:45:17] elukey: the fact that our users.default queue is named default means that most jobs-launch will succeed without problme [14:46:39] elukey: I'm gonna try to use the production queue as my user, just to check [14:46:45] elukey: and then from analytics [14:48:16] joal: +1 [14:56:21] elukey: confirmed it all works as expected [14:56:27] \o/ [14:56:56] elukey: things I'd like to triple check: killing jobs (actually, being refused to), and moving jobs between queues [14:57:04] leaving for kids now, will do it later [14:59:30] 10Analytics, 10Analytics-EventLogging, 10Better Use Of Data, 10Event-Platform, and 4 others: KaiOS / Inuka Event Platform client - https://phabricator.wikimedia.org/T273219 (10SBisson) [15:01:33] 10Analytics, 10Analytics-EventLogging, 10Better Use Of Data, 10Event-Platform, and 4 others: KaiOS / Inuka Event Platform client - https://phabricator.wikimedia.org/T273219 (10SBisson) a:03SBisson https://github.com/wikimedia/wikipedia-kaios/pull/358 [15:07:20] 10Analytics-Radar, 10Better Use Of Data, 10Product-Analytics, 10Product-Data-Infrastructure, and 3 others: prefUpdate schema contains multiple identical events for the same preference update - https://phabricator.wikimedia.org/T218835 (10Mholloway) @nray Do you plan to review updated patch (updated only to... [15:20:27] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Clean up issues with jobs after Hadoop Upgrade - https://phabricator.wikimedia.org/T274322 (10Milimetric) [15:20:35] 10Analytics, 10Product-Analytics: Default table creation settings results in warnings when querying - https://phabricator.wikimedia.org/T277822 (10Milimetric) (rolled up into bigtop migration cleanup task (not messing with subtasks)) [15:23:18] 10Analytics, 10Product-Analytics, 10Structured-Data-Backlog: Create a Commons equivalent of the wikidata_entity table in the Data Lake - https://phabricator.wikimedia.org/T258834 (10Milimetric) p:05Triage→03Medium [15:23:48] razzi: o/ [15:24:05] I tried to connect to clouddb1021 from an-launcher and it works [15:27:05] so there are two accounts/sqoop scripts [15:27:07] elukey: what credentials did you use? [15:27:14] /srv/deployment/analytics/refinery/bin/sqoop-mediawiki-tables -> uses clouddb1021 [15:27:35] sorry [15:27:39] /usr/local/bin/refinery-sqoop-mediawiki [15:27:49] and we also have, to keep things not confusing at all [15:28:00] /usr/local/bin/refinery-sqoop-mediawiki-production [15:28:09] that hits the dbstore100x nodes [15:28:23] the former uses the credentials to test, the latter a research/password combination [15:28:29] that works only on dbstores [15:31:26] (03CR) 10Mholloway: [C: 04-1] Image recommendations table for android (031 comment) [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/668244 (owner: 10Sharvaniharan) [15:40:23] 10Analytics-Clusters: Migrate eventlog1002 to buster - https://phabricator.wikimedia.org/T278137 (10Ottomata) [15:40:41] 10Analytics, 10Product-Analytics (Kanban): Hive table neilpquinn.toledo_pageviews missing almost all data - https://phabricator.wikimedia.org/T277781 (10nshahquinn-wmf) @Ottomata and @JAllemandou, thank you very much for investigating this! I realized that on 2020-12-10, I was working on T261953/T267940. As p... [15:52:59] !log rebalance kafka partitions for webrequest_text partition 2 [15:53:01] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [16:20:05] razzi: how many partitions left? Are you able to do two at the time now? [16:21:10] elukey: 21 partitions to go; each partition for webrequest_text is about twice as big as webrequest_upload which we used to test 2 at a time, but I could try 2 at a time for webrequest_text anyways [16:21:58] razzi: I'd be +1 to test, let's get ottomata's approval as well, so you'll be able to cut down the time (hopefully) [16:22:15] test == try one time with 2 text partitions at the same time [17:30:47] ottomata: if you have time, I'd need some extra eyes for thorium [17:31:06] so I see a big dir under /srv [17:31:07] 620G.hardsync.datasets.d6TU2Sb3m5Dw [17:31:50] now inside it there is the big dir that I want to delete, so if I cd in there and rm -rf it it returns instantly [17:32:05] and if I re-run du, I find another .hardsync.etc.. hardlink [17:32:32] so I am very puzzled about what's happening :D [17:32:45] I didn't try to rm .hardsync.etc.. directly [17:33:34] razzi / elukey: just got out of meeting, will retry sqoop shortly and let you know [17:34:59] milimetric: I'd like to watch over your shoulder, I'm still confused on how to test sqoop [17:35:37] razzi: omw cave [17:36:03] elukey: razzi +1 with 2 text partitions [17:36:37] elukey: not sure i understand [17:36:46] about hardsync [17:43:30] ottomata: so what I did is [17:43:31] cd /srv [17:43:40] du -sch .[!.]* * |sort -h [17:43:59] and the biggest dir is a .hardsync.dataset.etc.. [17:44:14] I am wondering what to do with it [17:45:17] PROBLEM - AQS root url on aqs1011 is CRITICAL: connect to address 10.64.16.201 and port 7232: Connection refused https://wikitech.wikimedia.org/wiki/Analytics/Systems/AQS%23Monitoring [17:45:43] new node : [17:46:32] 10Analytics, 10Data-Services, 10Machine-Learning-Team, 10ORES, and 2 others: Generate dump of scored-revisions from 2018-2020 for English Wikipedia - https://phabricator.wikimedia.org/T277609 (10calbon) This looks super interesting, when the data is out there I'd love to have it posted for a potential inte... [17:48:21] ACKNOWLEDGEMENT - AQS root url on aqs1011 is CRITICAL: connect to address 10.64.16.201 and port 7232: Connection refused Hnowlan Host not in use yet. https://wikitech.wikimedia.org/wiki/Analytics/Systems/AQS%23Monitoring [17:50:34] 10Analytics-Radar, 10Better Use Of Data, 10Product-Analytics, 10Product-Data-Infrastructure, and 3 others: prefUpdate schema contains multiple identical events for the same preference update - https://phabricator.wikimedia.org/T218835 (10nray) @Mholloway Sorry for the delay on that, I will for sure review... [17:54:02] (03PS4) 10Milimetric: Update mysql resolver to work with cloud replicas [analytics/refinery] - 10https://gerrit.wikimedia.org/r/666209 (https://phabricator.wikimedia.org/T274690) [18:01:58] !log drop /srv/.hardsync*trash* on thorium - old hardlinks that should have been trashed [18:02:01] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [18:03:41] ottomata: ok now I get what's happening [18:04:17] we have 100+ .hardsync* dirs under /srv, all containing a reference of the data dir that I want to delete [18:06:44] 10Analytics-Radar, 10Better Use Of Data, 10Product-Analytics, 10Product-Data-Infrastructure, and 3 others: prefUpdate schema contains multiple identical events for the same preference update - https://phabricator.wikimedia.org/T218835 (10Mholloway) No worries, @nray! Thanks for the update. [18:07:19] !log run rm -rfv .hardsync.*/archive/public-datasets/* on thorium:/srv to clean up files to drop (didn't work) [18:07:21] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [18:07:38] elukey: everything in there should just be hardlinks [18:07:38] so [18:07:54] i'd just delete all the .hardsync dirs [18:08:00] they shouldn't be there anyway [18:08:16] ottomata: ok I'll proceed then [18:12:53] !log drop /srv/.hardsync* to clean up hardlinks not needed [18:12:54] /dev/mapper/thorium--vg-data 7.2T 266G 6.6T 4% /srv [18:12:55] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [18:12:58] ottomata: success :) [18:18:16] 10Analytics, 10Analytics-Kanban: Check data currently stored on thorium and drop what it is not needed anymore - https://phabricator.wikimedia.org/T265971 (10elukey) ` elukey@thorium:/srv$ sudo du -hs * 177G analytics.wikimedia.org 8.0K deployment 4.0K log 16K lost+found 3.5G org.wikimedia.community-analytics... [18:19:32] ottomata: the next step is to figure out the max space that we'll allow for thorium's published-datasets [18:19:56] would something like 480G work ? (On raid 1) [18:20:32] we currently use around 240G [18:20:33] ish [18:25:06] * elukey afk! [18:25:10] have a good rest of the day folks [18:25:15] elukey: that could be good [18:25:18] i wonder if it would grow [18:25:39] yes I wonder the same, in case we'll need to order a beefier node [18:25:39] but i think we should just see what standard hw willy comes up with and pick something close to that [18:25:40] laters! [18:25:45] +1 [19:08:35] Gone for tonight - see you tomorrow team [20:56:00] (03PS3) 10Ottomata: Add support for finding RefineTarget inputs from Hive [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/673604 (https://phabricator.wikimedia.org/T212451) [21:00:24] mforns: just pushed up a less WIP patch [21:00:43] sorry it is so much...but i think i removed some extra code and made config loading much simpler [21:00:51] ok, ottomata, lookin, didn't finish yet because of interview, sorry [21:00:53] lemme know if discussing would help review [21:01:23] ottomata: sure! if you want to walk me over changes :] [21:01:35] wanna take a look first? or bc now? either is good for me [21:02:53] ottomata: let's chat for 10 mins now, if OK [21:03:27] k [22:24:05] (03CR) 10Jdlrobson: [C: 03+2] "Tested against current master of WikimediaEvents and then again with patch https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Wikimedia" [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/668743 (https://phabricator.wikimedia.org/T275766) (owner: 10Phuedx) [22:24:44] (03Merged) 10jenkins-bot: universalLanguageSelector: Add new properties [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/668743 (https://phabricator.wikimedia.org/T275766) (owner: 10Phuedx) [22:38:15] PROBLEM - Check unit status of produce_canary_events on an-launcher1002 is CRITICAL: CRITICAL: Status of the systemd unit produce_canary_events https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [23:00:43] RECOVERY - Check unit status of produce_canary_events on an-launcher1002 is OK: OK: Status of the systemd unit produce_canary_events https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers