[07:11:56] Good morning [07:13:21] bonjour [07:56:18] joal: I am trying to add the worker node daemons on the backup master/standby [07:56:34] so hopefully +96TB [08:00:35] \o/ [08:00:55] elukey: I'm doing a gentle start, then will start moving data around with our permission [08:01:34] sure sure! [08:01:55] I didn't mean to rush, just updating on the backup cluster status :) [08:02:03] yessir [08:02:10] this evening I'll try to follow up with dcops for the other two nodes [08:03:07] ack elukey [08:03:41] elukey: also, thanks for the good call on AQS this weekend - reducing the query-span really worked, without having to reduce the QPS [08:05:30] joal: I am so sad about AQS, after bigtop I'll focus on getting us to Cassandra 3 + new cluster [08:06:15] elukey: the 2y span allows for ~200 successful QPS with good latencies [08:06:41] joal: yes yes but I am very scared about new traffic coming in and bring AQS down [08:06:44] I can think of improvements to be better at fast-404, therefore reducing the load on cassandra [08:06:51] nice :) [08:06:54] for sure elukey [08:07:18] * elukey loves Joseph the Cassandra plumber [08:08:03] * elukey also loves regular Joseph [08:08:52] :) [08:08:53] 10Analytics: Increase segment replication factor on Druid Public to 3 - https://phabricator.wikimedia.org/T272670 (10JAllemandou) Replication factor has been changed to 3 on Friday 22nd of January 2021. [08:09:05] elukey: shall we close that task? --^ [08:12:17] joal: yep! [08:12:38] I'd be curious to see if rebooting leads to a better resiliency [08:13:09] I think it will not, but I might be very wrong [08:14:22] yes you are right, my hope is probably only a wish more than something based on solid proofs :( [08:14:42] nobody answered me in druid users@ [08:14:43] lovely [08:14:47] :( [08:19:23] https://grafana.wikimedia.org/d/000000585/hadoop?viewPanel=25&orgId=1&var-datasource=eqiad%20prometheus%2Fanalytics&var-hadoop_cluster=analytics-backup-hadoop&var-worker=All [08:19:27] \o/ [08:19:45] Indeed!!! [08:20:01] elukey: do we have an HDFS balancer running on backup-clasuter? [08:20:24] elukey: I think it;s very important we have one, running almost all the time, as data will flow in big [08:20:33] joal: good point we don't, I'll add it to one of the master nodes [08:27:53] joal: added [08:27:58] Thanks a lot elukey [08:54:50] elukey: our new machine for the test cluster, they are 36-cores/192Gb-RAM/24Disks is that correct? [08:55:09] elukey: for the BACKUP cluster excuse me [08:55:37] correct but 12 4TB disks [08:55:41] ack [08:56:25] elukey: and we have 18 of them on the test-cluster currently - right? [08:57:41] joal: 16 at the moment, including the masters, but I hope to get 2 more this evening [08:58:14] ok great [09:04:05] elukey: I'm gonna start by replicating the base hdfs roots we need with correct ownership /perms (/wmf /wmf/data etc) [09:05:07] elukey: I just created /wmf - I think we need to apply 022 umask on backup cluster as weel :) [09:09:34] ah I might have missed it since it was an old change, will do it in a second [09:09:40] ;) [09:17:17] (03PS5) 10WMDE-Fisch: Update schema with core bucket labels [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/656901 (https://phabricator.wikimedia.org/T269986) [09:17:25] 10Analytics: Make sure pageview API limits are defined and well documented - https://phabricator.wikimedia.org/T261681 (10JAllemandou) [09:18:29] (03CR) 10jerkins-bot: [V: 04-1] Update schema with core bucket labels [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/656901 (https://phabricator.wikimedia.org/T269986) (owner: 10WMDE-Fisch) [09:21:42] (03CR) 10WMDE-Fisch: "recheck" [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/656901 (https://phabricator.wikimedia.org/T269986) (owner: 10WMDE-Fisch) [09:21:50] joal: change deployed, I didn't restart the namenodes yet, maybe we can try if it is needed? [09:21:59] I can check elukey [09:23:19] elukey: all good! [09:23:22] super :) [09:23:27] Thanks [09:23:36] thank you for reviewing my misses :D [09:23:43] :) [09:32:32] (03CR) 10WMDE-Fisch: "I don't get that failing test. 😕" [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/656901 (https://phabricator.wikimedia.org/T269986) (owner: 10WMDE-Fisch) [09:32:58] elukey: about distcp load - I'm thinking of using 128 mappers - My thinking is: the throttling will be writing (less writing nodes than readinfg nodes), and the limit in writing nodes is the number of disks (12) - So max 16 hosts * 12 disks: 192, with some space not overhelm [09:33:08] How do you feel about that elukey --^ ? [09:33:20] sure makes sense [09:33:40] let's monitor if it is too aggressive or not [09:34:21] for sure! [09:37:40] elukey: starting with a super-small dataset, to check functionally correct commands and setting [09:37:45] super [09:40:00] * elukey bbiab! [09:51:08] (03CR) 10Awight: "> Patch Set 5:" [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/656901 (https://phabricator.wikimedia.org/T269986) (owner: 10WMDE-Fisch) [09:51:21] (03CR) 10Awight: [C: 03+1] "Change looks good." [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/656901 (https://phabricator.wikimedia.org/T269986) (owner: 10WMDE-Fisch) [09:57:01] !log Changing ownership of archive WMF files to analytics:analytics-privatedata-users after update of oozie jobs [09:57:06] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [10:17:36] 10Analytics, 10Patch-For-Review: Remove support for the (deprecated) Druid datasources (in favor of Druid Tables) on Superset - https://phabricator.wikimedia.org/T263972 (10elukey) Added https://wikitech.wikimedia.org/wiki/Analytics/Systems/Superset#Migrate_a_chart_to_Druid_tables [10:18:48] !log restart superset to remove druid datasources support - T263972 [10:18:52] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [10:18:52] T263972: Remove support for the (deprecated) Druid datasources (in favor of Druid Tables) on Superset - https://phabricator.wikimedia.org/T263972 [10:20:50] !log restart memcached on an-tool1010 to flush superset's cache [10:20:52] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [10:23:32] joal: I am going to send an email but I just disabled the Druid datasources support in Superset, that is the old pydruid-like way of definining things [10:23:40] to force people to only use Druid tables [10:23:54] all the charts are rendering fine afaics [10:24:15] but clicking on old datasource names will lead to a 404 of course [10:27:54] ack elukey - thanks for that - IIRC this is in preparation for moving to superset 1.0 right? [10:28:28] joal: yes exactly, even if in theory they should still support the old defs, but the code is not maintained anymore :( [10:28:39] I'll send an email to announce and inform PA [10:28:42] ack [10:31:17] elukey: functional code works great - Starting a heavier copy (32Tb) [10:32:31] joal: ah you are copying via scala? [10:32:42] nonono [10:33:02] ok sorry distcp, I misunderstood [10:33:07] elukey: distcp command works as expected functionally [10:33:09] yessir [10:33:12] I thought you were writing code like crazy [10:33:15] :) [10:41:15] ok elukey - job started with 7M files to copy [10:42:20] elukey: also, shall we merge https://gerrit.wikimedia.org/r/c/operations/puppet/+/656376? [10:44:12] (03CR) 10Joal: "Ping @milimetric for the header-line detail - then merge!" [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/656166 (https://phabricator.wikimedia.org/T271571) (owner: 10Milimetric) [10:47:38] joal: merging [10:51:23] joal: deployed [10:51:30] \o/ [10:56:23] (03CR) 10Joal: "An idea about sorting and a demand for precision in writing. I also agree with Dan's comments :) Thanks a lot @LexNasser :)" (033 comments) [analytics/aqs] - 10https://gerrit.wikimedia.org/r/657228 (https://phabricator.wikimedia.org/T207171) (owner: 10Lex Nasser) [11:01:35] Amir1: o/ do you have a min? [11:01:44] elukey: sure, what's up [11:02:50] hello :) so I noticed https://phabricator.wikimedia.org/T257118#6632388 only last week since Lex (our intern) was trying to use AQS in deployment-prep [11:03:21] we are thinking about re-creating it somewhere for testing, do you think it is better not to use deployment-prep anymore? [11:04:03] yeah, beta cluster is a mess [11:04:26] it would be great if you get a dedicated project. Similar to what we did with mailman [11:04:33] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Follow up on hdfs:///tmp perms issues after umask change on HDFS - https://phabricator.wikimedia.org/T271560 (10JAllemandou) [11:05:32] Amir1: yep yep we have our Analytics one, will use it, thanks :) [11:06:02] (unless you need direct access to beta cluster's mediawiki's database or something like that) [11:06:14] nono [11:06:51] joal: an-worker1137 started to fill up disks :D (backup cluster) [11:07:03] hehe :) [11:07:14] (03CR) 10Svantje Lilienthal: "> Patch Set 1:" (036 comments) [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/657362 (https://phabricator.wikimedia.org/T271902) (owner: 10Svantje Lilienthal) [11:07:24] elukey: in an ok way I hope? [11:08:01] well it alarmed in #operations [11:08:05] so we crossed the 90% mark [11:08:16] Ah - crap :( [11:08:22] How come :( [11:08:40] ah snap /tmp! [11:08:48] UH? [11:09:01] elukey@an-worker1137:/tmp$ sudo du -hs * [11:09:01] 4.0K disable_learn [11:09:01] 39G hadoop-hdfs [11:09:07] we have a ~50G root [11:09:36] elukey: so distcp uses /tmp to store it's files? [11:09:40] or maybe logs? [11:09:54] I see stuff like [11:09:55] ./BP-1041722270-10.64.5.7-1611317608813/current/finalized/subdir1/subdir29/blk_1073823027 [11:10:21] looks like distcp uses /rmp [11:10:23] looks like distcp uses /tmp [11:10:24] MEH [11:11:05] it is a problem since it fills up the root :( [11:11:14] there is space in the lvm so I can expand it in theory [11:11:14] of course [11:11:20] but it is not infinite [11:11:28] elukey: I'm gonna kill the job now [11:12:17] elukey: done - distcp job killed [11:12:29] elukey: you can drop from /tmop [11:12:50] elukey: actually, you may drop /tmp/hadoop-hdfs on all nodes [11:13:12] sure [11:14:30] done [11:14:41] Wait, didn't we talk about something like this last week? [11:14:54] Like (under hdfs) data_something? [11:15:03] hello :) [11:15:22] klausman: can you give us more info? [11:17:22] In a meeting last week, we talked about using a different directory for (IIRC) copies/data storage instead of piling it into the hdfs root [11:17:56] It's likely entirely unrelated beyong "it happened on analytics machines", but the coincidence is tickling my apophenia [11:18:24] ahhhhh [11:18:51] yes we discussed about the tmp directory names for druid indexations [11:19:05] that were /tmp_something vs /wmf/tmp/something [11:19:13] Yeah, that [11:19:38] okok now I get it, in this case we are copying data from hadoop analytics to hadoop backup [11:19:57] right, and the copying tool uses /tmp [11:20:00] and the tool used (distcp) seems to copy data on the root tmp [11:20:10] on the host itself, on the target node [11:20:22] filling it up in few mins (since it is ~50G) [11:20:36] so Joseph I think is trying to figure out how to tune it [11:20:41] Yeah :-/ We can't even fix this by using /var/tmp, since that likely is the same fs [11:21:14] yep.. and we have space in the LVS volumes, but not enough for the general use case [11:21:20] Could the copies be split up? Something like cp a* target && cp b* target? Instead of cp * target [11:21:43] I know I'm being POSIXy here :) [11:21:46] ahahahah [11:22:27] so distcp uses multiple mappers to fetch the data, I hope that there is a way to use one of the datanode disk partitions (with 4TB each) for these things [11:22:35] otherwise it is really a little cumbersome to use [11:22:44] (Joseph know better though) [11:24:33] It might be that /tmp is just a default somewhere, yeah. Again speaking from POSIX POV, programs *usually* honor the TMPDIR environment var [11:26:34] e.g. from mktemp(1): [11:26:35] -p DIR, --tmpdir[=DIR] [11:26:37] interpret TEMPLATE relative to DIR; if DIR is not specified, use $TMPDIR if set, else /tmp. With this option, TEMPLATE must not be an absolute name; unlike with -t, TEMPLATE may contain slashes, but mktemp creates only the final component. [11:28:29] 10Analytics-Radar, 10SRE, 10Wikimedia-Logstash, 10observability, 10Performance-Team (Radar): Retire udp2log: onboard its producers and consumers to the logging pipeline - https://phabricator.wikimedia.org/T205856 (10fgiunchedi) [11:29:16] klausman: yes I am pretty sure that there is a way to tune this, we are not the first ones to copy TBs of data [11:31:29] klausman: it is weird that the default is so brittle [11:31:49] elukey: could it be that nodes got started with no values for dirs? [11:32:04] joal: what do you mean? [11:32:06] elukey: cause there is not reference of /tmp anywhere in distcp [11:32:17] In hadoop conf I mean [11:32:35] the only one that I see is using -atomic, or one person mentioning a copy buffer for S3 [11:32:45] we do not use atomic elukey [11:33:15] yep [11:33:55] joal: what do you mean with "with no values for dirs" ? [11:37:15] I see a lot of people having the same issue but from HDFS to S3 [11:37:15] elukey: /tmp/hadoop-$USER folders are typically created when using hadoop.tmp.dir [11:37:31] and they tuned fs.s3a.buffer.dir [11:38:20] afaics we don't define it anywhere in puppet [11:38:39] and I checked in /etc on the node [11:38:41] nothing [11:38:46] yup [11:39:03] But, our data-dirs for both HDFS and YARN are explicitely defined [11:40:18] This is why I asked about node config elukey - [11:40:29] elukey: batcave for a minute (should be easier)? [11:40:55] joal: yes sure but "no values for dirs" was a little generic :D [11:41:19] indeed - I should have been more precise :) [11:53:32] klausman: we found the problem, wrong node configured by me [11:53:44] so it turns out I am the issue [11:53:47] :D [11:54:09] Well well, a self-fixing issue. I like those. [11:54:14] :) [11:54:21] Thanks for chiming in klausman :) [11:57:24] (03CR) 10Awight: added editor type preferences (031 comment) [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/657362 (https://phabricator.wikimedia.org/T271902) (owner: 10Svantje Lilienthal) [12:03:39] (03CR) 10Awight: "Very minor fix-ups, comments inline." (033 comments) [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/657362 (https://phabricator.wikimedia.org/T271902) (owner: 10Svantje Lilienthal) [12:25:17] !log Copy /wmf/data/archive to backup cluster (32Tb) - T272846 [12:25:20] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [12:25:20] T272846: Backup HDFS data before BigTop upgrade - https://phabricator.wikimedia.org/T272846 [12:29:21] 10Analytics, 10Analytics-Kanban: Backup HDFS data before BigTop upgrade - https://phabricator.wikimedia.org/T272846 (10JAllemandou) [12:30:47] 10Analytics, 10Analytics-Kanban: Backup HDFS data before BigTop upgrade - https://phabricator.wikimedia.org/T272846 (10JAllemandou) a:03JAllemandou [12:31:11] (03PS6) 10WMDE-Fisch: Update schema with core bucket labels [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/656901 (https://phabricator.wikimedia.org/T269986) [12:33:11] 10Analytics, 10Analytics-Kanban: Backup HDFS data before BigTop upgrade - https://phabricator.wikimedia.org/T272846 (10JAllemandou) [12:33:13] 10Analytics-Clusters: Upgrade the Hadoop Analytics cluster to BigTop - https://phabricator.wikimedia.org/T255142 (10JAllemandou) [12:34:47] (03CR) 10jerkins-bot: [V: 04-1] Update schema with core bucket labels [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/656901 (https://phabricator.wikimedia.org/T269986) (owner: 10WMDE-Fisch) [12:46:39] * elukey lunch! [13:00:10] (03CR) 10Joal: "I finally got to that - Sorry for the delay @LexNasser. A bunch of things, nothing major." (0313 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/654924 (https://phabricator.wikimedia.org/T207171) (owner: 10Lex Nasser) [13:14:09] (03PS7) 10WMDE-Fisch: Update schema with core bucket labels [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/656901 (https://phabricator.wikimedia.org/T269986) [13:15:30] (03CR) 10jerkins-bot: [V: 04-1] Update schema with core bucket labels [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/656901 (https://phabricator.wikimedia.org/T269986) (owner: 10WMDE-Fisch) [13:45:15] good morning team [14:08:25] Hi fdans :) [14:17:23] (03CR) 10Ottomata: "> Patch Set 5:" [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/656901 (https://phabricator.wikimedia.org/T269986) (owner: 10WMDE-Fisch) [14:21:24] (03CR) 10Awight: [C: 03+1] "Thanks! We also realized that the failure is also in analytics/legacy/test, https://gerrit.wikimedia.org/r/plugins/gitiles/schemas/event/" [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/656901 (https://phabricator.wikimedia.org/T269986) (owner: 10WMDE-Fisch) [14:22:22] (03CR) 10Ottomata: "Oh, and this seems to be a problem with tests in CI, not locally (at least for me). If tests pass for you locally, you can skip Jenkins Ve" [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/656901 (https://phabricator.wikimedia.org/T269986) (owner: 10WMDE-Fisch) [14:34:55] 10Analytics, 10WMDE-Templates-FocusArea, 10MW-1.36-notes (1.36.0-wmf.26; 2021-01-12), 10Patch-For-Review, 10WMDE-TechWish (Sprint-2021-01-20): Add edit count bucketing to all metrics - https://phabricator.wikimedia.org/T269986 (10Ottomata) [14:35:25] 10Analytics, 10WMDE-Templates-FocusArea, 10MW-1.36-notes (1.36.0-wmf.26; 2021-01-12), 10Patch-For-Review, 10WMDE-TechWish (Sprint-2021-01-20): Add edit count bucketing to all metrics - https://phabricator.wikimedia.org/T269986 (10Ottomata) @awight seeking feedback from Analytics, perhaps @mforns has some... [14:37:56] <- groceries [14:38:06] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Event-Platform, and 4 others: Migrate legacy metawiki schemas to Event Platform - https://phabricator.wikimedia.org/T259163 (10Ottomata) [14:38:16] 10Analytics, 10Event-Platform, 10Language-analytics, 10MW-1.36-notes (1.36.0-wmf.27; 2021-01-19), 10Patch-For-Review: UniversalLanguageSelector Event Platform Migration - https://phabricator.wikimedia.org/T267352 (10Ottomata) [14:40:00] (03CR) 10Awight: [C: 03+1] "> Patch Set 7:" [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/656901 (https://phabricator.wikimedia.org/T269986) (owner: 10WMDE-Fisch) [14:41:46] ebernhardson: FYI, deployed the eventgate-main image; your schema should be avail now [14:46:19] 10Analytics, 10Analytics-EventLogging, 10CSS: Schema code samples popup appears under the JSON table - https://phabricator.wikimedia.org/T272857 (10He7d3r) [14:47:58] mforns: we still need to wait on PHP migrations [14:48:08] it looks like the code is serializing booleans as integers in json [14:48:09] :( [14:48:31] hellooo team! [14:48:39] ottomata: hi! oh, ok [14:49:09] will work on the other schemas, no problemo [14:49:38] Hi mforns [14:49:45] hey joal [14:50:35] mforns: Wanted to let you know that I've changed archived files pownership this morning, as I started to copy data to the backup cluster [14:50:46] mforns: no need to do it twice :) [14:51:14] :/ sorry [14:51:23] no worry :) [14:51:32] I prioritized other stuff on friday [14:59:38] 10Analytics, 10ci-test-error: Failing CI: schema compatibility test in analytics/legacy/test - https://phabricator.wikimedia.org/T272861 (10awight) [14:59:51] (03CR) 10Awight: [C: 03+1] "Made a task about this T272861 and pasted my observations there." [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/656901 (https://phabricator.wikimedia.org/T269986) (owner: 10WMDE-Fisch) [15:00:56] joal: last diff for Bigtop 3.x [15:00:58] Components in v1.5 in v3.0 [15:00:58] alluxio 1.8.2 => 2.4.1 [15:01:02] \o/ [15:01:20] Awesome elukey :) [15:03:24] 10Analytics, 10Better Use Of Data, 10Event-Platform, 10Product-Infrastructure-Data, and 2 others: Set up an instance of EventStreams in beta that will allow for consuming any stream - https://phabricator.wikimedia.org/T253069 (10Ottomata) 05Open→03Resolved [15:03:29] 10Analytics, 10Better Use Of Data, 10Event-Platform, 10Product-Infrastructure-Data, 10Product-Infrastructure-Team-Backlog: Develop test environment solution for MEP analytics events - https://phabricator.wikimedia.org/T238837 (10Ottomata) [15:08:08] 10Analytics, 10Better Use Of Data, 10Event-Platform, 10Product-Infrastructure-Data, 10MW-1.36-notes (1.36.0-wmf.27; 2021-01-19): EventLogging PHP EventServiceClient should use EventBus->send(). - https://phabricator.wikimedia.org/T272863 (10Ottomata) [15:08:58] mforns: I'm going to focus on ^ hard. [15:09:55] so i'm not going to start any more migrations (inculding) the Performance team ones right now. if you get through the ones you are working on, lemme know and...maybe you can take mine? :) [15:11:11] yes ottomata, no problem by me, just will see if I have enough bandwidth given ops week [15:11:59] 10Analytics, 10ci-test-error: Failing CI: schema compatibility test in analytics/legacy/test - https://phabricator.wikimedia.org/T272861 (10Ottomata) What is strange is that 1.0.0 does not need to be compatible with 1.2.0, that is wrong. I think something is incorrectly sorting the schema versions on the CI se... [15:12:43] 10Analytics, 10ci-test-error: Failing CI: schema compatibility test in analytics/legacy/test - https://phabricator.wikimedia.org/T272861 (10Ottomata) @awight you can repro this locally? I cannot! [15:13:36] 10Analytics, 10ci-test-error: Failing CI: schema compatibility test in analytics/legacy/test - https://phabricator.wikimedia.org/T272861 (10Ottomata) I get: ` analytics/legacy/test Major Version 1 ✓ 1.1.0 must be compatible with 1.0.0 ✓ 1.2.0 must be compatible with 1.1.0 ` [15:13:44] mforns: right sounds good [15:13:47] no hurry on them of course [15:13:52] ottomata: o/ [15:13:55] hello! [15:14:04] I am going to look in a bit about adding the TLS certs for es-internal [15:14:18] I hope to finish it this week, thanks for the patience [15:14:23] but it is a good learning experience :) [15:14:51] (03CR) 10Awight: [C: 03+1] "More good news about this patch: in I9759381ce2 we will consolidate to one very performant query, which can process (very) roughly 500k us" [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/655886 (https://phabricator.wikimedia.org/T271894) (owner: 10WMDE-Fisch) [15:17:39] 10Analytics, 10ci-test-error: Failing CI: schema compatibility test in analytics/legacy/test - https://phabricator.wikimedia.org/T272861 (10awight) >>! In T272861#6773655, @Ottomata wrote: > @awight you can repro this locally? I cannot! I can, and it starts very clearly at the patch linked above, using git b... [15:26:07] 10Analytics, 10ci-test-error: Failing CI: schema compatibility test in analytics/legacy/test - https://phabricator.wikimedia.org/T272861 (10awight) >>! In T272861#6773670, @Ottomata wrote: > I get: > > ` > analytics/legacy/test > Major Version 1 > ✓ 1.1.0 must be compatible with 1.0.0 >... [15:29:47] 10Analytics, 10ci-test-error: Failing CI: schema compatibility test in analytics/legacy/test - https://phabricator.wikimedia.org/T272861 (10Ottomata) Ok, can you run this in a Node repl from your schemas/event/secondary clone and paste the output? `lang=javascript const jsonschemaTools = require('@wikimedia/j... [15:29:57] 10Analytics-Radar, 10Better Use Of Data, 10Instrument-ClientError, 10Wikimedia-Logstash, and 3 others: Documentation of client side error logging capabilities on mediawiki - https://phabricator.wikimedia.org/T248884 (10fgiunchedi) [15:36:55] ottomata: thanks! testing [15:38:33] 10Analytics, 10ci-test-error: Failing CI: schema compatibility test in analytics/legacy/test - https://phabricator.wikimedia.org/T272861 (10awight) >>! In T272861#6773696, @Ottomata wrote: > Ok, can you run this in a Node repl from your schemas/event/secondary clone and paste the output? `lang=javascript [ '1... [15:45:28] 10Analytics, 10ci-test-error: Failing CI: schema compatibility test in analytics/legacy/test - https://phabricator.wikimedia.org/T272861 (10Ottomata) > it starts very clearly at the patch linked above And if you checkout a commit before 778a881737a1f9c845795cb41bc00200ed81b989, is the order returned different? [15:45:55] (03PS1) 10Fdans: Add monthly pageview complete job [analytics/refinery] - 10https://gerrit.wikimedia.org/r/658348 (https://phabricator.wikimedia.org/T265732) [15:46:03] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Create monthly job for canonical pageviews - https://phabricator.wikimedia.org/T265732 (10fdans) a:03fdans [15:53:37] 10Analytics, 10ci-test-error: Failing CI: schema compatibility test in analytics/legacy/test - https://phabricator.wikimedia.org/T272861 (10awight) >>! In T272861#6773727, @Ottomata wrote: >> it starts very clearly at the patch linked above > > And if you checkout a commit before 778a881737a1f9c845795cb41bc00... [15:58:23] 10Analytics, 10ci-test-error: Failing CI: schema compatibility test in analytics/legacy/test - https://phabricator.wikimedia.org/T272861 (10Ottomata) Interesting. I have the same lodash version (but I'm running on MacOS). In lib/jsonschema-tools.js on Line 996 in `groupSchemasByTitleAndMajor`, what happens if... [16:01:49] fdans: standup? [16:02:35] razzi [16:09:43] 10Analytics, 10ci-test-error: Failing CI: schema compatibility test in analytics/legacy/test - https://phabricator.wikimedia.org/T272861 (10Mholloway) The compatibility tests assume that schemaInfo objects will be sorted by version number when iterating through them to check backwards compatibility. But there... [16:21:21] !log restart yarn and hdfs daemon on analytics1058 (canary node for new openjdk) [16:21:32] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [16:22:58] !log drain+restart cassandra on aqs1004 to pick up the new openjdk (canary) [16:23:01] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [16:27:02] 10Analytics, 10ci-test-error: Failing CI: schema compatibility test in analytics/legacy/test - https://phabricator.wikimedia.org/T272861 (10Mholloway) Actually, it's more than just the common stuff. Commenting out that part fixes this test case, but it looks like there's still other weird stuff going on on th... [16:36:59] 10Analytics, 10ci-test-error: Failing CI: schema compatibility test in analytics/legacy/test - https://phabricator.wikimedia.org/T272861 (10awight) `_.groupBy` returns an object so that exact patch didn't work, but I get the idea. Unfortunately, lodash doesn't support a custom comparator. Here's an attempted... [16:41:20] 10Analytics-Clusters, 10SRE, 10vm-requests: Eq: new Druid test VM for analytics - https://phabricator.wikimedia.org/T266771 (10Ottomata) [16:43:37] 10Analytics: Add time interval limits to pageview API - https://phabricator.wikimedia.org/T261681 (10fdans) [16:45:28] 10Analytics: Add time interval limits to pageview API - https://phabricator.wikimedia.org/T261681 (10fdans) a:05Milimetric→03lexnasser [16:50:25] 10Analytics-Clusters: Improve logging for HDFS Namenodes - https://phabricator.wikimedia.org/T265126 (10Ottomata) After ops-sync today here's what we want. We want to put Hadoop logs on their own LVM partition. - Create a new LVM partition and mount it at /var/log/hadoop - Symlink /var/log/hadoop-* into /var/l... [16:54:37] 10Analytics-Radar, 10Add-Link, 10Growth-Structured-Tasks, 10Growth-Team (Current Sprint), 10Patch-For-Review: Add Link engineering: Pipeline for moving MySQL database(s) from stats1008 to production MySQL server - https://phabricator.wikimedia.org/T266826 (10fdans) [16:57:07] 10Analytics: Increase segment replication factor on Druid Public to 3 - https://phabricator.wikimedia.org/T272670 (10fdans) 05Open→03Resolved a:03fdans [17:00:24] 10Analytics-Clusters: Re-create deployment-aqs cluster - https://phabricator.wikimedia.org/T272722 (10fdans) [17:03:31] 10Analytics, 10Analytics-Kanban, 10Product-Analytics: Superset presto error: Failed to list directory: hdfs://analytics-hadoop/wmf/data/event/... - https://phabricator.wikimedia.org/T272741 (10fdans) a:03mforns [17:05:41] elukey: triple checking something if you have a minute [17:05:42] 10Analytics-Radar, 10WMDE-Templates-FocusArea, 10MW-1.36-notes (1.36.0-wmf.26; 2021-01-12), 10Patch-For-Review, 10WMDE-TechWish (Sprint-2021-01-20): Add edit count bucketing to all metrics - https://phabricator.wikimedia.org/T269986 (10fdans) [17:06:00] elukey: all users available in the main cluster are also in backup, right? [17:07:00] joal: nope, only ours for the moment [17:07:28] we might want to have analytics-privatedata-users though [17:07:44] for the moment only the system ones and analytics-admins [17:07:44] elukey: analytics-privatedata-users group is usable [17:08:13] 10Analytics, 10Analytics-Kanban: Backup HDFS data before BigTop upgrade - https://phabricator.wikimedia.org/T272846 (10fdans) p:05Triage→03High [17:08:19] elukey: I'm asking for users to be able to copy /user (with plenty different user perms [17:08:55] elukey: maybe we don't even need the users to be created on system? [17:10:00] 10Analytics-EventLogging, 10Analytics-Radar, 10CSS: Schema code samples popup appears under the JSON table - https://phabricator.wikimedia.org/T272857 (10fdans) [17:10:13] joal: yes it is but it is not deployed on the namenode, and probably hdfs accepts any unknown/new groups [17:10:48] I'll deploy analytics-privatedata-users to the master after the meeting so it will be more consistent [17:10:54] is it needed now ? [17:11:09] 10Analytics, 10ci-test-error: Failing CI: schema compatibility test in analytics/legacy/test - https://phabricator.wikimedia.org/T272861 (10Ottomata) @awight what about: `lang=javascript function groupSchemasByTitleAndMajor(schemaInfos) { const schemaInfosByTitle = groupSchemasByTitle(schemaInfos);... [17:12:28] elukey: not really - I used it already and it worked [17:12:53] elukey: I'm gonna do it without users existing in system - if hadoop accepts that, let's go! [17:13:34] !log Copy /user to backup cluster (92Tb) - T272846 [17:13:36] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [17:13:36] T272846: Backup HDFS data before BigTop upgrade - https://phabricator.wikimedia.org/T272846 [17:15:31] 10Analytics, 10ci-test-error: Failing CI: schema compatibility test in analytics/legacy/test - https://phabricator.wikimedia.org/T272861 (10fdans) p:05Triage→03High [17:15:32] joal: the thing that may happen, I think, is that the namenode will assign random gid, but let's see how it goes [17:15:41] hm [17:15:46] not nice :S [17:15:59] elukey: let's triple check now (don't know how though) [17:16:01] I think not sure, maybe let's copy some data and see [17:16:24] (currently in a meeting but I'll be able to check it in a bit) [17:16:39] sure, I'll be there elukey [17:16:59] <3 [17:24:08] 10Analytics, 10ci-test-error: Failing CI: schema compatibility test in analytics/legacy/test - https://phabricator.wikimedia.org/T272861 (10awight) This was the problem, it fixes my local test failures: https://github.com/wikimedia/jsonschema-tools/pull/25 [17:35:41] joal: I have https://gerrit.wikimedia.org/r/c/operations/puppet/+/658394 ready [17:38:55] 10Analytics, 10ci-test-error: Failing CI: schema compatibility test in analytics/legacy/test - https://phabricator.wikimedia.org/T272861 (10Ottomata) NIIICE THANK YOU FOR FINDING THAT! [17:44:54] 10Analytics, 10ci-test-error: Failing CI: schema compatibility test in analytics/legacy/test - https://phabricator.wikimedia.org/T272861 (10Ottomata) Published as version 0.9.0 on npm. [17:45:14] joal: deployed! [17:52:53] mforns: just FYI, awight fixed that schema CI issue we had [17:52:54] https://phabricator.wikimedia.org/T272861 [17:54:46] ottomata: great, yes, we read that in grosking [17:56:02] elukey: please excuse me I lied - I was not there :S [17:57:34] ottomata: that was tricky to find [18:00:49] indeed! [18:01:43] joal: Here are the NOT IN and NOT EXISTS explains. NOT EXISTS and LEFT JOIN are optimized to the same execution plan https://usercontent.irccloud-cdn.com/file/Z6wOX0II/not_exists.txt https://usercontent.irccloud-cdn.com/file/8vQZLXIY/not_in.txt [18:02:18] And here are the benchmarks I ran, two of each type: ```NOT IN: [18:02:18] Total MapReduce CPU Time Spent: 1 days 13 hours 1 minutes 39 seconds 750 msec [18:02:19] Time taken: 508.684 seconds, Fetched: 261 row(s) [18:02:19] Total MapReduce CPU Time Spent: 1 days 12 hours 3 minutes 44 seconds 610 msec [18:02:19] Time taken: 654.88 seconds, Fetched: 261 row(s) [18:02:20] NOT EXISTS: [18:02:21] Total MapReduce CPU Time Spent: 1 days 5 hours 5 minutes 1 seconds 640 msec [18:02:21] Time taken: 264.204 seconds, Fetched: 261 row(s) [18:02:22] MapReduce Total cumulative CPU time: 0 days 14 hours 29 minutes 6 seconds 610 msec [18:02:23] Time taken: 354.249 seconds, Fetched: 261 row(s)``` [18:03:07] Seems like NOT EXISTS or LEFT JOIN is the way to go... if you agree, do you think one's better than the other? [18:05:06] * razzi afk for lunch [18:10:09] lexnasser: You can go for NOT EXISTS - It'd be great if you can write a comment explaining that you've checked the plan and that execution is a map-join :) [18:10:52] joal: Will do, thanks! [18:11:01] Thanks for the check lexnasser ) [18:15:39] (03CR) 10Awight: [C: 03+2] "recheck" [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/656901 (https://phabricator.wikimedia.org/T269986) (owner: 10WMDE-Fisch) [18:17:26] (03PS8) 10Awight: Update schema with core bucket labels [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/656901 (https://phabricator.wikimedia.org/T269986) (owner: 10WMDE-Fisch) [18:17:54] (03CR) 10Lex Nasser: Create and configure Oozie job to load 'Top Articles by Country Pageviews API' data into Cassandra (0312 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/654924 (https://phabricator.wikimedia.org/T207171) (owner: 10Lex Nasser) [18:18:03] (03CR) 10jerkins-bot: [V: 04-1] Update schema with core bucket labels [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/656901 (https://phabricator.wikimedia.org/T269986) (owner: 10WMDE-Fisch) [18:26:25] going to have dinner earlier today, will check later on if I am needed :) [18:49:03] ottomata: I'm going to start migrating some low-traffic kafka partitions now if you're cool with that [18:49:13] yeah! [18:49:23] go for it! [18:49:33] use the ol !log :) [18:49:44] ottomata: sounds good [18:53:40] !log rebalance kafka partitions for eqiad.mediawiki.job.ChangeDeletionNotification [18:53:42] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [18:55:57] (03PS7) 10Neil P. Quinn-WMF: Set up and document deployment strategy for jobs [analytics/wmf-product/jobs] - 10https://gerrit.wikimedia.org/r/651794 (https://phabricator.wikimedia.org/T261953) [18:56:15] 10Analytics-Clusters: Balance Kafka topic partitions on Kafka Jumbo to take advantage of the new brokers - https://phabricator.wikimedia.org/T255973 (10razzi) [18:58:08] !log rebalance kafka partitions for eventlogging_ExternalGuidance [18:58:09] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [19:00:31] 10Analytics-Clusters, 10Analytics-Kanban: Deprecate the anaytics-users POSIX group - https://phabricator.wikimedia.org/T269150 (10fdans) 05Open→03Resolved [19:00:33] 10Analytics, 10Analytics-Kanban: Analytics Ops Technical Debt - https://phabricator.wikimedia.org/T240437 (10fdans) [19:00:35] 10Analytics, 10Analytics-Kanban, 10Analytics-Wikistats: Static vertical footer position ignores list length, overlaps with list, makes lists on Wikistats unreadable - https://phabricator.wikimedia.org/T267467 (10fdans) 05Open→03Resolved [19:00:38] 10Analytics, 10Analytics-Kanban: EventStreams UI - https://phabricator.wikimedia.org/T268255 (10fdans) 05Open→03Resolved [19:00:45] 10Analytics-EventLogging, 10Analytics-Kanban, 10Event-Platform, 10Goal, and 3 others: Modern Event Platform - https://phabricator.wikimedia.org/T185233 (10fdans) [19:00:49] 10Analytics, 10Analytics-Kanban: Improve webrequest-refine shuffle-sort - https://phabricator.wikimedia.org/T267008 (10fdans) 05Open→03Resolved [19:00:51] 10Analytics, 10Analytics-Kanban: Add caching for maxmind functions used on cluster - https://phabricator.wikimedia.org/T267009 (10fdans) 05Open→03Resolved [19:00:59] 10Analytics-Clusters: Review recurrent Hadoop worker disk saturation events - https://phabricator.wikimedia.org/T265487 (10fdans) [19:01:01] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Investigate oozie banner monthly job timeouts - https://phabricator.wikimedia.org/T264358 (10fdans) 05Open→03Resolved [19:01:03] 10Analytics, 10Analytics-Kanban, 10Product-Analytics, 10Structured-Data-Backlog, 10Patch-For-Review: Add image table to monthly sqoop list - https://phabricator.wikimedia.org/T266077 (10fdans) 05Open→03Resolved [19:01:05] 10Analytics, 10Analytics-Kanban: Make druid mediawiki-history-reduced segments smaller - https://phabricator.wikimedia.org/T268813 (10fdans) 05Open→03Resolved [19:01:07] 10Analytics, 10Analytics-Kanban: AQS pageview default caching is one day - https://phabricator.wikimedia.org/T268809 (10fdans) 05Open→03Resolved [19:01:09] 10Analytics, 10Analytics-Kanban: AQS should be more resilient to druid nodes not available - https://phabricator.wikimedia.org/T268811 (10fdans) 05Open→03Resolved [19:01:11] 10Analytics, 10Analytics-Kanban: Fix pageview title accepted values (trailing EOL) - https://phabricator.wikimedia.org/T268630 (10fdans) 05Open→03Resolved [19:01:13] 10Analytics, 10Analytics-Kanban: pageviews complete have irregular lines - https://phabricator.wikimedia.org/T267575 (10fdans) [19:01:18] 10Analytics, 10Analytics-Kanban: Analytics Ops Technical Debt - https://phabricator.wikimedia.org/T240437 (10fdans) [19:01:20] 10Analytics-Clusters, 10Analytics-Kanban: Refactor puppet profiles to reduce hiera pollution - https://phabricator.wikimedia.org/T268220 (10fdans) 05Open→03Resolved [19:01:22] 10Analytics, 10Analytics-Kanban: pageviews complete have irregular lines - https://phabricator.wikimedia.org/T267575 (10fdans) 05Open→03Resolved [19:01:24] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Move oozie's hive2 actions to analytics-hive.eqiad.wmnet - https://phabricator.wikimedia.org/T268028 (10fdans) 05Open→03Resolved [19:01:28] 10Analytics, 10Analytics-Kanban, 10Event-Platform, 10SRE: Reduce cache TTL of schema.wikimedia.org - https://phabricator.wikimedia.org/T267557 (10fdans) 05Open→03Resolved [19:01:30] 10Analytics-Clusters, 10Analytics-Kanban, 10Patch-For-Review: Review an-coord1001's usage and failover plans - https://phabricator.wikimedia.org/T257412 (10fdans) [19:01:32] 10Analytics, 10Analytics-Kanban: Refine should report about malformed records and continue if possible - https://phabricator.wikimedia.org/T266872 (10fdans) 05Open→03Resolved [19:01:34] 10Analytics, 10Analytics-Kanban: [Data quality stats] Add dsaez to receive traffic anomaly alarms - https://phabricator.wikimedia.org/T267356 (10fdans) 05Open→03Resolved [19:01:36] 10Analytics, 10Analytics-Kanban: Traffic anomaly alarms - https://phabricator.wikimedia.org/T267355 (10fdans) [19:01:38] 10Analytics, 10Analytics-Kanban, 10Event-Platform, 10Patch-For-Review: eventgate-analytics-external occasionally seems to fail lookups of dynamic stream config from MW EventStreamConfig API - https://phabricator.wikimedia.org/T266573 (10fdans) 05Open→03Resolved [19:01:40] 10Analytics, 10Analytics-Kanban, 10Event-Platform, 10Patch-For-Review: Make node-rdkafka an optional dependency of EventGate - https://phabricator.wikimedia.org/T266058 (10fdans) 05Open→03Resolved [19:01:42] 10Analytics, 10Analytics-Kanban, 10Event-Platform, 10MW-1.36-notes (1.36.0-wmf.20; 2020-12-01), 10Patch-For-Review: Instrumentation development environment on EventGate server - https://phabricator.wikimedia.org/T259202 (10fdans) [19:01:46] wow go fdans go! [19:01:46] 10Analytics, 10Analytics-Kanban, 10Product-Analytics: Add data quality alarm for mobile-app data - https://phabricator.wikimedia.org/T257692 (10fdans) 05Open→03Resolved [19:01:48] 10Analytics, 10Analytics-Kanban, 10Event-Platform, 10MW-1.36-notes (1.36.0-wmf.20; 2020-12-01), 10Patch-For-Review: Instrumentation development environment on EventGate server - https://phabricator.wikimedia.org/T259202 (10fdans) 05Open→03Resolved [19:01:50] 10Analytics, 10Analytics-Kanban, 10Dumps-Generation: Document missing project types in pagecount dumps - https://phabricator.wikimedia.org/T249984 (10fdans) 05Open→03Resolved [19:01:52] 10Analytics-Kanban: Neflow data pipeline - https://phabricator.wikimedia.org/T257554 (10fdans) [19:01:58] 10Analytics-EventLogging, 10Analytics-Kanban, 10Event-Platform, 10Goal, and 3 others: Modern Event Platform - https://phabricator.wikimedia.org/T185233 (10fdans) [19:02:00] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Migrate pagecounts-ez generation to hadoop - https://phabricator.wikimedia.org/T192474 (10fdans) [19:02:04] 10Analytics, 10Analytics-Kanban, 10SRE, 10netops, 10Patch-For-Review: Add more dimensions in the netflow/pmacct/Druid pipeline - https://phabricator.wikimedia.org/T254332 (10fdans) 05Open→03Resolved [19:02:06] 10Analytics, 10Analytics-Kanban: Evaluate possible replacements for Camus: Gobblin, Marmaray, Kafka Connect HDFS, etc. - https://phabricator.wikimedia.org/T238400 (10fdans) 05Open→03Resolved [19:02:08] 10Analytics, 10Analytics-Kanban: Address refinery-source security vulnerabilities - https://phabricator.wikimedia.org/T237774 (10fdans) 05Open→03Resolved [19:02:10] 10Analytics-Kanban, 10Analytics-Wikistats: Wikistats 2.0. - https://phabricator.wikimedia.org/T130256 (10fdans) [19:02:13] 10Analytics, 10Analytics-EventLogging, 10Event-Platform, 10Goal, 10Services (watching): Modern Event Platform: Stream Connectors - https://phabricator.wikimedia.org/T214430 (10fdans) [19:02:15] 10Analytics, 10Analytics-Kanban, 10Analytics-Wikistats: Wikistats 2.0: Add statistics for the geographical origin of the contributors - https://phabricator.wikimedia.org/T188859 (10fdans) 05Open→03Resolved [19:02:17] 10Analytics, 10Analytics-Kanban: Analytics Ops Technical Debt - https://phabricator.wikimedia.org/T240437 (10fdans) [19:02:19] 10Analytics, 10Analytics-Kanban, 10Event-Platform, 10Patch-For-Review: Refine should add field to indicate if event is from wikimedia domain instead of filtering - https://phabricator.wikimedia.org/T256677 (10fdans) 05Open→03Resolved [19:02:23] 10Analytics-Clusters, 10Analytics-Kanban, 10Patch-For-Review: Deprecate the 'researchers' posix group - https://phabricator.wikimedia.org/T268801 (10fdans) 05Open→03Resolved [19:02:52] (03CR) 10Joal: Create and configure Oozie job to load 'Top Articles by Country Pageviews API' data into Cassandra (033 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/654924 (https://phabricator.wikimedia.org/T207171) (owner: 10Lex Nasser) [19:02:57] (03PS8) 10Neil P. Quinn-WMF: Set up and document deployment strategy for jobs [analytics/wmf-product/jobs] - 10https://gerrit.wikimedia.org/r/651794 (https://phabricator.wikimedia.org/T261953) [19:04:21] (03CR) 10Neil P. Quinn-WMF: [V: 03+2 C: 03+2] "Thanks, Joseph and Luca, for the thoughtful review!" (032 comments) [analytics/wmf-product/jobs] - 10https://gerrit.wikimedia.org/r/651794 (https://phabricator.wikimedia.org/T261953) (owner: 10Neil P. Quinn-WMF) [19:10:11] ottomata: rip my phone [19:10:24] haha [19:10:28] you get notifications? [19:10:41] yea for irccloud [19:10:45] nasty [19:20:08] how goes razzi? [19:20:50] ottomata: good, the second topic, eventlogging_ExternalGuidance, is 25% of 4.4G [19:20:56] cool [19:21:50] Meanwhile I'm trying to set up logical volumes like they are on an-master1001 on a virtual machine [19:23:41] ah nice [19:25:42] 10Analytics, 10Data-release, 10Privacy Engineering, 10Research, 10Privacy: Evaluate a differentially private solution to release wikipedia's project-title-country data - https://phabricator.wikimedia.org/T267283 (10TedTed) Ah, I see. Yeah, no anonymization notion can change the fact that some data might... [19:34:43] 10Analytics, 10Analytics-Kanban, 10Product-Analytics: Superset presto error: Failed to list directory: hdfs://analytics-hadoop/wmf/data/event/... - https://phabricator.wikimedia.org/T272741 (10mforns) Hi @kzimmerman! ===== In the case of `session_length`: It was a problem of permissions, indeed. Thanks for... [19:45:43] (03PS2) 10Fdans: Add monthly pageview complete job [analytics/refinery] - 10https://gerrit.wikimedia.org/r/658348 (https://phabricator.wikimedia.org/T265732) [19:55:44] 10Analytics-Clusters, 10DC-Ops, 10SRE, 10ops-eqiad: (Need By: TBD) rack/setup/install an-worker11[18-41] - https://phabricator.wikimedia.org/T260445 (10Jclark-ctr) replaced Dac cable for an-worker1119 and an-worker1131 @elukey confirmed both are seeing network [19:56:13] \o/ joal --^ [19:56:28] Awesome elukey :) [19:56:31] tomorrow I'll hopefully add 2 workers to backup [19:56:41] it was on both a bad copper cable -.- [19:56:47] elukey: I'm copying a big bunch of data, with a big bunch of different users (/user) [19:56:54] perfect [19:57:20] elukey: I'll need your help tomorrow morning to triple check the user are correct in HDFS despite not yet existing on the systems [19:57:32] Have a good end of day elukey :) [19:57:39] you too joal :) [20:07:24] Gone for tonight [20:23:11] 10Analytics, 10Analytics-Kanban, 10Event-Platform, 10Structured-Data-Backlog, 10Patch-For-Review: SuggestedTagsAction Event Platform Migration - https://phabricator.wikimedia.org/T267351 (10mforns) [20:39:55] 10Analytics-Clusters: Balance Kafka topic partitions on Kafka Jumbo to take advantage of the new brokers - https://phabricator.wikimedia.org/T255973 (10razzi) [20:41:35] !log rebalance kafka partitions for codfw.mediawiki.page-properties-change [20:41:37] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [20:42:11] !log rebalance kafka partitions for eqiad.mediawiki.page-properties-change.json [20:42:13] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [20:44:41] 10Analytics-Clusters: Balance Kafka topic partitions on Kafka Jumbo to take advantage of the new brokers - https://phabricator.wikimedia.org/T255973 (10razzi) [20:56:11] 10Analytics, 10Analytics-Kanban, 10Event-Platform, 10Patch-For-Review: DesktopWebUIActionsTracking Event Platform Migration - https://phabricator.wikimedia.org/T271164 (10mforns) [20:56:23] 10Analytics, 10Analytics-Kanban, 10Event-Platform, 10Patch-For-Review: MobileWebUIActionsTracking Event Platform Migration - https://phabricator.wikimedia.org/T267347 (10mforns) [20:58:34] 10Analytics, 10EventStreams, 10Services: To provide performer array in RC stream - https://phabricator.wikimedia.org/T218063 (10Ottomata) Hi! Sorry I didn't see this before; I don't often look at the EventStreams tag. Hm, so either we add `performer` to recentchange, or add the info you need to revision-cr... [21:16:40] 10Analytics, 10Analytics-Kanban, 10Event-Platform, 10Patch-For-Review: eventgate-wikimedia should emit metrics about validation errors - https://phabricator.wikimedia.org/T257237 (10Ottomata) @fgiunchedi hello! Been reading some alert documentation stuff and I have some questions. - https://wikitech.wikim... [21:27:44] (03PS1) 10Mforns: Add RUMSpeedIndex to analytics legacy [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/658429 (https://phabricator.wikimedia.org/T271208) [21:29:15] (03CR) 10Mforns: [V: 03+2 C: 03+2] Add RUMSpeedIndex to analytics legacy [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/658429 (https://phabricator.wikimedia.org/T271208) (owner: 10Mforns) [21:31:35] 10Analytics, 10EventStreams, 10Services: To provide performer array in RC stream - https://phabricator.wikimedia.org/T218063 (10Pchelolo) Hm, so for `patrolled` - technically this would be possible to add to revision-create, will require quite some coding to pass the info around. Also, the patrolled status c... [21:35:03] (03PS1) 10Mforns: Add PaintTiming to analytics/legacy [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/658430 (https://phabricator.wikimedia.org/T271208) [21:35:06] 10Analytics-Clusters: Balance Kafka topic partitions on Kafka Jumbo to take advantage of the new brokers - https://phabricator.wikimedia.org/T255973 (10razzi) [21:35:57] (03CR) 10Mforns: [V: 03+2 C: 03+2] Add PaintTiming to analytics/legacy [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/658430 (https://phabricator.wikimedia.org/T271208) (owner: 10Mforns) [21:41:42] (03PS1) 10Mforns: Add ElementTiming to analytics/legacy [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/658431 (https://phabricator.wikimedia.org/T271208) [21:43:34] (03CR) 10Mforns: [V: 03+2 C: 03+2] Add ElementTiming to analytics/legacy [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/658431 (https://phabricator.wikimedia.org/T271208) (owner: 10Mforns) [21:46:33] (03PS1) 10Mforns: Add LayoutShift to analytics/legacy [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/658434 (https://phabricator.wikimedia.org/T271208) [21:47:38] (03CR) 10Mforns: [V: 03+2 C: 03+2] Add LayoutShift to analytics/legacy [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/658434 (https://phabricator.wikimedia.org/T271208) (owner: 10Mforns) [21:53:15] (03PS1) 10Mforns: Add FeaturePolicyViolation to analytics/legacy [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/658435 (https://phabricator.wikimedia.org/T271208) [21:54:31] (03CR) 10Mforns: [V: 03+2 C: 03+2] Add FeaturePolicyViolation to analytics/legacy [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/658435 (https://phabricator.wikimedia.org/T271208) (owner: 10Mforns) [22:00:06] (03PS1) 10Mforns: Fix examples of analytics/legacy/elementtiming [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/658438 (https://phabricator.wikimedia.org/T271208) [22:01:04] (03CR) 10Mforns: [V: 03+2 C: 03+2] Fix examples of analytics/legacy/elementtiming [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/658438 (https://phabricator.wikimedia.org/T271208) (owner: 10Mforns) [22:04:38] (03PS1) 10Mforns: Correct examples of analytics/legacy/LayoutShift [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/658439 (https://phabricator.wikimedia.org/T271208) [22:05:22] (03CR) 10Mforns: [V: 03+2 C: 03+2] Correct examples of analytics/legacy/LayoutShift [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/658439 (https://phabricator.wikimedia.org/T271208) (owner: 10Mforns) [22:11:02] (03PS1) 10Mforns: Add FirstInputTiming to analytics/legacy [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/658442 (https://phabricator.wikimedia.org/T271208) [22:13:12] (03CR) 10Mforns: [V: 03+2 C: 03+2] Add FirstInputTiming to analytics/legacy [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/658442 (https://phabricator.wikimedia.org/T271208) (owner: 10Mforns) [22:14:31] 10Analytics, 10Analytics-EventLogging, 10Event-Platform, 10Performance-Team, 10Patch-For-Review: NavigationTiming Extension schemas Event Platform Migration - https://phabricator.wikimedia.org/T271208 (10mforns) [23:54:50] (03PS9) 10Awight: Update schema with core bucket labels [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/656901 (https://phabricator.wikimedia.org/T269986) (owner: 10WMDE-Fisch)