[04:53:19] 10Analytics, 10Analytics-Kanban, 10Product-Analytics, 10Patch-For-Review: Add dimensions to editors_daily dataset - https://phabricator.wikimedia.org/T256050 (10cchen) hi @JAllemandou , i put some time in you calendar. feel free to move it! [07:05:36] goood morning [07:46:17] Hello [07:52:56] bonjour! [07:53:22] How are ou elukey ? [07:54:25] good! And you [07:54:26] ? [07:54:37] All good :) [08:06:21] elukey: not sure if you've seen - I deleted logs from the wikidata job causing trouble: 10Tb per job, 7 jobs (I kept one for analysis) - This makes ~210Tb counting replication [08:06:45] joal: I saw briefly but I didn't get it was so much, nice! [08:06:48] I continue to maintain that this is not about cluster being full, but that the job has an issue [08:07:50] one thing that I am wondering is if our default log4j.properties are picked up by spark workers on yarn [08:08:09] hm [08:08:47] what I mean is if something like [08:09:17] --conf spark.executor.extraJavaOptions=-Dlog4j.configuration=/etc/spark2/defaults/log4j.properties may make the debug disappear [08:09:27] if so it would be an option to add to our defaults [08:09:36] (the driver picks it up) [08:09:42] I hear that [08:09:59] do you think that it could be something to let Goran test, or totally not needed? [08:10:17] the thing that I don't get is why it logs DEBUG when we have the root logger in the defaults at info [08:10:31] this is why I am thinking about the above [08:11:51] elukey: currently we supposedly log at INFO level, right? [08:14:00] joal: yep exactly [08:14:32] hm [08:15:30] elukey: Can you please check in Goran cron the command he uses to lauch that job? [08:16:00] elukey: I think the job is using spark with R, with which we are not familiar [08:16:44] Ah actually from the task it looks it is python sorry [08:17:59] yes it should be pyspark [08:34:19] 10Analytics, 10Analytics-EventLogging: Delete tofu table from staging database after research is done - https://phabricator.wikimedia.org/T70441 (10Amire80) I forgot about this... Is this still relevant? I can't find `staging` at all, and I suspect that it was just deleted completely. [09:56:59] Hi gmodena - Trying to Answer some of your questions in here about ssh access and cloud projects [09:57:47] joal ack - thanks [09:57:55] About ssh acces: Your config as described here https://wikitech.wikimedia.org/wiki/Production_access#SSH_configuration using bast3004 should work broadly accross production [09:58:00] gmodena: --^ [09:59:38] joal got it - that's the config I copied (with bast3004) [09:59:56] gmodena: About getting access to data, I just realized there is no link from the onboarding page to this page: https://wikitech.wikimedia.org/wiki/Analytics/Data_access [10:00:07] gmodena: sorry, yet another page to read :S [10:00:23] np. On it :) [10:01:20] just to recap: for now I created i created a ./ssh/config, and uploaded a pub key to wikited [10:01:26] *wikitech [10:01:31] gmodena: This will explain the various data access patterns we use - IMO ou should request access to analytics-privatedata-users [10:02:07] Ah gmodena - The public key also needs to be sent to the SRE team for it to be added to the keys able to access prod [10:02:48] gmodena: the procedure is above on the https://wikitech.wikimedia.org/wiki/Production_access page [10:03:30] I had a look at the phab task linked in the doc - https://phabricator.wikimedia.org/T96053 [10:03:38] elukey: asking for confirmation that gmodena should be added to the analytics-privatedata-users group [10:04:16] gmodena: this task looks good as an example :) [10:05:15] ack [10:05:24] do I understand correctly that I'll need manager approval? Can I create a phab task / request myself, or do I need my manager to do it on my behalf? [10:05:50] gmodena: hi :) You can create the task yourself, and ask your manager to comment [10:06:06] Andrew Otto is our approver as well, we'll need his green light before proceeding [10:06:13] elukey roger that [10:06:34] feel free to include me or ping me in any step with SRE if you have doubts [10:06:38] or just ask in here :) [10:06:58] i will :) [10:07:26] you'll also need a kerberos principal [10:07:42] that you can ask in the task as well [10:07:54] About being used to analytics VPS/cloud project, it needs to be done by someone in our team having admin rights over the project - I looks like I can do that [10:08:08] gmodena: --^ [10:08:42] gmodena: Can you tell me your shell-name and user-name for wikitech? [10:08:57] With that I can add you to the analytics cloud project :) [10:09:03] elukey ack [10:09:20] joal username is gmodena [10:09:24] shell name should be the same [10:09:30] ack - trying to add you [10:10:01] gmodena: https://wikitech.wikimedia.org/wiki/Analytics/Systems/Kerberos/UserGuide is a guide for all the "nice" things that kerberos requires from users [10:11:04] Also gmodena - When your access-request task is done, it'd be awesome if you could update the doc with it instead of the previous one (having kerberos listed in it) - Please :) [10:11:26] gmodena: You're listed as member of the analytics cloud project :) [10:13:10] joal woot! [10:13:28] i'm taking notes & will update the process once i'm done [10:13:42] elukey thanks for the pointer! [10:14:03] Thanks gmodena :) [10:26:20] so joal I was reading https://docs.cloudera.com/documentation/enterprise/5-16-x/topics/admin_ha_hivemetastore.html#concept_jqx_zqk_dq [10:26:35] that essentially states to use org.apache.hadoop.hive.thrift.DBTokenStore [10:26:48] but IIRC it caused some issues when we tried it (even with only one metastore) [10:27:04] webrequest_load was failing for krb creds once in a while [10:27:19] so no bueno, for that we might need to wait for bigtop [10:29:31] ack elukey [10:30:04] not sure that bigtop will solve the issue nontheless - higher versions might help [10:30:54] joal: I think it will, I am 99% positive :) [10:31:02] great :) [10:31:47] 10Analytics, 10Analytics-Kanban: Move oozie's hive2 actions to analytics-hive.eqiad.wmnet - https://phabricator.wikimedia.org/T268028 (10elukey) [10:32:34] 10Analytics, 10Analytics-Kanban: Move oozie's hive2 actions to analytics-hive.eqiad.wmnet - https://phabricator.wikimedia.org/T268028 (10elukey) p:05Triage→03High [10:32:47] (03PS1) 10Elukey: Move webrequest_laod to analytics-hive hive2 credentials [analytics/refinery] - 10https://gerrit.wikimedia.org/r/643013 (https://phabricator.wikimedia.org/T268028) [10:33:52] (03CR) 10Joal: [C: 03+1] "LGTM!" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/643013 (https://phabricator.wikimedia.org/T268028) (owner: 10Elukey) [10:34:09] \o/ [10:34:11] all in [10:36:55] o/ would you have any objections to importing commons RDF to hdfs using hdfs_rsync like what's done for wikidata? [10:37:14] dcausse: 5 euro and you have the green light [10:37:18] :D [10:37:19] :) [10:37:35] dcausse: I possibly can make it cheaper ;) [10:37:36] jokes aside, how much data/files are you importing? [10:37:38] I have a patch (not urgent at all) [10:37:43] lemme see [10:38:09] dcausse: I think commons RDF are small in comparison to the dumps we import regularly [10:38:36] yes [10:38:47] they are around 15G apparently [10:38:48] (03PS2) 10Elukey: Move webrequest_load to analytics-hive hive2 credentials [analytics/refinery] - 10https://gerrit.wikimedia.org/r/643013 (https://phabricator.wikimedia.org/T268028) [10:38:48] So that shouldn't be problematic [10:39:10] https://dumps.wikimedia.your.org/other/wikibase/commonswiki/ the -mediainfo.ttl.bz2 ones [10:39:36] dcausse: I'm surprise it's already that big! [10:39:46] yes, it's growing fast [10:39:59] lots of bots adding licence info and such [10:40:39] dcausse: Would that be the cause for the last 2 month spike in edits for commons? (see https://stats.wikimedia.org/#/commons.wikimedia.org/contributing/edits/normal|bar|2-year|~total|monthly) [10:40:57] dcausse: I assumed it was, but I'd love confirmation :) [10:41:03] wow [10:41:21] I can ask someone there to be sure [10:41:22] The spike is real, isn't it :) [10:44:32] I asked something to the SD team, I'll report back to you here [10:44:38] ack [10:44:50] As per you request for hdfs_rsync - Please go ahead [10:44:54] dcausse: --^ [10:45:04] joal: thanks! [10:46:12] I'll add you both to the patch if you don't mind (it's really not urgent) [10:46:19] sure! [10:46:50] thanks! [10:46:56] dcausse: I will also I think create a oozie job to regularly convert TTLs to svro triples - It'll be useful :) [10:47:12] it's done already :) [10:47:19] WUT? [10:47:26] great :) [10:47:31] sorry forgot to tell you [10:47:37] dcausse: no problem at all! [10:47:56] dcausse: can you tell me where the is stored please? [10:48:42] Getting kids for lunch [10:48:43] sure: show partitions discovery.wikidata_rdf; [10:49:14] we keep the last 4 dumps there [10:49:32] parition is like: date=20201116 [10:58:29] joal elukey fyi: i created a phabricator task for onboarding - https://phabricator.wikimedia.org/T268453 [10:58:37] thanks for helping me out [10:58:39] ack! [10:59:23] yep looks good! [11:02:55] Pathname /user/elukey/thorium_backup/backup_wikistats_1/htdocs/archive/backup/bash_[2011-01-17][03:00].zip fr$m hdfs:/user/elukey/thorium_backup/backup_wikistats_1/htdocs/archive/backup/bash_[2011-01-17][03:00].zip is not a valid DFS filename. [11:02:59] lol [11:04:45] to be noted that the files are [11:04:55] elukey@stat1004:/srv/thorium_backup/backup_wikistats_1/htdocs/archive/backup$ ls [11:04:58] 'bash_[2011-01-17][03:00].zip' 'csv_[2011-01-17][03:00].zip' 'perl_[2011-01-17][03:00].zip' projectcounts-2009.tar projectcounts-2010.tar projectcounts-2011.tar [11:05:02] even with quotes [11:05:11] beauty [11:38:13] * elukey lunch! [12:05:45] (03PS1) 10Gerrit maintenance bot: Add skr.wiktionary to pageview whitelist [analytics/refinery] - 10https://gerrit.wikimedia.org/r/643025 (https://phabricator.wikimedia.org/T268448) [12:08:57] (03CR) 10Urbanecm: [C: 03+1] Add skr.wiktionary to pageview whitelist [analytics/refinery] - 10https://gerrit.wikimedia.org/r/643025 (https://phabricator.wikimedia.org/T268448) (owner: 10Gerrit maintenance bot) [13:00:59] (03PS1) 10Joal: [WIP] Update sqoop adding tables and removing timestamps [analytics/refinery] - 10https://gerrit.wikimedia.org/r/643029 (https://phabricator.wikimedia.org/T266077) [13:04:35] 10Analytics, 10Analytics-Wikistats, 10I18n: WikiReportsLocalizations.pm still fetches language names from SVN - https://phabricator.wikimedia.org/T64570 (10Paladox) a:05Paladox→03None [13:22:38] joal: confirmed, the spike on commons edits is most like due to bots, context: https://commons.wikimedia.org/wiki/Commons_talk:Structured_data#Structured_copyright_and_licensing_for_search_indexing [13:22:55] awesome dcausse - many thanks :) [13:24:35] PROBLEM - Check the last execution of produce_canary_events on an-launcher1002 is CRITICAL: CRITICAL: Status of the systemd unit produce_canary_events https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [13:35:15] RECOVERY - Check the last execution of produce_canary_events on an-launcher1002 is OK: OK: Status of the systemd unit produce_canary_events https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [13:50:07] (03PS1) 10Joal: Refactor oozie mediawiki-history-load job [analytics/refinery] - 10https://gerrit.wikimedia.org/r/643033 [13:53:29] elukey: Would you by any chance be back? [13:58:58] joal: yep :) [13:59:08] o/ [13:59:20] I've trying to repro the problem of logs [13:59:38] could you please try to get the cron command from goran? [13:59:52] elukey: actually some more info, so that you understand [14:00:15] elukey: I've doing tests of logs with pyspark, and it all works as expected (info level everywhere) [14:00:53] I wish to test a cron job using the analytics-privatedata user, as I think the cron job might not have the correct env var set, and therefore default to unexpected logging [14:02:41] ahh yes makes sense [14:23:58] 10Analytics: Avro Deserializer logging set to DEBUG in pyspark lead to huge yarn stderr container files (causing disk usage alerts) - https://phabricator.wikimedia.org/T268376 (10elukey) Joseph and I checked a bit, and we are interested to make the following test: @GoranSMilovanovic can you add the following con... [14:30:15] 10Analytics: Avro Deserializer logging set to DEBUG in pyspark lead to huge yarn stderr container files (causing disk usage alerts) - https://phabricator.wikimedia.org/T268376 (10Ottomata) Two guesses that might help fix this. 1. [[ https://stackoverflow.com/a/4527351 | Explicitly set the log level ]] for org.a... [14:57:39] 10Analytics: Avro Deserializer logging set to DEBUG in pyspark lead to huge yarn stderr container files (causing disk usage alerts) - https://phabricator.wikimedia.org/T268376 (10GoranSMilovanovic) @elukey @Ottomata The tests as you have suggested will be carried out today, later CET hours. Thank you! @Ottomata... [15:01:37] 10Quarry, 10cloud-services-team (Kanban): Do some checks of how many Quarry queries will break in a multiinstance environment - https://phabricator.wikimedia.org/T267989 (10dcaro) I did a preliminary check on all quarry queries, and found 42 that seem to use cross DB (I have to check a bit more the code to see... [15:17:38] 10Analytics, 10Patch-For-Review: Kerberos identity for fkaelin - https://phabricator.wikimedia.org/T268365 (10Ottomata) FYI. I added fab's principal following https://wikitech.wikimedia.org/wiki/Analytics/Systems/Kerberos#Create_a_principal_for_a_real_user. [15:20:09] 10Analytics, 10Patch-For-Review: Kerberos identity for fkaelin - https://phabricator.wikimedia.org/T268365 (10Ottomata) 05Open→03Resolved a:03Ottomata [15:29:51] (03CR) 10Mforns: ""+332 -1181" Awesome!!" (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/643033 (owner: 10Joal) [15:33:18] (03CR) 10Mforns: [C: 03+1] "LGTM!" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/643013 (https://phabricator.wikimedia.org/T268028) (owner: 10Elukey) [15:34:14] (03CR) 10Elukey: [V: 03+2 C: 03+2] Move webrequest_load to analytics-hive hive2 credentials [analytics/refinery] - 10https://gerrit.wikimedia.org/r/643013 (https://phabricator.wikimedia.org/T268028) (owner: 10Elukey) [15:54:15] PROBLEM - Check the last execution of produce_canary_events on an-launcher1002 is CRITICAL: CRITICAL: Status of the systemd unit produce_canary_events https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [15:55:08] ottomata: hi! is --^ something that we can silence for the time being? [15:58:55] elukey: i can maybe chang the email to just me [15:58:56] i am working on it [15:59:06] each time it happens (and I'm online) i go and try to find more infomation [15:59:15] i keep adding more logging to eventgate-analytics-external [15:59:27] so i need to check on todays [15:59:29] haven't had time yet [15:59:45] but i need to know when it happens to be able to go and find the logs [16:04:53] RECOVERY - Check the last execution of produce_canary_events on an-launcher1002 is OK: OK: Status of the systemd unit produce_canary_events https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [16:04:56] 10Analytics: Avro Deserializer logging set to DEBUG in pyspark lead to huge yarn stderr container files (causing disk usage alerts) - https://phabricator.wikimedia.org/T268376 (10JAllemandou) hm, I don't think it was a discussion with me as `--files` works in our settings for what it is intended to, as in provid... [16:13:29] (03CR) 10Joal: Refactor oozie mediawiki-history-load job (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/643033 (owner: 10Joal) [16:15:01] 10Analytics, 10Event-Platform, 10Product-Infrastructure-Data: Automate EventGate validation error reporting - https://phabricator.wikimedia.org/T268027 (10mpopov) +1 to starting the practice of adding stream ownership info to stream config, for data governance purposes [16:15:28] (03CR) 10Milimetric: [C: 03+2] Disable chart movement on scrolling when on table [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/641434 (https://phabricator.wikimedia.org/T267467) (owner: 10Fdans) [16:17:41] (03Merged) 10jenkins-bot: Disable chart movement on scrolling when on table [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/641434 (https://phabricator.wikimedia.org/T267467) (owner: 10Fdans) [16:34:05] elukey: will you join us for ops sync? [16:36:36] ottomata: nope I asked to skip a moment ago, I am doing the host moves with John [16:37:54] ottomata: am I needed? I can jump now in the call if you want [16:37:59] (just shutdown a node) [16:38:25] k no sok [16:38:59] super thanks :) [17:05:06] :( Spawn failed: Server at http://127.0.0.1:36101/user/addshore/ didn't respond in 30 seconds (My notrebook is apparently not loading correctly) [17:12:45] addshore: on what node? 1007? [17:13:34] addshore: can you try https://wikitech.wikimedia.org/wiki/Analytics/Systems/Jupyter#Resetting_user_virtualenvs ? [17:14:09] * addshore will [17:14:12] yus was 1007 [17:16:40] seemingly worked, ty! :) [17:17:51] gooood [17:58:32] 10Analytics, 10Analytics-Kanban, 10Operations, 10netops, 10Patch-For-Review: Add more dimensions in the netflow/pmacct/Druid pipeline - https://phabricator.wikimedia.org/T254332 (10mforns) Hi @ayounsi, The new data is already in Hive's netflow table. And I'm about to enable Druid loading for the new fie... [18:18:24] joal: I am done with the host moves for today, IIUC from standup you needed the ops oncall today? :) [18:44:29] Heya elukey - sorry I missed the ping [18:44:39] elukey: it's late - shall we talk tomorrow? [18:52:28] 10Analytics, 10Analytics-Kanban, 10Event-Platform, 10Patch-For-Review: eventgate-analytics-external occasionally seems to fail lookups of dynamic stream config from MW EventStreamConfig API - https://phabricator.wikimedia.org/T266573 (10Ottomata) Ok, got some pretty whacky logs finally. ` [2020-11-23T15:4... [19:05:51] joal: sorry I started doing another thing, just seen the pings [19:06:00] sure it is ok! Let's do tomorrow when you have time :) [19:06:29] Sure elukey - Will ping you in the morning :) [19:06:32] Thanks! [19:06:41] Gone for tonight [19:06:58] afk as well! [19:31:48] 10Quarry, 10cloud-services-team (Kanban): Do some checks of how many Quarry queries will break in a multiinstance environment - https://phabricator.wikimedia.org/T267989 (10Bstorm) As an aside, if you use the `webservice shell` command, you can get it working (https://wikitech.wikimedia.org/wiki/Help:Toolforge... [19:33:21] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Event-Platform, and 4 others: Migrate legacy metawiki schemas to Event Platform - https://phabricator.wikimedia.org/T259163 (10Ottomata) [19:33:36] 10Analytics, 10Analytics-Kanban, 10Operations, 10netops, 10Patch-For-Review: Add more dimensions in the netflow/pmacct/Druid pipeline - https://phabricator.wikimedia.org/T254332 (10ayounsi) I don't understand the difference :) Why can't we do the same as T231339#6612105 ? [19:34:03] 10Analytics, 10Anti-Harassment, 10Event-Platform: SpecialMuteSubmit Event Platform Migration - https://phabricator.wikimedia.org/T267350 (10Ottomata) [19:34:05] 10Analytics, 10Anti-Harassment, 10Event-Platform: SpecialInvestigate Event Platform Migration - https://phabricator.wikimedia.org/T267349 (10Ottomata) [19:34:07] 10Analytics, 10Anti-Harassment, 10Event-Platform: AutoblockIpBlock Event Platform Migration - https://phabricator.wikimedia.org/T267340 (10Ottomata) [19:39:08] 10Analytics, 10Anti-Harassment, 10Event-Platform: CookieBlock Event Platform Migration - https://phabricator.wikimedia.org/T267341 (10Ottomata) Hi @Niharika I reached out to both of them and neither of them knew anything about this schema. According to meta.wikimedia.org history, you created and edited this... [19:46:07] 10Analytics, 10Analytics-Kanban, 10Growth-Team, 10Product-Analytics, 10Patch-For-Review: Migrate Growth EventLogging schemas to Event Platform - https://phabricator.wikimedia.org/T267333 (10Ottomata) [19:46:45] 10Analytics, 10Anti-Harassment, 10Event-Platform: CookieBlock Event Platform Migration - https://phabricator.wikimedia.org/T267341 (10Niharika) >>! In T267341#6642064, @Ottomata wrote: > Hi @Niharika I reached out to both of them and neither of them knew anything about this schema. According to meta.wikimedi... [19:46:54] 10Analytics, 10Analytics-Kanban, 10Growth-Team, 10Product-Analytics, 10Patch-For-Review: Migrate Growth EventLogging schemas to Event Platform - https://phabricator.wikimedia.org/T267333 (10Ottomata) [19:50:33] 10Analytics, 10Analytics-Kanban, 10Growth-Team, 10Product-Analytics, 10Patch-For-Review: Migrate Growth EventLogging schemas to Event Platform - https://phabricator.wikimedia.org/T267333 (10Ottomata) Hiya @nettrom_WMF, we'd like to migrate these schemas during the week of Dec 7 - Dec 11. Let us know if... [19:51:30] 10Analytics, 10Analytics-Kanban, 10Growth-Team, 10Product-Analytics, 10Patch-For-Review: Migrate Growth EventLogging schemas to Event Platform - https://phabricator.wikimedia.org/T267333 (10Ottomata) @nettrom_WMF I may have already asked you this elsewhere, but I'll ask again here so we have an officiall... [19:58:09] 10Quarry, 10cloud-services-team (Kanban): Do some checks of how many Quarry queries will break in a multiinstance environment - https://phabricator.wikimedia.org/T267989 (10dcaro) I'm crossing query.latest_query_rev with query_revision, that shrinks the number quite considerably: ` MariaDB [quarry]> select co... [20:02:44] 10Quarry, 10cloud-services-team (Kanban): Do some checks of how many Quarry queries will break in a multiinstance environment - https://phabricator.wikimedia.org/T267989 (10dcaro) And thanks for the shell tip! (found it a bit earlier, and was able to git it working \o/): ''' dcaro@vulcanus$ curl --silent http... [20:06:46] 10Analytics, 10Analytics-Kanban, 10Growth-Team, 10Product-Analytics, 10Patch-For-Review: Migrate Growth EventLogging schemas to Event Platform - https://phabricator.wikimedia.org/T267333 (10nettrom_WMF) >>! In T267333#6642131, @Ottomata wrote: > @nettrom_WMF I may have already asked you this elsewhere, b... [20:07:20] 10Analytics, 10Analytics-Kanban, 10Growth-Team, 10Product-Analytics, 10Patch-For-Review: Migrate Growth EventLogging schemas to Event Platform - https://phabricator.wikimedia.org/T267333 (10Ottomata) Thank you! [20:12:43] 10Analytics, 10Analytics-Kanban, 10Operations, 10netops, 10Patch-For-Review: Add more dimensions in the netflow/pmacct/Druid pipeline - https://phabricator.wikimedia.org/T254332 (10mforns) We could do the same as T231339#6612105 if necessary. However, netflow's datasource in Druid is already big (1.9TB)... [20:50:39] 10Analytics, 10Anti-Harassment, 10Event-Platform: CookieBlock Event Platform Migration - https://phabricator.wikimedia.org/T267341 (10ifried) @Niharika: So, if I understand correctly, it should not be marked as Anti-Harassment, and it should be marked as Community Tech? If so, we can wait on responses from... [20:57:11] 10Analytics, 10Anti-Harassment, 10Event-Platform: CookieBlock Event Platform Migration - https://phabricator.wikimedia.org/T267341 (10Ottomata) Ah, great, thank you! @samwilson @MusikAnimal : 1. is the CookieBlock schema still used, and if not can we disable it altogether? 2. If it is used, do you need the... [20:57:16] 10Analytics, 10Anti-Harassment, 10Event-Platform: CookieBlock Event Platform Migration - https://phabricator.wikimedia.org/T267341 (10Niharika) @ifried I think it makes more sense for "cookie blocks" to be an #anti-harassment project. It was taken on by Commtech as it was one of the earlier things that team... [20:58:45] 10Analytics, 10Anti-Harassment, 10Event-Platform: CookieBlock Event Platform Migration - https://phabricator.wikimedia.org/T267341 (10Ottomata) Ah, it looks like maybe @dbarratt originally marked this schema as Anti-Harassment. Maybe he knows more! :) [21:05:18] 10Analytics, 10Analytics-Kanban, 10Operations, 10netops, 10Patch-For-Review: Add more dimensions in the netflow/pmacct/Druid pipeline - https://phabricator.wikimedia.org/T254332 (10mforns) More thoughts... The addition of a field to a data set can have different effects to the data size. 1. If the field... [21:22:04] 10Analytics, 10Analytics-Kanban, 10Operations, 10netops, 10Patch-For-Review: Add more dimensions in the netflow/pmacct/Druid pipeline - https://phabricator.wikimedia.org/T254332 (10mforns) And still more thoughts :) After playing a bit more with the datasource, I think the data is very granular, it is l... [21:58:12] 10Analytics: Avro Deserializer logging set to DEBUG in pyspark lead to huge yarn stderr container files (causing disk usage alerts) - https://phabricator.wikimedia.org/T268376 (10GoranSMilovanovic) @JAllemandou Thank you for your comment. Ok. @elukey @Ottomata Running your test now: ` sudo -u analytics-private... [22:13:41] 10Analytics: Avro Deserializer logging set to DEBUG in pyspark lead to huge yarn stderr container files (causing disk usage alerts) - https://phabricator.wikimedia.org/T268376 (10Ottomata) Hm, @elukey may have given you the wrong file path for this test? `/etc/spark2/defaults/log4j.properties` doesn't exist, bu... [22:16:49] 10Analytics: Avro Deserializer logging set to DEBUG in pyspark lead to huge yarn stderr container files (causing disk usage alerts) - https://phabricator.wikimedia.org/T268376 (10GoranSMilovanovic) @Ottomata @elukey The only thing that I can report right now - for which I do not know whether it is related to the... [23:18:59] 10Quarry, 10cloud-services-team (Kanban): Do some checks of how many Quarry queries will break in a multiinstance environment - https://phabricator.wikimedia.org/T267989 (10zhuyifei1999) > I'll review the queries too to anonymize if needed and paste the results somewhere. quarry queries are public. I don't th... [23:31:34] 10Analytics, 10Operations, 10SRE-Access-Requests: Requesting access to production shell groups for JAnstee - https://phabricator.wikimedia.org/T266249 (10Dzahn) a:05JAnstee_WMF→03None [23:31:37] 10Analytics, 10Operations, 10SRE-Access-Requests: Requesting access to production shell groups for JAnstee - https://phabricator.wikimedia.org/T266249 (10Dzahn) To add to what @Aklapper said, when we try to verify managers on corp LDAP servers for janstee and nbdubane it tells us that is Dana McCurdy. Sea... [23:35:59] 10Analytics, 10Operations, 10SRE-Access-Requests: Requesting access to production shell groups for JAnstee - https://phabricator.wikimedia.org/T266249 (10Dzahn) 05Stalled→03Open a:03Dzahn [23:45:28] 10Analytics, 10Analytics-EventLogging, 10Platform Team Initiatives (Abstract Schema): Convert EventLogging to AbstractSchema - https://phabricator.wikimedia.org/T268547 (10Reedy)