[06:25:15] <joal>	 Hi team
[06:57:06] <wikibugs>	 10Analytics, 10Product-Analytics: event_user_id is always NULL for anonymous edits in Mediawiki History table - https://phabricator.wikimedia.org/T232171 (10JAllemandou) Good point @nettrom_WMF, there is indeed a discrepancy between doc and reality, introduced with the database change of using a normalized tab...
[07:00:38] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Product-Analytics: event_user_id is always NULL for anonymous edits in Mediawiki History table - https://phabricator.wikimedia.org/T232171 (10JAllemandou)
[07:01:17] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Product-Analytics: event_user_id is always NULL for anonymous edits in Mediawiki History table - https://phabricator.wikimedia.org/T232171 (10JAllemandou) a:03JAllemandou
[07:01:18] <wikibugs>	 10Analytics, 10Analytics-Kanban: Sqoop: remove cuc_comment and join to comment table - https://phabricator.wikimedia.org/T217848 (10JAllemandou) a:05JAllemandou→03None
[09:48:35] <wikibugs>	 (03PS5) 10Fdans: (wip) Add cassandra loading job for requests per file metric [analytics/refinery] - 10https://gerrit.wikimedia.org/r/533921 (https://phabricator.wikimedia.org/T228149)
[10:32:44] <wikibugs>	 (03PS6) 10Fdans: (wip) Add cassandra loading job for requests per file metric [analytics/refinery] - 10https://gerrit.wikimedia.org/r/533921 (https://phabricator.wikimedia.org/T228149)
[10:33:17] <wikibugs>	 (03PS7) 10Fdans: (wip) Add cassandra loading job for requests per file metric [analytics/refinery] - 10https://gerrit.wikimedia.org/r/533921 (https://phabricator.wikimedia.org/T228149)
[12:09:22] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Product-Analytics: event_user_id is always NULL for anonymous edits in Mediawiki History table - https://phabricator.wikimedia.org/T232171 (10Nuria) 05Open→03Resolved
[13:25:39] <wikibugs>	 10Analytics, 10Analytics-EventLogging, 10EventBus, 10CPT Initiatives (Modern Event Platform (TEC2)), 10Services (watching): Migrate all event-schemas schemas to current.yaml and materialize with jsonschema-tools. - https://phabricator.wikimedia.org/T232144 (10Ottomata) 05Resolved→03Open Let's use thi...
[13:25:44] <wikibugs>	 10Analytics, 10Analytics-EventLogging, 10EventBus, 10CPT Initiatives (Modern Event Platform (TEC2)), and 2 others: CI Support for Schema Registry - https://phabricator.wikimedia.org/T206814 (10Ottomata)
[15:16:10] <wikibugs>	 (03PS1) 10Fdans: Add per file media requests endpoing to AQS [analytics/aqs] - 10https://gerrit.wikimedia.org/r/534824 (https://phabricator.wikimedia.org/T207208)
[15:17:30] <wikibugs>	 (03PS8) 10Fdans: (wip) Add cassandra loading job for requests per file metric [analytics/refinery] - 10https://gerrit.wikimedia.org/r/533921 (https://phabricator.wikimedia.org/T228149)
[15:24:20] <wikibugs>	 (03PS2) 10Fdans: Add per file media requests endpoing to AQS [analytics/aqs] - 10https://gerrit.wikimedia.org/r/534824 (https://phabricator.wikimedia.org/T207208)
[15:24:50] <wikibugs>	 (03PS9) 10Fdans: Add cassandra loading job for requests per file metric [analytics/refinery] - 10https://gerrit.wikimedia.org/r/533921 (https://phabricator.wikimedia.org/T228149)
[15:38:34] <wikibugs>	 (03CR) 10Nuria: [C: 04-1] "I have suggested some rewording and have just one question about the code" (0311 comments) [analytics/aqs] - 10https://gerrit.wikimedia.org/r/534824 (https://phabricator.wikimedia.org/T207208) (owner: 10Fdans)
[15:43:53] <wikibugs>	 (03CR) 10Nuria: [C: 04-1] Add cassandra loading job for requests per file metric (033 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/533921 (https://phabricator.wikimedia.org/T228149) (owner: 10Fdans)
[15:46:02] <wikibugs>	 10Analytics, 10Services (watching), 10cloud-services-team (Kanban): Mediarequests: Add endpoint for agreggated counts per file type per project - https://phabricator.wikimedia.org/T231589 (10Nuria) a:03fdans
[15:52:47] <joal>	 Hey team - Melissa is stuck on the road behind an accident, I'll miss standup or arrive super late
[15:52:52] <joal>	 sorry for the short notice
[15:53:10] <joal>	 I'll send an escrum if needed when she arrives
[15:55:31] <nuria>	 joal: Ok!
[15:56:10] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Services (watching): Mediarequests: Add endpoint for agreggated counts per file type per project - https://phabricator.wikimedia.org/T231589 (10Nuria)
[15:57:22] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Tool-Pageviews: Create new mediarequests table - https://phabricator.wikimedia.org/T229817 (10Nuria)
[15:57:54] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10Tool-Pageviews: Create new mediarequests table - https://phabricator.wikimedia.org/T229817 (10Nuria) ping @fdans seems like if the oozie job is working we can move this ticket to done right?
[15:58:21] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10StructuredDataOnCommons, 10Tool-Pageviews: Change name and format of partition  column in mediarequest table - https://phabricator.wikimedia.org/T231030 (10Nuria) 05Open→03Declined
[15:58:30] <wikibugs>	 10Analytics, 10Analytics-Kanban, 10StructuredDataOnCommons, 10Tool-Pageviews, 10Patch-For-Review: Add literal transcoding to media file properties UDF - https://phabricator.wikimedia.org/T230312 (10Nuria)
[15:59:59] <wikibugs>	 10Analytics, 10Tool-Pageviews: Add referrer to mediarequests dataset to inform about project  - https://phabricator.wikimedia.org/T228151 (10Nuria)
[16:01:38] <wikibugs>	 10Analytics, 10Tool-Pageviews: Add referrer to mediarequests dataset to inform about project - https://phabricator.wikimedia.org/T228151 (10Nuria) Ping @fdans  thsi ticket can probably be moved to "done" in the canvas no?
[16:02:06] <fdans>	 ping nuria standuuup!
[16:02:19] <fdans>	 and ottomata i think
[16:35:45] <wikibugs>	 (03PS5) 10Mforns: [WIP] Add spark job to create mediawiki history dumps [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/528504 (https://phabricator.wikimedia.org/T208612)
[18:59:52] <icinga-wm>	 RECOVERY - Check if active EventStreams endpoint is delivering messages. on icinga1001 is OK: OK: An EventStreams message was consumed from https://stream.wikimedia.org/v2/stream/recentchange within 10 seconds. https://wikitech.wikimedia.org/wiki/Event_Platform/EventStreams/Administration
[19:17:53] <fdans>	 looking at alerts
[19:21:17] <fdans>	 a-team wiki sites down for you or just me?
[19:21:38] <mforns>	 uoo
[19:21:49] <joal>	 fdans: huge issue ongoing - probably DDoS - SRE on it
[19:22:31] <joal>	 :(
[19:22:34] <fdans>	 joal: I'm guessing that's where our webrequest data loss errors are coming from?
[19:22:56] <joal>	 it probably is yes
[19:23:23] <joal>	 issue started at about 18UTC IIUC
[19:23:46] <joal>	 The ops channel is a mess :(
[19:31:56] <mforns>	 joal, I changed to gzip and it worked fine
[19:32:13] <mforns>	 I think there was some problem in the way I specified Bzip2
[19:32:20] <joal>	 mforns: possible
[19:32:23] <mforns>	 because it failed in the end
[19:32:35] <mforns>	 maybe it was retrying, and that's why it took so long
[19:32:51] <joal>	 mforns: I have wonders about how your job is structured :)
[19:33:05] <joal>	 mforns: you are generating the base files currently, is that it?
[19:33:21] <mforns>	 what do you mean base files?
[19:33:36] <joal>	 Files split by wiki and possibly time
[19:34:55] <mforns>	 joal, yes, and also there's a part that is supposed to rename the files, and move them to their final location with prettified names
[19:35:07] <mforns>	 but that didn't work :'(
[19:35:18] <mforns>	 still, I'm happy that the other part worked
[19:35:39] <joal>	 ok - I was thinking about this last bit and realized I might have miss-driven you
[19:35:43] <joal>	 mforns: --^
[19:36:05] <mforns>	 why?
[19:36:23] <joal>	 mforns: While running in spark for facilitation, the renaming/moving code doesn't make use of spark, only HDFS api
[19:36:48] <mforns>	 yes, I know
[19:36:54] <joal>	 ok - I 
[19:36:59] <joal>	 again sorry
[19:37:16] <mforns>	 why?
[19:37:17] <joal>	 ok - I had wondered if you had used spark to read / write the files 
[19:37:27] <mforns>	 I wasnt expecting.. no no
[19:37:31] <joal>	 which would be very inefficient
[19:37:36] <joal>	 cool
[19:37:45] <mforns>	 no, it's using FileSystem
[19:37:49] <mforns>	 fs.rename
[19:37:51] <joal>	 Awesome
[19:38:21] <joal>	 well, soon awesome, when fixed :)
[19:38:45] <joal>	 mforns: can you explain me how you specified your bz2 compression?
[19:39:32] <joal>	 mforns: from https://yarn.wikimedia.org/cluster/app/application_1564562750409_160003
[19:39:45] <joal>	 mforns: Caused by: java.lang.ClassNotFoundException: Class org.apache.hadoop.io.compress.Bzip2Codec not found
[19:39:48] <mforns>	 joal, same as with gzip, but changing org.apache.hadoop.io.compress.GzipCodec by org.apache.hadoop.io.compress.Bzip2Codec
[19:40:00] <mforns>	 yea, that's what I saw
[19:40:36] <mforns>	 joal, I thought that's how you would specify given that: http://hadoop.apache.org/docs/current/api/org/apache/hadoop/io/compress/BZip2Codec.html
[19:40:39] <joal>	 mforns: And indeed, 6 AppMaster tries - long before failing
[19:40:50] <mforns>	 ooooooooh!
[19:41:07] <joal>	 mforns: same page, full bottom
[19:41:59] <joal>	 mforns: spark has API for compression codecs
[19:42:11] <joal>	 for instance: df.write().option("compression","bzip2").text(filePath)
[19:42:22] <mforns>	 I see
[19:42:42] <joal>	 mforns: using gzip should be done the same way
[19:42:48] <mforns>	 joal, so it should not be specified in the SparkConf?
[19:42:56] <joal>	 Nope
[19:43:25] <joal>	 mforns: well no - You can I guess, but it's probably having side effects
[19:44:21] <mforns>	 ok
[19:45:03] <mforns>	 will do! thanks :D
[19:45:36] <joal>	 Thanks mforns :)
[19:52:40] <mforns>	 joal, then, do we need the other options on sparkconf: spark.hadoop.mapred.output.compress, spark.hadoop.mapred.output.compression.codec, spark.hadoop.mapred.output.compression.type ?
[20:03:33] <joal>	 mforns: nope
[20:03:43] <mforns>	 ok :]
[20:03:58] <joal>	 mforns: I think those options were useful in previous spark versions we used (1.6)
[20:04:22] <joal>	 But I have not tested for real (maybe I'm completely mistaking)
[20:13:27] <joal>	 Ok gone for tonight team
[20:15:34] <mforns>	 byeeeee