[05:55:17] goood morning [06:02:40] 10Analytics, 10Analytics-SWAP: Spark Scala kernel dying under Jupyter - https://phabricator.wikimedia.org/T249761 (10elukey) >>! In T249761#6053495, @awight wrote: >>>! In T249761#6053290, @elukey wrote: >> Can you please add more info about what does this mean? What error do you get? How can we reproduce? >... [06:04:03] Morning elukey [06:04:21] hello! [06:05:37] How’s elukey today? [06:06:34] RhinosF1: good, but it is the start of the week for me (yesterday I was on holiday) so I hope it doesn't get worse later on :D [06:06:38] what about you? [06:07:59] Not bad, bored of being stuck in the house. [06:26:22] 10Analytics, 10ContentTranslation, 10Language-Team (Language-2020-Focus-Sprint): Test Performance of Marian NMT translation in stat cluster - https://phabricator.wikimedia.org/T247245 (10MoritzMuehlenhoff) Out of interest, what CPU does your personal laptop use? (Ideally /proc/cpuinfo if you're on Linux). I... [09:09:41] (03CR) 10Thiemo Kreuz (WMDE): [C: 03+2] Only track unique users disabling TwoColConflict (031 comment) [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/587232 (https://phabricator.wikimedia.org/T247944) (owner: 10WMDE-Fisch) [09:23:33] elukey: Hi, first thank you for the extremely generous offer to help debug my SWAP kernel! My day is broken up by IRL things so async in IRC is great for me, too. Or any other day, of course. [09:24:05] awight: hi! sure anytime! [09:24:29] I tested yesterday a spark yarn kernel on 1003, with a simple spark.sql query and it worked fine [09:24:57] local though returned me nothing, that was weird [09:25:06] oh that's great news! Let me do exactly that. firefox on linux [09:25:26] also remember to kinit in the jupyter terminal [09:25:31] before launching the kernel [09:26:39] I can't log into stat1003 for some reason. maybe a typo? [09:35:20] notebook1003 :) [09:35:40] or stat1004->stat1008 (stat1007 doesn't have jupyter yet) [09:35:49] elukey: It works perfectly ^_^' Must have been failing to kinit in the right order, apologies for the noise. [09:36:09] awight: goooooooood. local or yarn? [09:36:16] also, should I update the docs? [09:36:26] Just to help other people in case [09:37:52] I'm getting results with both yarn and local. [09:38:03] \o/ [09:42:13] The docs seem right, too. I'll go invalidate the ticket now :-) [09:43:33] 10Analytics, 10Analytics-SWAP: Spark Scala kernel dying under Jupyter - https://phabricator.wikimedia.org/T249761 (10awight) 05Open→03Invalid >>! In T249761#6053901, @elukey wrote: > - Do you authenticate via kinit using the Jupyter terminal? (before launching the kernel) Solved! Thanks for your help, I... [09:56:46] Pretty sure this is dramatically faster than the pyspark kernel :100%: [10:09:04] definitely [10:51:08] 10Analytics: Add Authentication/Encryption to Kafka Jumbo's consumer - https://phabricator.wikimedia.org/T250146 (10elukey) p:05Triage→03High [10:51:37] 10Analytics: Add Authentication/Encryption to Kafka Jumbo's consumer - https://phabricator.wikimedia.org/T250146 (10elukey) [10:51:40] 10Analytics, 10Analytics-Kanban, 10Operations, 10netops: Move netflow to TLS encryption/authentication via librdkafka - https://phabricator.wikimedia.org/T248980 (10elukey) [10:53:04] I can't ssh into stat1003 either today [10:53:35] my ssh config has me go through bast3004 [10:54:46] gilles@bast3004:~$ ping stat1003.eqiad.wmnet [10:54:46] ping: stat1003.eqiad.wmnet: Name or service not known [10:55:28] 10Analytics: Add Authentication/Encryption to Kafka Jumbo's consumer - https://phabricator.wikimedia.org/T250146 (10elukey) [10:56:13] 10Analytics: Add Authentication/Encryption to Kafka Jumbo's consumer - https://phabricator.wikimedia.org/T250146 (10elukey) [10:58:59] 10Analytics, 10Product-Analytics (Kanban): Add TLS encryption support to Kafkatee and enable it where possible - https://phabricator.wikimedia.org/T250147 (10elukey) [10:59:21] 10Analytics, 10Product-Analytics (Kanban): Add TLS encryption support to Kafkatee and enable it where possible - https://phabricator.wikimedia.org/T250147 (10elukey) https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/588086/ https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/588015/ [11:01:56] 10Analytics: Establish if Camus can support TLS encryption + Authentication to Kafka with a minimal code change - https://phabricator.wikimedia.org/T250148 (10elukey) [11:02:38] gilles: hey, stat1003 was decommed a long time ago - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Clients [11:02:45] ah ok [11:06:04] 10Analytics: Enable TLS encryption from Eventgate Analytics to Kafka Jumbo - https://phabricator.wikimedia.org/T250149 (10elukey) [11:06:53] * elukey lunch! [11:11:06] * joal is very happy to see others using scala notebooks! Thanks awight )n [11:40:59] joal: +1, it's awesome that I can now do experiments, then copy the winning code straight into my postprocessing library. [11:41:07] \o/ [11:41:21] awight: also, I have a graphing package for you if interested ;) [11:43:10] All ears... [11:43:45] I tried the Brunel snippet on wikitech:SWAP but it gives multiple syntax errors for some reason. [11:44:02] awight: the dfrawback is that it involves creating dedicated kernels in order to import a jar [11:44:54] That might be convenient for me anyway, to access the library I'm building. [11:45:01] indeed [11:45:31] awight: Actually it's a good idea for me to document that - I'm gonna update the swap page on wiki :0 [11:46:06] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Druid access for view on event.editeventattempt - https://phabricator.wikimedia.org/T249945 (10fdans) Just did a test ingestion of one day with the properties specified in the puppet patch, everything seems good! Feel free to check the datasource in Turnil... [11:46:10] awight: The tool is vegas: https://github.com/vegas-viz/Vegas [11:46:48] awight: I have updated a jar to include this one (https://mvnrepository.com/artifact/org.vegas-viz/vegas-spark) [11:47:32] And loaded that jar onto a personal kernel (I used the doc here to copy the scala-spark kernel: https://wikitech.wikimedia.org/wiki/SWAP#Permanent_custom_Spark_Notebook_kernels) [11:47:44] and then updated that copied kernel to include the jar [11:48:08] And finally I can use that to do some graphing (not perfect, but better than nothing) [11:48:37] Ah nice, this package seems to have been popular for a long time now. Activity is looking low lately, but at least I'll find lots to copy and paste :-) [11:49:14] awight: I use a lot https://github.com/vegas-viz/Vegas/blob/master/core/src/test/scala/vegas/fixtures/BasicPlots.scala and https://github.com/vegas-viz/Vegas/blob/master/core/src/test/scala/vegas/fixtures/VegasPlots.scala [11:49:43] Cookbooks! Thanks [11:49:53] ;) [11:50:37] awight: on my drea-list there is making sure the package get's updated with newer vega versions [11:51:03] awight: but there is too much in that list :) [11:52:30] Stable versions have their benefits, too! [11:57:24] 10Analytics: Explore in jupyter notebook whether the raw pageview timeseries can help on outage/censhorsip automatic detection - https://phabricator.wikimedia.org/T249849 (10mforns) a:03mforns [12:06:03] joal: bonjour :) [12:06:10] bonjour elukey :) [12:06:21] when you have time we'd still need to roll out the kernels to all hosts [12:06:30] IIRC the newer ones are only on 1004 now [12:07:02] elukey: have we patched the kernels so that the typo I made has been corrected? [12:07:18] I can't recall :( [12:07:21] Let's see [12:08:57] 10Analytics: Kerberos-run-command doesn't work with spark-submit [workaround] - https://phabricator.wikimedia.org/T250161 (10fdans) [12:11:53] 10Analytics, 10Event-Platform, 10Inuka-Team (Kanban), 10KaiOS-Wikipedia-app (MVP): Capture and send back client-side errors - https://phabricator.wikimedia.org/T248615 (10AMuigai) [12:12:42] It has, awesome - elukey we're good to go [12:15:36] :) [12:15:53] ok so I'll do it later on! [12:18:17] 10Analytics, 10Inuka-Team, 10Language-strategy, 10Tool-Pageviews: Have a way to show the most popular pages per country - https://phabricator.wikimedia.org/T207171 (10AMuigai) @Nuria checking where this is on your list of priorities? This is still a feature we would like to use for new readers, more-so in... [12:24:28] good afternoon! [12:25:09] elukey: when you ever get into a mood for some shell script review, I got a couple patches for the refinery deploy repo that could use your blessing :-] https://gerrit.wikimedia.org/r/#/c/analytics/refinery/+/585503/ and https://gerrit.wikimedia.org/r/#/c/analytics/refinery/+/585505/ [12:25:10] 10Analytics, 10Analytics-Kanban, 10Product-Analytics: Users having issues with presto dashboards on superset - https://phabricator.wikimedia.org/T249923 (10elukey) The next step is to figure out how to handle the extra permissions needed for all users. Options are: * add sqllab's perms to the Alpha role in S... [12:26:32] 10Analytics: Deprecate /mnt/hdfs from the Analytics infra - https://phabricator.wikimedia.org/T241040 (10elukey) 05Open→03Declined After introducing the hdfs-rsync tool, all the mount points have been really stable. I am inclined to close this task and re-open if needed. [12:26:34] 10Analytics, 10Analytics-Kanban: Analytics Ops Technical Debt - https://phabricator.wikimedia.org/T240437 (10elukey) [12:31:45] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Add SWAP profile to stat1005 - https://phabricator.wikimedia.org/T245179 (10elukey) @mpopov hi! Are you using the SparkR kernels that we have on SWAP by any chance? I am asking because on stat100[5,8], our Buster nodes, we have upgraded to... [12:32:54] (03CR) 10Elukey: [C: 03+2] Fix wget/chmod to use commit-msg, not the directory [analytics/refinery] - 10https://gerrit.wikimedia.org/r/585503 (owner: 10Hashar) [12:34:14] hashar: hi! qq - is pipefail needed currently? [12:35:10] elukey: apparently not yet, I guess Ihave added it out of habit :] [12:35:22] I always start my shell scripts with -eux -o pipefail [12:35:34] super strict :D [12:35:44] yeah [12:35:51] that has saved a few times though [12:35:53] I guess it doesn't really hurt, let's do it [12:36:02] (03CR) 10Elukey: [C: 03+2] Make update-refinery-source-jars more strict [analytics/refinery] - 10https://gerrit.wikimedia.org/r/585505 (owner: 10Hashar) [12:36:06] and the -eu seems to work all fine [12:36:08] (03PS2) 10Elukey: Fix wget/chmod to use commit-msg, not the directory [analytics/refinery] - 10https://gerrit.wikimedia.org/r/585503 (owner: 10Hashar) [12:36:15] (03CR) 10Elukey: [V: 03+2 C: 03+2] Fix wget/chmod to use commit-msg, not the directory [analytics/refinery] - 10https://gerrit.wikimedia.org/r/585503 (owner: 10Hashar) [12:36:21] (03PS2) 10Elukey: Make update-refinery-source-jars more strict [analytics/refinery] - 10https://gerrit.wikimedia.org/r/585505 (owner: 10Hashar) [12:36:25] (03CR) 10Elukey: [V: 03+2 C: 03+2] Make update-refinery-source-jars more strict [analytics/refinery] - 10https://gerrit.wikimedia.org/r/585505 (owner: 10Hashar) [12:36:27] I am going to migrate the update-jars job to use a Docker container as well [12:36:39] done! [12:36:44] thank you !!! [12:36:46] thanks a lot for all the work that you are doing :) [12:37:00] hashar: --^ [12:37:40] and it is only the tip of the iceberg! [12:42:18] 10Analytics: Superset: Repeatedly asking to re-log in - https://phabricator.wikimedia.org/T249824 (10elukey) Folding this task into, we'd need to find a different solution for all users, but we hope to fix it soon :) [12:42:31] 10Analytics: Check home/HDFS leftovers of anomie - https://phabricator.wikimedia.org/T250167 (10MoritzMuehlenhoff) [12:42:39] 10Analytics, 10Analytics-Kanban, 10Product-Analytics: Users having issues with presto dashboards on superset - https://phabricator.wikimedia.org/T249923 (10elukey) [12:42:41] 10Analytics: Superset: Repeatedly asking to re-log in - https://phabricator.wikimedia.org/T249824 (10elukey) [12:42:52] 10Analytics, 10Analytics-Kanban, 10Product-Analytics: Users having issues with presto sqllab on superset - https://phabricator.wikimedia.org/T249923 (10elukey) [12:43:35] 10Analytics, 10Analytics-Kanban, 10Product-Analytics: Users having issues with presto sqllab on superset - https://phabricator.wikimedia.org/T249923 (10elukey) [12:44:21] 10Analytics: Check home/HDFS leftovers of flemmerich - https://phabricator.wikimedia.org/T246070 (10elukey) @Isaac @flemmerich ping :) [13:12:29] 10Analytics: Enable TLS encryption from Eventgate Analytics to Kafka Jumbo - https://phabricator.wikimedia.org/T250149 (10Ottomata) OH! I didn't realize I didn't have this enabled for eventgate-analytics! ...and it also needs enabled for eventgate-main???. I really thought I had this everywhere. Looking forw... [13:12:55] 10Analytics: Add Authentication/Encryption to Kafka Jumbo's clients - https://phabricator.wikimedia.org/T250146 (10Ottomata) [13:15:17] 10Analytics: Enable TLS encryption from Eventgate Analytics to Kafka Jumbo - https://phabricator.wikimedia.org/T250149 (10elukey) >>! In T250149#6054907, @Ottomata wrote: > OH! I didn't realize I didn't have this enabled for eventgate-analytics! ...and it also needs enabled for eventgate-main???. I really tho... [13:20:23] (03CR) 10Ottomata: "Sorry this is such a big commit! It was much easier to do the refactoring and new stuff together here in one commit rather than rebasing " [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/586447 (https://phabricator.wikimedia.org/T238230) (owner: 10Ottomata) [13:20:33] joal: o/ if you have some time https://gerrit.wikimedia.org/r/c/analytics/refinery/source/+/586447 is ready for review [13:20:53] i had trouble with a nullpointerexception via the geocode_ip hive udf [13:21:04] so I worked around it by still doing the map partitions thing there [13:21:06] 10Analytics: Check home/HDFS leftovers of anomie - https://phabricator.wikimedia.org/T250167 (10elukey) ` ====== stat1004 ====== total 8 drwxrwxr-x 2 2248 wikidev 4096 Oct 12 2019 out drwxrwxr-x 2 2248 wikidev 4096 Sep 10 2019 queries ls: cannot access '/var/userarchive/anomie.tar.bz2': No such file or directo... [13:25:28] 10Analytics, 10Event-Platform, 10Services (watching): Failure in EventBus schema for mediawiki/revision/visibility-change - https://phabricator.wikimedia.org/T187362 (10Ottomata) 05Stalled→03Declined Lots of things have changed since 2018, so this will be hard to reproduce. Feel free to reopen if it hap... [13:26:24] 10Analytics, 10Event-Platform, 10Growth-Team, 10Patch-For-Review, and 2 others: Add fields needed by ERI to mediawiki.revision-create - https://phabricator.wikimedia.org/T145164 (10Ottomata) 05Open→03Declined No movement since 2016. Closing. [13:26:53] 10Analytics, 10ChangeProp, 10Event-Platform, 10Growth-Team, and 3 others: Set up the foundation for the ReviewStream feed - https://phabricator.wikimedia.org/T143743 (10Ottomata) 05Open→03Declined No movement since 2016. Closing. [13:43:33] ottomata: without using UDFs, we're back to iterating twice over the dataframe I guess [13:44:56] OH i forgot about that [13:44:58] right. [13:45:16] ottomata: may I help with trying to make the UDF work? [13:45:19] yeah unless i do ALL transforms in the same map aprtition [13:45:23] yes joal if we can figure that out [13:45:30] nuria said you made some changes to the geocode udf [13:45:35] joining the cave now [13:45:38] a few montht ago also [13:45:38] ok [14:21:02] all kafkatee instances are using TLS \o/ [14:24:45] 10Analytics, 10Event-Platform, 10Inuka-Team: Create KaiOS error stream - https://phabricator.wikimedia.org/T250177 (10SBisson) [14:28:25] 10Analytics, 10Event-Platform, 10Inuka-Team (Kanban), 10KaiOS-Wikipedia-app (MVP): Capture and send back client-side errors - https://phabricator.wikimedia.org/T248615 (10SBisson) p:05Triage→03Medium [14:28:57] 10Analytics, 10Analytics-SWAP, 10Product-Analytics: Provide Python 3.6+ on SWAP - https://phabricator.wikimedia.org/T212591 (10elukey) [14:29:24] 10Analytics, 10Analytics-SWAP, 10Product-Analytics: Provide Python 3.6+ on SWAP - https://phabricator.wikimedia.org/T212591 (10elukey) 05Open→03Resolved a:03elukey Please log on stat1005 or stat1008 to get a Python 3.7 venv :) [14:33:59] * elukey afk for a bit! [14:56:56] (03PS1) 10Joal: Update hive geocoded-data udf [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/588715 [15:07:22] 10Analytics, 10Operations, 10Research, 10Traffic: Wikipedia Accessibility, check false positives and false negatives of traffic alarms - https://phabricator.wikimedia.org/T245166 (10Nuria) closing as this is happening as part of our monthly sync up. [15:08:54] 10Analytics, 10Operations, 10Research, 10Traffic: Wikipedia Accessibility, check false positives and false negatives of traffic alarms - https://phabricator.wikimedia.org/T245166 (10Nuria) 05Open→03Resolved [15:10:50] 10Analytics, 10Analytics-EventLogging, 10Timeless: EventLogging revision popup gets hidden behind content in Timeless - https://phabricator.wikimedia.org/T249557 (10Izno) @Milimetric #timeless [15:13:22] 10Analytics, 10Analytics-EventLogging, 10Timeless: EventLogging revision popup gets hidden behind content in Timeless - https://phabricator.wikimedia.org/T249557 (10Milimetric) 05Declined→03Open oh, doh! Sorry I did know that. Hm... patches welcome, I can try to review, we're still quite swamped. I'll... [15:15:10] 10Analytics, 10Inuka-Team, 10Language-strategy, 10Tool-Pageviews: Have a way to show the most popular pages per country - https://phabricator.wikimedia.org/T207171 (10Nuria) The bot detection (that is a prerequisite to this work) will be rolled out in the next couple of weeks. Now, we do not have plans to... [15:19:31] 10Analytics, 10Analytics-Kanban, 10Continuous-Integration-Infrastructure (phase-out-jessie), 10Patch-For-Review, and 2 others: Migrate analytics/refinery/source release jobs to Docker - https://phabricator.wikimedia.org/T210271 (10hashar) A couple patches in the refinery repo got merged earlier today. I s... [15:34:50] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Druid access for view on event.editeventattempt - https://phabricator.wikimedia.org/T249945 (10Nuria) Looked at data and it looks good, if @dr0ptp4kt (let's give him a day to respond) sees no issues then let's merge cc @kaldari [15:38:28] 10Analytics, 10Analytics-Data-Quality, 10Analytics-EventLogging, 10WikiEditor, 10Mobile: WikiEditor records all edits as desktop edits in EventLogging - https://phabricator.wikimedia.org/T249944 (10Jdlrobson) [15:41:04] (03CR) 10Ottomata: "Still WIP!" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/588715 (owner: 10Joal) [15:43:37] (03CR) 10Mforns: "Code looks good to me! Numbro seems more verbose, but more readable. I like it :]" [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/585725 (https://phabricator.wikimedia.org/T199386) (owner: 10Fdans) [15:46:33] 10Analytics, 10Operations: Broken /a/refinery-source/guard/run_all_guards.sh script on stat1002 - https://phabricator.wikimedia.org/T166937 (10fgiunchedi) 05Open→03Resolved a:03fgiunchedi Boldly resolving, the class has been removed from puppet in I830a80fd7eb [15:59:45] ping fdans ottomata [16:00:02] I like it early ping! [16:00:10] 10Analytics, 10Analytics-Data-Quality, 10Analytics-EventLogging, 10Product-Analytics, and 3 others: WikiEditor records all edits as desktop edits in EventLogging - https://phabricator.wikimedia.org/T249944 (10JTannerWMF) Putting this on the Editing team's radar [16:03:59] (03PS2) 10Joal: [WIP] Update hive geocoded-data udf [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/588715 [16:06:46] (03CR) 10jerkins-bot: [V: 04-1] [WIP] Update hive geocoded-data udf [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/588715 (owner: 10Joal) [16:11:05] 10Analytics, 10Analytics-Data-Quality, 10Analytics-EventLogging, 10Product-Analytics, and 3 others: WikiEditor records all edits as desktop edits in EventLogging - https://phabricator.wikimedia.org/T249944 (10DLynch) This is slightly more complicated than just making WikiEditor detect whether it's being ac... [16:17:08] joal: elukey: The scala kernel has accelerated my work like crazy, thanks again! I'm copy-pasting in both directions like crazy, it's so nice to get IDE completion etc. [16:17:23] \o/ [16:29:16] \o/ [16:30:38] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Fix non MapReduce execution of GeoCode UDF - https://phabricator.wikimedia.org/T238432 (10JAllemandou) 05Resolved→03Open [16:37:25] (03PS3) 10Joal: Update hive geocoded-data udf [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/588715 [16:52:18] (03CR) 10Joal: "Tested on cluster - hive, hive-local, spark, spark-local \o/" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/588715 (owner: 10Joal) [17:02:47] (03CR) 10Ottomata: Update hive geocoded-data udf (031 comment) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/588715 (owner: 10Joal) [17:14:29] 10Analytics, 10Analytics-Kanban, 10Product-Analytics: Users having issues with presto sqllab on superset - https://phabricator.wikimedia.org/T249923 (10Nuria) >add to existing users the sqllab role (should be easy with a db alter table) and then figure out how to add the role upon user's first login (the Alp... [17:18:10] * elukey off! [17:19:32] 10Analytics, 10Event-Platform, 10MediaWiki-extensions-CentralNotice, 10MediaWiki-extensions-WikibaseRepository, and 3 others: EventBus jobs failing heavily because of CentralNotice and WikibaseRepo - https://phabricator.wikimedia.org/T225195 (10awight) 05Open→03Resolved a:03awight >>! In T225195#5382... [17:28:33] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Druid access for view on event.editeventattempt - https://phabricator.wikimedia.org/T249945 (10dr0ptp4kt) Looking good to me. Looking forward to the fuller data! I'm subscribed on {T208589} for any next step on a v2 of this. [17:41:58] 10Analytics, 10Analytics-Data-Quality, 10Analytics-EventLogging, 10Product-Analytics, and 3 others: WikiEditor records all edits as desktop edits in EventLogging - https://phabricator.wikimedia.org/T249944 (10kaldari) @Mayakp.wiki - How does Turnillo split between "Mobile web" editing and "Other" editing i... [17:54:26] (03CR) 10Joal: Update hive geocoded-data udf (031 comment) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/588715 (owner: 10Joal) [17:58:43] (03PS4) 10Joal: Update hive geocoded-data udf [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/588715 [18:01:39] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Newpyter - SWAP Juypter Rewrite - https://phabricator.wikimedia.org/T224658 (10kzimmerman) @Ottomata our team is super interested in this and we'd love to provide more feedback in addition to the comment Shay made about sharing. What's your timeline/when s... [18:01:58] joal: it worked! :) [18:02:03] \o/ ! [18:02:04] going to re-patch to use the udf now [18:02:10] Happy me :) [18:02:20] sorry for the change-dechange-rechange [18:02:23] ottomata: --^ [18:05:56] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Newpyter - SWAP Juypter Rewrite - https://phabricator.wikimedia.org/T224658 (10Ottomata) Hiya! I will likely begin working on this in the next week or two. There's a lot of backend infra work to be done before it will be useable, and I'm sure things will... [18:15:26] (03CR) 10Ottomata: Update hive geocoded-data udf (031 comment) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/588715 (owner: 10Joal) [18:16:04] joal: one comment, i think this factory -> new reader(factory) -> factory.getReader chain is a bit crazy but i dunno ok java best practices let's not go there [18:16:47] ottomata: Give me some time to sync with nuria on this - I also have the feeling we should remove the factory given how the code is now formatted [18:16:54] i really don't know what the point of the factory is [18:16:54] yeah [18:16:57] ok [18:17:28] ottomata: not commenting more on your CR comment, will make sure the road is correct first :) [18:17:52] i rebased my stuff on an earlier patch of yours, i'll submit it now using it and rebase on your final one laater [18:18:01] ack [18:18:35] i think you can comment on mine now, the code will be the same in mineindependent of what happens in yours [18:18:38] since it just uses the hive udf now [18:18:47] trying to submit,.,. [18:18:51] ok makes snese [18:21:20] (03PS7) 10Ottomata: Unify Refine transform functions and add user agent parser transform [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/586447 (https://phabricator.wikimedia.org/T238230) [18:21:51] ok submitted joal https://gerrit.wikimedia.org/r/c/analytics/refinery/source/+/586447 [18:22:08] will read that [18:43:55] (03PS8) 10Ottomata: Unify Refine transform functions and add user agent parser transform [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/586447 (https://phabricator.wikimedia.org/T238230) [18:55:25] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Druid access for view on event.editeventattempt - https://phabricator.wikimedia.org/T249945 (10Nuria) @dr0ptp4kt to be super clear, the data ingested will look as it does on the link francisco provided as we do not have plans to work on the map columns ing... [19:01:37] joal: i know ottomata does not like the factory( i heard you!) but the point is to make initialization easier, right now with these changes the maxmind getter is no longer a singleton [19:01:56] nuria: but what did the singleton even do? [19:01:59] it didin't hold any state [19:06:02] nuria: do ou want to discuss in batcave? [19:06:22] joal: sorry on meeting [19:06:28] np nuria [19:13:25] 10Analytics, 10Analytics-EventLogging, 10Event-Platform, 10Patch-For-Review: Vertical: Migrate SearchSatisfaction EventLogging event stream to Event Platform - https://phabricator.wikimedia.org/T249261 (10Ottomata) Parking some thoughts. Once the supporting patches are merged and deployed, we can set `"S... [19:21:12] (03CR) 10Mforns: Unify Refine transform functions and add user agent parser transform (032 comments) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/586447 (https://phabricator.wikimedia.org/T238230) (owner: 10Ottomata) [19:21:40] (03CR) 10Mforns: "Didn't give a +1, because it seems you're still adding changes." [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/586447 (https://phabricator.wikimedia.org/T238230) (owner: 10Ottomata) [19:25:58] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Druid access for view on event.editeventattempt - https://phabricator.wikimedia.org/T249945 (10dr0ptp4kt) @Nuria thanks - good to know! [19:27:31] 10Analytics, 10Operations, 10Wikimedia-General-or-Unknown, 10Readers-Web-Backlog (Needs Product Owner Decisions), 10SEO: Yoruba Language Wikipedia not being indexed by search engines - https://phabricator.wikimedia.org/T236241 (10ovasileva) [19:31:56] (03PS9) 10Ottomata: Unify Refine transform functions and add user agent parser transform [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/586447 (https://phabricator.wikimedia.org/T238230) [19:32:27] (03CR) 10Ottomata: Unify Refine transform functions and add user agent parser transform (032 comments) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/586447 (https://phabricator.wikimedia.org/T238230) (owner: 10Ottomata) [19:33:13] (03CR) 10Ottomata: "I don't expect this file to change anymore (significantly anyway). We'll need to rebase on some of Joseph's stuff as it changes, but this" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/586447 (https://phabricator.wikimedia.org/T238230) (owner: 10Ottomata) [19:38:21] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Druid access for view on event.editeventattempt - https://phabricator.wikimedia.org/T249945 (10kaldari) Looking great so far. Would it be possible to add a description for this dashboard in Turnillo (similar to the other dashboards), something like: "Sampl... [19:42:01] mforns: how's this one look to ya? [19:42:02] https://gerrit.wikimedia.org/r/c/schemas/event/secondary/+/585581 [19:42:28] ottomata: was reviewing it right now [19:42:36] :D [19:42:50] I mean... looks good to me! not sure I can spot any problems though just by looking [19:43:03] I like that you can put example events within the schema :] [19:43:23] why is that the current is only in yaml? [19:44:05] also, didn't know about the http field, it's cool [19:48:12] mforns: have you seen https://wikitech.wikimedia.org/wiki/Event_Platform/Schemas ? [19:48:28] more specifically [19:48:28] https://wikitech.wikimedia.org/wiki/Event_Platform/Schemas#Creating_a_new_schema [19:49:08] current.yaml is the main working copy [19:49:15] it is the only one that should be manually edited [19:49:19] all the other files are generated from it [19:49:23] aha [19:49:32] I saw this docs, but didn't read in depth [19:50:31] ya i plan to write ones that are more of a tutorial [19:50:34] for EL analytics stuff [19:53:44] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Druid access for view on event.editeventattempt - https://phabricator.wikimedia.org/T249945 (10kaldari) To answer the question above, looks like the data in the `edits_hourly` dashboard comes from the database and mostly relies on `revision_tags`. [19:53:47] mforns: ya it is hard to review something migraetd over like this, mostly i just wanted your +1 especially on the namespacing as we discussed [19:53:57] general plan is here: [19:53:58] https://phabricator.wikimedia.org/T249261#6056644 [19:55:05] 10Analytics, 10Analytics-Data-Quality, 10Analytics-EventLogging, 10Product-Analytics, and 3 others: WikiEditor records all edits as desktop edits in EventLogging - https://phabricator.wikimedia.org/T249944 (10kaldari) @Mayakp.wiki - Nevermind, I figured it out. It looks like the edits_hourly dashboard reli... [19:55:51] ottomata: yes, namespacing looks great, I think there will be no perfect namespacing, anything we choose will include some cases and will exclude others, but it seems to me that analytics/legacy/blah is pretty simple, accurate, and hopefully will make sense in the future when new schemas are added [19:56:09] and new stuff will be just analytics/ [19:56:10] e.g. [19:56:23] analytics/mediaview [19:56:26] analytics/session_ping [19:56:28] no subfolder? [19:56:29] whatever [19:56:33] project? [19:56:37] like [19:56:37] if people want sure [19:56:47] analytics/editing/blah [19:56:47] i mostly dn't want to babysit it :p [19:56:51] yea yea [19:56:58] stuff in analytiics/ esp folks can decide whatever they want [19:57:06] yes, makes sense [19:57:39] ok, if you +1 i'll merge that one, hopfeully I can get timo to merge or +2 https://gerrit.wikimedia.org/r/c/mediawiki/extensions/EventLogging/+/585587 today [19:57:46] then we'll have what we need in beta to start producing events there [19:58:02] Refine stuff is only for prod anyway, so we'll have to wait a train or two before we do more [19:58:16] joal:, ottomata : let's see the singleton ensures 1 instance is created, now in our context we are creating a number of instances [19:58:44] joal, ottomata : i probably do not have time today to talk about this but i can comment on CR later on today and catch up tomorrow [19:59:02] ottomata: I gave a +1 :], good luck with the other one. [19:59:05] danke [19:59:15] nuria: heheh, but one instance of what?! it didn't do anything! :p [19:59:46] it just gave you a an object that you could call methods on that did things [20:00:40] not sure why MaxmindGeocoderFactory.getInstnace().getISPReader().geocode() is better than new ISPReader().geocode [20:00:56] if all MaxmindGeocoderFactory.getInstnace().getISPReader() does is call new ISPReader() anyway [20:01:03] 10Analytics, 10Analytics-Data-Quality, 10Analytics-EventLogging, 10Product-Analytics, and 3 others: WikiEditor records all edits as desktop edits in EventLogging - https://phabricator.wikimedia.org/T249944 (10kaldari) >... we'd probably still want to differentiate between MobileFrontend and WikiEditor. As-... [20:01:20] 10Analytics, 10Analytics-Data-Quality, 10Analytics-EventLogging, 10Product-Analytics, and 3 others: WikiEditor records all edits as platform = desktop in EventLogging - https://phabricator.wikimedia.org/T249944 (10kaldari) [20:05:43] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Druid access for view on event.editeventattempt - https://phabricator.wikimedia.org/T249945 (10Nuria) @kaldari edits_hourly is a denormalized version of the data in medaiwiki updated monthly (the denormalization takes couple days of processing time), Once... [20:13:30] 10Analytics, 10Analytics-Data-Quality, 10Analytics-EventLogging, 10Product-Analytics, and 3 others: WikiEditor records all edits as platform = desktop in EventLogging - https://phabricator.wikimedia.org/T249944 (10kaldari) FYI, MobileFrontend has code to detect phones and tablets in `UADeviceDetector.php`.... [21:46:02] hue says no? :( [21:46:16] java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient [21:58:33] ooh, the docs might help me :D https://wikitech.wikimedia.org/wiki/Analytics/Cluster/Hue [21:59:33] YAY! [22:02:31] 10Analytics, 10Research: Check home/HDFS leftovers of flemmerich - https://phabricator.wikimedia.org/T246070 (10leila) [22:02:50] 10Analytics, 10Research: Check home/HDFS leftovers of flemmerich - https://phabricator.wikimedia.org/T246070 (10leila) a:03Isaac [23:37:28] 10Analytics, 10Analytics-Data-Quality, 10Analytics-EventLogging, 10Product-Analytics, and 3 others: WikiEditor records all edits as platform = desktop in EventLogging - https://phabricator.wikimedia.org/T249944 (10Mayakp.wiki) Hi @kaldari , you are right. platform in the edits_hourly dashboard is derived f...