[08:27:06] 10Analytics, 10Analytics-Kanban: Geoeditors_private deletion scripts scheduled day conflicts with retention period - https://phabricator.wikimedia.org/T231017 (10JAllemandou) To keep archive happy: this is the page: https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Data_deletion_and_sanitization T... [08:34:46] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Generate edit totals by country by month/year - https://phabricator.wikimedia.org/T215655 (10JAllemandou) Thanks a lot @leila and @diego :) One question left before I can fully complete this (the middle one :), others are comments. >> - for all namespa... [09:03:58] (03PS1) 10Joal: Update geoditors-yearly oozie job [analytics/refinery] - 10https://gerrit.wikimedia.org/r/533169 (https://phabricator.wikimedia.org/T215655) [09:16:50] (03PS2) 10Joal: Update geoditors-yearly oozie job [analytics/refinery] - 10https://gerrit.wikimedia.org/r/533169 (https://phabricator.wikimedia.org/T215655) [09:17:40] (03CR) 10Joal: [V: 03+2] "Tested on cluster" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/533169 (https://phabricator.wikimedia.org/T215655) (owner: 10Joal) [09:23:45] (03PS1) 10Joal: [WIP] Add SLA-email alerts to all oozie jobs [analytics/refinery] - 10https://gerrit.wikimedia.org/r/533173 (https://phabricator.wikimedia.org/T228747) [09:27:55] (03CR) 10Joal: [V: 03+2] "dry-run tested on cluster ok - following with all other jobs." [analytics/refinery] - 10https://gerrit.wikimedia.org/r/533173 (https://phabricator.wikimedia.org/T228747) (owner: 10Joal) [10:21:39] 10Analytics, 10Analytics-Dashiki, 10CX-analytics, 10Language-analytics: The language-reportcard.wmflabs.org/cx2 chart is stuck at 2018-10-21 - https://phabricator.wikimedia.org/T208324 (10Pginer-WMF) [10:37:21] (03PS2) 10Joal: Add SLA-email alerts to all oozie jobs [analytics/refinery] - 10https://gerrit.wikimedia.org/r/533173 (https://phabricator.wikimedia.org/T228747) [12:08:34] (03PS7) 10Fdans: Change partition structure to year/month/day/hour. [analytics/refinery] - 10https://gerrit.wikimedia.org/r/532725 (https://phabricator.wikimedia.org/T229817) [12:43:33] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Generate edit totals by country by month/year - https://phabricator.wikimedia.org/T215655 (10Nuria) @joal: we have data for wikidata, I think there is no need to generate a file at this time. [12:50:17] 10Analytics, 10Analytics-Kanban: Geoeditors_private deletion scripts scheduled day conflicts with retention period - https://phabricator.wikimedia.org/T231017 (10Nuria) 05Open→03Resolved [12:56:24] Hi nuria [12:56:47] hello joal [12:57:21] (03CR) 10Nuria: [C: 04-1] Update geoditors-yearly oozie job (034 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/533169 (https://phabricator.wikimedia.org/T215655) (owner: 10Joal) [12:57:25] nuria: Do you mind explaining your comment on T215655? I don't understand "we have data for wikidata" [12:57:25] T215655: Generate edit totals by country by month/year - https://phabricator.wikimedia.org/T215655 [12:59:07] joal: yes, sorry. since nobody has asked for edits for wikidata for gii I do not think we need to include them. Data is available for wikidata should we decide to compute this at a later time when leila talks to GII folks. Makes sense? (otherwise seems scope creep) [12:59:46] Ok makes sense - I'm removing it but will keep the project-family parameterization if ok for you (ready for later) [13:09:39] joal: +1 ya, sounds great [13:12:44] (03CR) 10Nuria: Change partition structure to year/month/day/hour. (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/532725 (https://phabricator.wikimedia.org/T229817) (owner: 10Fdans) [13:13:35] 10Analytics: Set up automatic deletion for netflow data set in Hive - https://phabricator.wikimedia.org/T231339 (10Nuria) If schema and data do not match, unhappy customer cause there is no data. [13:13:58] 10Analytics, 10Analytics-Kanban: Version analytics meta mysql database backup - https://phabricator.wikimedia.org/T231208 (10Nuria) Excellent, closing. [13:14:10] 10Analytics, 10Analytics-Kanban: Version analytics meta mysql database backup - https://phabricator.wikimedia.org/T231208 (10Nuria) 05Open→03Resolved [13:14:33] (03PS3) 10Joal: Update geoditors-yearly oozie job [analytics/refinery] - 10https://gerrit.wikimedia.org/r/533169 (https://phabricator.wikimedia.org/T215655) [13:14:54] (03CR) 10Joal: [V: 03+2] "Thanks for the review nuria- testing the new version on cluster." (034 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/533169 (https://phabricator.wikimedia.org/T215655) (owner: 10Joal) [13:16:14] 10Analytics, 10Analytics-Kanban: Upgrade pandas in spark SWAP notebooks - https://phabricator.wikimedia.org/T222301 (10Nuria) 05Open→03Resolved [13:16:20] 10Analytics, 10Analytics-Cluster, 10Analytics-Kanban, 10Patch-For-Review: Upgrade Spark to 2.4.x - https://phabricator.wikimedia.org/T222253 (10Nuria) [13:16:24] 10Analytics, 10Analytics-Data-Quality, 10Analytics-Kanban: Anomalous statistics results in eu.wikipedia siteviews - https://phabricator.wikimedia.org/T212879 (10Nuria) 05Open→03Resolved [13:16:39] 10Analytics, 10Analytics-Kanban, 10Product-Analytics: Bug: Superset asking for my credentials on every page load - https://phabricator.wikimedia.org/T224159 (10Nuria) 05Open→03Resolved [13:17:36] 10Analytics, 10Analytics-Kanban, 10Discovery, 10Operations, and 2 others: Make oozie swift upload emit event to Kafka about swift object upload complete - https://phabricator.wikimedia.org/T227896 (10Nuria) 05Open→03Resolved [13:17:40] 10Analytics, 10Discovery, 10Operations, 10Research-Backlog, 10Patch-For-Review: Workflow to be able to move data files computed in jobs from analytics cluster to production - https://phabricator.wikimedia.org/T213976 (10Nuria) [13:20:20] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Rebuild spark2 for Debian Buster - https://phabricator.wikimedia.org/T229347 (10Nuria) 05Open→03Resolved [13:20:24] 10Analytics, 10Analytics-Cluster, 10Analytics-Kanban, 10Patch-For-Review: Upgrade Spark to 2.4.x - https://phabricator.wikimedia.org/T222253 (10Nuria) [13:21:50] (03CR) 10Joal: [V: 03+2] "Tested" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/533169 (https://phabricator.wikimedia.org/T215655) (owner: 10Joal) [13:26:22] (03CR) 10Nuria: Add SLA-email alerts to all oozie jobs (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/533173 (https://phabricator.wikimedia.org/T228747) (owner: 10Joal) [13:28:35] (03CR) 10Nuria: [C: 03+1] "I think this is ready to merge!" (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/533169 (https://phabricator.wikimedia.org/T215655) (owner: 10Joal) [13:35:34] ottomata: Hola! is the migration to evengate main completed? [13:36:02] hey team :] [13:36:37] holaaa mforns [13:39:16] boa taaarde mforns [13:48:33] 10Analytics, 10Product-Analytics, 10Reading Depth: Reading_depth: remove eventlogging instrumentation - https://phabricator.wikimedia.org/T229042 (10phuedx) a:05phuedx→03None [13:55:31] 10Analytics, 10Product-Analytics, 10Reading Depth: Reading_depth: remove eventlogging instrumentation - https://phabricator.wikimedia.org/T229042 (10phuedx) Moving this off Reader's Webs board the original request of the team has been fulfilled. [14:05:03] 10Analytics: SLF4J errors when querying mediawiki_wikitext_history - https://phabricator.wikimedia.org/T231373 (10JAllemandou) Explanantions: By selecting all limit 1, you're not actually running a map-reduce job but rather read the file locally to print all columns. Files storing the text are compressing data a... [14:05:14] 10Analytics: SLF4J errors when querying mediawiki_wikitext_history - https://phabricator.wikimedia.org/T231373 (10JAllemandou) 05Open→03Resolved [14:12:27] 10Analytics: SLF4J errors when querying mediawiki_wikitext_history - https://phabricator.wikimedia.org/T231373 (10Nuria) @dr0ptp4kt reducing query size (columns, where clause more explicit) will help as well. Closing. [14:13:04] 10Analytics: Refine: Use Spark SQL instead of Hive JDBC - https://phabricator.wikimedia.org/T209453 (10Ottomata) Yesterday I updated my pull request and pinged more Apache Spark folks review. I was told that they were in the process of completely refactoring the SQL catalog code, and the refactor will include s... [14:14:41] 10Analytics, 10Product-Analytics: Update R from 3.3.3 to 3.6.0 on stat and notebook machines - https://phabricator.wikimedia.org/T220542 (10mpopov) Would it be possible to upgrade just SWAP to Buster, or is Spark 2.3-Java 8 issue blocking the upgrade across the board (not just the Hadoop cluster)? I suspect w... [14:21:05] ottomata: would you have a minute for me now? [14:22:15] 10Analytics, 10Product-Analytics, 10Reading Depth: Reading_depth: deactivate eventlogging instrumentation - https://phabricator.wikimedia.org/T229042 (10Nuria) 05Open→03Resolved [14:23:37] 10Analytics, 10Product-Analytics: Update R from 3.3.3 to 3.6.0 on stat and notebook machines - https://phabricator.wikimedia.org/T220542 (10Ottomata) Java 11 is a big blocker to upgrading to Buster anywhere; we essentially have to update everything to Java 11 at once, otherwise Java 8 clients will likely fail... [14:46:34] 10Analytics: Install Debian Buster on Hadoop - https://phabricator.wikimedia.org/T231067 (10Ottomata) Confirmed that Spark 2.4.3 works with Java 8. I think we can and should upgrade to Spark 2.4.3 before we switch to Java 11 and Buster. As far as I can tell, Spark 2.4.3 also works with Java 11. I can't test... [14:54:07] Gone for kids [14:54:57] https://www.irccloud.com/pastebin/CfBqP542/ [14:55:08] why do we keep missing each other!!??? [15:07:17] ottomata: is the oozie worflow to move staff to swift documented a bit in wiitech [15:16:36] wikitech no [15:26:28] ottomata: can we add a bit of docs so we can pass them along to adam's team? [15:26:58] joal: do we have any docs about how to parse the dumps stored (in parquet i think) in the cluster? [15:28:53] ottomata: docs can point to code samples from ebernhardson when using that functionality [15:32:52] aye ok [15:58:41] ebernhardson: o/ are you using the swift upload workflow yet? if so, can you link? [16:00:55] (03CR) 10Mforns: "Wrote some potentially useless comments... :D" (0311 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/533173 (https://phabricator.wikimedia.org/T228747) (owner: 10Joal) [16:00:56] ping ottomata standdduppppp [16:01:56] ottomata: would be this one https://github.com/wikimedia/wikimedia-discovery-analytics/blob/master/oozie/glent/esbulk/workflow.xml [16:02:27] ty! [16:51:44] nuria: https://wikitech.wikimedia.org/wiki/Analytics/Systems/Exporting_from_HDFS_to_Swift [17:15:17] ottomata: grasias [17:19:07] nuria: about Leila geoeditors answer- Shall I create a new task and a new patch, or shall I do everything at once in the current patch? [17:33:51] joal: wanna chat? [17:51:36] hey ottomata :) [17:51:47] 10 minutes before monthly metrics? [17:52:27] yes! [17:52:30] joal: am here [17:52:50] ottomata: cave? [17:53:03] k [18:03:27] joal: just to emphasize, I don't mean to expand the scope of work. Only if it's easy, it's a nice to have. otherwise, drop it. [18:04:12] leila: very easy (done already), but still scope creep :) Let's follow nuria's views :) [18:04:29] ok