[03:26:01] 10Analytics, 10Product-Analytics: Bug: Superset asking for my credentials on every page load - https://phabricator.wikimedia.org/T224159 (10JKatzWMF) @Nuria It's fixed for me! 😍 [05:34:43] 10Analytics, 10Release-Engineering-Team-TODO, 10Continuous-Integration-Infrastructure (Slipway), 10Release-Engineering-Team (CI & Testing services): Migrate analytics/refinery/source release jobs to Docker - https://phabricator.wikimedia.org/T210271 (10hashar) [07:16:23] Good morning team [07:21:35] 10Analytics, 10Analytics-Kanban, 10Product-Analytics: Bug: Superset asking for my credentials on every page load - https://phabricator.wikimedia.org/T224159 (10Nuria) [07:44:24] Hi nuria - Sorry for the early ping - Do you agree I should delete the tracking-failure line on Matomo ? [08:10:10] morning! [11:20:29] 10Analytics, 10Analytics-Kanban, 10Wikimedia-Portals: Review all the oozie coordinators/bundles in Refinery to add alerting when missing - https://phabricator.wikimedia.org/T228747 (10JAllemandou) We need to differentiate between emails sent when an error occurs (I call this error-email) and emails sent when... [11:29:52] joal: ya, +1 [11:30:32] !log Remove tracking failure for http://Wikipedia/screen/Explore in matomo [11:30:34] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [11:30:37] Thanks nuria :) [11:36:39] joal: nice audit on sla alarms. do you think all critical jobs should implement SLA ? or you is that too much [11:37:29] nuria: originally we thought we would only have them for critical, but then we actually prefer to know when things don't work - So possibly implementing on all by default [11:38:08] it's again some tedious works, but not complicated [11:58:53] (03PS1) 10Joal: Correct ozzie workflows for error-emails [analytics/refinery] - 10https://gerrit.wikimedia.org/r/532361 (https://phabricator.wikimedia.org/T228747) [12:38:41] joal: ok, let's please just do the SLA do it in jobs we own and we can open tickets to others to implement it in other jobs [12:39:16] nuria: Aren't the jobs in refinery/oozie all owned by us? [12:39:43] joal: some of them are addshore I think , like wikidata*... [12:40:24] joal: also article_recomender is un unfinished job [12:40:40] nuria: true, I should have noted that in comment [12:41:13] nuria: I don't like the idea that what is in refiney is not to be fully maintained by us [12:41:19] joal: also "projectview-geo"? [12:41:43] joal: I think those jobs did not had a place to go earlier on, we can migrate them to an outside package if needed be [12:41:51] nuria: maybe we shoujld ask people willing to have their own job to have their own folder (as discovery does, and as we asked product-analytics)? [12:42:50] nuria: Ok to create a ticket about the wikidata jobs [12:43:14] joal: sure, i think there are very few there, i can only think of those two cases [12:43:16] Hola [12:43:29] Hi addshore - Sorry for the noise ;) [12:43:35] ;) [12:44:08] If they need to be in their own folder that works for us [12:44:50] projectview-geo works AFAIK nuria [12:45:28] nuria: I think I din't get wht you meant - projectview-geo is not one of ours? [12:45:55] nuria: Shall I move /wmf/data/archive/geowiki to /wmf/data/archive/geowiki_legacy ? [12:46:42] joal: sounds good, please add note to docs in wikitech [12:46:46] addshore: holaaa [12:46:53] Yes [12:47:51] Afaik all of the Wikidata ones work, run, and are used :) (Hola today as I'm in Mexico) ;) [13:19:37] taking a break team - see you at standup [13:50:57] 10Analytics, 10Product-Analytics: Streamline Superset signup and authentication - https://phabricator.wikimedia.org/T203132 (10Nuria) New Superset is deployed so going forward accessing superset can be done just like accessing turnilo, >Create a developer account, which involves picking both an overall usern... [13:51:47] (03CR) 10Nuria: [C: 03+2] Correct ozzie workflows for error-emails [analytics/refinery] - 10https://gerrit.wikimedia.org/r/532361 (https://phabricator.wikimedia.org/T228747) (owner: 10Joal) [14:18:23] hey alll [14:21:49] 10Analytics, 10Operations, 10Core Platform Team Legacy (Watching / External), 10Patch-For-Review, and 2 others: Replace and expand kafka main hosts (kafka[12]00[123]) with kafka-main[12]00[12345] - https://phabricator.wikimedia.org/T225005 (10Ottomata) Heya! Yes, that link from Petr is the right one, just... [14:24:58] 10Analytics, 10Analytics-SWAP, 10Product-Analytics: Upgrade all SWAP users to JupyterLab 1.0 - https://phabricator.wikimedia.org/T230724 (10Ottomata) a:03Ottomata [14:26:58] hola mforns [14:27:25] hello nuria! [14:28:00] 10Analytics, 10Operations, 10Research-management, 10Patch-For-Review, 10User-Elukey: Remove computational bottlenecks in stats machine via adding a GPU that can be used to train ML models - https://phabricator.wikimedia.org/T148843 (10Ottomata) > We are testing in https://phabricator.wikimedia.org/T22934... [14:30:09] hello a-team! [14:30:18] Am back and ready to kapow! [14:30:21] emails checked. [14:30:23] kablammmmm [14:42:11] welcome back ottomata ! [14:47:50] hey ottoooo!!! [14:47:56] welcome :] [14:52:07] a-team the goals meeting thing conflicts with standup [14:53:39] I kind of wanna go and I see others in the team accepted the invite, can we postpone/cancel it? [14:54:37] fdans: i need to talk with otto about superset [14:54:47] fdans: so i am going to go to teh next edition of the meeting [14:55:03] fdans: please send 1 pager about wikimanina [14:55:26] nuria: ooh, I didn't know there's another edition [14:55:28] fdans: if you want to go to teh goals meeting please do [14:55:36] fdans: ya, there will be two [14:55:51] fdans: this is teh 1st one that is going tocatch alot of people on vacation [14:56:47] nuria: I think I'll go to the next one then, I'll be in standup [14:58:47] (03PS2) 10Fdans: Transition data rows to using time ranges instead of timestamps [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/531148 (https://phabricator.wikimedia.org/T230514) [15:00:36] (03CR) 10jerkins-bot: [V: 04-1] Transition data rows to using time ranges instead of timestamps [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/531148 (https://phabricator.wikimedia.org/T230514) (owner: 10Fdans) [15:02:11] ping joal standduppp [15:02:17] Hey team - I had answered yes to today's meeting on individual goals, will skip standup [15:03:30] joal: +1 [15:03:51] sending e-scrum now [15:03:55] (forgot ...) [15:17:40] 10Analytics: version analytics meta backup - https://phabricator.wikimedia.org/T231208 (10Nuria) [15:18:54] 10Analytics: Version analytics meta mysql database backup - https://phabricator.wikimedia.org/T231208 (10Ottomata) [15:20:34] 10Analytics, 10Analytics-EventLogging: Port reportupdater queries that use MySQL log eventlogging database to Hive event database - https://phabricator.wikimedia.org/T229862 (10Nuria) a:03fdans [15:20:53] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban: Port reportupdater queries that use MySQL log eventlogging database to Hive event database - https://phabricator.wikimedia.org/T229862 (10Nuria) [15:24:00] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban: Port reportupdater queries that use MySQL log eventlogging database to Hive event database - https://phabricator.wikimedia.org/T229862 (10Nuria) We will be migrating queries that have puppet timers. Format is slightly different as hive queries are wr... [15:30:08] 10Analytics, 10Analytics-Kanban: Version analytics meta mysql database backup - https://phabricator.wikimedia.org/T231208 (10fdans) [15:30:27] 10Analytics, 10Analytics-Kanban: Version analytics meta mysql database backup - https://phabricator.wikimedia.org/T231208 (10fdans) p:05Triage→03High [15:31:05] 10Analytics, 10Operations, 10SRE-Access-Requests: Access to HUE for cchen - https://phabricator.wikimedia.org/T231111 (10fdans) p:05Triage→03High [15:37:55] 10Analytics, 10Product-Analytics: Update R from 3.3.3 to 3.6.0 on stat and notebook machines - https://phabricator.wikimedia.org/T220542 (10Ottomata) Upgrading to Buster will now probably take us a while due to Java 8 vs Java 11 issues. See also: T231067 [15:38:46] 10Analytics, 10Product-Analytics: Update R from 3.3.3 to 3.6.0 on stat and notebook machines - https://phabricator.wikimedia.org/T220542 (10Ottomata) [15:43:33] 10Analytics, 10Analytics-SWAP, 10Product-Analytics: Provide Python 3.6 on SWAP - https://phabricator.wikimedia.org/T212591 (10Nuria) 05Open→03Resolved [16:29:02] something up with HDFS? I can see a file in /mnt/hdfs on stat1007, but `hdfs dfs -ls` can't find it, and oozie workflow is complaining it doesn't exist: hdfs:/analytics-hadoop/user/ebernhardson/discovery-analytics/2019-08-19T16.56.58+00.00-ae2f7f9/oozie/query_clicks/hourly/workflow.xml [16:29:43] hmm, wonder why that only has hdfs:/ instead of hdfs:// ... looking [16:30:25] :) [16:31:02] ottomata: so the oozie.wf.application.path clearly has hdfs://, but the error report from hue to re-run the failed workflow onyl reports the single / [16:31:31] https://hue.wikimedia.org/oozie/list_oozie_workflow/0004006-190822093211873-oozie-oozi-W/?coordinator_job_id=0025649-190730075836326-oozie-oozi-C [16:32:05] it's been intermittently failing all weekend, against files that were uploaded to hdfs more than a week ago and not deleted [16:54:01] and now it agreed to rerun one, without changing anything. :S [16:58:40] elukey: i can see how files would be present on mnt hdfs but not on hdfs maybe as files on mnt might be stale (totally guessing ) [16:59:15] sorry meant erik not luca [17:12:36] well for whatever reason the 11 separate hours (non-continuous) of failures decided to re-run just fine...no clue what gremlin mucked it up [17:33:28] 10Analytics, 10Analytics-Kanban, 10Product-Analytics: Bug: Superset asking for my credentials on every page load - https://phabricator.wikimedia.org/T224159 (10kzimmerman) Yay! Working for me too! [18:07:13] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Version analytics meta mysql database backup - https://phabricator.wikimedia.org/T231208 (10Ottomata) I've got this working in the analytics cluster, but I can't get it to work with kerberos in the analytics test cluster. Here's why. The MySQL instance h... [18:07:37] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Version analytics meta mysql database backup - https://phabricator.wikimedia.org/T231208 (10Ottomata) [18:18:41] 10Analytics, 10Operations, 10SRE-Access-Requests: Access to HUE for cchen - https://phabricator.wikimedia.org/T231111 (10mforns) @cchen You should be able to access Hue now. Please, reach out if you have any problems. Cheers! [18:19:07] 10Analytics, 10Analytics-Kanban, 10Operations, 10SRE-Access-Requests: Access to HUE for cchen - https://phabricator.wikimedia.org/T231111 (10mforns) [18:24:41] Hi ottomata :) Nice talking to you anew :) [18:25:08] I'll grab some of your time tomorrow to talk about pyspark packaging, python version, and all that :) [18:28:51] 10Analytics, 10Analytics-Kanban, 10Operations, 10SRE-Access-Requests: Access to HUE for cchen - https://phabricator.wikimedia.org/T231111 (10cchen) Thank you @mforns! I am able to access Hue now! [18:47:08] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Rebuild spark2 for Debian Buster - https://phabricator.wikimedia.org/T229347 (10Ottomata) Weird! Pretty dumb, it is because python3-tk has `Depends: python3 (>= 3.5), python3 (<< 3.6)`. Even though python3.5 is available, apt is removing the python3.7 fr... [18:48:54] 10Analytics, 10Analytics-Kanban, 10Operations, 10SRE-Access-Requests: Access to HUE for cchen - https://phabricator.wikimedia.org/T231111 (10mforns) Cool! [19:06:56] !log update spark2 package to -4 version with support for python3.7 across cluster. T229347 [19:06:59] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [19:06:59] T229347: Rebuild spark2 for Debian Buster - https://phabricator.wikimedia.org/T229347 [19:17:06] Container-meme: https://twitter.com/MSAdministrator/status/1165463915104542720 [19:25:07] 10Analytics: Upgrade pandas in spark SWAP notebooks - https://phabricator.wikimedia.org/T222301 (10Ottomata) After the work for T229347, the version of pandas that pyspark2 will use is 0.25.0. [19:25:20] 10Analytics, 10cloud-services-team (Kanban): Upgrade pandas in spark SWAP notebooks - https://phabricator.wikimedia.org/T222301 (10Ottomata) [19:25:45] 10Analytics, 10Analytics-Kanban: Upgrade pandas in spark SWAP notebooks - https://phabricator.wikimedia.org/T222301 (10Ottomata) [19:25:52] 10Analytics, 10Analytics-Kanban: Upgrade pandas in spark SWAP notebooks - https://phabricator.wikimedia.org/T222301 (10Ottomata) a:03Ottomata [19:33:26] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Rebuild spark2 for Debian Buster - https://phabricator.wikimedia.org/T229347 (10Ottomata) I installed spark2 2.3.1-bin-hadoop2.6-4 everywhere, and now the numpy and pyarrow/pandas test in yarn works from stretch with python 3.5 and 3.7, and in Buster. Unf... [19:33:30] 10Analytics: Wikistats2 time related bugs - https://phabricator.wikimedia.org/T231248 (10mforns) [19:34:16] 10Analytics, 10Analytics-Cluster: Pyspark on SWAP: Py4JJavaError: Import Error: no module named pyarrow - https://phabricator.wikimedia.org/T222254 (10Ottomata) This was resolved by shipping pyarrow and other python deps with the spark2 package. [19:34:26] 10Analytics, 10Analytics-Cluster: Pyspark on SWAP: Py4JJavaError: Import Error: no module named pyarrow - https://phabricator.wikimedia.org/T222254 (10Ottomata) a:03Ottomata [19:34:33] 10Analytics, 10Analytics-Cluster, 10Analytics-Kanban: Pyspark on SWAP: Py4JJavaError: Import Error: no module named pyarrow - https://phabricator.wikimedia.org/T222254 (10Ottomata) [19:47:42] 10Analytics, 10Fundraising-Backlog, 10fundraising-tech-ops: investigate recent divot in landingpage log activity - https://phabricator.wikimedia.org/T224733 (10Jgreen) [20:23:22] 10Analytics, 10Analytics-Wikistats, 10Operations, 10Traffic, 10Performance-Team (Radar): Piwik JS isn't cached - https://phabricator.wikimedia.org/T230772 (10kchapman) [21:01:18] 10Analytics, 10Analytics-Cluster, 10Analytics-Kanban, 10Patch-For-Review: Upgrade Spark to 2.4.x - https://phabricator.wikimedia.org/T222253 (10Ottomata) Based on the work in T229347, I built a Spark 2.4.3 .deb and tried it on stat1005. Since spark.yarn.archive is set in spark-defaults.conf to the 2.3.1... [21:34:27] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Turnilo: Remove count metric for edit_hourly data cube - https://phabricator.wikimedia.org/T230963 (10mforns) a:03mforns [22:02:49] 10Analytics, 10Analytics-Kanban: Version analytics meta mysql database backup - https://phabricator.wikimedia.org/T231208 (10Nuria) And (i know paranoia) we have tested the restore from one of these backups?