[01:45:51] 10Quarry: Quarry's download as "Excel xlsx" creates an empty file last 2 days - https://phabricator.wikimedia.org/T257453 (10Jarekt) Thanks for the fix [06:09:27] 10Analytics, 10Jupyter-Hub: PySpark Error in JupyterHub: Python in worker has different version - https://phabricator.wikimedia.org/T256997 (10elukey) @diego we deployed a new version of spark on stat1008 that should work, but we are not sure if jupyterhub needs more settings or not. Can you retry (possibly fr... [06:10:08] 10Analytics-Clusters: PySpark Error in JupyterHub: Python in worker has different version - https://phabricator.wikimedia.org/T256997 (10elukey) [06:12:30] 10Analytics-Clusters, 10Product-Analytics: Request admin access to Superset - https://phabricator.wikimedia.org/T255207 (10elukey) 05Open→03Resolved a:03elukey @cchen closing the task since I think we are done, please re-open if I am missing something :) [09:41:59] 10Analytics-Clusters: Review an-coord1001's usage and failover plans - https://phabricator.wikimedia.org/T257412 (10elukey) I put some thinking into the host failure use case (hw failure, an-coord1001 down for days), and about the cost of finding a workaround until a replacement arrives. This is a high level lis... [10:27:30] * elukey lunch [10:46:27] 10Analytics-Clusters: PySpark Error in JupyterHub: Python in worker has different version - https://phabricator.wikimedia.org/T256997 (10diego) @elukey apparently no changes (I've restarted the server), I'm getting this error: ` Py4JJavaError: An error occurred while calling o75.count. : org.apache.spark.SparkE... [11:29:02] (03PS1) 10Awight: Change metric for TwoColConflict disables [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/611279 (https://phabricator.wikimedia.org/T257577) [11:41:47] Is there any way of reconstructing the history of user_properties? Even a single snapshot of properties from a few months ago would solve my current problem. [12:34:12] 10Analytics-Clusters: PySpark Error in JupyterHub: Python in worker has different version - https://phabricator.wikimedia.org/T256997 (10elukey) @diego if you try on a `pyspark --master yarn` session, do you get the same result? Also, can you post in here how to repro the problem (if it is not a complicated quer... [13:20:47] 10Analytics-Clusters, 10Operations: Segfault for systemd-sysusers.service on stat1007 - https://phabricator.wikimedia.org/T256098 (10elukey) The redhat bug report leads to https://github.com/systemd/systemd/issues/6512, I followed the steps outlined in there: ` elukey@stat1007:~$ sudo gdb systemd-sysusers [..... [13:31:13] 10Analytics-Clusters, 10Operations: Segfault for systemd-sysusers.service on stat1007 - https://phabricator.wikimedia.org/T256098 (10elukey) Coreos applied a patch to libc: https://github.com/mischief/coreos-overlay/commit/19d5f42d8208334ef8581ba90e01161e00dede71 [13:39:53] 10Analytics-Clusters, 10Operations: Segfault for systemd-sysusers.service on stat1007 - https://phabricator.wikimedia.org/T256098 (10elukey) ` elukey@stat1006:~$ sudo systemd-sysusers Creating group systemd-coredump with gid 490. Creating user systemd-coredump (systemd Core Dumper) with uid 490 and gid 490. Se... [13:51:53] 10Analytics, 10Product-Analytics, 10Epic: API pageview counts for 'Mobile app' are incorrect since switch to mobile-html - https://phabricator.wikimedia.org/T256508 (10JoeWalsh) [14:07:33] 10Analytics-Clusters, 10Operations: Segfault for systemd-sysusers.service on stat1007 - https://phabricator.wikimedia.org/T256098 (10elukey) I checked `/usr/lib/sysusers.d/*.conf` and the last user listed is `systemd-coredump`, plus we still don't use systemd-sysusers in analytics (yet). [14:28:19] hahaha [14:30:01] awight: you can get a snapshot from db backups. We don't sqoop in user_properties but we could add it. Maybe we should add everything from the mysql dbs even if we don't use it... [14:51:42] 10Analytics-Clusters: PySpark Error in JupyterHub: Python in worker has different version - https://phabricator.wikimedia.org/T256997 (10diego) @elukey the ` pyspark --master yarn ` solution means to run another notebook (in another port)? To reproduce the error you can do: ` import pandas as pd changes =... [14:54:55] (03CR) 10Milimetric: [V: 03+2] eventlogging: Remove unused props from PrefUpdate [analytics/refinery] - 10https://gerrit.wikimedia.org/r/588105 (https://phabricator.wikimedia.org/T249894) (owner: 10Krinkle) [15:31:55] 10Analytics, 10Dumps-Generation: Sample HTML Dumps - Request for feedback - https://phabricator.wikimedia.org/T257480 (10RBrounley_WMF) >>! In T257480#6292334, @ArielGlenn wrote: > Couple quick thoughts about the format: it would be good for the articles to be written into subdirectories for the larger wikis,... [16:01:55] !log deployed, EL whitelist is updated [16:01:56] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [16:03:46] milimetric: o/ - one qs - what was deployed? :D [16:04:31] elukey: I did a refinery deploy without any source or restarting of jobs. The updates that had not already been deployed were scripts and jobs that have to wait to be rerun next week anyway [16:04:42] elukey: so the only functional thing is an update to the eventlogging sanitizing whitelist [16:05:54] milimetric: okok, maybe let's add a log like "!log refinery deployed something something, EL whitelist updated etc.. [16:06:19] I don't want to nitpick but it is hard to figure out from the SAL what was done with only "Deployed" [16:06:30] (say if we need to track something down etc..) [16:08:15] makes sense, will do [16:08:38] (heh, should I be like !log last log should've mentioned I updated the EL whitelist?) [16:10:38] 10Analytics, 10Dumps-Generation: Sample HTML Dumps - Request for feedback - https://phabricator.wikimedia.org/T257480 (10Isaac) > English Wiki has 15m articles (I believe) > a full enwiki dump is clocking in at 944gb or something insanely large I'm pretty sure a large part of this issue is based on how you han... [16:14:55] 10Analytics, 10Dumps-Generation: Sample HTML Dumps - Request for feedback - https://phabricator.wikimedia.org/T257480 (10ArielGlenn) @RBrounley_WMF The 15k streams are concatenated together into one file, but it's easy to look for start of bz2 file markers (since they are byte aligned) and process multiple str... [16:25:06] 10Analytics, 10Dumps-Generation: Sample HTML Dumps - Request for feedback - https://phabricator.wikimedia.org/T257480 (10R.zhurba) @ArielGlenn regarding HTML cleaning I think about removing attributes that don't hold information, like styles and classes. [16:55:08] 10Analytics, 10Event-Platform, 10Technical-blog-posts: Story idea for Blog: Wikimedia's Event Platform - https://phabricator.wikimedia.org/T253649 (10srodlund) Hey hey @Ottomata would you like to schedule a meeting to discuss an initial outline to work from? [17:10:10] !log updating the EL whitelist, refinery reploy (but not source) [17:10:12] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [17:10:15] cc milimetric [17:11:04] elukey: i am getting an error in jupyter that is probably a sympton of something else: "It is possible the underlying files have been updated. You can explicitly invalidate the cache in Spark by running 'REFRESH TABLE tableName' command in SQL or by recreating the Dataset/DataFrame involved." [17:16:20] nuria: stat1008? [17:16:26] elukey: 1007 [17:16:59] elukey: is there a good way to troubleshoot jupyter errors? [17:17:01] mmmm the only think that I did yesterday was to upgrade spark on stat1008, but didn't do anything on the rest [17:17:07] elukey: k [17:17:13] lemme check [17:18:30] elukey: nvm restarted it all and things worked [17:18:43] elukey: this is jupyter getting himself in a horked state [17:19:00] ah okok! [17:19:21] turn off and on again is always the most scientific thing to do :D [17:22:56] if everything is ok, I'd log off for today [17:25:12] all right, will check later to see if anything is needed :) [17:34:06] 10Analytics: User entropy alarms. Evaluate thresholds - https://phabricator.wikimedia.org/T257691 (10Nuria) [17:41:12] 10Analytics, 10Product-Analytics, 10Epic: Add data quality alarm for mobile-app data - https://phabricator.wikimedia.org/T257692 (10Nuria) [17:45:22] 10Analytics, 10Analytics-Kanban, 10Product-Analytics, 10Epic: Add data quality alarm for mobile-app data - https://phabricator.wikimedia.org/T257692 (10Nuria) a:03Nuria [17:51:59] 10Analytics: User entropy alarms. Evaluate thresholds - https://phabricator.wikimedia.org/T257691 (10Nuria) Deviation threshold for this alarm is '10': https://github.com/wikimedia/analytics-refinery/blob/master/oozie/data_quality_stats/hourly/bundle.xml#L66 Seeing graph perhaps we need to think of lowering res... [17:52:31] 10Analytics, 10Analytics-Kanban: User entropy alarms. Evaluate thresholds - https://phabricator.wikimedia.org/T257691 (10Nuria) a:03mforns [17:56:42] 10Analytics, 10Analytics-Wikistats: "Page views by edition of Wikipedia" for each country - https://phabricator.wikimedia.org/T257071 (10A455bcd9) @Aklapper Thanks a lot for reopening the task, I didn't know I could do it by myself. Thanks as well for changing the title. (I guess that's what you meant by "summ... [17:59:52] 10Analytics, 10Analytics-Kanban, 10Product-Analytics, 10Patch-For-Review: Add dimensions to editors_daily dataset - https://phabricator.wikimedia.org/T256050 (10Milimetric) p:05Medium→03High a:05cchen→03Milimetric [18:05:38] 10Analytics: Discover why article "-" (redirects to Hiphen article) is one of the most accessed in ptwiki - https://phabricator.wikimedia.org/T257697 (10Danilo) [19:48:35] nuria: I'm looking at this refine failure in /wmf/data/raw/eventlogging/eventlogging_SearchSatisfaction/hourly/2020/07/10/16 [19:48:56] milimetric: aham, i have not seen it one sec [19:49:07] nono nuria I was just saying so you don't have to look :) [19:49:12] no worries, I'll ping you if it's anything weird [19:49:47] milimetric: this is the schema that andrew just migrated: https://schema.wikimedia.org/#!//secondary/jsonschema/analytics/legacy/searchsatisfaction [19:49:53] yep [19:50:02] milimetric: so it is not unlikely the code might have a NPE that we have not seen [19:50:41] I know, no worries, just trying to keep it off your plate and doing a bad job :) [20:34:39] nuria: I've failed... and I don't understand how/why. If you have some time, I could use a pair to try and figure out what's going on with this job [20:35:02] milimetric: of course, bc? [20:35:25] omw [20:53:46] 10Analytics, 10Dumps-Generation: Sample HTML Dumps - Request for feedback - https://phabricator.wikimedia.org/T257480 (10Nicolastorzec) @RBrounley_WMF: +1 on publishing the dataset as a small number of large splittable files compressed with a splittable format. The whole download and distributed data processin... [22:32:32] 10Analytics, 10Product-Analytics: Calculate impact of missing mobile app pageviews to high-level metrics - https://phabricator.wikimedia.org/T257373 (10SNowick_WMF) (Caveat that there are known issues with how potentially accurate the pageview metric, is given the way offline pageviews are included in counts....