[03:44:14] (03PS1) 10Nuria: [WIP] chopping timeseries for noise detection [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/612454 (https://phabricator.wikimedia.org/T257691) [03:46:05] (03CR) 10Nuria: [C: 04-1] "Need to test change with data to make sure alarms are what we want them to be" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/612454 (https://phabricator.wikimedia.org/T257691) (owner: 10Nuria) [03:47:24] (03CR) 10jerkins-bot: [V: 04-1] [WIP] chopping timeseries for noise detection [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/612454 (https://phabricator.wikimedia.org/T257691) (owner: 10Nuria) [06:52:12] goood morning [07:00:41] 10Analytics-Clusters: PySpark Error in JupyterHub: Python in worker has different version - https://phabricator.wikimedia.org/T256997 (10elukey) @diego the error happens on stat1008 and surely on stat1005 since they are Debian Buster nodes, running Python 3.7, meanwhile the other stats are still on Stretch runni... [07:19:08] good morning team [07:20:18] bonjour! [07:20:25] Bonjour elukey :) [07:27:31] 10Analytics-Clusters: PySpark Error in JupyterHub: Python in worker has different version - https://phabricator.wikimedia.org/T256997 (10elukey) Andrew's fix for PySpark was https://gerrit.wikimedia.org/r/c/operations/debs/spark2/+/602386/3/debian/conf/spark-env.sh (included in the last version of the `spark2` d... [07:27:36] !log Restart forgotten unique-devices per-project-family jobs after yesterday deploy [07:27:37] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [08:34:15] (03PS1) 10Elukey: Set python3.7 for Buster Pyspark Yarn kernels [analytics/jupyterhub/deploy] - 10https://gerrit.wikimedia.org/r/612484 (https://phabricator.wikimedia.org/T256997) [08:34:29] this seems to work on stat1008 --^ [08:34:43] (manually applied on the kernel.json pyspark definitions) [08:36:19] 10Analytics-Clusters, 10Patch-For-Review: PySpark Error in JupyterHub: Python in worker has different version - https://phabricator.wikimedia.org/T256997 (10elukey) @diego I applied https://gerrit.wikimedia.org/r/c/analytics/jupyterhub/deploy/+/612484 manually on stat1008, and from my tests it seems to work. C... [10:24:21] 10Analytics-Clusters: Review an-coord1001's usage and failover plans - https://phabricator.wikimedia.org/T257412 (10elukey) I had a chat with Manuel this morning, and one possibility would be the following: * Add dbproxy hosts (very ligthweight) that run basically haproxy (usually there are two for HA). The pup... [10:24:35] joal: when you have a moment, let me know what you think about https://phabricator.wikimedia.org/T257412 [10:24:43] (not urgent, even later on during the week) [10:27:04] more opinions are always good, I'd like to be prepared for a an-coord1001 failure :) [10:27:41] ack! reading [10:36:16] 10Analytics, 10Analytics-Wikistats, 10Inuka-Team, 10Language-strategy, and 2 others: Have a way to show the most popular pages per country - https://phabricator.wikimedia.org/T207171 (10Aklapper) [10:41:10] 10Analytics-Clusters: Review an-coord1001's usage and failover plans - https://phabricator.wikimedia.org/T257412 (10JAllemandou) > I like more the manual failover solution, seems a first good step towards better handling SPOF I agree with that - incremental steps looks good :). > if we can move all the daemons... [10:41:23] elukey: added a comment - let me know if it makes sense :) [10:43:15] joal: makes sense! Thanks :) [10:43:56] I am still trying to think how to move stuff around, in theory we didn't budget a dedicated node for hive/presto/etc.. (that we migth need now) [10:44:04] but, we could use the one that we wanted for superset [10:44:08] keeping superset in a vm [10:44:14] this use case seem more important [10:49:42] for sure elukey [11:26:36] * elukey lunch! [12:21:07] hey teamm! :] [12:27:41] mforns: o/ [12:27:50] hey elukey !! [12:27:54] all good? [12:28:19] yes yes.. and you?? [12:32:08] hi mforns :) [13:09:17] (03CR) 10Ottomata: [C: 03+1] "Hm! Good catch. I bet we could get away with not even setting PYSPARK_PYTHON, now that spark-env.sh does it. But doing it explicitly is" [analytics/jupyterhub/deploy] - 10https://gerrit.wikimedia.org/r/612484 (https://phabricator.wikimedia.org/T256997) (owner: 10Elukey) [13:10:57] (03CR) 10Mforns: "Some nit-picky comments." (036 comments) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/612454 (https://phabricator.wikimedia.org/T257691) (owner: 10Nuria) [13:11:09] (03CR) 10Elukey: "> Patch Set 1: Code-Review+1" [analytics/jupyterhub/deploy] - 10https://gerrit.wikimedia.org/r/612484 (https://phabricator.wikimedia.org/T256997) (owner: 10Elukey) [13:11:17] hey joal :] [13:11:22] elukey: yes, all good [13:14:46] (03CR) 10Ottomata: [C: 03+1] "I'm not sure what is best. Being explicit is good, and we already have these buster specific kernels, but also it would be nice to not ha" [analytics/jupyterhub/deploy] - 10https://gerrit.wikimedia.org/r/612484 (https://phabricator.wikimedia.org/T256997) (owner: 10Elukey) [13:25:10] (03PS1) 10Fdans: [wip] Allow more than one dimension to be filtered in Wikistats [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/612574 (https://phabricator.wikimedia.org/T255757) [13:25:19] (03CR) 10jerkins-bot: [V: 04-1] [wip] Allow more than one dimension to be filtered in Wikistats [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/612574 (https://phabricator.wikimedia.org/T255757) (owner: 10Fdans) [13:29:27] heya teammmm, someone wants to brainbounce a bit about session length? I have a question :] [13:36:01] (03PS3) 10Fdans: [wip] Add filter/split component to Wikistats [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/607768 (https://phabricator.wikimedia.org/T249758) [13:36:03] (03PS2) 10Fdans: [wip] Allow more than one dimension to be filtered in Wikistats [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/612574 (https://phabricator.wikimedia.org/T255757) [13:36:36] mforns: in like 15 minutes ok? [13:36:43] milimetric: of course [13:36:46] thanks! [13:37:23] (03CR) 10jerkins-bot: [V: 04-1] [wip] Add filter/split component to Wikistats [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/607768 (https://phabricator.wikimedia.org/T249758) (owner: 10Fdans) [13:37:28] (03CR) 10jerkins-bot: [V: 04-1] [wip] Allow more than one dimension to be filtered in Wikistats [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/612574 (https://phabricator.wikimedia.org/T255757) (owner: 10Fdans) [13:43:35] joal: cookbook for rollback bigtop -> cdh finally worked! [13:44:07] the trick is to issue the command to rollback the namenode (that in turn also causes the jns to rollback), then roll restart the jns, then start the namenode [13:44:18] this circumvents all the bugs [13:46:30] (03PS3) 10Fdans: [wip] Allow more than one dimension to be filtered in Wikistats [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/612574 (https://phabricator.wikimedia.org/T255757) [13:47:06] 10Analytics-Clusters, 10Analytics-Kanban, 10Patch-For-Review: Create anaconda .deb package with stacked conda user envs - https://phabricator.wikimedia.org/T251006 (10Ottomata) Oh noooo! I just tried installing anaconda from my .deb for the first time. ` [@stat1008:/usr/lib/anaconda-wmf] $ grep --binary-fi... [13:53:30] (03CR) 10jerkins-bot: [V: 04-1] [wip] Allow more than one dimension to be filtered in Wikistats [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/612574 (https://phabricator.wikimedia.org/T255757) (owner: 10Fdans) [13:59:37] mforns: cave? [13:59:43] milimetric: omw :] [14:09:00] elukey: I'm interested to understand more about the command order when you have a mintue :) [14:09:13] some reading for you a-team - https://docs.google.com/document/d/1vahntOx4zZU5Z7Dz5JE03Co5cEcw4kAuHtb-M8-cTRE [14:09:13] M8: Testing SVGs in mockups - https://phabricator.wikimedia.org/M8 [14:12:04] OH FUN JOAL did you just write that?! [14:12:11] yessir [14:12:16] joal: https://github.com/wikimedia/operations-cookbooks/blob/master/cookbooks/sre/hadoop/change-distro-from-cdh.py#L157-L183 [14:12:39] * joal is afraid of elukey's link [14:12:46] ottomata: yes :) [14:13:20] * joal is trying to write stuff before holidays, so that it doesn't all fly away :) [14:14:15] joal: basically 'echo Y | sudo -u hdfs kerberos-run-command hdfs hdfs namenode -rollback' does rollback namenode state and jns state [14:14:39] but for some bug/reason the jns are left in a weird state after it, like with no cluster id etc.. [14:14:48] I recall that elukey [14:14:49] and the namenode can't start [14:15:03] but doing a roll restart in the middle resolves the problem [14:15:12] ok I get it now :) [14:15:28] it is kinda turn off/on again solution [14:15:30] thanks a lot for the gentle walkthrough :) [14:15:40] but I suspect it is a weird bug that I don't want to look into jira :D [14:15:52] I probably could have understood through thorough reading [14:15:54] :) [14:16:43] ah yes sorry didn't mean to explain the solution, just to avoid the read :) [14:17:09] that's great elukey - /me prefers talking to elukey than reading code :) [14:17:57] joal: as Andrew was saying, there are only few references of cdh-specific versions in our pom.xml [14:18:18] one thing to test would be to build refinery for bigtop without them, eventually [14:19:29] sounds like a nice idea elukey :) [14:19:58] not sure if it makes sense for our current version, maybe only for bigtop after the move [14:20:01] taking a break before meeting evening - see y'all at standup [14:24:07] (03CR) 10Joal: [C: 03+1] "All my comments have been taken care of - Ok for me - Maybe someone else could review?" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/607788 (owner: 10Ottomata) [14:25:20] (03CR) 10Joal: [C: 03+1] "I'm not knowledgeable on the camus side - Why were we needing them originaly, avro imports?" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/608725 (owner: 10Ottomata) [14:25:29] 10Analytics, 10Dumps-Generation: Sample HTML Dumps - Request for feedback - https://phabricator.wikimedia.org/T257480 (10RBrounley_WMF) >>! In T257480#6297199, @Isaac wrote: >> English Wiki has 15m articles (I believe) >> a full enwiki dump is clocking in at 944gb or something insanely large > I'm pretty sure... [14:30:10] (03CR) 10Ottomata: "Our old avro intergration that was only used for ApiRequest and CirrusSearchRequest had generated Java classes as a built in 'schema regis" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/608725 (owner: 10Ottomata) [14:30:21] (03CR) 10Ottomata: [C: 03+2] Remove unused custom avro camus classes [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/608725 (owner: 10Ottomata) [14:32:41] hudi eh? :) [14:33:11] 10Analytics-Clusters, 10Patch-For-Review: PySpark Error in JupyterHub: Python in worker has different version - https://phabricator.wikimedia.org/T256997 (10diego) @elukey yes! this is working, thank you very much! It would be great if you can apply the same patch in the stat1005. [14:36:11] very annoying, I keep getting kerberos issues when trying to use pyspark yarn on stat1005 [14:36:54] I applied a change for PYSPARK_PYTHON (namely removing it) [14:36:59] 10Analytics-Radar, 10Better Use Of Data, 10Product-Analytics, 10Epic, and 2 others: Session Length Metric. Web implementation - https://phabricator.wikimedia.org/T248987 (10mforns) Hi @jlinehan! I reviewed your code, and after a while managed to understand the complexity of the situation. Thanks for the d... [14:37:04] but it should pick up the ticket [14:37:12] klist shows it to me in the jupyter terminal [14:45:43] !log re-create jupyterhub's base kernel directory on stat1005 (trying to debug some problems) [14:45:44] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [14:52:27] 10Analytics, 10Dumps-Generation: Sample HTML Dumps - Request for feedback - https://phabricator.wikimedia.org/T257480 (10ArielGlenn) @RBrounley_WMF Page content for redirects consists of the wikitext that contains the directive for the redirect. For example, ` #REDIRECT [[Apgar score]] ` which can be s... [14:53:42] (03CR) 10Elukey: [V: 03+2 C: 03+2] "Tried to remove the var from /usr/local/share/jupyter/kernels/spark_yarn_pyspark/kernel.json, restarted jupyterhub and my notebook, but se" [analytics/jupyterhub/deploy] - 10https://gerrit.wikimedia.org/r/612484 (https://phabricator.wikimedia.org/T256997) (owner: 10Elukey) [14:54:35] (03CR) 10Ottomata: "Sounds good!" [analytics/jupyterhub/deploy] - 10https://gerrit.wikimedia.org/r/612484 (https://phabricator.wikimedia.org/T256997) (owner: 10Elukey) [14:55:03] !log re-create jupyterhub's venv on stat1005/8 after https://gerrit.wikimedia.org/r/612484 [14:57:15] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [14:59:06] 10Analytics-Clusters, 10Patch-For-Review: PySpark Error in JupyterHub: Python in worker has different version - https://phabricator.wikimedia.org/T256997 (10elukey) @diego done! [14:59:22] 10Analytics-Clusters, 10Analytics-Kanban, 10Patch-For-Review: PySpark Error in JupyterHub: Python in worker has different version - https://phabricator.wikimedia.org/T256997 (10elukey) a:03elukey [14:59:24] (03CR) 10Mforns: "Lookin'!" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/607788 (owner: 10Ottomata) [15:01:20] elukey: thanks for fixing the notebooks, I really appreciate it! [15:02:03] dsaez: only because we <3 researchers [15:02:16] :D [15:02:21] haha [15:02:24] thx [15:02:30] thanks for the report, the repro was really useful! [15:03:32] got it, I'll add the "how to reproduce the error" section to my future tickets [15:03:43] that is always very useful! [15:07:55] !log upgrade spark2 to 2.4.4-bin-hadoop2.6-3 on stat1004 [15:07:57] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [15:21:16] 10Analytics-Radar, 10Better Use Of Data, 10Product-Analytics, 10Epic, and 2 others: Session Length Metric. Web implementation - https://phabricator.wikimedia.org/T248987 (10Nuria) I still have a few questions around the code being quite complicated for the edge cases and I really wonder if we need to accou... [15:26:22] (03CR) 10Mforns: [C: 03+1] "LGTM! +1 Left a couple probably useless comments :]" (034 comments) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/607788 (owner: 10Ottomata) [15:26:24] ottomata: thanks! don't forget https://gerrit.wikimedia.org/r/c/schemas/event/secondary/+/612610 though :) [15:26:29] a-team: let's met at :30 after the hour today as we have later the flink training [15:26:55] * milimetric tries in vain to find a thumbs up icon [15:27:06] 👍 [15:27:18] showoff [15:27:57] :+1: [15:28:01] :( [15:30:41] 10Analytics-Radar, 10Better Use Of Data, 10Product-Analytics, 10Epic, and 2 others: Session Length Metric. Web implementation - https://phabricator.wikimedia.org/T248987 (10mforns) > On a traditional definition of session, no. Inactivity for say, 30 minutes (probably less) means the session has ended. "ses... [15:30:59] 10Analytics-Radar, 10Better Use Of Data, 10Desktop Improvements, 10Product-Infrastructure-Team-Backlog, and 7 others: Client side error logging production launch - https://phabricator.wikimedia.org/T226986 (10Ottomata) Can we close this task and call the production launch done? [15:31:59] nuria: we're all here! :) [15:50:58] !log upgrade spark2 on all stat100x hosts [15:51:00] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [15:52:16] ottomata: tried to run some numpy examples via pyspark --master yarn (I looked for some examples that needed to distribute on workers simple integers) and all works fine [15:56:00] (tested on stat1004 and stat1005, so stretch/buster) [16:05:28] 10Analytics-Radar, 10Better Use Of Data, 10Product-Analytics, 10Epic, and 2 others: Session Length Metric. Web implementation - https://phabricator.wikimedia.org/T248987 (10jlinehan) >>! In T248987#6304541, @mforns wrote: > Shouldn't the session length for that use case be 60 minutes instead? Given that th... [16:09:32] 10Analytics-Radar, 10Better Use Of Data, 10Desktop Improvements, 10Product-Infrastructure-Team-Backlog, and 7 others: Client side error logging production launch - https://phabricator.wikimedia.org/T226986 (10Nuria) Closing cc-ing @dcipoletti so he knows that client side error logging is available (which... [16:09:46] 10Analytics-Radar, 10Better Use Of Data, 10Desktop Improvements, 10Product-Infrastructure-Team-Backlog, and 7 others: Client side error logging production launch - https://phabricator.wikimedia.org/T226986 (10Nuria) 05Open→03Resolved [16:09:49] 10Analytics, 10Analytics-EventLogging, 10Event-Platform, 10Services (watching): Modern Event Platform: Stream Intake Service - https://phabricator.wikimedia.org/T201068 (10Nuria) [16:33:47] 10Analytics-Radar, 10Better Use Of Data, 10Product-Analytics, 10Epic, and 2 others: Session Length Metric. Web implementation - https://phabricator.wikimedia.org/T248987 (10jlinehan) >>! In T248987#6304745, @Nuria wrote: > I still have a few questions around the code being quite complicated for the edge ca... [16:41:24] going off people o/ [16:41:35] bye elukey :) [17:01:22] 10Analytics-Radar, 10Better Use Of Data, 10Product-Analytics, 10Epic, and 2 others: Session Length Metric. Web implementation - https://phabricator.wikimedia.org/T248987 (10Nuria) >But in the example, the user is reading the wiki page for 20 seconds every 2 minutes, so it has no such inactivity period, rig... [17:05:34] 10Analytics, 10Product-Analytics: Add editors_monthly data to Druid - https://phabricator.wikimedia.org/T256719 (10cchen) [17:06:52] 10Analytics-Radar, 10Better Use Of Data, 10Product-Analytics, 10Epic, and 2 others: Session Length Metric. Web implementation - https://phabricator.wikimedia.org/T248987 (10jlinehan) >>! In T248987#6305252, @Nuria wrote: >>were we trying to capture eyeballs-on-screen-time? > mmm, no, I do not think so. Se... [17:14:03] 10Analytics-Radar, 10Product-Analytics (Kanban): Calculate impact of missing mobile app pageviews to high-level metrics - https://phabricator.wikimedia.org/T257373 (10LGoto) [17:16:17] 10Analytics, 10Product-Analytics, 10Epic: Definition of what constitutes a mobile pageview - https://phabricator.wikimedia.org/T257277 (10LGoto) a:03SNowick_WMF [17:16:52] 10Analytics, 10Product-Analytics, 10Product-Infrastructure-Team-Backlog, 10Epic: Definition of what constitutes a mobile pageview - https://phabricator.wikimedia.org/T257277 (10LGoto) [17:16:55] 10Analytics-Radar, 10Better Use Of Data, 10Product-Analytics, 10Epic, and 2 others: Session Length Metric. Web implementation - https://phabricator.wikimedia.org/T248987 (10mforns) Thanks @jlinehan for the clarification. I believe (but might be wrong) that the session length metric is a bit simpler to ins... [17:30:05] 10Analytics-Radar, 10Better Use Of Data, 10Product-Analytics, 10Epic, and 2 others: Session Length Metric. Web implementation - https://phabricator.wikimedia.org/T248987 (10mforns) As an improvement to the algorithm above, I'd remember we spoke about having exponential heartbeat intervals, so that we send... [17:42:08] 10Analytics-Radar, 10Better Use Of Data, 10Product-Analytics, 10Epic, and 2 others: Session Length Metric. Web implementation - https://phabricator.wikimedia.org/T248987 (10jlinehan) >>! In T248987#6305346, @mforns wrote: > I believe (but might be wrong) that the session length metric is a bit simpler to i... [17:47:10] 10Analytics-Radar, 10Better Use Of Data, 10Product-Analytics, 10Epic, and 2 others: Session Length Metric. Web implementation - https://phabricator.wikimedia.org/T248987 (10Nuria) The challenge is how to define inactivity such cookies are "reseted". A window open for the whole night is not a"whole night"... [17:58:10] 10Analytics-Radar, 10Better Use Of Data, 10Product-Analytics, 10Epic, and 2 others: Session Length Metric. Web implementation - https://phabricator.wikimedia.org/T248987 (10jlinehan) >>! In T248987#6305429, @Nuria wrote: > The challenge is how to define inactivity such cookies are "reseted". A window open... [18:11:00] 10Analytics-Radar, 10Better Use Of Data, 10Product-Analytics, 10Epic, and 2 others: Session Length Metric. Web implementation - https://phabricator.wikimedia.org/T248987 (10mforns) @Nuria, the algorithm as proposed does stop ticking when the page is not visible. However, as you pointed out, if someone open... [18:13:58] 10Analytics-Radar, 10Better Use Of Data, 10Product-Analytics, 10Epic, and 2 others: Session Length Metric. Web implementation - https://phabricator.wikimedia.org/T248987 (10Nuria) >The BUOD group is planning to build some notion of session that captures inactivity, which we were going to back-port for this... [18:34:51] (03CR) 10Nuria: [C: 04-1] [WIP] chopping timeseries for noise detection (032 comments) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/612454 (https://phabricator.wikimedia.org/T257691) (owner: 10Nuria) [18:49:28] 10Analytics-Radar, 10Better Use Of Data, 10Product-Analytics, 10Epic, and 2 others: Session Length Metric. Web implementation - https://phabricator.wikimedia.org/T248987 (10Nuria) Let's please start a new patch going forward as the history of current patch to date would make things unnecessarily confusing. [18:53:49] 10Analytics-Radar, 10Better Use Of Data, 10Product-Analytics, 10Epic, and 2 others: Session Length Metric. Web implementation - https://phabricator.wikimedia.org/T248987 (10jlinehan) >>! In T248987#6305513, @Nuria wrote: >>The BUOD group is planning to build some notion of session that captures inactivity,... [18:58:12] (03CR) 10Mforns: [WIP] chopping timeseries for noise detection (032 comments) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/612454 (https://phabricator.wikimedia.org/T257691) (owner: 10Nuria) [19:01:57] gone for tonight team - see you tomorrow [19:02:03] I have streams in my head :) [20:01:45] 10Analytics, 10Analytics-Kanban, 10Product-Analytics, 10Epic: API pageview counts for 'Mobile app' are incorrect since switch to mobile-html - https://phabricator.wikimedia.org/T256508 (10SNowick_WMF) [20:20:16] byeeee teamm [21:35:46] I'm just like psychologically incompatible with Scala [21:36:06] I've looked up how to instantiate a variable like 100 times. I just can't remember it, no matter what I do