[00:14:39] 10Analytics, 10ChangeProp, 10Discovery-Search, 10EventBus, and 3 others: Better way to pause writes on elasticsearch - https://phabricator.wikimedia.org/T230730 (10EBernhardson) > Is it going to back off the same amount of time as the last back-off. So it will start processing messages in a timely manner,... [02:32:10] 10Analytics, 10Analytics-Kanban, 10Operations, 10SRE-Access-Requests: Access to HUE for cchen - https://phabricator.wikimedia.org/T231111 (10Mathew.onipe) 05Open→03Resolved I'm guessing everyone is happy so I'm going to close this. [07:14:34] Good morning team [07:45:59] morning! [10:01:45] heya teammm, good morning :] [10:15:49] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Generate edit totals by country by month/year - https://phabricator.wikimedia.org/T215655 (10JAllemandou) Monthly Jobs have been running monthly (there is data up to now). Datasets are documented here: https://wikitech.wikimedia.org/wiki/Analytics/Data_Lak... [10:29:03] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Geoeditors_private deletion scripts scheduled day conflicts with retention period - https://phabricator.wikimedia.org/T231017 (10mforns) [10:42:32] hellooou [10:45:41] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Geoeditors_private deletion scripts scheduled day conflicts with retention period - https://phabricator.wikimedia.org/T231017 (10JAllemandou) @mforns: I wonder if we shouldn't document the deletion strategy in some wikitech page, to help us having a clear... [10:47:59] 10Analytics: Add agent dimension to pageviews top-by-country in aqs - https://phabricator.wikimedia.org/T231292 (10Nuria) [10:59:05] 10Analytics: Add agent dimension to pageviews top-by-country in aqs - https://phabricator.wikimedia.org/T231292 (10JAllemandou) The reason the `agent` field is not present in the `top` aqs endpoint is because we decided not to provide the top views for spiders - top is computed for user only as of now. The same... [11:29:48] nuria: can you remember at all the reason why we decided not to split by agent type in pageviews by country? [11:37:03] omg joal nuria sorry, I just realized you both asked/answered that question right above mine [11:49:15] fdans: ya, same here i had forgotten [12:03:03] (03PS2) 10Joal: Update webrequest oozie job for yarn queue to work [analytics/refinery] - 10https://gerrit.wikimedia.org/r/531682 (https://phabricator.wikimedia.org/T231002) [12:03:37] (03CR) 10Joal: "Review needed anew (sorry) - I found a way not to have to modify every hive-query" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/531682 (https://phabricator.wikimedia.org/T231002) (owner: 10Joal) [13:24:29] (03CR) 10Ottomata: [C: 03+1] "Nice!" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/531682 (https://phabricator.wikimedia.org/T231002) (owner: 10Joal) [14:22:28] PROBLEM - statsv Varnishkafka log producer on cp1081 is CRITICAL: PROCS CRITICAL: 0 processes with args /usr/bin/varnishkafka -S /etc/varnishkafka/statsv.conf https://wikitech.wikimedia.org/wiki/Analytics/Systems/Varnishkafka [14:23:00] PROBLEM - Webrequests Varnishkafka log producer on cp1081 is CRITICAL: PROCS CRITICAL: 0 processes with args /usr/bin/varnishkafka -S /etc/varnishkafka/webrequest.conf https://wikitech.wikimedia.org/wiki/Analytics/Systems/Varnishkafka [14:26:35] ottomata: hello :) Is there anything for us to do on thoses alerts? [14:27:43] wowo [14:30:06] lookin [14:32:52] I can not log into that host [14:33:41] mforns: ya, ottomata might need to take a look [14:34:57] I can't see anything in https://grafana.wikimedia.org/d/000000253/varnishkafka [14:39:05] joal, nuria, maybe this issue is related to ema's email to [OPS] [14:39:34] RECOVERY - statsv Varnishkafka log producer on cp1081 is OK: PROCS OK: 1 process with args /usr/bin/varnishkafka -S /etc/varnishkafka/statsv.conf https://wikitech.wikimedia.org/wiki/Analytics/Systems/Varnishkafka [14:39:59] oh, ok :] [14:40:06] RECOVERY - Webrequests Varnishkafka log producer on cp1081 is OK: PROCS OK: 1 process with args /usr/bin/varnishkafka -S /etc/varnishkafka/webrequest.conf https://wikitech.wikimedia.org/wiki/Analytics/Systems/Varnishkafka [14:42:53] mforns: ops list right? [14:43:04] nuria, yes [14:44:18] mforns: process is re-started [14:44:23] mforns: so we are all good [14:44:32] aha [14:44:37] mforns: probably related to ast [14:44:45] *ats [14:45:34] the host they mention is cp1075, which does not match, but it could be related anyway? [14:47:52] 10Analytics, 10Operations, 10Traffic: varnishkafka statsv and webrequest crashed on cp1081 - https://phabricator.wikimedia.org/T231331 (10ema) [14:48:03] 10Analytics, 10Operations, 10Traffic: varnishkafka statsv and webrequest crashed on cp1081 - https://phabricator.wikimedia.org/T231331 (10ema) p:05Triage→03Normal [14:55:41] Gone to get the kids - see you at standup team [15:00:56] 10Analytics, 10Operations, 10Traffic: varnishkafka statsv and webrequest crashed on cp1081 - https://phabricator.wikimedia.org/T231331 (10Nuria) CPU at 100%: https://grafana.wikimedia.org/d/000000377/host-overview?refresh=5m&orgId=1&var-server=cp1081&var-datasource=eqiad%20prometheus%2Fops&var-cluster=cache... [15:15:34] joal hey sorry was doing interview! [15:19:27] looks ok now i guess! [15:24:11] (03PS1) 10Fdans: Changes related to timestamp partitioning for mediarequests [analytics/refinery] - 10https://gerrit.wikimedia.org/r/532725 (https://phabricator.wikimedia.org/T229817) [15:26:12] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Geoeditors_private deletion scripts scheduled day conflicts with retention period - https://phabricator.wikimedia.org/T231017 (10mforns) @JAllemandou I created this page in Wikitech, explains a bit how data_purge.pp works and how the retention period vs ti... [15:28:48] (03CR) 10Fdans: [V: 04-1] "Will +1 verified when I finish testing the job in stat1007, making sure that timestamp format makes sense in partitions and directory name" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/532725 (https://phabricator.wikimedia.org/T229817) (owner: 10Fdans) [15:31:28] (03PS2) 10Fdans: Changes related to timestamp partitioning for mediarequests [analytics/refinery] - 10https://gerrit.wikimedia.org/r/532725 (https://phabricator.wikimedia.org/T229817) [15:33:14] (03PS1) 10Ottomata: Updating jupyterlab dependency to 1.0.9 [analytics/jupyterhub/deploy] - 10https://gerrit.wikimedia.org/r/532726 (https://phabricator.wikimedia.org/T230724) [15:34:12] 10Analytics, 10Analytics-Kanban: Set up automatic deletion for netflow datasource in Druid - https://phabricator.wikimedia.org/T229674 (10mforns) [15:37:48] 10Analytics, 10Analytics-Kanban: Set up automatic deletion for netflow datasource in Druid - https://phabricator.wikimedia.org/T229674 (10mforns) Hi @ayounsi! Please, review this task and let us know how long would you like to keep the netflow data in Druid/Turnilo. Or put in another way, how interesting is i... [15:39:23] (03PS2) 10Ottomata: Updating jupyterlab dependency to 1.0.9 [analytics/jupyterhub/deploy] - 10https://gerrit.wikimedia.org/r/532726 (https://phabricator.wikimedia.org/T230724) [15:42:43] (03PS3) 10Fdans: Changes related to timestamp partitioning for mediarequests [analytics/refinery] - 10https://gerrit.wikimedia.org/r/532725 (https://phabricator.wikimedia.org/T229817) [15:44:16] hey team. I am looking at Turnilo and notice the new "Percentage of identified bot requests". this is a great feature, thanks! [15:44:39] is there any resource where I can read more on what a "bot" means and how a request is identified as a bot request? [15:46:07] (03PS3) 10Ottomata: Updating jupyterlab dependency to 1.0.9 [analytics/jupyterhub/deploy] - 10https://gerrit.wikimedia.org/r/532726 (https://phabricator.wikimedia.org/T230724) [15:46:39] 10Analytics, 10Analytics-Kanban: Set up automatic deletion for netflow data set in Hive - https://phabricator.wikimedia.org/T231339 (10mforns) [15:47:52] 10Analytics, 10Research: Recommend the best format to release public data lake as a dump - https://phabricator.wikimedia.org/T224459 (10leila) Update: We have 3 responses in the survey and one over email. I sent a reminder to wiki-research-l and an email to analytics. The deadline is in a week from now. [15:48:00] (03CR) 10Ottomata: "I think there might be an issue with ':' in HDFS pathnames. What will the `dt` partition look like? Just '2019-08-27T14' ? for hourly?" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/532725 (https://phabricator.wikimedia.org/T229817) (owner: 10Fdans) [15:48:40] thank for taking a look ottomata :)\ [15:48:49] thank you* [15:48:55] 10Analytics, 10Analytics-Kanban: Set up automatic deletion for netflow data set in Hive - https://phabricator.wikimedia.org/T231339 (10mforns) Hi @ayounsi! Please, review this task and see if it makes sense. I assume from what Luca told me, unless you tell me the opposite, that keeping the data in Hive/HDFS f... [15:49:31] (03CR) 10Ottomata: [C: 03+2] Updating jupyterlab dependency to 1.0.9 [analytics/jupyterhub/deploy] - 10https://gerrit.wikimedia.org/r/532726 (https://phabricator.wikimedia.org/T230724) (owner: 10Ottomata) [15:49:34] (03CR) 10Ottomata: [V: 03+2 C: 03+2] Updating jupyterlab dependency to 1.0.9 [analytics/jupyterhub/deploy] - 10https://gerrit.wikimedia.org/r/532726 (https://phabricator.wikimedia.org/T230724) (owner: 10Ottomata) [15:53:53] 10Analytics, 10Analytics-Kanban: Set up automatic deletion for netflow data set in Hive - https://phabricator.wikimedia.org/T231339 (10mforns) So, the idea for solution #2 (see task description) is the following: Use WhitelistSanitization to sanitize the wmf database. And have a white-list that, for now, incl... [15:54:23] 10Analytics: Set up automatic deletion for netflow data set in Hive - https://phabricator.wikimedia.org/T231339 (10mforns) [15:56:20] sukhe: those are self-reported bots, see: https://meta.wikimedia.org/wiki/Research:Page_view#Tagging [15:58:36] 10Analytics, 10Research: Recommend the best format to release public data lake as a dump - https://phabricator.wikimedia.org/T224459 (10mforns) Cool! [15:59:26] 10Analytics: Set up automatic deletion for netflow data set in Hive - https://phabricator.wikimedia.org/T231339 (10Nuria) I really would like to avoid setting up custom sanitization, can we rather move netflow data to events database and use the event sanitization process? [16:00:37] joal: standdupppp [16:01:52] 10Analytics, 10Analytics-SWAP, 10Product-Analytics, 10Patch-For-Review: Upgrade all SWAP users to JupyterLab 1.0 - https://phabricator.wikimedia.org/T230724 (10Ottomata) Updated to JupyterLab 1.0.9 by following the process at https://wikitech.wikimedia.org/wiki/SWAP#Administration. I also manually upgrade... [16:02:03] !log restarted turnilo to apply changes to config [16:02:03] 10Analytics, 10Analytics-Kanban, 10Analytics-SWAP, 10Product-Analytics, 10Patch-For-Review: Upgrade all SWAP users to JupyterLab 1.0 - https://phabricator.wikimedia.org/T230724 (10Ottomata) [16:02:09] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [16:05:09] nuria: ah, thanks [16:07:03] 10Analytics, 10Analytics-Kanban, 10Operations, 10netops, 10ops-eqiad: Move cloudvirtan* hardware out of CloudVPS back into production Analytics VLAN. - https://phabricator.wikimedia.org/T225128 (10Ottomata) > OO, when we reimage these, let's use Buster! :) I take it back, use Stretch. Buster ships with J... [16:07:14] 10Analytics, 10Analytics-Kanban, 10Operations, 10netops, 10ops-eqiad: Move cloudvirtan* hardware out of CloudVPS back into production Analytics VLAN. - https://phabricator.wikimedia.org/T225128 (10Ottomata) [16:07:16] (03CR) 10Fdans: Transition data rows to using time ranges instead of timestamps (035 comments) [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/531148 (https://phabricator.wikimedia.org/T230514) (owner: 10Fdans) [17:04:34] Good morning, I'm having trouble accessing Jupyter Lab. I was on Lab as of 9am and at present am not able to use the interface. I'd appreciate any and all feedback for getting back and accessing the files I was working on. [17:28:48] 10Analytics, 10Analytics-Kanban: Set up automatic deletion for netflow datasource in Druid - https://phabricator.wikimedia.org/T229674 (10ayounsi) It would be great to have an aggregated/sanitized set of the data for as long as possible. In addition to `ip_src` and `ip_dst` it's fine to remove `port_src` `por... [17:30:59] 10Analytics: Set up automatic deletion for netflow data set in Hive - https://phabricator.wikimedia.org/T231339 (10ayounsi) Similar to T229674. It would be great to have an aggregated/sanitized set of the data for as long as possible. In addition to ip_src and ip_dst it's fine to remove port_src port_dst tcp_f... [18:15:09] 10Analytics, 10Analytics-SWAP, 10Jupyter-Hub: Trouble accessing Jupyter Lab - https://phabricator.wikimedia.org/T231365 (10Iflorez) [18:24:48] (03PS4) 10Addshore: Create script tracking number of slots on wikibase repos [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/523938 (https://phabricator.wikimedia.org/T68025) [18:27:58] 10Analytics, 10Analytics-SWAP, 10Jupyter-Hub: Trouble accessing Jupyter Lab - https://phabricator.wikimedia.org/T231365 (10Ottomata) Something is definitely not working for me on notebook1003, am investigating. In the meantime, can you check if JupyterLab works for you on notebook1004? [18:41:40] 10Analytics, 10Product-Analytics: Make an Analytics Data Lake table to provide meta info about wikis - https://phabricator.wikimedia.org/T184576 (10kzimmerman) a:05Neil_P._Quinn_WMF→03None [18:41:56] (03PS5) 10Addshore: Create script tracking number of slots on wikibase repos [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/523938 (https://phabricator.wikimedia.org/T68025) [18:42:01] (03CR) 10Addshore: [C: 03+2] Create script tracking number of slots on wikibase repos [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/523938 (https://phabricator.wikimedia.org/T68025) (owner: 10Addshore) [18:42:06] (03PS1) 10Addshore: Create script tracking number of slots on wikibase repos [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/532763 (https://phabricator.wikimedia.org/T68025) [18:42:10] (03CR) 10Addshore: [C: 03+2] Create script tracking number of slots on wikibase repos [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/532763 (https://phabricator.wikimedia.org/T68025) (owner: 10Addshore) [18:44:59] (03Merged) 10jenkins-bot: Create script tracking number of slots on wikibase repos [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/523938 (https://phabricator.wikimedia.org/T68025) (owner: 10Addshore) [18:45:16] (03Merged) 10jenkins-bot: Create script tracking number of slots on wikibase repos [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/532763 (https://phabricator.wikimedia.org/T68025) (owner: 10Addshore) [19:01:40] 10Analytics: SLF4J errors when querying mediawiki_wikitext_history - https://phabricator.wikimedia.org/T231373 (10dr0ptp4kt) [19:03:50] 10Analytics, 10Analytics-SWAP, 10Jupyter-Hub: Trouble accessing Jupyter Lab - https://phabricator.wikimedia.org/T231365 (10Iflorez) no luck on notebook4, unfortunately [19:20:52] iflorez: yt? [19:20:57] trying to figure out this jupyterlab thing [20:00:20] 10Analytics, 10Analytics-Kanban, 10Operations, 10netops, 10ops-eqiad: Move cloudvirtan* hardware out of CloudVPS back into production Analytics VLAN. - https://phabricator.wikimedia.org/T225128 (10Cmjohnson) a:05Cmjohnson→03Jclark-ctr @Jclark-ctr Can you move these servers as evenly as you can into r... [20:02:46] hello @ottomata what is yt? [20:05:14] 10Analytics, 10Analytics-General-or-Unknown, 10Product-Analytics, 10Wikimedia-Interwiki-links, and 2 others: there should be a comparison of clicks count on interlanguage links on different platforms - https://phabricator.wikimedia.org/T78351 (10LGoto) [20:05:48] iflorez: hi! [20:05:50] 'you there' [20:05:51] ? [20:05:56] hello [20:06:38] yes, online and ready to troubleshoot :) [20:06:38] can you try some things for me? jupyterhub is working for me now, but i'm not totally sure the difference between my notebook server and yours [20:06:47] ok [20:06:53] i just restarted your server on notebook1003 [20:06:57] can you try to load jupyterlab now? [20:07:01] ok [20:08:21] yes! I'm in JupyterLab on notebook3! [20:08:48] that worked, hm ok. [20:08:54] and it still doesn't work on notebook1004, right? [20:09:18] correct [20:09:24] ok let me try this and see if it fixes. [20:10:05] ok, try on notebook1004 now iflorez [20:11:08] no luck [20:11:24] no luck with notebook4 [20:12:10] hmm [20:12:58] maybe it's useful to note that when I logged in, I was moved to this url: [20:12:58] ...iflorez/lab/workspaces/auto-C [20:16:12] iflorez: can you try abain? [20:16:13] again* [20:16:50] nope. [20:17:07] no luck on notebook4 [20:17:17] yeah very strange. [20:17:33] i can see the error, but i'm not sure exactly how i resolved it on 1003 or for myself! [20:17:47] it looks like some issue with mismatched or partially upgraded version [20:22:18] oh dear [20:22:18] can you see my files? [20:22:18] or is everything offline at present? [20:22:18] given files, I can work on notebook3 while this issue gets resolved. [20:22:29] the files are not affected [20:22:40] its just an issue with the jupyterlab installed in your venv [20:22:41] hoorah! [20:22:43] and othres i expected [20:22:46] expect* [20:22:51] 10Analytics: Set up automatic deletion for netflow data set in Hive - https://phabricator.wikimedia.org/T231339 (10Nuria) @ayounsi Before we can sanitize this data it needs to be abide to a schema (a schema is a description in json of your data like https://meta.wikimedia.org/wiki/Schema:NavigationTiming) and th... [20:23:55] There are four files that I was working on that I would like to access. How can I access those while this issue is addressed? [20:24:27] on 1003 or 1004? [20:24:32] 1004 [20:25:36] 10Analytics, 10Product-Analytics, 10Product-Infrastructure-Team-Backlog, 10Reading List Service, and 2 others: [EPIC] Reading List Sync service analytics - https://phabricator.wikimedia.org/T191859 (10LGoto) [20:25:49] does the regulay jupyter notebook (non lab) server show them to you? [20:25:52] and allow you to access? [20:26:47] I will try again. As of 10am the jupyter notebook only showed me the output in the files...not the cell contents. [20:27:09] oh hm ,that is weird. [20:31:36] iflorez: can you try jupyterlab again for me? [20:31:47] (btw, thanks for being guinea pig...i can't seem to repro now otherwise) [20:32:34] nope. [20:32:36] sigh. [20:32:50] no luck getting on JupyterLab via notebook4 [20:33:01] yeah [20:33:27] and am not able to see the cell contents of my files via JupyterNotebook...just seeing the outputs. [20:34:33] ah ha. i think i see it. i don't really know why it working on 1003 or for me, but i think this version of jupyterlab isn't compatible with our version of nodejs? [20:34:36] rolling back. [20:38:37] iflorez: how about now, on notebook1004? [20:40:08] yes! [20:40:16] *dancing* [20:40:16] ok, rolling back for all. [20:41:40] (03PS1) 10Ottomata: Revert "Updating jupyterlab dependency to 1.0.9" [analytics/jupyterhub/deploy] - 10https://gerrit.wikimedia.org/r/532785 [20:42:11] (03PS2) 10Ottomata: Revert "Updating jupyterlab dependency to 1.0.9" [analytics/jupyterhub/deploy] - 10https://gerrit.wikimedia.org/r/532785 (https://phabricator.wikimedia.org/T230724) [20:42:28] (03CR) 10Ottomata: [V: 03+2 C: 03+2] Revert "Updating jupyterlab dependency to 1.0.9" [analytics/jupyterhub/deploy] - 10https://gerrit.wikimedia.org/r/532785 (https://phabricator.wikimedia.org/T230724) (owner: 10Ottomata) [20:46:17] !log rolling back to jupyterlab version 0.32.1, 1.0.x is not compatible with Stretch's version of nodejs - T230724 [20:46:22] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [20:46:23] T230724: Upgrade all SWAP users to JupyterLab 1.0 - https://phabricator.wikimedia.org/T230724 [20:48:32] AHHH i think i know why 1.0.x didn't work. i think my VM's version of nodejs is too new, and built an incompatible jupyterlab wheel [20:57:14] 10Analytics, 10Analytics-Kanban, 10Analytics-SWAP, 10Product-Analytics, 10Patch-For-Review: Upgrade all SWAP users to JupyterLab 1.0 - https://phabricator.wikimedia.org/T230724 (10Ottomata) After upgrading this morning, {T231365} was filed. I discovered that JupyterLab 1.0.x isn't compatible with the v... [20:57:52] 10Analytics, 10Analytics-Kanban, 10Analytics-SWAP, 10Jupyter-Hub: Trouble accessing Jupyter Lab - https://phabricator.wikimedia.org/T231365 (10Ottomata) I rolled back to JupyterLab 0.32.1. [20:58:41] 10Analytics, 10Analytics-Kanban, 10Analytics-SWAP, 10Product-Analytics, 10Patch-For-Review: Upgrade all SWAP users to JupyterLab 1.0 - https://phabricator.wikimedia.org/T230724 (10Ottomata) [20:58:42] 10Analytics: Install Debian Buster on Hadoop - https://phabricator.wikimedia.org/T231067 (10Ottomata) [20:58:46] 10Analytics, 10Analytics-SWAP, 10Product-Analytics, 10Patch-For-Review: Upgrade all SWAP users to JupyterLab 1.0 - https://phabricator.wikimedia.org/T230724 (10Ottomata) [21:16:13] @ottomata, I'm happy to troubleshoot this issue so as to figure out how to proceed with the upgrade...it appears that most analysts had no trouble on notebook4 and only a short restart of on notebook3. Not sure why I experienced issues...but I'm happy to troubleshoot so as to keep us moving forward with upgrades. [21:18:48] well, there does seem to be some incompatibility. it was broken for me on 1003 for a while. [21:25:30] hmmm i think things are still broken. [21:26:57] hm, or perhaps jupyterhub just needed a bounce after the downgrade for each user [21:36:33] At present, my notebook4 is dropping the kernel. Every so often I'm seeing a "No Kernel!" note on the right hand side. And it keeps dropping the files in the list of running files. [21:36:33] Maybe I should switch to notebook3 for the time being? [22:51:34] shoot i see that too. [22:51:41] does notebook1003 not do that to you? [22:58:31] I'm getting a 503 message on notebook3 [23:18:56] sigh i'm sorry iflorez, i'm out for the day, will fix asap tomorrow morning. what timezone are you in? [23:57:52] PST