[00:15:29] (03PS2) 10Milimetric: Add avkwiki to analytics whitelist [analytics/refinery] - 10https://gerrit.wikimedia.org/r/617111 (https://phabricator.wikimedia.org/T257943) (owner: 10Urbanecm) [00:15:58] (03PS1) 10Milimetric: Add lld and thankyou wikipedias [analytics/refinery] - 10https://gerrit.wikimedia.org/r/620806 [00:16:10] (03CR) 10Milimetric: [V: 03+2 C: 03+2] Add lld and thankyou wikipedias [analytics/refinery] - 10https://gerrit.wikimedia.org/r/620806 (owner: 10Milimetric) [09:53:31] (03CR) 10DannyS712: [C: 03+1] Add avkwiki to analytics whitelist [analytics/refinery] - 10https://gerrit.wikimedia.org/r/617111 (https://phabricator.wikimedia.org/T257943) (owner: 10Urbanecm) [09:53:39] hey. I have a question regarding the latest-snapshot of the xmldatadumps (I access via stat1005). I go over all the pages-articles*.xml*.bz2 using the "latest" snapshop (the smaller chunks in order to do distributed processing). while most of the files point to the 20200801-snapshot there is 1 file pointing to the 20200720-snapshot, for example "enwiki-latest-pages-articles27.xml-p63663462p64603917.bz2 -> [09:53:39] ../20200720/enwiki-20200720-pages-articles27.xml-p63663462p64603917.bz2" (also for other wikis). when removing this file or using the 20200801-snapshot directly everything works as expected. is this expected behavior for the latest-snapshot? Thanks in advance for any insights [10:23:33] 10Analytics, 10CAS-SSO: Allow login to JupyterHub via CAS - https://phabricator.wikimedia.org/T260386 (10jbond) > I'm testing out this functionality now, but it isn't clear what is needed to get this authentication working with CAS Looking at the documentation it seems you need the following configuration ap... [11:55:25] Hi team [12:03:11] mgerlach: I assume what you've discovered is a bug (maybe?) - On the stat machine the folder hierarchy is a a mount of what is generated/managed by dumps - Ariel (nick apergos) is the go-to person, but he seems away as of now [12:03:26] mgerlach: oh, and hello (sorry :S) [12:27:07] joal: thanks. yes that is what I wanted to ask. should have added, I can e work around it but it seemed rather unusual so wanted to raise in case it points towards a bug [13:02:43] Hello, I have tried to setup a virtual environment in stat1005 for the first time. I followed https://wikitech.wikimedia.org/wiki/Analytics/Systems/Anaconda to succesfully create a virtual environment named 'keras'. I also added the two proxies to the ~\.config file. [13:02:43] However, I am running into problems when trying to install packages (once the env is activated) via either conda or pip (e.g. pip install ipdb), as in both cases I am getting "Retrying" type of messages all the time. Any idea? [13:05:29] Hi agaduran - This feels like a proxy issue [13:05:41] agaduran: have you tried setup http/ https proxy? [13:19:40] 10Analytics, 10Event-Platform, 10Technical-blog-posts: Story idea for Blog: Wikimedia's Event Platform - https://phabricator.wikimedia.org/T253649 (10Ottomata) Great stuff thank you! [13:24:54] 10Analytics, 10CAS-SSO: Allow login to JupyterHub via CAS - https://phabricator.wikimedia.org/T260386 (10Ottomata) Thanks @jbond, I'll leave this as a low/medium priority one for now and discuss with Luca when he gets back. I'm working on a new JupyterHub setup that should allow us to sort of easily swap out... [13:27:11] 10Analytics, 10CAS-SSO: Allow login to JupyterHub via CAS - https://phabricator.wikimedia.org/T260386 (10Ottomata) And, @jbond there's no current 2FA (e.g. google Authenticator app?) support for CAS (yet), right? I think Luca and I both would feel more comfortable if there was some extra security on this one.... [13:29:22] joal: Hi joal. No, I did not. Any reference page similar to https://wikitech.wikimedia.org/wiki/Analytics/Systems/Anaconda to do this? [13:41:01] actually agaduran the proxy commands are written in the page you pasted [13:41:14] agaduran: https://wikitech.wikimedia.org/wiki/Analytics/Systems/Anaconda#Installing_packages_into_your_user_conda_environment [13:43:12] Yes, that I did. I copied those proxies and pasted to ~/.config [13:43:47] agaduran: I'm no expert in conda - ca [13:44:08] agaduran: can you try pasting the export commands in your shell and pip install after? [13:47:37] joal: I took a look with agaduran; the issue is not only related to conda but we also couldnt create any virtual environments; we added the proxy to ~/.profile but this didnt work either [13:52:45] mgerlach: let's try to be precise :) agaduran mentioned following the Anaconda page - you mention creating a virtual-env - let me ask what you aim for and where it fails [13:53:41] I just tested following the commands on the Anaconda page and managed to install a package [13:53:48] agaduran, mgerlach -- ^ [13:53:50] Aha. I put the exports proxies commands in the bashrc file, and then added source bashrc in bashrc_profile, and now seems to work @joal @mgerlach Thanks [13:54:14] agaduran: those are indeed bash [14:03:48] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Event-Platform, 10Product-Infrastructure-Data: Migrate legacy metawiki schemas to Event Platform - https://phabricator.wikimedia.org/T259163 (10Ottomata) Hm @jlinehan, maybe we should start by identifying schemas that don't need migrated, and t... [14:53:32] Hey team - I have an impromptu appointment, will be late some time at standup [14:53:40] sorry for the late notice [14:58:36] a-team I'm subbing nuria at tech mgr meeting so I won't be at standup [14:58:47] (I know, mr bigshot) [15:01:50] milimetric: standup? [15:02:14] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Event-Platform, and 2 others: Automate ingestion and refinement into Hive of event data from Kafka using stream configs and canary/heartbeat events - https://phabricator.wikimedia.org/T251609 (10hashar) @mforns and I will pair the crafting of a J... [15:17:04] oh, I'm sorry, I thought it was at 12 today! [15:17:40] I'm just working on the review for fdans, yesterday I was able to make a checksum and migrate a hive table to hudi [15:27:42] 10Analytics: Establish what data must be backed up before the HDFS upgrade - https://phabricator.wikimedia.org/T260409 (10JAllemandou) IMO we should add what is not re-computable. Here is what comes to my mind: - aqs (stats data of AQS usage) - browser-general - events (the ones that are not deleted, and 90 d... [16:20:35] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Order mediawiki_history dumps by event_timestamp - https://phabricator.wikimedia.org/T254233 (10JAllemandou) Thanks @marcmiquel for letting us know! I reopen and will investigate. [16:20:48] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Order mediawiki_history dumps by event_timestamp - https://phabricator.wikimedia.org/T254233 (10JAllemandou) 05Resolved→03Open [16:21:21] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Order mediawiki_history dumps by event_timestamp - https://phabricator.wikimedia.org/T254233 (10JAllemandou) a:05mforns→03JAllemandou [16:30:00] 10Analytics, 10CAS-SSO: Allow login to JupyterHub via CAS - https://phabricator.wikimedia.org/T260386 (10jbond) >>! In T260386#6393108, @Ottomata wrote: > Thanks @jbond, I'll leave this as a low/medium priority one for now and discuss with Luca when he gets back. I'm working on a new JupyterHub setup that sho... [16:35:49] Willow [16:38:29] Oak [16:42:54] 10Analytics-Radar, 10Dumps-Generation, 10Okapi, 10Platform Engineering: HTML Dumps - June/2020 - https://phabricator.wikimedia.org/T254275 (10RBrounley_WMF) Thanks all - added! https://github.com/wikimedia/OKAPI [17:46:02] 10Analytics, 10Analytics-EventLogging, 10Event-Platform: Migrate EventLogging MediaViewer data to Event Platform - https://phabricator.wikimedia.org/T260582 (10Lydia_Pintscher) I don't have a stake in it :D @Ramsey-WMF might though. [18:02:57] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Order mediawiki_history dumps by event_timestamp - https://phabricator.wikimedia.org/T254233 (10marcmiquel) You're welcome. El dt., 18 d’ag. 2020, 18:21, JAllemandou < no-reply@phabricator.wikimedia.org> va escriure: > JAllemandou claimed this task. Vi... [18:52:37] 10Analytics, 10Analytics-EventLogging, 10Event-Platform, 10Structured-Data-Backlog: Migrate EventLogging MediaViewer data to Event Platform - https://phabricator.wikimedia.org/T260582 (10Ramsey-WMF) @Ottomata I discussed this with @MarkTraceur and we think you can go ahead and deactivate the MediaViewer... [19:02:55] 10Analytics-Clusters, 10Discovery, 10Discovery-Search (Current work), 10Patch-For-Review: Move mjolnir kafka daemon from ES to search-loader VMs - https://phabricator.wikimedia.org/T258245 (10Gehel) [19:14:02] 10Analytics, 10Analytics-Kanban, 10Platform Team Workboards (Initiatives): reportupdater Pingback reports are broken and need to be refactored - https://phabricator.wikimedia.org/T246154 (10mforns) @CCicalese_WMF We're tackling this task now. IIUC, the pingback heartbeat is sent at first install time, then... [19:19:52] 10Analytics, 10Analytics-EventLogging, 10Event-Platform, 10Structured-Data-Backlog: Migrate EventLogging MediaViewer data to Event Platform - https://phabricator.wikimedia.org/T260582 (10Ottomata) Awesome, thanks! Are there any other EventLogging schemas you all own that we can get rid of? (See the list... [19:23:36] 10Analytics, 10Analytics-EventLogging, 10Event-Platform, 10Structured-Data-Backlog: Migrate EventLogging MediaViewer data to Event Platform - https://phabricator.wikimedia.org/T260582 (10MarkTraceur) @Ottomata you can probably axe all of them, we have yet to decide which might be useful and/or a reason for... [19:47:46] 10Analytics, 10Event-Platform, 10Technical-blog-posts: Story idea for Blog: Wikimedia's Event Platform - https://phabricator.wikimedia.org/T253649 (10srodlund) Hey hey, I worked on the 1st post. This is really well written, and I just made minor grammar and style errors. Feel free to accept or decline or mak... [20:05:02] 10Analytics, 10Event-Platform, 10Technical-blog-posts: Story idea for Blog: Wikimedia's Event Platform - https://phabricator.wikimedia.org/T253649 (10srodlund) 2nd post is done! @Ottomata these are really good and needed hardly any editing at all! Do you have a date you want them to go up by? I'm thinking w... [20:18:17] 10Analytics, 10Analytics-EventLogging, 10Event-Platform, 10Structured-Data-Backlog: Migrate EventLogging MediaViewer data to Event Platform - https://phabricator.wikimedia.org/T260582 (10Tgr) All the other logging mechanisms have been removed from MediaViewer AFAIK. The schemas are all marked as inactive. [20:22:36] 10Analytics, 10Event-Platform, 10Technical-blog-posts: Story idea for Blog: Wikimedia's Event Platform - https://phabricator.wikimedia.org/T253649 (10Ottomata) Hi wow thank you! I am excited to look at your edits, will do first thing tomorrow. I think a week apart would be about right, which means I should... [20:24:08] 10Analytics, 10Event-Platform, 10Technical-blog-posts: Story idea for Blog: Wikimedia's Event Platform - https://phabricator.wikimedia.org/T253649 (10srodlund) Sounds good! They are seriously just minor grammar edits. You did a great job writing these! [20:36:20] 10Analytics-Radar, 10Technical-blog-posts: Story idea for Blog: The Best Dataset on Wikimedia Content and Contributors - https://phabricator.wikimedia.org/T259559 (10srodlund) @Milimetric <--I saw that you were editing on this doc, too, so I thought I would check-in and see if you feel it is ready to edit. [21:04:17] 10Analytics-Radar, 10Technical-blog-posts: Story idea for Blog: The Best Dataset on Wikimedia Content and Contributors - https://phabricator.wikimedia.org/T259559 (10Milimetric) @srodlund yes, totally ready! Edit away, and feel free to ping me if you want to hack on it together. My main goal with it is to ma... [21:12:32] 10Analytics, 10Analytics-Kanban, 10Platform Team Workboards (Initiatives): reportupdater Pingback reports are broken and need to be refactored - https://phabricator.wikimedia.org/T246154 (10CCicalese_WMF) Excellent! Thank you! [21:22:24] 10Analytics-Radar, 10Technical-blog-posts: Story idea for Blog: The Best Dataset on Wikimedia Content and Contributors - https://phabricator.wikimedia.org/T259559 (10srodlund) @Milimetric I will take a first pass at it (asynch), and then if we want to meet (synch) we can. Should I pay any attention to the se... [22:30:45] 10Analytics, 10Discovery-Search, 10Event-Platform: swift_upload.py events not making it into kafka - https://phabricator.wikimedia.org/T260743 (10EBernhardson) [22:31:34] ottomata: sorry, seems i'm finding problems lately :( in ^ eventgate apparently returns 2xx but no events show up in kafka