[05:25:47] 10Analytics, 10Analytics-EventLogging, 10MW-1.32-release-notes (WMF-deploy-2018-08-07 (1.32.0-wmf.16)), 10Patch-For-Review, 10Performance-Team (Radar): Spin out a tiny EventLogging RL module for lightweight logging - https://phabricator.wikimedia.org/T187207 (10ori) I think that Nuria was right to press... [09:07:56] (03PS24) 10Joal: Add MediawikiHistoryChecker spark job [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/439869 (https://phabricator.wikimedia.org/T192481) [09:11:03] dsaez: Hello :) [09:11:35] dsaez: there are some old and long living spark jobs on the cluster currently - May I kill some of them? [09:34:42] hi joal! bonjour! just to confirm, the host key change for bast1002.wikimedia.org is intentional, correct? [09:38:30] Hi miriam_, bonjour :) [09:39:28] 10Analytics, 10Product-Analytics: Load change_tag tables into the Analytics Data Lake on a daily basis - https://phabricator.wikimedia.org/T201062 (10Neil_P._Quinn_WMF) Okay, that's quite disappointing. I think we need to discuss this more in our team meeting today, but some initial comments: >>! In T201062#4... [09:39:56] miriam_: bast1002 was down yesterday evening and some of us switched to different bastions based on that map: https://wikitech.wikimedia.org/wiki/Bastion [09:40:04] miriam_: sorry for the tech channel ping [09:42:56] miriam_: also from ops team logs, bast1002 got updated and restarted yesterday, so I think the key change is normal [09:45:05] joal: perfect, thanks a lot for your help! [10:17:51] 10Analytics, 10Analytics-Data-Quality, 10Analytics-Kanban, 10WMDE-Analytics-Engineering, 10User-GoranSMilovanovic: Data set review for the Wiktionary Cognate Dashboard - https://phabricator.wikimedia.org/T199851 (10GoranSMilovanovic) 05Open>03Resolved [10:49:17] 10Analytics-Tech-community-metrics, 10Upstream: "Wiki Editions" should be "Wiki edits" - https://phabricator.wikimedia.org/T164935 (10Aklapper) a:03Aklapper [12:07:23] Team: I have checked mediawiki-history validity manually using the checker - Everything alright [12:08:00] I have sent https://gerrit.wikimedia.org/r/450960 for ottomata to review, merge and apply whenever we want [12:15:21] (03PS1) 10Joal: Fix oozie mediawiki-history druid indexation job [analytics/refinery] - 10https://gerrit.wikimedia.org/r/450962 [12:52:06] (03PS5) 10Sahil505: Made Bar-chart popup dynamic [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/449029 (https://phabricator.wikimedia.org/T192416) [13:05:20] coool joal i should merge that ya? [13:05:29] ottomata: please :) [13:05:41] ottomata: The thing I don't know however is how it'll be applied [13:05:56] heh, me neither, i guess on aqs nodes [13:05:57] I think AQS needs to be restarted after that (maybe puppet takes care of it?) [13:06:09] oh [13:06:17] ok will run and watch puppet [13:06:25] is there a way to know if it has been applied? [13:06:31] ottomata: I'll check [13:06:39] can you check on just one node? [13:06:43] sure [13:07:17] ottomata: on aqs1004 first? [13:07:55] ya [13:07:58] just ran puppet there [13:08:01] it looks like aqs was not restarted [13:08:20] ya it was not [13:08:21] ottomata: no diff for me [13:08:26] hm [13:08:32] i will depool and bounce aqs [13:08:34] ya? [13:08:38] thanks :) [13:08:58] ok done [13:08:59] check it now [13:10:27] ottomata: works for me :) [13:10:51] ok, using scap to bounce the rest of them [13:11:32] ottomata: Once you're done with that, I'll double check a bit on the UI, making sure I don't find something weird (unexpected, but last 2 deploys here was, so) [13:11:56] 10Analytics, 10Product-Analytics: Load change_tag tables into the Analytics Data Lake on a daily basis - https://phabricator.wikimedia.org/T201062 (10Tgr) Change tags are added after the revision has been created and saved (and they can be added or removed much later, although that probably does not really hap... [13:16:37] joal that is done [13:19:03] checking ottomata [13:20:06] 10Analytics, 10Product-Analytics: Load change_tag tables into the Analytics Data Lake on a daily basis - https://phabricator.wikimedia.org/T201062 (10Ottomata) > Not sure if something else needs to be done on the changeprop side change-prop wouldn't need to be involved, unless the event needs to trigger someth... [13:23:46] (03CR) 10Ottomata: [C: 031] "Wow nice and simple. This works!?" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/450962 (owner: 10Joal) [13:24:22] ottomata: currently testing -- One error about null timestamps, trying to fix it [13:24:37] ottomata: Everything looks good about AQS :) Many thanks for the merge/deploy :) [13:24:49] !log Update AQS druid datasource to 2018-07 snapshot [13:24:51] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [13:26:15] yuppppp [13:26:22] maaan i think my ide is all messed up now [13:26:24] i updated java [13:26:27] nothing working [13:29:24] :( [13:31:42] 10Analytics, 10EventBus, 10Product-Analytics, 10Services (watching): Load change_tag tables into the Analytics Data Lake on a daily basis - https://phabricator.wikimedia.org/T201062 (10mobrovac) +1 on creating such an event, it sounds like a useful piece of information for clients to have/be able to react... [13:36:36] 10Analytics, 10EventBus, 10Product-Analytics, 10Services (watching): Load change_tag tables into the Analytics Data Lake on a daily basis - https://phabricator.wikimedia.org/T201062 (10Ottomata) Ah no, this would just be produced into Kafka via EventBus. We automatically consume and refine all of the Medi... [13:39:12] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10EventBus, 10Services (watching): Modern Event Platform: Schema Registry - https://phabricator.wikimedia.org/T201063 (10Ottomata) The features listed here have definitely grown from what I had originally considered when I proposed this program.... [13:40:41] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10EventBus, 10Services (watching): Modern Event Platform: Schema Registry - https://phabricator.wikimedia.org/T201063 (10Ottomata) > it needs to be an editble GUI that will dynamically change the behavior of remote clients Hm, it also might make... [13:41:47] (03PS1) 10Cicalese: Filter out more erroneous pingback data caused by T200864. [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/450983 [13:54:39] gotta go drop my car off for inspection, back shortly.... [13:59:26] hm, joal I saw mediawiki-history still failed with the parquet fanciness, but I assume you're working on it? [13:59:42] (03PS6) 10Sahil505: Made Bar-chart popup dynamic [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/449029 (https://phabricator.wikimedia.org/T192416) [13:59:59] milimetric: failure is for druid job, not the more important ones [14:00:28] milimetric: I'm indeed working on revamping the druid using parquet, but hit a nasty small thing (null timestamp) [14:00:52] ah, ok [14:01:07] (03PS2) 10Cicalese: Filter out more erroneous pingback data caused by T200864. [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/450983 [14:02:03] (03CR) 10Milimetric: Filter out more erroneous pingback data caused by T200864. (032 comments) [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/450983 (owner: 10Cicalese) [14:02:50] (03CR) 10Milimetric: "oh, you got it, ignore me :)" [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/450983 (owner: 10Cicalese) [14:03:27] (03PS3) 10Cicalese: Filter out more erroneous pingback data caused by T200864. [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/450983 [14:05:56] (03CR) 10Cicalese: "> Patch Set 2:" [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/450983 (owner: 10Cicalese) [14:06:56] (03PS7) 10Sahil505: Made Bar-chart popup dynamic [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/449029 (https://phabricator.wikimedia.org/T192416) [14:07:54] (03CR) 10Sahil505: "Marcel: Please feel free to review it :]" [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/449029 (https://phabricator.wikimedia.org/T192416) (owner: 10Sahil505) [14:14:02] (03CR) 10Cicalese: [V: 032 C: 032] Filter out more erroneous pingback data caused by T200864. [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/450983 (owner: 10Cicalese) [14:14:25] anyone available to help me get a post-aggregation going in Superset? following http://druid.io/docs/latest/querying/post-aggregations.html#example-usage and having issues https://usercontent.irccloud-cdn.com/file/XkXlAlXX/superset_stitched.png [14:15:31] Superset's documentation on this is…non-existent [14:21:38] 10Analytics, 10EventBus, 10Product-Analytics, 10Services (watching): Load change_tag tables into the Analytics Data Lake on a daily basis - https://phabricator.wikimedia.org/T201062 (10Milimetric) >> * In the meantime, when we're more ok with our other goals work, we'll figure out how to do the incremental... [14:22:53] bearloga: heya [14:23:25] bearloga: do you want a post-agreggation as a new metric, or do you want to use a post-aggregation as a on-offin a vizu? [14:23:46] I think the later can be achieved in an easier way [14:24:26] joal: I want it as a metric so it's available as an option whenever a new slice is created [14:24:38] ok [14:25:58] joal: I also tried using this as the JSON https://www.irccloud.com/pastebin/mBX6BsN7/ [14:26:20] k [14:26:39] joal: which then gives this query (that also errors) https://usercontent.irccloud-cdn.com/file/8gzKyvpz/Screen%20Shot%202018-08-07%20at%2010.21.37%20AM.png [14:27:58] 10Analytics, 10EventBus, 10Product-Analytics, 10Services (watching): Load change_tag tables into the Analytics Data Lake on a daily basis - https://phabricator.wikimedia.org/T201062 (10Ottomata) > pages get restored in the past, not even sure what happens to tags but we see those as new revisions in EventB... [14:35:39] done bearloga - Solution was to only put json for post-aggregation and reuse existing metrics (you can see it in the gsc_all datasource) [14:37:20] joal: thank you so much!!! you're the best!!! [14:38:22] you're welcome bearloga :) Nice to see those tools used :) [14:38:52] ottomata: when ou want, let's discuss the patch of partition-checker [14:40:22] 10Analytics, 10EventBus, 10Product-Analytics, 10Services (watching): Load change_tag tables into the Analytics Data Lake on a daily basis - https://phabricator.wikimedia.org/T201062 (10Tgr) At a glance there's just a page.delete / page.undelete event (or mediawiki.revision-visibility-change if it's revisio... [14:42:39] joal: we use scala 2.11.7 when building and running, yes? [14:42:59] joal: sure, acutally, that is WIP especially since I coudlnt' actually test/run it [14:43:09] i think i want to refactor a little more to make that easier [14:43:13] you think that'd be ok? [14:43:21] maybe make a case class for CamusPartitionStatus [14:43:28] instead of using map and tuples? [14:43:39] hmm, wonder if i could reuse hive partition... [14:44:25] hm [14:45:36] ottomata: I like the idea of refactoring a bit the thing [14:45:49] I wonder how, and also wonder about email-sending [14:46:00] Should we add refinery-core as a dep for omaus? [14:46:09] camus sorry [14:46:13] that might be nice [14:46:17] i mostly copy/pasted so i wouldn't have to do that [14:46:21] i guess won't hurt, right? [14:46:49] hm, only issue I can think of (I think I recall we hit that before) is dependencies-conflicts [14:47:12] hm [14:52:07] 10Analytics, 10EventBus, 10Product-Analytics, 10Services (watching): Load change_tag tables into the Analytics Data Lake on a daily basis - https://phabricator.wikimedia.org/T201062 (10Pchelolo) a:03Pchelolo > revisions belonging to restored pages emit new revision-create events when the page is restored... [14:52:16] phewfinally got intellij working again [14:54:22] (03PS2) 10Ottomata: Add email error reporting to CamusPartitionChecker [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/450861 (https://phabricator.wikimedia.org/T198908) [14:54:34] just a fix to tests so the thing passes, stil lworking on it [14:55:15] np ottomata - Let me know if you want to brainbounce :) [15:01:42] ping ottomata [15:02:17] 10Analytics: Page creation data no longer updates - https://phabricator.wikimedia.org/T201420 (10Nettrom) [15:13:42] joal: sorry for delay, for sure, please kill them, I don't know how they stay there. [15:25:37] np dsaez :) I'll kill the old ones (the ones from yesterday will stay) [15:26:41] joal: do you from which machine were they coming from? When I run process in stat100X I'm usually aware of the status, I'm worried that from the notebooks I'm not understanding how to kill them [15:27:24] dsaez: I'm assuming the jobs have been started by notebooks, and that the notebook kernel is still alive [15:40:27] ya dsaez with spark notebooks, you can't just close the browser windwo [15:40:32] the kernel keeps running in the background [15:40:35] you have to stop the kernel [15:41:51] 10Analytics, 10EventBus, 10Product-Analytics, 10Services (watching): Load change_tag tables into the Analytics Data Lake on a daily basis - https://phabricator.wikimedia.org/T201062 (10Nuria) I am also of the opinion that adding a revision-tag-change event is the easiest thing to do if those tags can easi... [15:43:23] joal: is superset up-to-date? I'm trying to add a filter to a slice but getting "unorderable types: str() < int()" (see 'Top 10 Wikipedias by clickthroughs' slice) and looks like that was an issue that got fixed earlier this year https://github.com/apache/incubator-superset/issues/3029 [15:45:37] joal: when I remove all filters the slice works but when I add filter (project in wikipedia) I get the error [15:45:54] Ottomata, interesting, and can I access a session after closing the browser? [15:45:56] 10Analytics: Rename "new pages" endpoint to "net new pages" to better convey that we are reporting a calculation of pages created- pages deleted + restored pages - https://phabricator.wikimedia.org/T201425 (10Nuria) [15:48:01] 10Analytics: Rename "new pages" endpoint to "net new pages" to better convey that we are reporting a calculation of pages created- pages deleted + restored pages - https://phabricator.wikimedia.org/T201425 (10Nuria) This is probably going to mean maintaining old naming for backwards compatibility and creating... [15:48:09] 10Analytics, 10Easy: Rename "new pages" endpoint to "net new pages" to better convey that we are reporting a calculation of pages created- pages deleted + restored pages - https://phabricator.wikimedia.org/T201425 (10Nuria) [15:49:48] bearloga: Not sure about the version [15:49:51] checking [15:52:54] oo joal let's talk revision score stuff soon too ya? [15:52:57] bearloga: we're super late !!!! [15:53:32] ottomata: Yes! Forgot about that during my holidays :( Sorry about that- Will put it on my calendar again :) [15:53:41] ottomata: can we easily update superset? [15:53:57] ottomata: latest release is 0.26.3, we are at 0.20.6 [15:55:19] easily probably! make a task? [15:55:30] yessir [15:55:39] dsaez: yes, but that also leave the spark job running [15:56:25] ottomata: in the notebook UI, go to the "Running" tab [15:56:46] dsaez: main UI I mean, where you browse files [15:57:52] 10Analytics: Update superset (we have 0.20.6, 0.26.3 is available) - https://phabricator.wikimedia.org/T201430 (10JAllemandou) [15:58:08] bearloga, ottomata: --^ [16:00:41] 10Analytics, 10Easy: Rename "new pages" endpoint to "net new pages" to better convey that we are reporting a calculation of pages created- pages deleted - https://phabricator.wikimedia.org/T201425 (10Nuria) [16:02:45] (03CR) 10Milimetric: Fix interval bugs in time range selector (034 comments) [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/450063 (https://phabricator.wikimedia.org/T200497) (owner: 10Mforns) [16:04:10] 10Analytics: Update superset (we have 0.20.6, 0.26.3 is available) - https://phabricator.wikimedia.org/T201430 (10mpopov) Motivation (beyond that it's just nice to have the latest and greatest): I'm trying to add a filter to a slice (which usually works) but when the filter is added, the slice goes from working... [16:05:52] (03CR) 10Milimetric: [V: 032 C: 032] "Looks much better, thank you!" [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/449029 (https://phabricator.wikimedia.org/T192416) (owner: 10Sahil505) [16:08:38] ottomata, joal : I understand that i need to kill the process explicitly.. and that you can do in the tab that joal is describing. What I'm not sure if you can reconstruct the session when you click there. That would be great, because you can left something running in the background and come back later. [16:08:54] * dsaez is experimenting with background notebooks :) [16:11:25] oh yeah! they are alive!!! [16:12:48] dsaez: if you clik on the link of your "green" notebook, it opens the session, and the fact that the icon is green means its kernel is alive (and spark job as well, if session has not been explicitely closed) [16:13:32] (03PS3) 10Ottomata: Add email error reporting to CamusPartitionChecker [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/450861 (https://phabricator.wikimedia.org/T198908) [16:13:37] a-team: Actually, we already use first-revision timestamp as creation date for users if registration is empy [16:13:57] joal: got it, perfect. [16:14:26] a-team: I'm gonna investigate, but it seems there a quite some users with no registration date [16:15:03] dsaez: now that you know about zombie-notebooks, do you mind killing the ones that need be? [16:15:07] dsaez: please :) [16:15:34] dsaez: Oh you did already my bad :) [16:15:45] dsaez: Thank you ! [16:15:45] joal: I killed all of them...like in a zombie's movie :D [16:15:51] hehe :D [16:18:03] I'm now fighting with this memoryOverhead, I'm not sure if is a configuration issue, or something wrong in my code [16:26:03] dsaez: yeah sorry still haven't had time to look at that [16:26:30] ottomata, np [16:27:08] dsaez: this issue looks very weird [16:27:10] the current improvements on notebooks is great [16:27:14] dsaez: can you share the notebook? [16:27:28] joal, yep, give me a second. [16:32:47] joal: I can't connect now, not sure why... but I can't ssh to stat1005 neither [16:33:06] oh bast1002 issues [16:35:31] 10Quarry, 10Gerrit: Quarry repo access should be cleaned up - https://phabricator.wikimedia.org/T201435 (10Jdforrester-WMF) [16:44:56] ottomata: I see a funny test file in my notebook folder :) [16:45:04] ottomata: What were you testing? [16:51:11] ottomata: I think the spark update broke SWAP integration :( [16:51:19] ottomata: I can't have pyspark working [16:51:29] ottomata: ImportError: No module named 'py4j' [16:56:26] ottomata: works fine with Scala, but can't make it work with toree :( [16:57:02] with python sorry [16:57:46] uh ohhhh [16:57:54] joal: i was trying to test notebook sharing [16:57:59] didn't get anywhere [16:58:01] Ah :) [16:58:15] I assume I an delete that file ;) [16:58:17] ya [16:58:39] permission denied (obviously) [16:58:54] deleting [17:01:18] (03PS4) 10Ottomata: Add email error reporting to CamusPartitionChecker [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/450861 (https://phabricator.wikimedia.org/T198908) [17:01:21] joal ok ^ is ready for review [17:01:26] looking into pyspark now... [17:01:30] for notebook [17:01:42] thanks ottomata - Will review [17:03:26] now [17:08:26] (03CR) 10Joal: "Comment on not using vars, but globally ok :)" (034 comments) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/450861 (https://phabricator.wikimedia.org/T198908) (owner: 10Ottomata) [17:08:39] gone for diner, will be back in a bit [17:11:26] 10Analytics, 10EventBus, 10Product-Analytics, 10Patch-For-Review, 10Services (watching): Load change_tag tables into the Analytics Data Lake on a daily basis - https://phabricator.wikimedia.org/T201062 (10Pchelolo) The proposed schema falls into the `revision` hierarchy, so it will inherit from the `revi... [17:11:41] 10Analytics, 10EventBus, 10Product-Analytics, 10Patch-For-Review, 10Services (watching): Load change_tag tables into the Analytics Data Lake on a daily basis - https://phabricator.wikimedia.org/T201062 (10Nuria) ping @Neil_P._Quinn_WMF @Tgr take a look at schema proposal: https://gerrit.wikimedia.org/r/... [17:15:01] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Upgrade eventlogging servers to Stretch - https://phabricator.wikimedia.org/T114199 (10Cmjohnson) [17:17:52] 10Analytics, 10Discovery-Search (Current work), 10Patch-For-Review: Use kafka for communication from analytics cluster to elasticsearch - https://phabricator.wikimedia.org/T198490 (10EBernhardson) [17:17:55] 10Analytics, 10EventBus, 10Operations, 10Discovery-Search (Current work), and 2 others: Create kafka topic for mjolinr bulk daemon and decide on cluster - https://phabricator.wikimedia.org/T200215 (10EBernhardson) 05Open>03Resolved [17:20:50] 10Analytics-Kanban, 10Patch-For-Review: Eventlogging's processors stopped working - https://phabricator.wikimedia.org/T200630 (10Nuria) @Volans: much agree with the len limit not being a fix at all, rather a band aid needed to keep service afloat. The best solution on my opinion is to have a timeout when pr... [17:36:41] joal i know why is broken [17:36:51] can make a fast fix, but to do it better i'll make a new deb package [17:36:55] py4j was upgraded [17:37:03] and the kernel.json file referenced the old version path [17:37:12] so i'm going to make the spark2 deb package create a versionless symlink too [17:41:28] 10Quarry, 10Gerrit: Quarry repo access should be cleaned up - https://phabricator.wikimedia.org/T201435 (10Legoktm) I've made https://gerrit.wikimedia.org/r/admin/groups/b992861bb3f5df1d6f3df8b821c0bf919bb713a8 public, I don't know why it was set to private. [17:43:07] 10Quarry, 10Gerrit: Quarry repo access should be cleaned up - https://phabricator.wikimedia.org/T201435 (10Jdforrester-WMF) [17:49:37] (03CR) 10Nuria: [C: 031] Add saltrotate, a script that manages cryptographic salts (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/449249 (https://phabricator.wikimedia.org/T199899) (owner: 10Mforns) [17:53:06] 10Analytics, 10Analytics-Kanban, 10wikimediafoundation.org: Measure traffic for new wikimedia foundation site - https://phabricator.wikimedia.org/T188419 (10Nuria) Do file a ticket, it might be you are not on the ldap group needed: https://wikitech.wikimedia.org/wiki/Analytics/Systems/Piwik#Access [17:54:22] 10Analytics, 10Patch-For-Review, 10Wikimedia-Incident, 10cloud-services-team (Kanban): Alarms on throughput on refined data - https://phabricator.wikimedia.org/T198908 (10Nuria) I suggest both NavigationTiming and VirtualPageview which should have data at all times. [18:00:53] joal: should be fixed now [18:06:24] 10Analytics, 10EventBus, 10Product-Analytics, 10Patch-For-Review, 10Services (watching): Load change tags into the Analytics Data Lake on a daily basis - https://phabricator.wikimedia.org/T201062 (10Neil_P._Quinn_WMF) [18:09:57] (03PS1) 10Ottomata: Update wheels with pyhive and impyla for default Hive access in prod [analytics/swap/deploy] - 10https://gerrit.wikimedia.org/r/451060 (https://phabricator.wikimedia.org/T183145) [18:10:03] (03PS1) 10Ottomata: Updating wheels with Apache toree 0.2.0 rc5, and jupyterlab 0.32.1 [analytics/swap/deploy] - 10https://gerrit.wikimedia.org/r/451061 (https://phabricator.wikimedia.org/T198738) [18:10:06] (03PS1) 10Ottomata: Use locally committed toree .tar.gz to build wheel [analytics/swap/deploy] - 10https://gerrit.wikimedia.org/r/451062 [18:10:08] (03PS1) 10Ottomata: Install toree kernels for all users [analytics/swap/deploy] - 10https://gerrit.wikimedia.org/r/451063 (https://phabricator.wikimedia.org/T190443) [18:10:10] (03PS1) 10Ottomata: Add spark.executorEnv.PYTHONPATH for pyspark on yarn in jupyter [analytics/swap/deploy] - 10https://gerrit.wikimedia.org/r/451064 (https://phabricator.wikimedia.org/T198909) [18:10:13] (03PS1) 10Ottomata: Add note about PYSPARK_PYTHON [analytics/swap/deploy] - 10https://gerrit.wikimedia.org/r/451065 (https://phabricator.wikimedia.org/T190443) [18:10:14] (03PS1) 10Ottomata: Use ipython for PySpark instead of Toree [analytics/swap/deploy] - 10https://gerrit.wikimedia.org/r/451066 (https://phabricator.wikimedia.org/T190443) [18:10:17] (03PS1) 10Ottomata: Fix path to brunel jar for spark scala jupyter kernels [analytics/swap/deploy] - 10https://gerrit.wikimedia.org/r/451067 (https://phabricator.wikimedia.org/T190443) [18:10:19] (03PS1) 10Ottomata: Use versionless symlink for spark kernels that use py4j [analytics/swap/deploy] - 10https://gerrit.wikimedia.org/r/451068 (https://phabricator.wikimedia.org/T200732) [18:16:38] (03PS1) 10Ottomata: Use versionless symlink for spark kernels that use py4j [analytics/jupyterhub/deploy] - 10https://gerrit.wikimedia.org/r/451069 (https://phabricator.wikimedia.org/T200732) [18:17:01] (03Abandoned) 10Ottomata: Update wheels with pyhive and impyla for default Hive access in prod [analytics/swap/deploy] - 10https://gerrit.wikimedia.org/r/451060 (https://phabricator.wikimedia.org/T183145) (owner: 10Ottomata) [18:17:09] (03Abandoned) 10Ottomata: Updating wheels with Apache toree 0.2.0 rc5, and jupyterlab 0.32.1 [analytics/swap/deploy] - 10https://gerrit.wikimedia.org/r/451061 (https://phabricator.wikimedia.org/T198738) (owner: 10Ottomata) [18:17:16] (03Abandoned) 10Ottomata: Use locally committed toree .tar.gz to build wheel [analytics/swap/deploy] - 10https://gerrit.wikimedia.org/r/451062 (owner: 10Ottomata) [18:17:22] (03Abandoned) 10Ottomata: Install toree kernels for all users [analytics/swap/deploy] - 10https://gerrit.wikimedia.org/r/451063 (https://phabricator.wikimedia.org/T190443) (owner: 10Ottomata) [18:17:25] (03Abandoned) 10Ottomata: Add spark.executorEnv.PYTHONPATH for pyspark on yarn in jupyter [analytics/swap/deploy] - 10https://gerrit.wikimedia.org/r/451064 (https://phabricator.wikimedia.org/T198909) (owner: 10Ottomata) [18:17:27] (03Abandoned) 10Ottomata: Add note about PYSPARK_PYTHON [analytics/swap/deploy] - 10https://gerrit.wikimedia.org/r/451065 (https://phabricator.wikimedia.org/T190443) (owner: 10Ottomata) [18:17:31] (03Abandoned) 10Ottomata: Use ipython for PySpark instead of Toree [analytics/swap/deploy] - 10https://gerrit.wikimedia.org/r/451066 (https://phabricator.wikimedia.org/T190443) (owner: 10Ottomata) [18:17:33] (03Abandoned) 10Ottomata: Fix path to brunel jar for spark scala jupyter kernels [analytics/swap/deploy] - 10https://gerrit.wikimedia.org/r/451067 (https://phabricator.wikimedia.org/T190443) (owner: 10Ottomata) [18:17:35] (03Abandoned) 10Ottomata: Use versionless symlink for spark kernels that use py4j [analytics/swap/deploy] - 10https://gerrit.wikimedia.org/r/451068 (https://phabricator.wikimedia.org/T200732) (owner: 10Ottomata) [18:17:48] (03CR) 10Ottomata: [V: 032 C: 032] Use versionless symlink for spark kernels that use py4j [analytics/jupyterhub/deploy] - 10https://gerrit.wikimedia.org/r/451069 (https://phabricator.wikimedia.org/T200732) (owner: 10Ottomata) [18:44:57] 10Analytics, 10Operations, 10Traffic: The WMF-Last-Access Set-Cookie header should follow RFC 2965 syntax rather than the pre-RFC Netscape format - https://phabricator.wikimedia.org/T147967 (10Jdforrester-WMF) [18:45:11] 10Analytics, 10Operations, 10Traffic: The WMF-Last-Access Set-Cookie header should follow RFC 2965 syntax rather than the pre-RFC Netscape format - https://phabricator.wikimedia.org/T147967 (10Jdforrester-WMF) [18:45:47] 10Analytics, 10Operations, 10Traffic: The WMF-Last-Access Set-Cookie header should follow RFC 2965 syntax rather than the pre-RFC Netscape format - https://phabricator.wikimedia.org/T147967 (10Jdforrester-WMF) >>! In T147967#2710596, @BBlack wrote: > I'd suggest blocking this on the seemingly-unrelated T1471... [19:10:55] ottomata: I confirm it works for me! Many thanks :) [19:15:05] yeehaw [19:19:05] joal: still around? [19:19:12] looking at top of https://phabricator.wikimedia.org/T190443 [19:19:12] sure [19:19:15] we got 2 checkboxes there [19:19:20] have I fixed those while you were gone?! :p [19:19:44] can't remember the status [19:19:52] do those work? [19:20:15] ottomata: IIRC you added the brunel-2.6 jar automatically, but this has not worked for me as is [19:22:27] ottomata: I wonder if your addition of the jar to toree includes it as "magical jar" [19:24:55] 10Analytics, 10Analytics-EventLogging, 10MW-1.32-release-notes (WMF-deploy-2018-08-07 (1.32.0-wmf.16)), 10Patch-For-Review, 10Performance-Team (Radar): Spin out a tiny EventLogging RL module for lightweight logging - https://phabricator.wikimedia.org/T187207 (10Milimetric) The argument that I initially w... [19:26:04] hm [19:26:14] joal: and the matplotlib one? [19:26:17] was that you who added that? [19:26:39] hm - Can't recall - currently doing some stuff that will tell me [19:28:33] ottomata: I think you created it?? I'm currently testing python code that will tell me if matplotlib works for me [19:30:29] something else to do ottomata: Limit the number of workers of pyspark-yarn kernel :) [19:30:52] I know it's done for toree, but as of now I'm eating a lot of resources using pyspark [19:33:29] hm joal it has spark.dynamicAllocation.maxExecutors=128 [19:37:16] rm [19:37:32] ottomata: matplotlib working for me :) [19:41:20] great [19:41:26] anoter check box bites the dust [19:41:46] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Spark Jupyter Notebook integration - https://phabricator.wikimedia.org/T190443 (10Ottomata) [19:43:08] * joal sings tum-tum-tum---tudutudu-tum-tudum [19:50:00] byeee teaaam, see you tomorrow [19:50:08] see you mforns [19:50:17] :] [20:00:04] 10Analytics: Scan npm dependencies for vulnerabilities - https://phabricator.wikimedia.org/T200717 (10Milimetric) [20:03:23] 10Analytics: Piwik user account for Wikimedia.org.il - https://phabricator.wikimedia.org/T199046 (10Milimetric) Piwik (now called Matomo) actually doesn't handle too much traffic so even small sites are a problem for it. The chapter sites should still be ok on it. With our other priorities it will just take a... [20:06:56] joal: were you able to get more than 128 executors for pyspark yarn kernel? [20:07:04] I think I did [20:07:18] hm [20:07:20] Let me try again ottomata [20:07:50] now ottomata --> having more than 200 workers [20:07:58] hm [20:08:05] can you tell which pid this is [20:08:07] on notebook1003? [20:08:15] i see this one [20:08:16] 30378 [20:08:22] which has --conf spark.dynamicAllocation.maxExecutors=128 [20:08:41] was started 6 mins ago [20:09:04] do you know if the scala toree one lets you get more than 128 exectutors? [20:09:14] I* think it doesn't can check [20:11:56] ottomata: confirm it works with toree [20:13:27] hm ok [20:16:52] joal: try now? have a thought. [20:16:56] or [20:17:00] how cna I easily repro? [20:17:08] and i will try [20:18:54] ottomata: it works :) [20:18:59] great, out of order args [20:19:30] ottomata: can be easily repro-ed: https://gist.github.com/jobar/f6c53c06340f3d1efa2ca74b17365907 [20:20:40] 10Analytics, 10Analytics-Cluster, 10WMDE-Analytics-Engineering, 10User-GoranSMilovanovic: Can't write from Spark to local FS - https://phabricator.wikimedia.org/T200609 (10Milimetric) Is it possible this might be trying to write on the machine where yarn is executing the program, instead of on stat1005? I... [20:20:51] (03PS1) 10Ottomata: Fix pyspark kernels to properly apply dynamicAllocation.maxExecutors=128 [analytics/jupyterhub/deploy] - 10https://gerrit.wikimedia.org/r/451103 (https://phabricator.wikimedia.org/T190443) [20:21:07] aye :) [20:21:07] Hello A-team! I'm working an oozie job for https://phabricator.wikimedia.org/T186828 and it is working well under my account. I'm thinking it'd be better to move the table to wmf and not run the oozie job under my account. Should I deploy the code as the instruction in https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Deploy/Refinery ? [20:21:24] joal https://gerrit.wikimedia.org/r/#/c/analytics/jupyterhub/deploy/+/451103/1/kernels/spark_yarn_pyspark/kernel.json was the problem [20:22:06] chelsyx: no, we can deploy refinery, but we should get your oozie job into the refinery code: https://github.com/wikimedia/analytics-refinery/tree/master/oozie [20:22:10] ottomata: Ahhhhh ! [20:22:55] (03CR) 10Ottomata: [V: 032 C: 032] Fix pyspark kernels to properly apply dynamicAllocation.maxExecutors=128 [analytics/jupyterhub/deploy] - 10https://gerrit.wikimedia.org/r/451103 (https://phabricator.wikimedia.org/T190443) (owner: 10Ottomata) [20:23:07] chelsyx: so the process is basically: [20:23:09] milimetric: Thx! So I can just commit my code and nothing else need to do from me? [20:23:37] chelsyx: yeah, just submit a gerrit change where you add your oozie code under that folder, and then we can help merge/test/deploy [20:24:14] milimetric: Cool. Thanks! [20:24:33] chelsyx: and add some of us as reviewers so we see it [20:24:44] milimetric: Will do! [20:25:05] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Spark Jupyter Notebook integration - https://phabricator.wikimedia.org/T190443 (10Ottomata) @diego to make it easier to troubleshoot, can you explain how to reproduce? Tanks! [20:26:41] chelsyx: Please make sure you test your code once it is in oozie, you can test it under your user just the same but it needs to be tested again to make sure paths are correct [20:27:06] 10Analytics, 10Patch-For-Review, 10Wikimedia-Incident, 10cloud-services-team (Kanban): Alarms on throughput on camus imported data - https://phabricator.wikimedia.org/T198908 (10Ottomata) [20:27:26] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10Wikimedia-Incident, 10cloud-services-team (Kanban): Alarms on throughput on camus imported data - https://phabricator.wikimedia.org/T198908 (10Ottomata) [20:27:38] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10Wikimedia-Incident, 10cloud-services-team (Kanban): Alarms on throughput on camus imported data - https://phabricator.wikimedia.org/T198908 (10Ottomata) a:03Ottomata [20:29:35] nuria_: I've test it under my username, but I don't have access to some directory (like `/wmf/data/archive`) that the data will be in eventually so I can't test those path [20:30:12] chelsyx: you can override the paths when you are testing the code on the oozie dir as they are passed as properties right? [20:30:21] chelsyx: does that make sense? [20:30:45] nuria_: yes, I've overwirte it with other path when test it [20:31:16] chelsyx: nuria_ is saying (maybe you got this) that for the stuff in the .properties file, you can override the values on the CLI when you submit the oozie job [20:31:19] e.g. [20:31:46] -Dmy.data.path=/user/chelsyx/cool/data [20:31:51] without having to edit the file [20:31:53] chelsyx: right, this is to make sure the code you are committing runs as is , overrides are -Dsome.property=?fake/value [20:31:59] exactly [20:32:14] nuria_ ottomata : Got you. Thanks! [20:32:43] chelsyx: btw thanks for braving the nasty world of oozie [20:32:54] chelsyx: indeed, see https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Oozie#Run_oozie_job_overriding_what_pertains [20:32:56] it can be pretty annoying, i know :) [20:33:39] yeah :) [20:35:19] gone for tonight team [20:36:08] later sjoal [20:39:11] 10Analytics, 10Operations, 10decommission, 10ops-eqiad, 10Patch-For-Review: Decommission stat1002.eqiad.wmnet - https://phabricator.wikimedia.org/T173097 (10Cmjohnson) [20:42:19] 10Analytics, 10Operations, 10ops-eqiad: Remove stat1002 - https://phabricator.wikimedia.org/T173094 (10Cmjohnson) [20:42:25] 10Analytics-Cluster, 10Analytics-Kanban, 10Patch-For-Review: Replacement of stat1002 and stat1003 - https://phabricator.wikimedia.org/T152712 (10Cmjohnson) [20:42:27] 10Analytics, 10Operations, 10decommission, 10ops-eqiad, 10Patch-For-Review: Decommission stat1002.eqiad.wmnet - https://phabricator.wikimedia.org/T173097 (10Cmjohnson) 05Open>03Resolved This server was given to Stroz and we have a copy of the hard drive in the eqiad data center on an encrypted drive,... [21:31:44] 10Analytics, 10Analytics-Kanban, 10wikimediafoundation.org: Measure traffic for new wikimedia foundation site - https://phabricator.wikimedia.org/T188419 (10Varnent) [21:32:07] 10Analytics, 10Analytics-Kanban, 10wikimediafoundation.org: Measure traffic for new wikimedia foundation site - https://phabricator.wikimedia.org/T188419 (10Varnent) >>! In T188419#4485799, @Nuria wrote: > Do file a ticket, it might be you are not on the ldap group needed: https://wikitech.wikimedia.org/wiki... [22:05:35] 10Analytics: Piwik user account for Wikimedia.org.il - https://phabricator.wikimedia.org/T199046 (10Nuria) @Ltzike would you consider installing piwik in a lab hosts and maintaning it yourself? My main concerns here are not so much priorities or traffic but rather access to our current install of piwik which is... [22:10:29] 10Analytics, 10Analytics-Cluster, 10Analytics-Kanban, 10Patch-For-Review: Upgrade spark 2.3.0 -> 2.3.1 on analytics cluster - https://phabricator.wikimedia.org/T200732 (10Nuria) [22:10:36] 10Analytics, 10Analytics-Cluster, 10Analytics-Kanban, 10Patch-For-Review: Upgrade spark 2.3.0 -> 2.3.1 on analytics cluster - https://phabricator.wikimedia.org/T200732 (10Nuria) 05Open>03Resolved [22:10:59] 10Analytics-Kanban, 10Patch-For-Review: Update AQS new-pages endpoint - https://phabricator.wikimedia.org/T200272 (10Nuria) 05Open>03Resolved [22:12:35] 10Analytics, 10Analytics-Kanban, 10Analytics-Wikistats, 10Patch-For-Review: Missing stats for Atikamekw Wikipedia on stats.wikimedia.org - https://phabricator.wikimedia.org/T193625 (10Nuria) @Seeris Stats are now there for atjwiki {F24665845} [22:12:46] 10Analytics, 10Analytics-Kanban, 10Analytics-Wikistats, 10Patch-For-Review: Missing stats for Atikamekw Wikipedia on stats.wikimedia.org - https://phabricator.wikimedia.org/T193625 (10Nuria) 05Open>03Resolved [22:13:07] 10Analytics-Kanban, 10Patch-For-Review: Eventlogging's processors stopped working - https://phabricator.wikimedia.org/T200630 (10Nuria) [22:13:09] 10Analytics-Kanban, 10Patch-For-Review: Simplify and document how to increase log verbosity/level for Eventlogging - https://phabricator.wikimedia.org/T200765 (10Nuria) 05Open>03Resolved [22:13:24] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: [EL sanitization] Productionize EventLoggingSanitization.scala - https://phabricator.wikimedia.org/T193176 (10Nuria) 05Open>03Resolved [22:13:41] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Fix sqoop script so that the jar-generation step doesn't print logs (alerts email sent by cron) - https://phabricator.wikimedia.org/T198966 (10Nuria) 05Open>03Resolved [22:13:55] 10Analytics, 10Analytics-Kanban, 10User-Elukey: Q1 2018/19 Analytics procurement - https://phabricator.wikimedia.org/T198694 (10Nuria) [22:13:57] 10Analytics, 10Analytics-Kanban: Order Data Lake Hardware - https://phabricator.wikimedia.org/T198424 (10Nuria) 05Open>03Resolved