[06:15:07] morning! [07:08:30] Hello [07:08:58] joal: o/ [07:09:05] I found jupyter-joal-singleuser.service failed on notebook1004 [07:09:20] (not sure if it is important) [07:09:34] Nothing importatn elukey - I don't know what it could be [07:09:51] It might be a leftover from our tests with Andrew [07:10:05] it alarms in icinga due to systemd unit failures, I'll talk with Andrew to figure out what's happening [07:10:12] Sounds good [07:10:14] Thanks :) [07:11:49] (03PS18) 10Joal: Add MediawikiHistoryChecker spark job [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/439869 (https://phabricator.wikimedia.org/T192481) [07:13:10] (03PS10) 10Joal: Add validation step in mediawiki-history jobs [analytics/refinery] - 10https://gerrit.wikimedia.org/r/440005 (https://phabricator.wikimedia.org/T192481) [07:13:47] (03CR) 10Joal: "Tested on cluster" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/439869 (https://phabricator.wikimedia.org/T192481) (owner: 10Joal) [07:13:58] (03CR) 10Joal: "Tested on cluster" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/440005 (https://phabricator.wikimedia.org/T192481) (owner: 10Joal) [07:16:57] (03PS19) 10Joal: Add MediawikiHistoryChecker spark job [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/439869 (https://phabricator.wikimedia.org/T192481) [07:21:03] 10Analytics, 10Analytics-Kanban, 10Operations, 10netops, 10Patch-For-Review: Review analytics-in4 rules on cr1/cr2 eqiad - https://phabricator.wikimedia.org/T198623 (10elukey) Fixed archiva and removed puppet in analytics-in4. The last step is to drop Ganglia and git-deploy terms from common-infrastructu... [07:22:02] joal: did you see https://phabricator.wikimedia.org/T199046 ? [07:22:55] Nope I had not seen that [07:31:19] 10Analytics, 10Analytics-Kanban, 10Operations, 10netops, 10Patch-For-Review: Review analytics-in4 rules on cr1/cr2 eqiad - https://phabricator.wikimedia.org/T198623 (10elukey) a:03elukey [07:35:39] (03PS20) 10Joal: Add MediawikiHistoryChecker spark job [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/439869 (https://phabricator.wikimedia.org/T192481) [07:37:38] 10Analytics, 10User-Elukey: Add a safe failover for analytics1003 - https://phabricator.wikimedia.org/T198093 (10elukey) Any comment about this? (No rush/hurry, I am just checking my open tasks :) [07:37:49] joal: (if you have time) - what do you think about https://phabricator.wikimedia.org/T198093 ? [07:38:18] the TL;DR is that it might be possible to move the an1003's db to a full replicated mysql setting [07:38:28] and leave only oozie etc.. on an1003 [07:39:20] (so the mariadb instance on an1003 will be moved on separate misc db hosts) [07:40:01] elukey: I don't see any particular concern for daily patterns [07:40:06] 10Analytics, 10User-Elukey: Add a safe failover for analytics1003 - https://phabricator.wikimedia.org/T198093 (10jcrespo) > This should be ok since we already have some dbproxies whitelisted in the analytics vlan's firewall, so it should be a matter of adding another one. I don't think that is ok- we need at... [07:40:21] Might be less fluid at upgrade times, but maybe not even [07:40:49] 10Analytics, 10User-Elukey: Add a safe failover for analytics1003 - https://phabricator.wikimedia.org/T198093 (10Marostegui) Regarding a possible migration: - Can you provide some details about the size of the databases? (something like a du -sh . on the data directory would be ok to have an estimation). -... [07:42:02] joal: sure, but the bright side is not having a SPOF anymore [07:42:04] no? [07:42:14] elukey: For sure :) [07:42:44] elukey: I was just thinking in terms of hanges for us - Obviously the value of such a change is big :) [07:45:54] joal: If you think about it, the only time that we think about touching those db is if we add a new db (very rare) or if we upgrade cdh (very rare, and I don't recall schema changes mentioned for the past ones) [07:46:09] Excatly [07:48:57] 10Analytics, 10User-Elukey: Add a safe failover for analytics1003 - https://phabricator.wikimedia.org/T198093 (10elukey) >>! In T198093#4407099, @jcrespo wrote: >> This should be ok since we already have some dbproxies whitelisted in the analytics vlan's firewall, so it should be a matter of adding another one... [09:31:26] (03PS21) 10Joal: Add MediawikiHistoryChecker spark job [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/439869 (https://phabricator.wikimedia.org/T192481) [09:39:49] (03PS22) 10Joal: Update MediawikiHistoryChecker adding reduced [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/439869 (https://phabricator.wikimedia.org/T192481) [09:56:49] 10Analytics: Order Data Lake Hardware - https://phabricator.wikimedia.org/T198424 (10elukey) Another point to figure out is what kind of security level we are aiming for, just to do our homework before ordering hardware and choosing Presto + Hadoop as technology for this project. The public data lake project sh... [10:13:14] (03PS4) 10Joal: Manual importer of xml dumps to hdfs [analytics/refinery] - 10https://gerrit.wikimedia.org/r/409960 [10:18:55] (03PS5) 10Joal: Manual importer of xml dumps to hdfs [analytics/refinery] - 10https://gerrit.wikimedia.org/r/409960 [10:45:20] * elukey afk for 2h! :) [11:46:37] hi fdans [11:46:41] that query is awesome! [12:47:09] joal: o/ [12:52:21] elukey: o/ [13:18:05] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Varnishkafka eventlogging instances delivery failures - https://phabricator.wikimedia.org/T198070 (10Ottomata) Hm, didn't quite realize we didn't already compress snappy. I think it'd be wise to just enable this for all producers by defau... [13:50:45] (03PS1) 10Elukey: Update targets with db1107 and db1108 [analytics/refinery/scap] - 10https://gerrit.wikimedia.org/r/444589 [13:51:03] (03PS2) 10Elukey: Update targets with db1107 and db1108 [analytics/refinery/scap] - 10https://gerrit.wikimedia.org/r/444589 (https://phabricator.wikimedia.org/T198766) [13:52:31] joal: --^ [13:52:35] as FYI [13:52:42] (03CR) 10Elukey: [V: 032 C: 032] Update targets with db1107 and db1108 [analytics/refinery/scap] - 10https://gerrit.wikimedia.org/r/444589 (https://phabricator.wikimedia.org/T198766) (owner: 10Elukey) [13:53:39] ack elu [13:53:41] ack elukey [13:53:42] sorr [13:54:16] joal: as FYI I am not merging the change to deploy the refinery on those dbs [13:54:22] and then I'll try to deploy too [13:54:30] is it ok? [13:54:44] (it should be hanlded by puppet in theory but I am sure something will break [13:55:02] I didn't get it elukey sorry [13:55:30] joal: sorry I am self verbosing myself now :) [13:55:32] You're not merging, and you'll try a manual deploy? [13:55:57] I am merging https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/444154/, that is a puppet change to force db1107/8 to get the refinery [13:56:12] in theory puppet should be able to check out the git repo [13:56:29] but I might need to do something manually since I have the feeling that something will break [13:56:49] so if you see me attempting a deploy (limited to the dbs) you'll know what I am doing :) [13:56:52] ok makes sense :) [13:59:04] joal: totally unrelated, but I think https://gerrit.wikimedia.org/r/444238 is awesome [14:30:02] mforns: o/ [14:32:28] ottomata: hiiiiiii [14:34:21] hello elukey :] [14:35:17] mforns: refinery deployed on the dbs :) [14:35:28] elukey, I saw.. THANKS :D [14:35:43] o/ [14:35:51] elukey, is the mysql purging script running on the refinery whitelist already? [14:36:05] hey ottomata :] [14:36:27] mforns: it has changed now to use /srv/deployment/analytics/refinery/static_data/eventlogging/whitelist.yaml, so tomorrow will be the first round with the new location [14:36:40] awesome! [14:37:28] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Deploy refinery to eventlogging hosts - https://phabricator.wikimedia.org/T198766 (10elukey) [14:39:00] ottomata: when you are caffeinated and ready, shall we deploy the snappy config to vk-eventlogging? [14:41:10] (03PS4) 10Jonas Kress (WMDE): Track number of editors from Wikipedia who also edit on Wikidata over time [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/443069 (https://phabricator.wikimedia.org/T193641) [14:41:30] elukey: yes! [14:41:33] rdy now lets do it [14:43:15] ottomata: https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/444232/ - +1 ? [14:43:23] (or just check if it is the right one :) [14:43:30] OH i thoguht i did! [14:44:15] ah nice! I can disable puppet where needed, run on cp host, check with you, and then apply all? [14:44:19] yup! [14:44:28] all right going to ping traffic just in case [14:45:59] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10EventBus, 10Services (watching): Modern Event Platform (with EventLogging of the Future (EoF)) - https://phabricator.wikimedia.org/T185233 (10Ottomata) {T198906} Just made me think of another need: auditing. At the very least, we should have a... [14:50:19] (03CR) 10Mforns: [V: 032 C: 032] "Looks awesome to me! So much more clear and useful. THX" (031 comment) [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/443582 (https://phabricator.wikimedia.org/T188928) (owner: 10Fdans) [14:51:33] thank you for the review mforns! [14:51:50] np, sorry for the delay [14:56:22] ottomata: cp5007 running with snappy [14:56:38] cool [14:56:57] (03Merged) 10jenkins-bot: Changes map component to accept concrete numbers [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/443582 (https://phabricator.wikimedia.org/T188928) (owner: 10Fdans) [14:57:12] and kafkacat looks happy [15:03:57] ping ottomata [15:04:54] ping ottomata [15:05:24] OH BAY [15:07:36] google kicked me out! [15:08:03] joal: https://grafana.wikimedia.org/dashboard/db/varnishkafka?panelId=34&fullscreen&orgId=1&var-instance=eventlogging&var-host=cp5007&from=now-3h&to=now [15:08:12] restarted some minute ago cp5007 [15:11:42] \o/ elukey :) [15:12:13] elukey: I'll be super interested by the monitoring of singapore-events failures :) [15:19:12] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10EventBus, 10Services (watching): Modern Event Platform (with EventLogging of the Future (EoF)) - https://phabricator.wikimedia.org/T185233 (10Pchelolo) >>! In T185233#4408295, @Ottomata wrote: > {T198906} Just made me think of another need: aud... [16:13:41] 10Analytics, 10cloud-services-team (Kanban): Alarms on throughput on refined data - https://phabricator.wikimedia.org/T198908 (10fdans) p:05Unbreak!>03High [16:14:47] 10Analytics, 10Wikimedia-Stream: EventStreams butcher up some Unicode characters - https://phabricator.wikimedia.org/T198994 (10fdans) p:05Triage>03Normal [16:15:27] 10Analytics, 10Wikimedia-Stream: EventStreams butcher up some Unicode characters - https://phabricator.wikimedia.org/T198994 (10fdans) p:05Normal>03High [16:15:41] 10Analytics, 10Wikimedia-Stream: EventStreams butcher up some Unicode characters - https://phabricator.wikimedia.org/T198994 (10fdans) p:05High>03Normal [16:17:39] ottomata: o/ - whenever you have time there is a screen/tmux on furud owned by you that is probably super old [16:21:53] 10Analytics: Piwik user account for Wikimedia.org.il - https://phabricator.wikimedia.org/T199046 (10fdans) [16:25:30] fdans: --^ - I think that they want to use piwik to track that site no? (I didn't get your comment sorry) [16:26:40] elukey: the thing is that the site is not a wiki, it's a local chapter website [16:27:02] elukey screen termintated thanks [16:27:09] I'm asking because there is a community maintained piwik that is different from piwik.wikimedia.org elukey [16:29:05] fdans: ah okok thanks :) [17:03:56] (03CR) 10Jonas Kress (WMDE): "-Yes, only Wikipedia and absolute numbers, because they are required for the report." [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/443069 (https://phabricator.wikimedia.org/T193641) (owner: 10Jonas Kress (WMDE)) [17:06:52] 10Analytics, 10Analytics-EventLogging, 10Page-Previews, 10Readers-Web-Backlog, 10Readers-Web-Kanbanana-Board: Some VirtualPageView are too long and fail EventLogging processing - https://phabricator.wikimedia.org/T196904 (10ovasileva) a:05ABorbaWMF>03Ottomata This looks good from our side. Over to y... [17:21:32] as FYI Arzhel is doing some firewall changes on the routers that are no-op [17:21:46] if you see anything weird let me know :) [17:33:47] a-team: vk el looks good :) https://grafana.wikimedia.org/dashboard/db/varnishkafka?panelId=34&fullscreen&orgId=1&var-instance=eventlogging&var-host=All [17:34:19] the bytes transferred are basically half [17:34:53] very cool, hope timeouts are a thing of the past [17:37:43] hopefully! [17:38:05] in theory, https://grafana.wikimedia.org/dashboard/db/varnishkafka?panelId=20&fullscreen&orgId=1 should remain flat [17:38:43] err sorry https://grafana.wikimedia.org/dashboard/db/varnishkafka?panelId=20&fullscreen&orgId=1&from=now-2d&to=now&var-instance=eventlogging&var-host=All [17:38:50] (without those peaks) [17:54:16] * elukey afk! [19:02:07] (03PS1) 10Ottomata: Add spark.executorEnv.PYTHONPATH for pyspark on yarn in jupyter [analytics/jupyterhub/deploy] - 10https://gerrit.wikimedia.org/r/444651 (https://phabricator.wikimedia.org/T198909) [19:02:28] (03CR) 10Ottomata: [V: 032 C: 032] Add spark.executorEnv.PYTHONPATH for pyspark on yarn in jupyter [analytics/jupyterhub/deploy] - 10https://gerrit.wikimedia.org/r/444651 (https://phabricator.wikimedia.org/T198909) (owner: 10Ottomata) [19:03:07] 10Analytics, 10Patch-For-Review: Errors with the new SWAP notebooks - https://phabricator.wikimedia.org/T198909 (10Ottomata) Alright, I think I got this. Since we set the python executable for pyspark on yarn, we also need to explicitly set the PYTHONPATH for executors too: ``` --conf spark.executorEnv.PYTHO... [19:04:48] 10Analytics, 10Discovery, 10Discovery-Analysis, 10Product-Analytics: Add referer to WebrequestData - https://phabricator.wikimedia.org/T172009 (10Tbayer) To clarify, I assume that this is separate from the general HTTP referrer header that is already recorded in the `referer` field in the [[https://wikitec... [19:05:30] \o/ ottomata :) [19:22:06] 10Analytics, 10Patch-For-Review: Errors with the new SWAP notebooks - https://phabricator.wikimedia.org/T198909 (10diego) @Ottomata , I'm getting the same error: Name: org.apache.toree.interpreter.broker.BrokerException Message: Py4JJavaError: An error occurred while calling o61.showString. : org.apache.spar... [19:24:23] dsaez: hiyaaa [19:24:45] yt? [19:25:10] yep [19:25:13] hi [19:25:18] did you restart your notebook? [19:25:22] use a new one? [19:26:13] 10Analytics, 10Patch-For-Review: Errors with the new SWAP notebooks - https://phabricator.wikimedia.org/T198909 (10Ottomata) K, we looking into it. BTW, for syntax highlighting/autocomplete, not sure. Looks like a known issue upstream. I've not yet been able to find a workaround. https://issues.apache.org/j... [19:26:17] let me restart again just to be shore [19:27:05] ottomata, let me stop the server and restart [19:28:26] ottomata, new error :) but we are progressing [19:28:30] oh!? [19:28:45] I'll add in the ticket [19:29:02] but this is the same error that I was having before, this a version compatibility error [19:29:15] I was having this error when running from stat1005 [19:29:30] oh! hm [19:30:00] oh oh oh [19:30:41] yes sorry, i think i know what that is dsaez [19:30:45] one min [19:31:06] 10Analytics, 10Patch-For-Review: Errors with the new SWAP notebooks - https://phabricator.wikimedia.org/T198909 (10diego) After restarting the notebook, now I get this error. I have the same error if I use pyspark from stat1005. The solution that I found for this is working with python2.7 in pyspark Name: or... [19:31:10] "Exception: Python in worker has different version 2.7 than that in driver 3.5, PySpark cannot run with different minor versions." [19:31:20] dsaez: restart it again [19:31:21] try now [19:32:43] ok, give a min, the stop button is not working [19:33:04] oh, i think you don't need to restart your jupyter server [19:33:05] just the notebook [19:39:23] dsaez: looking ok? [19:40:23] ottomata, you need to restart the server, if not I don't see any change [19:40:56] hm [19:40:59] ottomata, great, working [19:41:01] YES [19:41:01] cool [19:41:13] what have you done? [19:41:14] sorry, i fixed your problem on notebook1004, but not on 1003 :) [19:41:17] i set [19:41:18] PYSPARK_PYTHO [19:41:21] PYSPARK_PYTHON [19:41:27] to the same thing as PYTHON_EXEC [19:41:32] /usr/bin/python3 [19:42:16] in the driver? I thought that was a client problem [19:42:19] worker [19:42:21] (03PS1) 10Ottomata: Add note about PYSPARK_PYTHON [analytics/jupyterhub/deploy] - 10https://gerrit.wikimedia.org/r/444667 (https://phabricator.wikimedia.org/T190443) [19:42:37] i also had to set [19:42:42] --conf spark.executorEnv.PYTHONPATH=/usr/lib/spark2/python:/usr/lib/spark2/python/lib/py4j-0.10.6-src.zip [19:42:46] to fix the udf problem [19:42:57] the driver needs to tell the worker which python to use too [19:43:03] which...i guess is what PYSPARK_PYTHON does [19:43:03] ? [19:43:19] I think PYSPARK_PYTHON is for the driver [19:43:29] not driver, but the host [19:43:42] here are the toree kernel settings [19:45:03] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Spark notebook integration - https://phabricator.wikimedia.org/T190443 (10Ottomata) BTW, the PySpark on YARN notebook needs `PYSPARK_PYTHON` manually set to the same value of `PYTHON_EXEC`, e.g. `/usr/bin/python3` to avoid version errors. Here's a working... [19:45:07] dsaez: https://phabricator.wikimedia.org/T190443#4409450 [19:45:57] great! thanks [19:46:24] ottomata, and do you have any idea why the autocomplete/colors are not working? [19:46:32] ottomata: I confirm it works for me as well :) [19:46:37] thanks a mil ottomata :) [19:46:50] dsaez: posted on your ticket [19:46:54] i think its a known issue [19:46:56] https://issues.apache.org/jira/browse/TOREE-323 [19:47:01] tried some workarounds, but didn't get anything to work [19:47:27] oh :( [19:47:55] this is kind of important, for usability [19:49:20] hm [19:51:38] ottomata: since you;re on noebooks (are ou?) [19:51:46] ottomata, what is the workers/memory configuration for these notebooks? I've learnt that memory overhead is very important for pypsark, because it's there were the python process occurs [19:51:46] ya [19:52:06] ottomata: Are there more details logs than the ones we have on te no9tebook itself? [19:52:16] joal ya that is a problem [19:52:18] there are [19:52:22] but you said you don't have perms to read them, right? [19:52:25] i made a task to fix that [19:52:27] but [19:52:52] I managed to have brunel working on a local notebook using docker (https://github.com/jupyter/docker-stacks/tree/master/all-spark-notebook) [19:52:56] journalctl -f -u jupyter-$USER-singleuser [19:53:14] However, I have no error when trying on paws, but no chart either :( [19:54:16] joal: swap you mean? [19:54:19] which notebook node? [19:54:20] 1003? [19:54:20] yes [19:54:22] 1003 [19:54:38] ok am watching your logs [19:54:46] seeing you start new notebook... :) [19:54:54] As of now, it the test :) [19:55:03] changing the kernel for scala-yarn [19:55:09] Importing the jar [19:55:51] ottomata: YOU FIXED IT ! [19:55:52] :D [19:56:11] I knew by asking you, things would start working :) [19:56:15] That's super great [19:56:57] haha HWA?! [19:57:02] i fixed it?! [19:57:05] brunel works?! [19:57:13] Well, it didn't work last friday, and now, it does [19:57:16] :D [19:57:16] uhhh [19:57:18] oooook [19:57:31] ottomata: I restarted the server - Might have a role to play [19:57:45] aye [20:00:33] ottomata: I however have new error on something that worked fine before [20:00:42] ImportError: No module named '_tkinter' [20:01:13] ? [20:01:21] when importing matpltlib [20:03:22] joal: this is just local ? [20:03:23] or yarn? [20:03:26] yarn [20:03:32] Didn't test local [20:03:52] The bizarre thing is that it worked with my own pyspark kernel [20:04:09] maybe the python version is not the same, and matplotlib is not installe? [20:04:15] hmmm [20:04:23] i think we have matplotlib installed from deb package [20:04:30] k [20:04:40] on hadoop nodes too [20:04:53] joal: yarn app id? [20:04:54] python version? the notebook tells me 3.5 [20:05:12] ottomata: application_1527607587036_143334 [20:05:16] shoudl be 3.5.3 everywhere [20:05:20] k [20:05:23] not that then :) [20:05:44] _tkinter I think you solve reinstalling matplotlib [20:06:04] oh maybe its a brunel matplotlib dependency mismatch? [20:06:14] !pip install -U matplotlib [20:06:16] ottomata: no brunel involved here :) [20:06:18] oh [20:07:58] joal, when do you get that error? just with import matplotlib ? [20:08:03] yes [20:08:17] actually dsaez: import matplotlib.pyplot as plt [20:08:19] yeah i can't repro [20:08:24] :S [20:08:27] OH [20:08:28] yes i can [20:08:30] with that i can [20:08:59] joal, ottomata, I think this is related with the autocomplete/colors issue [20:09:11] oh? [20:09:49] joal: i just instaleld a package on notebook1003 [20:09:51] python3-tk [20:10:01] i think that's the problem there [20:10:03] Restarting [20:10:10] you prob don't have to restart... :) [20:10:11] but maybe [20:10:33] Success :) [20:10:36] Thanks ottomata :) [20:10:51] will puppetize [20:12:02] joal, can you plot inline? [20:12:06] I get this error: [20:12:17] plt.plot([1,2],[1,2]) [20:12:36] Name: org.apache.toree.interpreter.broker.BrokerException [20:12:36] Message: Traceback (most recent call last): [20:12:36] File "/tmp/kernel-PySpark-75473d11-211f-4a0e-9805-f87529d3ff61/pyspark_runner.py", line 194, in [20:12:36] eval(compiled_code) [20:12:50] _tkinter.TclError: no display name and no $DISPLAY environment variable [20:12:51] Will try dsaez [20:13:03] I think that last line is the key [20:13:12] no $DISPLAY environment variable [20:14:13] 10Analytics, 10Analytics-Kanban, 10Operations, 10Patch-For-Review: Please install Text::CSV_XS at stat1005 - https://phabricator.wikimedia.org/T199131 (10Ottomata) a:03Ottomata [20:14:27] 10Analytics, 10Analytics-Kanban, 10Operations, 10Patch-For-Review: Please install Text::CSV_XS at stat1005 - https://phabricator.wikimedia.org/T199131 (10Ottomata) Done! [20:15:37] (03PS23) 10Joal: Add MediawikiHistoryChecker spark job [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/439869 (https://phabricator.wikimedia.org/T192481) [20:16:29] https://stackoverflow.com/questions/37604289/tkinter-tclerror-no-display-name-and-no-display-environment-variable [20:16:32] (03PS11) 10Joal: Add validation step in mediawiki-history jobs [analytics/refinery] - 10https://gerrit.wikimedia.org/r/440005 (https://phabricator.wikimedia.org/T192481) [20:18:56] ottomata: something else to consider for Toree conf is using spark.dynamicExecution.maxExecutors to a nive default value [20:19:26] oh? [20:19:30] it should be set no? [20:19:35] ottomata: It's super easy to overload the clsuter [20:19:44] So far I don't think it is [20:19:48] oh you are right [20:19:49] hm [20:19:55] overload! overload!! [20:20:03] ottomata: I have more than 800 executors now :) [20:20:05] is it easier to overload the cluster than it is via the shell we have now? [20:20:05] :D [20:20:08] :D [20:20:12] yeah!! power!!! [20:20:44] * dsaez is holding himself [20:21:02] same ottomata - It's jsut that since there is absolutely no feedback on 'power' as dsaez says, you can just easily forget [20:21:22] ottomata: Plus the idea that more and more people will start to use [20:21:46] ottomata: We can wait until it happens to make decision :) [20:22:14] aye [20:22:29] ottomata, regarding the visualization problems, check this> https://stackoverflow.com/questions/39570019/how-to-get-ipython-inbuild-magic-command-to-work-in-jupyter-notebook-pyspark-ker [20:22:43] 10Analytics: Consider changing EventLogging to encode events using base64 instead of uriEncode - https://phabricator.wikimedia.org/T199148 (10mforns) [20:23:00] with the current setup inline plots are not working, and inline plots are the main reason to use notebooks [20:24:23] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Spark notebook integration - https://phabricator.wikimedia.org/T190443 (10diego) Regarding the visualization problems, autocomplete, but specially the inline plots, here there is a possible way to explore: https://stackoverflow.com/questions/39570019/how-... [20:25:43] hm [20:29:22] Yay nuria_ :) I have a working solution at last for the checker (with numbers and all) [20:39:58] dsaez: i'm going to merge your ticket into the parent jupyter + spark one [20:40:02] then we can work on the other stuff togehter [20:40:05] thanks, this is all very ehlpful [20:40:30] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Spark notebook integration - https://phabricator.wikimedia.org/T190443 (10Ottomata) [20:40:32] 10Analytics, 10Patch-For-Review: Errors with the new SWAP notebooks - https://phabricator.wikimedia.org/T198909 (10Ottomata) [20:41:22] 10Analytics, 10Patch-For-Review: Errors with the new SWAP notebooks - https://phabricator.wikimedia.org/T198909 (10Ottomata) I think we got this working! I just merged this task into {T190443} so we can keep tracking more issues there. [20:41:59] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Spark notebook integration - https://phabricator.wikimedia.org/T190443 (10Ottomata) [20:42:25] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Spark Jupyter Notebook integration - https://phabricator.wikimedia.org/T190443 (10Ottomata) [20:42:27] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Spark Jupyter Notebook integration - https://phabricator.wikimedia.org/T190443 (10Ottomata) p:05Low>03Normal [21:01:21] (03PS2) 10Joal: Update MediawikiHistoryChecker adding reduced [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/441378 (https://phabricator.wikimedia.org/T192481) [21:16:48] 10Analytics, 10MinervaNeue, 10Readers-Web-Backlog, 10Design: Sticky header instrumentation - https://phabricator.wikimedia.org/T199157 (10Jdlrobson) [21:24:13] 10Analytics, 10Performance-Team (Radar): Archive Kasocki repository - https://phabricator.wikimedia.org/T190365 (10Krinkle) [21:29:02] 10Analytics, 10Performance-Team (Radar): Archive Kasocki repository - https://phabricator.wikimedia.org/T190365 (10Krinkle) [21:35:19] 10Analytics, 10Cleanup, 10Performance-Team: Archive Kasocki repository - https://phabricator.wikimedia.org/T190365 (10Krinkle) a:03Krinkle [21:35:57] 10Analytics, 10Cleanup, 10Performance-Team: Archive Kasocki repository - https://phabricator.wikimedia.org/T190365 (10Krinkle) 05Open>03Resolved [21:41:47] nice joal [21:43:26] mforns: is sashil's stuff merged such we can deploy wikistats ? [22:18:39] *sahil [23:25:33] neilpquinn: yt? [23:26:17] nuria_: yep [23:26:31] what's up? [23:26:48] neilpquinn: see my reply, the media wiki snapshot is ready for you to use [23:27:19] neilpquinn: that process is completely decoupled from erik's wikistats files that are updated in a differenet schedule using different sources, makes sense? [23:28:03] nuria_: I know... [23:28:26] nuria_: I'm happily using the mediawiki_history snapshot right now [23:28:33] neilpquinn: ah ok [23:28:58] nuria_: I'm asking specifically about the wikistats files...sorry if that wasn't clear :0 [23:29:00] :0 [23:29:04] Damn it [23:29:14] my shift key isn't working...so I'm sending people weird smilies [23:30:20] nuria_: but your response hasn't shown up for me yet. Going into a meeting for the next 30 minutes, so hopefully it'll be here when I'm done :)