[01:03:24] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Newpyter - SWAP Juypter Rewrite - https://phabricator.wikimedia.org/T224658 (10SNowick_WMF) Thanks for the design doc @Ottomata. The Product Analytics team discussed this and we are hoping we can find a way to include notebook sharing or some sort of cent... [10:32:09] Hello! Is there any easy-pick task of wikistats for me to get started with the codebase? [10:45:07] milimetric any small bug easy to tackle? [11:27:34] Quasipodo: if you want to look around https://phabricator.wikimedia.org/project/view/206/ [11:31:12] elukey yep I already looked around. Still would be nice if someone can point me to an easy one [11:34:51] Quasipodo: sure makes sense, but it is saturday so few people arounds.. I'd wait for more formal answers on analytics@ :) [11:37:42] sure, thank you for getting back to me anyway! [11:37:54] np! [15:22:14] PROBLEM - Hadoop NodeManager on analytics1071 is CRITICAL: PROCS CRITICAL: 0 processes with command name java, args org.apache.hadoop.yarn.server.nodemanager.NodeManager https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hadoop/Administration [16:17:52] * addshore cant make installing pip packages in python notebooks on SWAP work :( [16:52:47] addshore: what do you mean? [16:54:23] !log restart yarn nodemanger on analytics1071 - network errors in the logs [16:54:24] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [16:55:20] elukey: Im doing !pip3 install sparql-dataframe SPARQLWrapper and then tyring to import those packages but it doesnt work D: [16:55:25] on PySpark - Local [16:55:31] ImportError: No module named 'SPARQLWrapper' :( [16:55:56] RECOVERY - Hadoop NodeManager on analytics1071 is OK: PROCS OK: 1 process with command name java, args org.apache.hadoop.yarn.server.nodemanager.NodeManager https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hadoop/Administration [17:00:32] addshore: mmm on what host? [17:00:43] notebook1003? [17:00:51] *looks* [17:01:02] also keep it mind that we have jupyter now on the stat boxes too :) [17:01:16] yup, 1003 [17:01:26] ooooooh, I did not know that :D [17:01:46] https://usercontent.irccloud-cdn.com/file/JnimAnGt/image.png [17:01:50] the pip bit said it did the install [17:02:10] * addshore hasn't tried installing things with pip in a notebook before [17:02:41] mmm jupyter-addshore-singleuser on notebook1003 doesn't show any log, weird [17:03:34] ok so [17:03:34] (venv) elukey@notebook1003:/home/addshore$ pip3 freeze | grep -i sparql [17:03:37] sparql-dataframe==0.3 [17:03:40] SPARQLWrapper==1.8.5 [17:03:47] that looks corret [17:03:54] *correct [17:04:14] addshore: is you notebook saved? Can I restart it? [17:04:24] go for it! [17:05:40] addshore: I stopped it, can you try to login again? [17:05:44] it should start a new one [17:05:48] let's see if it works [17:06:36] ImportError: No module named 'SPARQLWrapper' [17:06:38] :( [17:07:05] should I be using pyspark-local? :P [17:07:24] weird, in your venv the import works [17:07:35] perhaps my notebook isnt using my venv? [17:07:52] but also that confusing as I did the pip install in the notebook... so if definitely knows about it [17:10:17] trying in mine [17:10:23] otherwise, i realize its saturday, so don't worry about looking too deep :) I can do it another way for now [17:10:46] sure sure no problem, I can check 5 mins and then we can see on monday [17:10:53] :) [17:11:47] I can repro yes [17:11:53] oooh, interesting :D [17:13:36] ok so with the Python3 kernel it works [17:13:51] hmm, okay, I guess i should be able to just use that? [17:14:21] trying with pyspark yarn [17:15:40] nope it doesn't work [17:15:46] pyspark yarn doesnt seem to be working for me, which is why I ended up trying out the pyspark -local one [17:17:35] same problem on stat1004 [17:18:45] I guess i can use the python3 one, but I need to read the docs so I can make my own spark thingy :) [17:21:42] so there was a similar issue happened a while ago, pip was not available in the python notebook [17:21:49] but then after a general restart it worked [17:22:02] in this case it seems that the pyspark kernels do not source the venv [17:25:37] addshore: so it works on stat1005 [17:25:57] that is on buster, so a lot of newer stuff [17:26:07] if you want to test there it could unblock you [17:27:36] thankks!!! [17:28:33] Quasipodo: let's try to keep communication to one channel and if you don't hear back in a couple of days you can ping on here. It will help limit the noise on the lists, everyone's extra constrained for time these days. I replied via email and we can talk there for now. [18:44:49] bah [18:48:06] Anyone got any tips for building an WHERE IN (%s) clause in python for use in a hive query in a notebook? [18:48:10] I seem to be failing hard [18:48:21] HiveServer2Error: Error while compiling statement: FAILED: ParseException line 9:19 cannot recognize input near 'IN' '(' '[' in expression specification [21:21:25] im a fool... fixed it...