[00:56:38] Analytics-Cluster, Analytics-Kanban: Spike replacing Camus with Gobblin {hawk} [13 pts] - https://phabricator.wikimedia.org/T111409#1643705 (madhuvishy) a:madhuvishy [01:59:09] Analytics-Kanban, Patch-For-Review: Bug: client IP is being hashed differently by the different parallel processors - https://phabricator.wikimedia.org/T112688#1643755 (Milimetric) By the way, for the https://edit-analysis.wmflabs.org/compare/ dashboard, if you clear cache it looks like the data is back in... [08:31:38] Analytics-Tech-community-metrics, DevRel-September-2015: Provide open changeset snapshot data on Sep 22 and Sep 24 (for Gerrit Cleanup Day) - https://phabricator.wikimedia.org/T110947#1644167 (Dicortazar) Sure @Aklapper, I'll start working on the queries. However, I'd like to be sure we're still using the... [08:33:37] Analytics-Tech-community-metrics, DevRel-September-2015: Provide open changeset snapshot data on Sep 22 and Sep 24 (for Gerrit Cleanup Day) - https://phabricator.wikimedia.org/T110947#1644174 (Qgil) It would be good to be consistent with the filters applied: WIP, CR -1, V -1. [10:31:24] Analytics: pagecounts-raw files missing since yesterday (15th September) 18.00 - https://phabricator.wikimedia.org/T112741#1644507 (DianaArq) NEW [10:40:37] (Abandoned) JanZerebecki: Modify access rules [wikidata/analytics/dashboard] (refs/meta/config) - https://gerrit.wikimedia.org/r/238319 (owner: Christopher Johnson (WMDE)) [10:41:03] Analytics: pagecounts-raw files missing since yesterday (15th September) 18.00 - https://phabricator.wikimedia.org/T112741#1644533 (JAllemandou) We are aware of the issue, we are currently working on solving that and catching back on data. [10:56:13] Analytics: pagecounts-raw files missing since yesterday (15th September) 18.00 - https://phabricator.wikimedia.org/T112741#1644553 (DianaArq) Ok, thanks for your quick answer and for letting me know the status of the issue [10:57:29] hey mforns [10:57:34] hi joal [10:57:36] :] [10:57:40] We are experiencing issues with the cluster currently [10:57:44] mmm [10:57:53] If you don't mind, I'll let you lead the interview [10:57:58] do you want me to ok [10:58:03] I'll be there, but trying to fix in the mean time [10:58:13] ok cool [11:09:25] (PS1) Christopher Johnson (WMDE): completed todo data graphs order stacked bar data, added markdown [wikidata/analytics/dashboard] - https://gerrit.wikimedia.org/r/238716 [11:13:54] (PS2) Christopher Johnson (WMDE): completed todo data graphs order stacked bar data, added markdown [wikidata/analytics/dashboard] - https://gerrit.wikimedia.org/r/238716 [11:16:06] Analytics-Engineering, Wikidata: Dashboard repository for limn-wikidata-data - https://phabricator.wikimedia.org/T112506#1644570 (Addshore) After further discussion with @Milimetric I think Dashiki might be the way forward with some / all of this. And yet again I may now not end up trying to use limn [11:17:56] (CR) Christopher Johnson (WMDE): [C: 2 V: 2] completed todo data graphs order stacked bar data, added markdown [wikidata/analytics/dashboard] - https://gerrit.wikimedia.org/r/238716 (owner: Christopher Johnson (WMDE)) [11:35:26] Analytics, operations: Moving analysis data from flourine to analytics cluster - https://phabricator.wikimedia.org/T112744#1644598 (Addshore) NEW [11:35:47] Analytics, operations: Moving analysis data from flourine to analytics cluster - https://phabricator.wikimedia.org/T112744#1644605 (Addshore) [11:36:31] Analytics, operations: Moving analysis data from flourine to analytics cluster - https://phabricator.wikimedia.org/T112744#1644598 (Addshore) [11:48:31] mforns: cave in 2 mins ? [11:48:45] joal, xD I'm already there [11:56:58] Analytics-Engineering, Wikidata: Dashboard repository for limn-wikidata-data - https://phabricator.wikimedia.org/T112506#1644630 (Christopher) @Addshore yes, I will just create a separate remote download set function and point it at your sources so that we can use both local and remote data. One... [12:02:20] Analytics-Backlog: Investigate sample cube pageview_count vs unsampled log pageview count - https://phabricator.wikimedia.org/T108925#1644634 (JAllemandou) @TBayer: nothing great from the pageview / webrequest ratio. It is pretty stable, with bumps every now and then, but nothing big enough to explain a diffe... [12:26:06] Analytics-Tech-community-metrics, DevRel-September-2015: Automated generation of (Git) repositories for Korma - https://phabricator.wikimedia.org/T110678#1644655 (Qgil) Going back to {T103292}, if we keep the Git history of inactive repos then we are more certain that i.e. "Authors" data of a year ago at... [12:35:15] Analytics-Tech-community-metrics, DevRel-September-2015: Automated generation of (Git) repositories for Korma - https://phabricator.wikimedia.org/T110678#1644664 (Aklapper) Qgil convinced me. So yeah, we should keep such data. [12:50:13] Analytics, Developer-Relations, MediaWiki-API, Research consulting, and 3 others: Metrics about the use of the Wikimedia web APIs - https://phabricator.wikimedia.org/T102079#1644688 (Qgil) Just checking, by "active users" we mean products using our Web APIs, not the individual users (people) using... [13:22:57] hi ottomata [13:23:28] Hi milimetric and ottomata [13:23:32] so what shall we do about that shared key? Erik suggested etcd [13:24:07] (https://github.com/coreos/etcd) [13:24:25] ottomata: see my email: issues with cluster due to hive server changes :S [13:26:51] (CR) Joal: [C: -1] "Comments inline :)" (5 comments) [analytics/refinery/source] - https://gerrit.wikimedia.org/r/237274 (owner: Nuria) [13:27:38] ottomata, milimetric, I tried to contact alex this morning and afternoon, but no chance :( [13:28:12] milimetric: if you have time, would you give me a few minutes to validate my ideas on guard job ? [13:28:41] (PS1) Christopher Johnson (WMDE): removes smoothing adds mailing lists [wikidata/analytics/dashboard] - https://gerrit.wikimedia.org/r/238739 [13:30:20] joal: to the batcave [13:31:16] (CR) Christopher Johnson (WMDE): [C: 2 V: 2] removes smoothing adds mailing lists [wikidata/analytics/dashboard] - https://gerrit.wikimedia.org/r/238739 (owner: Christopher Johnson (WMDE)) [13:31:56] milimetric: omw ! [14:09:25] joal, milimetric sorry, was in interview (that i was late for :( ) [14:09:34] need shower and something to eat, back asap [14:09:34] np ottomata1 [14:10:07] milimetric: etcd is an interesting idea! i will talk to _joe_ about it [14:11:13] ottomata1: if etcd, why not zookeeper (already in our stack ...) [14:17:46] joal: i can pick up the guard stuff if you are too busy after i address your changes on udf code which will take no time otherwise i will pick up something else [14:18:24] nuria: we brain bounced with milimetric around guard job [14:18:40] Krinkle: looking at graphs! [14:18:49] nuria: your opinion will be of interest, and whoever does it doesn't matter ;) [14:20:32] Hey halfak, some food for thoughts around zeppelin: https://news.ycombinator.com/item?id=9463809 [14:21:31] joal: i am going to work on udf first so as to make sure that is done today [14:21:50] awesome, I'll deploy either tonight or tomorrow then :) [14:21:54] nuria: --^ [14:22:04] nuria: do my comments make sense ? [14:22:12] joal: ya totally [14:22:16] cool :) [14:22:38] joal, I'm worried this thread is not helping me like Zeppelin. [14:22:49] halfak: :D [14:23:10] halfak: do you use iPython or jupyter? [14:23:36] Yeah... When I can. It doesn't fit into my workflow well because I like to keep my journal on the wiki. [14:23:47] right halfak [14:24:17] I've been considering switching my work such that I store the iPython notebooks on github and just reference them from on-wiki. [14:24:23] do you think Zeppelin would be of use to you if we install it ? [14:24:51] Does it support R and Python? [14:25:23] python yes, sure, R not natively, but there is a module, so it might come soon [14:25:55] Why would we do this instead of Jupyter? [14:26:14] Straight integration with hadoop [14:26:31] I don't know if Jupyter has that though [14:26:39] SQL? [14:26:45] pyspark? [14:27:38] hadoop integration for hive queries for instance [14:27:47] so SQL-ish [14:27:55] Yeah... I suspect that python has an SQL connector for hive [14:27:58] and so would R [14:28:29] AND IT'S THE GROSSEST THING EVER https://cwiki.apache.org/confluence/display/Hive/HiveClient#HiveClient-Python [14:28:35] Seriously. [14:29:01] :) [14:29:05] aouch [14:29:14] If you need 6 import lines just to run a query, your probably not being pythonic. [14:29:15] my eyes still burn [14:30:14] so halfak, zeppelin or Ipython ... debate needs to happen :) [14:30:40] heh. Well the reason I bring up ipython/Jupyter is because it seems to be dominating at the moment. [14:31:05] Also, it targets languages that analysts/researchers use every day over languages used primarily by engineers. [14:31:33] But it could be that Zeppelin does the same and better [14:31:37] halfak: I'm sure it dominates, that's why I am keen to discuss and don't say it will be Zeppelin :-P [14:31:39] I'm not sure. [14:31:51] Oh. Dominant as in that's what's most used across the web. [14:31:59] halfak: I had it :) [14:32:15] I guess I don't have hard numbers, but I get that impression. [14:32:21] halfak: I suspect the only thing Zeppelin will do better is tight hadoop ? spark integration [14:33:15] Seems like that's an issue within the language though, isn't it? Can't we just make the python/R HIVE connector better? [14:36:10] Maybe not. I wonder if you could provide a demo of Zeppelin. [14:36:18] Or if there's a good one you recommend. [14:37:49] halfak: not seen any demo yet, but I'd love to have a playground [14:38:08] Yeah... I wouldn't mind trying it out if you think it wouldn't take much time to get an instance going. [14:38:14] I'd like to have a quarry for hive. [14:38:24] halfak: Right [14:38:28] Or maybe even a more powerful UI for quarry. [14:38:33] :) [14:38:35] halfak: I'll surely let you know [14:38:40] kk [14:39:00] * halfak uses the archaic version of git on the altiscale servers [14:39:51] halfak: only concern for quarry is openess - We can't let everybody access wrong data, and possibly also break the cluster because of resource contention :) [14:40:01] So it'll need to be internal first (if not always|) [14:40:10] Makes perfect sense. [14:40:30] halfak: archaic ? [14:41:05] Yeah. It has weirdly formatted messages that set of the anomaly detectors in my brain. [14:41:16] 1.7.1 [14:41:25] Not that old, I guess, but it seems really weird to me. [14:41:33] It doesn't know how to ask for my username. [14:41:46] It doesn't tell me which repo it is interacting with when I push. [14:41:54] mwarf [14:42:03] Just little things that add up to me asking myself "am I doing this right? Something seems wrong." [14:42:14] yeah, I can hear that [14:42:58] joal, re. quarry, IMO, the killer value is the transparency and accountability in querying. [14:43:16] You want to know what questions people have? Go read through the recent queries on quarry. [14:43:30] halfak: perfectly on the spot [14:43:45] You'll learn a lot about (1) what people want to know, (2) how they are going about knowing it and (3) cool SQL tricks you didn't consider! [14:43:56] :D [14:44:05] Now if only I could submit pull requests to their queries. I've become the resident query optimizer for MediaWiki. [14:44:11] At least for quarry. [14:45:17] Good and bad sides of one thing always come together I guess halfak :) [14:46:29] indeed. [14:54:26] joal: i did restart mysql yesterday, and added a couple of settings [14:54:32] looking into it, settings I changed shoudln't have affected [14:54:41] maybe hive metastore just needs a restart too [14:54:47] ottomata: I have guessed that based on the tasks timeline :) [14:54:49] aye [14:54:59] Cna you for that ? [14:55:13] Can you go for that (hard typing again ...) [14:55:32] joal: where did you find that error? [14:55:56] no partition added in webrequest_load since yesterday [14:56:03] So I tried to load on manually :) [14:56:14] and got that error [14:56:39] k [14:56:48] want to reproduce so I can see if anything i do fixes [14:57:53] I precisely did that: https://gist.github.com/jobar/2c53603dd2afa852c376 [15:09:17] ok, joal i think fixed. [15:09:57] i will tell load jobs to rerun. [15:10:08] ottomata: ok, do you want me to do that ? [15:10:40] ottomata: when you have time, please let me know what went wrong :) [15:11:13] k, naw am doing [15:11:17] will tell you in just a few [15:11:46] thx :) [15:13:46] k, they should be rerunning [15:13:58] phew [15:13:58] ok [15:14:10] not entirely sure why this was a problem [15:14:15] but, what I did was to enable mysql binlog [15:14:27] so that I could set up replication of the mysql server there to the new hive/oozie host [15:14:30] ok (ability to replicate, right ?) [15:14:38] :) [15:14:50] i was talking with jaime yesterday, and we decided to do this so that I could move mysql server and apply the innodb_file_per_table setting [15:15:03] which would make our live easier in the future for many things, including backups and potential restores [15:15:20] What is that setting ? binlog related I guess ? [15:15:20] i *could* just stop mysql and copy the whole datadir, but we decided to do this more complicated thing to help us in the future [15:15:28] makes sense [15:15:29] no, not binlog related [15:15:37] I think I would have gone the same path [15:15:40] that setting makes it so that innodb stores its table data in separate files [15:15:45] instead of one monolithic file [15:15:55] but, in order to move to that setting [15:15:58] ok ... How is that better for us ? [15:17:02] it makes backups and restores easier...i think. jaime was recommending it, so I said ok [15:17:18] ok, sorry for bothering (I love to learn new stuff :) [15:17:44] yeah, we might want to press him on that. the migration would be much easier if i kept the same settings [15:17:56] anyway, the default binlog_format is statement based [15:18:05] and somehow that is incompatible with whatever query hive was running [15:18:10] so, i set it to ROW based [15:18:27] which is less efficient i think, but it shoudln't really matter for our case here, cause we aren't really going to be running slaves. [15:18:31] this is just for the migration [15:18:38] i'm going to ask jaime about this before I move on [15:19:01] when you say ROW based xsetting, you are talking about the bin log, write, or hive ? [15:19:25] binlog [15:19:29] its the format of the binlog [15:19:31] k, makes sense [15:19:33] statement vs row [15:19:36] yup [15:19:46] are we doing the offsite in nov now? [15:20:16] hm, we disussed that a bit yesterday, and flights were cheaper, so we said why not [15:20:27] not ok for you ? [15:20:42] naw, i'll need to go to virginia mid that week [15:20:50] Mwarf ... [15:20:52] i'm hosting a big friends thanksgiving that weekend [15:21:07] I wonder if Kevin hasn't already asked travel to book [15:21:12] Let's talk about that in standup [15:21:21] k [15:21:29] ottomata: sounds great :) [15:22:15] ottomata: finally, what solved the stuff ? [15:23:36] hive? [15:23:41] yup [15:23:41] setting binlog_format = ROW [15:23:45] instead of STATEMENT [15:23:47] don't really know why [15:23:57] Ohhhhh, sorry, I thought you changed that yesterday, ok makes sense :) [15:24:02] http://stackoverflow.com/questions/19205318/cannot-create-database-with-hive [15:24:07] Analytics-Kanban: Prepare lightning talk on EL audit {tick} [5 pts] - https://phabricator.wikimedia.org/T112126#1645107 (mforns) [15:24:09] no, yesterday i just turned on binlog [15:24:11] which wasn't on before [15:24:16] and the default format is STATEMENT [15:24:30] thank you stack overflow and google. [15:24:39] otherwise that would have probably taken me many hours to solve [15:24:48] Analytics-Kanban: Put toghether an updated documentation on EventLogging {tick} {oryx} [8 pts] - https://phabricator.wikimedia.org/T112124#1645112 (mforns) [15:24:59] ottomata: those two have save my life a good number of time :) [15:26:22] Analytics: pagecounts-raw files missing since yesterday (15th September) 18.00 - https://phabricator.wikimedia.org/T112741#1645134 (JAllemandou) Bug found and solved, related to hive server migration T110090. Data is slowly backfilling, we expect it to be full restored by tomorrow. [15:27:29] ottomata: next time you deploy stuff on cluster let me know, it'll be less of a surprise in the morning :-P [15:27:42] sure thing, many apologies [15:27:51] i'm surprised that caused problems at all [15:28:00] It was more of a joke, but it's still a good idea :) [15:28:08] ya [15:28:37] yeah, side effects are always unplanned (if they were planned, we wouldn't call them side effects :) [15:31:49] Analytics-Kanban: Document work so far on Last access uniques and validating the numbers {bear} [8 pts] - https://phabricator.wikimedia.org/T112010#1645188 (ggellerman) [15:34:57] Analytics-Cluster, Analytics-Kanban: pagecounts-raw files missing since yesterday (15th September) 18.00 - https://phabricator.wikimedia.org/T112741#1645205 (kevinator) [15:36:06] Analytics-Kanban, Patch-For-Review: Bug: client IP is being hashed differently by the different parallel processors - https://phabricator.wikimedia.org/T112688#1645214 (Milimetric) a:Milimetric [15:42:17] nuria: was going to say: i bet you can test code with the same class names if you unset auxpath when you start hive [15:43:19] ottomata: I'll try that [15:44:56] ottomata: and how do you set it? [15:45:00] ottomata: will test now [15:45:26] ottomata: ah like hive --auxpath $HIVE_HOME/lib/hive-hbase-handler-0.8.0-SNAPSHOT.jar:$HIVE_HOME/lib/hive-contrib-0.8.0-SNAPSHOT.jar [15:45:42] ottomata: i doubt it, but will confirm [15:48:44] ottomata, regarding the documentation about EL on Kafka, I ended up removing lots of stuff, because it was really outdated... so you may find the architecture page short in content. Feel free to ask me to add things! [15:50:21] Analytics-Cluster, Analytics-Kanban: pagecounts-raw files missing since yesterday (15th September) 18.00 [3 pts] - https://phabricator.wikimedia.org/T112741#1645284 (ggellerman) a:Ottomata [15:53:04] nawi think you will need to do it with -D [15:53:10] in order to unset the conf vaule [15:53:10] um [15:53:22] maybe [15:53:28] -Dhive.aux.jars.path='' [15:53:34] or use your jar [15:54:59] cool, ok thanks mforns [16:07:35] Going for "hive --auxpath ''" works ! [16:14:52] ottomata: https://github.com/wikimedia/operations-puppet/blob/production/manifests/role/eventlogging.pp#L161 [16:15:03] there's only 1? [16:15:09] am i missing something [16:26:55] (PS6) Nuria: Make pageview definition aware of preview parameter [analytics/refinery/source] - https://gerrit.wikimedia.org/r/237274 [16:27:14] (CR) Nuria: Make pageview definition aware of preview parameter (5 comments) [analytics/refinery/source] - https://gerrit.wikimedia.org/r/237274 (owner: Nuria) [16:32:29] madhuvishy: that's the hiera default [16:32:45] ottomata: aah, okay it's in hiera [16:32:47] thanks [16:33:11] https://github.com/wikimedia/operations-puppet/blob/production/hieradata/eqiad.yaml#L75 [16:36:28] o/ ottomata, I want a new package installed on stat1002/1003. What are the right projects to tag in phab on the task? [16:36:44] hm, dunno actually. analytics-backlog? [16:37:23] Analytics-Backlog, Privacy: Identify possible user identity reconstruction using location and user_agent_map pageview aggregated fields to try to link to IPs in webrequest - https://phabricator.wikimedia.org/T108843#1645473 (kevinator) p:Normal>High [16:37:42] kk. FWIW, it'll be snzip and it doesn't look like it has a Deb. [16:37:43] https://github.com/kubo/snzip [16:37:55] I've been using it locally and it works pretty well. [16:37:59] Analytics-Kanban, Patch-For-Review: Bug: client IP is being hashed differently by the different parallel processors - https://phabricator.wikimedia.org/T112688#1645477 (Ottomata) I just restarted eventlogging client side processor running only 1 process. This should fix the problem until we actually fix t... [16:38:16] Analytics-Kanban, Privacy: Identify possible user identity reconstruction using location and user_agent_map pageview aggregated fields to try to link to IPs in webrequest - https://phabricator.wikimedia.org/T108843#1645479 (JAllemandou) [16:38:48] btw, just deploeyd the change to use only one processor. milimetric, madhuvishyetc. [16:38:59] thx ottomata [16:40:43] I just pinged the thread about it, so they know the time [16:40:49] Analytics-Backlog: Install snzip on stat1002 and stat1003 - https://phabricator.wikimedia.org/T112770#1645489 (Halfak) NEW [16:40:56] * milimetric lunch [16:43:33] milimetric: ottomata it would be great if you could take a look at https://phabricator.wikimedia.org/T112744 :) [16:54:35] addshore: where do the api logs come from? [16:54:40] udp2log from mediawiki? [16:56:51] ah, demux.py [16:57:15] hm [16:57:36] ah yes, udp2log [16:59:01] Analytics, operations: Moving analysis data from flourine to analytics cluster - https://phabricator.wikimedia.org/T112744#1645588 (Ottomata) Yes, we can do this. fluorine already has an rsyncd running that allows stat1002 to copy files. This would just be a matter of adding a cron job to rsync them to... [17:01:30] ottomata: yes! [17:02:51] Analytics, operations: Moving analysis data from flourine to analytics cluster - https://phabricator.wikimedia.org/T112744#1645597 (Addshore) I'm guessing we don't want to rsync the archived logs files themselves (as that is basically 800GB of duplicated data) or does 800GB not matter? And if we just set... [17:05:06] addshore: i think 800G is fine. [17:05:09] 18T avail on stat1002 [17:05:13] cool! :) [17:05:40] I guess I should be able to make a patch for that in puppet later today or tommorrow then! [17:05:52] my day today has not gone to plan at all..... no time.. [17:18:28] Analytics, Analytics-Cluster, Fundraising Tech Backlog, Fundraising-Backlog, operations: Verify kafkatee use for fundraising logs on erbium - https://phabricator.wikimedia.org/T97676#1645798 (Jgreen) [17:19:39] Analytics-Cluster, operations, Patch-For-Review: Turn off webrequest udp2log instances. - https://phabricator.wikimedia.org/T97294#1645804 (Jgreen) [17:24:10] Analytics, Analytics-Cluster, Fundraising Tech Backlog, Fundraising-Backlog, operations: Verify kafkatee use for fundraising logs on erbium - https://phabricator.wikimedia.org/T97676#1645833 (Jgreen) The new kafkatee-based banner log pipeline is up and running on americium.frack.eqiad.wmnet, w... [17:24:29] ottomata: udp2log truned off ! [17:24:32] awesome :) [17:28:04] (back) [17:29:35] Analytics-Backlog, Analytics-Kanban, Privacy: Identify possible user identity reconstruction using location and user_agent_map pageview aggregated fields to try to link to IPs in webrequest - https://phabricator.wikimedia.org/T108843#1645888 (Nuria) a:Nuria [17:37:13] Analytics, operations: Moving analysis data from flourine to analytics cluster - https://phabricator.wikimedia.org/T112744#1645938 (Legoktm) [17:38:25] Analytics, operations, Patch-For-Review: Moving analysis data from flourine to analytics cluster - https://phabricator.wikimedia.org/T112744#1645942 (Addshore) This will actually result in roughly 2.4T currently as the retention is 90 days on stat1002 [17:38:33] joal: eh? [17:38:48] ottomata: will be 2.4T roughly actually, due to the retention of the current rsyncs being 90 days [17:39:14] eh? [17:50:57] Analytics-EventLogging: Promise returned from LogEvent should resolve when logging is complete - https://phabricator.wikimedia.org/T112788#1646019 (Ejegg) NEW [17:55:07] ottomata: I was looking at the ticket and was thinking that udp2log for webrequest were to be closed, but anyway :) [17:55:19] ah, no, that isn't udp2log for web, that is for mw [17:55:20] different [17:55:26] ok makes sense [17:55:42] And in fact, it seems that hive --auxpath doesn't work :( [17:55:52] tyr with -D [17:56:14] hive doesn't accept -D [17:57:35] n? [17:57:37] no? [17:57:37] hm [17:57:40] ok, iwll look into i [17:57:42] t [17:57:44] maybe. :) [17:57:46] :) [17:57:51] i feel like there are a billion things i have to do ahHHhh [17:57:53] and I need to eat [17:57:53] AHHH [17:57:54] and meetings [17:57:55] AHHH [17:57:55] where is that param set up ? [17:58:01] in /etc/hive/conf/hive-site.xml [17:58:01] :D [17:58:02] see bottom [17:58:12] Ok, I'll continue to look into it as well [17:58:14] k [18:01:02] Analytics-Backlog: Investigate sample cube pageview_count vs unsampled log pageview count - https://phabricator.wikimedia.org/T108925#1646071 (Milimetric) @Tbayer: that table was just loaded to answer some answers from legal. I won't be keeping it updated and will probably delete it after you're done with it... [18:02:03] lucnh! [18:04:21] heh :) I'm gonna ping otto the millisecond he comes back online, just to freak him out [18:06:09] Found a way to override the hive settings nuria and ottomata [18:06:27] Here it is: hive --hiveconf hive.aux.jars.path= [18:12:52] i'm sad because no new york [18:13:58] milimetric: that stat paper you were gonna send me ... [18:14:30] nuria: I sent it to internal right during the meeting [18:14:37] k [18:15:01] the subject is "Designing Statistical Privacy for Your Data" [18:15:58] Analytics-EventLogging, Unplanned-Sprint-Work, Fundraising Sprint Snoop (Dogg|Lion), Patch-For-Review: Promise returned from LogEvent should resolve when logging is complete - https://phabricator.wikimedia.org/T112788#1646145 (Ejegg) p:Triage>Normal a:Ejegg [18:16:12] Analytics-EventLogging, Unplanned-Sprint-Work, Fundraising Sprint Snoop (Dogg|Lion), Patch-For-Review: Promise returned from LogEvent should resolve when logging is complete - https://phabricator.wikimedia.org/T112788#1646019 (Ejegg) [18:18:57] (PS7) Nuria: Make pageview definition aware of preview parameter [analytics/refinery/source] - https://gerrit.wikimedia.org/r/237274 [18:22:39] Analytics-Backlog: Install snzip on stat1002 and stat1003 - https://phabricator.wikimedia.org/T112770#1646168 (Halfak) [18:32:19] (CR) Nuria: "Tested #7 on cluster, works w/o issues." [analytics/refinery/source] - https://gerrit.wikimedia.org/r/237274 (owner: Nuria) [18:38:01] (PS8) Nuria: Make pageview definition aware of preview parameter [analytics/refinery/source] - https://gerrit.wikimedia.org/r/237274 [18:40:27] (CR) Joal: [C: 2 V: 2] "LGTM !" [analytics/refinery/source] - https://gerrit.wikimedia.org/r/237274 (owner: Nuria) [18:42:27] (PS1) Joal: Update changelog for version v0.0.19 [analytics/refinery/source] - https://gerrit.wikimedia.org/r/238814 [18:43:12] Guys, off for today ! [18:43:19] See you tomorrow :) [18:50:23] (CR) Nuria: [C: 2] "Looks good, @joal to self merge tomorrow." [analytics/refinery/source] - https://gerrit.wikimedia.org/r/238814 (owner: Joal) [19:11:27] Analytics-Backlog: Install snzip on stat1002 and stat1003 - https://phabricator.wikimedia.org/T112770#1646510 (Halfak) It seems like the snappy library is a dependency. See http://packages.ubuntu.com/trusty/libsnappy-dev [19:19:13] looks good, I modified a couple of things, but looks good. [19:20:45] mforns_gym: ^ [19:20:49] re el doc [19:32:00] (PS1) Ori.livneh: Support gauge ('g') metric type [analytics/statsv] - https://gerrit.wikimedia.org/r/238836 (https://phabricator.wikimedia.org/T112713) [19:32:20] (CR) Ori.livneh: [C: 2] Support gauge ('g') metric type [analytics/statsv] - https://gerrit.wikimedia.org/r/238836 (https://phabricator.wikimedia.org/T112713) (owner: Ori.livneh) [19:33:23] (CR) Ori.livneh: [V: 2] Support gauge ('g') metric type [analytics/statsv] - https://gerrit.wikimedia.org/r/238836 (https://phabricator.wikimedia.org/T112713) (owner: Ori.livneh) [19:33:52] Analytics, Analytics-Cluster, Fundraising Tech Backlog, Fundraising-Backlog, operations: Verify kafkatee use for fundraising logs on erbium - https://phabricator.wikimedia.org/T97676#1646579 (ellery) I'll respond later with a more detailed report but here are some initial results based on some... [19:38:49] Quarry: Too much whitespace in Quarry - https://phabricator.wikimedia.org/T112803#1646597 (Spage) NEW [19:39:18] Analytics-Engineering, Performance-Team, Patch-For-Review: Support support for "gauge" metric type in Statsv - https://phabricator.wikimedia.org/T112713#1646604 (ori) Open>Resolved a:ori [19:57:45] Analytics-Cluster, Analytics-Kanban, Patch-For-Review: Create new Hive / Oozie server from old analytics Dell - https://phabricator.wikimedia.org/T110090#1646670 (Ottomata) [20:22:27] Analytics-Backlog: Install snzip on stat1002 and stat1003 - https://phabricator.wikimedia.org/T112770#1646809 (madhuvishy) [20:28:25] Analytics-EventLogging, Performance-Team: Support kafka in eventlogging client on tin.eqiad.wmnet - https://phabricator.wikimedia.org/T112660#1646819 (Krinkle) [20:28:56] Analytics-EventLogging, Patch-For-Review, WMF-deploy-2015-09-08_(1.26wmf22), WMF-deploy-2015-09-15_(1.26wmf23), WMF-deploy-2015-09-22_(1.26wmf24): EventLogging client should log errors (e.g url too long, or validation failure) - https://phabricator.wikimedia.org/T112592#1646824 (Krinkle) Open>... [20:29:32] Analytics-EventLogging, Performance-Team: Support kafka in eventlogging client on tin.eqiad.wmnet - https://phabricator.wikimedia.org/T112660#1646832 (Ottomata) I don't think we even knew that eventlogging was deployed to tin. Both terbium and tin are precise. Is there a Trusty host where we can install... [20:32:13] ottomata: Little scripts like https://gist.github.com/Krinkle/a51a7030a2b11c220379 I run several times a day, almost every. Right now from tin (not sure where else..). It's to briefly tap into the stream a fish out a few packets. Either to debug something, or to verify a conclusion, or just to get some sample details from an error schema to then fix. As [20:32:13] opposed to querying from mysql which seems overkill and more cumbersome. [20:33:08] I suppose it would also be nice to not require ssh access to stat1002, but that's doable (we may have to grant a few ppl access). [20:34:23] indeed, Krinkle, you should just use kafkacat and consume from the eventlogging_NavigationTiming topic [20:34:49] I use what I know and is documented. [20:34:53] I've never heard of kafkacat [20:35:01] Krinkle: stat1002 [20:35:02] kafkacat -C -b kafka1012.eqiad.wmnet:9092 -t eventlogging_NavigationTiming | head -n 10 [20:35:57] Cool [20:36:24] thx [20:36:27] that'll do? [20:36:56] For now, yeah. Though it won't do for queries where I'm filtering by a property. [20:37:01] jq ? [20:37:02] :p [20:37:03] for schema DeprecatedUsage I often filter by module or method [20:37:22] Being able to script it, regexes, inarray etc. would be helpful [20:37:30] aye [20:37:37] which the python module is designed to do. But the kafka stream doesn't work on tin. so hence zmq [20:37:48] i mean, you can use python + kafka client [20:38:20] but ja [20:38:22] eventlogging would be nice too :) [20:38:35] if eventlogging was deployed to stat1002, would that work? [20:39:05] I assume, yes. [20:41:11] Krinkle: right now on stat1002 with python you can do [20:41:39] Analytics-EventLogging, Performance-Team: Support kafka in eventlogging client on tin.eqiad.wmnet - https://phabricator.wikimedia.org/T112660#1646895 (Krinkle) [20:41:43] ottomata: K, checking [20:42:07] Mm... my main home directory is on tin and terbium, but I'll set up a small camp on stat1001 [20:42:10] 02* [20:42:14] Krinkle: https://gist.github.com/ottomata/014e813ebcd3d37d499c\ [20:42:49] Hm.. I recall kafka requiring a longer list of hostnames [20:42:58] yeah if you do it right :) [20:42:58] Is this a proxy or loadbalancer of sorts? [20:43:07] kafka clients just take in a list for bootstrapping [20:43:12] it just picks one at random to get real cluster layout [20:43:22] you can give it just one and it works [20:43:27] as long as that one broker is online :) [20:43:27] Ah, I had no idea [20:43:28] okay [20:43:46] presumably with failover? [20:44:00] if you give it a list, it will try each one [20:44:04] before it fails [20:44:11] but, its only for bootstrapping [20:44:24] once the client is created, it talks with kafka cluster to keep layout [20:44:45] if a broker goes down, it will be notice and get new layout [20:45:21] oops [20:45:28] fixed gist, had atypo [20:46:06] i guess eventlogging makes this nicer, because it presents you a stream of dicts, rather than the json strings [20:46:11] but you can json.loads on each message.value :) [20:48:21] Analytics, Analytics-Cluster, Fundraising Tech Backlog, Fundraising-Backlog, operations: Verify kafkatee use for fundraising logs on erbium - https://phabricator.wikimedia.org/T97676#1646933 (Ottomata) Ellery, call to your Senator and tell her you want a realtime streaming analytics cluster. [20:49:05] ottomata: Yeah. [20:49:22] ottomata: But I assume the (new) eventlogging lib does support kafka and probably uses this underneath [20:49:30] ja it does [20:49:44] lemme deploy to stat1002 :) [20:49:51] i would like to ahve it there too [20:49:59] its jsut a same that i have to do sudo python setup.py install on each node [20:50:02] that it is deployed to :/ [20:50:05] I saw madhuvishy wrote a sample program the other day, which didn't work on tin, but I guess it does work somewhere where a current version of the lib is installed [20:51:10] ottomata: does hafnium have updated code now? [20:59:05] Analytics-Backlog: Mark affected schemas as "containing user-inputed textual data" in the schema pagesv {tick} - https://phabricator.wikimedia.org/T112271#1646974 (mforns) Also, we need to add a section in eventlogging documentation that explains how to proceed in the case someone wants to publish any report... [20:59:44] Analytics-Backlog: Mark schemas as "containing user-inputed textual data" and add publish section to the docs {tick} - https://phabricator.wikimedia.org/T112271#1646979 (mforns) [21:00:14] madhuvishy: yes [21:01:00] ottomata: ah cool. can Krinkle run his scripts from there may be? [21:01:18] madhuvishy: yes probably! [21:01:25] * Krinkle doesn't have access to hafnium [21:01:31] I can deploy scripts there through puppet [21:01:32] i don't like hafnium >:( because it has a public IP :) [21:01:41] aah [21:01:43] it's not meant to be played on manaully [21:01:45] but jaaa Krinkle I'd like it to be on stat1002 also [21:01:48] yeah [21:01:55] Yeah, stat2 sounds good [21:01:57] so i will deploy it to stat1002 [21:02:08] Krinkle: woudl yo mind if it wasn't installed globally there, just deployed? [21:02:09] you could do [21:02:09] and perhaps in the future when terbium has trusty there as well. [21:02:19] export PYTHONPATH=/srv/deployment/eventlogging/EventLogging/server [21:02:33] ottomata: could do, but why? [21:02:46] because, right now, for every deploy, we ahve to manually log into each host [21:02:48] cd into that dir [21:02:49] and do [21:02:53] sudo python setup.py install [21:03:05] Eh.. is this not puppetised? [21:03:08] to actually deploy it to global python path [21:03:09] no. [21:03:11] its git deployed. [21:03:15] that is puppetized [21:03:19] but installing like that is not [21:03:23] installing it globally [21:04:27] ottomata: unrelated - where is all our camus config stuff? [21:04:28] what about hafnium [21:04:38] I assume it'd be installed globally there [21:04:52] Krinkle: it is. [21:04:58] madhuvishy: in refinery [21:05:00] we shoudl move it to puppet [21:05:25] I'm working on the spike for camus to gobblin replacement [21:05:53] awesooome [21:06:20] it's pretty cool. wondering if i can try it on vagrant [21:06:29] by enabling our hadoop role [21:06:38] ottomata: is it part of role eventlogging? (which is applied to hafnium) [21:06:42] I can't find it in https://github.com/wikimedia/operations-puppet/blob/HEAD/manifests/role/eventlogging.pp [21:07:42] Ah, https://github.com/wikimedia/operations-puppet/blob/HEAD/modules/eventlogging/manifests/package.pp [21:07:43] Krinkle: sorry, what are you looking for? [21:07:52] madhuvishy: I'm looking how it is installed on hafnium [21:09:22] madhuvishy: i think should be able to [21:09:38] so dependencies are packaged and installed through apt, and el itself is via trebuchet. but presumably doesn't run setup.py into global py path [21:09:40] Krinkle: this is how [21:09:40] https://github.com/wikimedia/operations-puppet/blob/HEAD/modules/eventlogging/manifests/package.pp#L22 [21:09:42] ah I thought ottomata logged in and ran python setup.py install. [21:09:45] that's right [21:09:46] it doesn't [21:09:52] you have to manaully run setup.py install [21:10:01] it deploys the raw code, but doesn't install it into place [21:10:02] which we did on eventlog1001 and hafnium the last time we deployed [21:10:11] right [21:10:27] you might be able to puppetise a notify => 'setup.py' or something like that [21:10:34] I think trebuchet puppet class has something like it [21:10:37] i think you can make git deploy hooks [21:10:39] but there's also something to be said about not auto-upgrading. [21:10:42] yeah [21:10:56] (PS1) Milimetric: Revert "Hack around bad Event Logging IP hash problem" [analytics/limn-edit-data] - https://gerrit.wikimedia.org/r/238965 [21:11:03] I'll leave it to you. As long as it works :) [21:11:19] its fine for now, not my ideal, but i'm not going to fix it now :) [21:13:12] * Krinkle is testing on stat1002. [21:13:15] Ah, right. not globally [21:13:38] ottomata: are the camus jobs run by oozie? [21:13:41] Krinkle: , not yet... [21:13:45] madhuvishy: , no by cron [21:13:51] I'll use sys.path.append if that works. [21:14:02] would prefer to keep it as a simple script that works on itsown [21:14:06] Krinkle: that shoudl work [21:14:17] ottomata: hmmm, do we want to run it through oozie [21:14:29] i thikn you'll want to use eventlogging.get_reader() instead of .connect [21:14:33] no [21:14:34] madhuvishy: [21:14:38] is there a reason to? [21:14:46] gobblin ships this scheduler called quartz [21:14:50] its cron like [21:14:59] the reason to use oozie is to depend on existant data or previously run jobs [21:15:00] but it lets you choose your own [21:15:01] camus is the start [21:15:35] okay cool [21:15:47] dunno anything bout quartz, if its cool maybe! [21:15:52] i know very little at this point [21:16:33] Analytics, MediaWiki-Authentication-and-authorization, Reading-Infrastructure-Team, MW-1.26-release, Patch-For-Review: Create dashboard to track key authentication metrics before, during and after AuthManager rollout - https://phabricator.wikimedia.org/T91701#1647027 (csteipp) Is there an estima... [21:17:23] ottomata: Hm.. there's nothing there except srv/deployment/analytics/refinery [21:17:32] Krinkle: not yet! :) [21:17:36] oh! [21:17:42] I thought iy was there but not global [21:17:46] making puppet happy. [21:17:47] Okay, no worries [21:17:51] (PS1) Jforrester: Limit data split to all plus the top 50 wikis [analytics/limn-edit-data] - https://gerrit.wikimedia.org/r/238969 (https://phabricator.wikimedia.org/T112222) [21:18:24] Analytics-Backlog, Analytics-Dashiki, Editing-Analysis, Editing-Department, and 2 others: Limit wikis on https://edit-analysis.wmflabs.org/compare/ to top 50 Wikipedias {lion} - https://phabricator.wikimedia.org/T112222#1647031 (Jdforrester-WMF) a:Jdforrester-WMF [21:18:28] Analytics-Backlog, Analytics-Dashiki, Editing-Analysis, Editing-Department, and 2 others: Limit wikis on https://edit-analysis.wmflabs.org/compare/ to top 50 Wikipedias {lion} - https://phabricator.wikimedia.org/T112222#1629271 (Jdforrester-WMF) [21:18:50] ack no, python-pykafka does not work on precise, right, no hafnium, oops [21:19:36] milimetric: do you have more context for this item: https://phabricator.wikimedia.org/T108843 [21:19:45] (CR) Milimetric: [C: 2 V: 2] Limit data split to all plus the top 50 wikis [analytics/limn-edit-data] - https://gerrit.wikimedia.org/r/238969 (https://phabricator.wikimedia.org/T112222) (owner: Jforrester) [21:20:01] milimetric: That was quick review. :-) [21:20:08] easy change :) [21:21:17] nuria, I don't think so. That task is just basically challenging the assumption that we can keep around the pageview_hourly table forever [21:21:19] * James_F grins. [21:21:48] I'm not sure exactly what Joseph was hinting at in the description [21:21:51] Analytics-Backlog, Research-and-Data: Install snzip on stat1002 and stat1003 - https://phabricator.wikimedia.org/T112770#1647045 (Halfak) [21:22:36] Krinkle: try now [21:25:10] what the, django? [21:26:15] ottomata: https://gist.github.com/Krinkle/1974c00ca470be026cd9 [21:26:20] indeed [21:27:58] where did django come from! [21:28:37] got it [21:28:40] https://github.com/jsocol/pystatsd/issues/24 [21:30:10] Krinkle: https://gist.github.com/Krinkle/1974c00ca470be026cd9#gistcomment-1575480 [21:31:06] ottomata: thx [21:31:54] i guess django is installed on stat1002? and conflicts with statsd [21:31:55] iunnooo [21:32:01] Anyone got a pro-tip for re-compressing data in hadoop? [21:32:12] I've got a big collection of snappy file and I want them to be bz2 [21:32:20] or gz [21:32:22] or 7z [21:32:28] halfak: ja what was your snappy problem? [21:32:30] Anything I can work with on a unix machine [21:32:31] why do you you want to change them? [21:32:42] Because you can't decompress snappy on a unix machine. [21:32:49] Without compiling some guys code yourself [21:33:06] See https://phabricator.wikimedia.org/T112770 [21:33:16] halfak: [21:33:30] /home/otto/snzip-0.9.0/snzip -h [21:33:34] Wut [21:34:27] also, you've copied the snappy files locally? and you are trying to work with them? [21:34:33] -bash: /home/otto/snzip-0.9.0/snzip: No such file or directory [21:34:45] /home/otto/snzip-0.9.0/snzip [21:34:48] ottomata, yeah. Hadoop is no good at the next step in processing. [21:34:53] on stat1002? [21:34:55] stat1003 [21:34:59] ah i'm on 2 [21:35:08] It would be better to do this work on 1003 [21:35:17] halfak: btw, you can do hdfs dfs -text to output a snappy file from hdfs [21:35:20] instead of -cat [21:35:21] Would you mind doing whatever it is that you did on 1002 to get that compiled? [21:35:24] just in case, dunno how big your files are [21:35:31] ottomata, I'm going to multiprocess them. I need file handles. [21:35:46] So dfs -text won't work. [21:36:06] k [21:36:15] halfak: , i just copied the binary over to stat1003 [21:36:17] /home/otto/snzip [21:36:36] OK with keeping it there for the forseeable future? [21:36:41] Or do you want it in a folder? [21:36:57] halfak: i won't mess with it, but we probably should have a .deb packge for this [21:38:15] Analytics, MediaWiki-Authentication-and-authorization, Reading-Infrastructure-Team, MW-1.26-release, Patch-For-Review: Create dashboard to track key authentication metrics before, during and after AuthManager rollout - https://phabricator.wikimedia.org/T91701#1647078 (Tgr) Uh, as soon as I don't... [21:38:33] ottomata, I think this'll work for me. Thanks! [21:39:19] Analytics-Backlog, Research-and-Data: Install snzip on stat1002 and stat1003 - https://phabricator.wikimedia.org/T112770#1647081 (Halfak) @ottomata directed me to a compiled binary of snzip he has in his home folder. This will work for me for now. [21:39:24] Analytics-Backlog, Research-and-Data: Install snzip on stat1002 and stat1003 - https://phabricator.wikimedia.org/T112770#1647082 (Halfak) Open>Resolved [21:41:24] ottomata, I get the following on all of my files from hadoop snappy "Unknown file header: 0x00 0x03 0x48 0x97 0x00 0x00 0xfb 0x06 0x97" [21:42:34] https://gist.github.com/halfak/49cc3d3a311eb64ac995 [21:42:39] For the full call and error [21:43:27] Analytics-Backlog, Research-and-Data: Install snzip on stat1002 and stat1003 - https://phabricator.wikimedia.org/T112770#1647090 (Ottomata) Resolved>Open I'd like to leave this open and put it on the analytics backlog with low priority. It'd be nice to have this as a .deb. [21:43:41] Analytics-Backlog: Install snzip on stat1002 and stat1003 - https://phabricator.wikimedia.org/T112770#1647093 (Ottomata) p:Triage>Low [21:43:57] HMM [21:44:01] i haven't used this thing much, halfak [21:44:02] hm [21:44:14] File could be corrupt. [21:44:23] Got a good one from hadoop that we can test with [21:44:35] I'll move the file to stat1002 to see if there's something special about that env. [21:44:43] halfak: how did you create these files? [21:44:57] Snappy output codec in hadoop [21:45:08] Hadoop made no complaints when reading the same files [21:45:25] But I did have to copy them in to an S3 bucket on the way back from altiscale's cluster [21:45:29] from a MR job? [21:45:32] are these sequence files? [21:46:03] https://github.com/halfak/measuring-edit-productivity/blob/master/hadoop/json2diffs.hadoop [21:46:06] See the config there [21:46:15] type=Block [21:46:27] codec=blahblahblah.SnappyCodec [21:49:13] Same error on stat1002 [21:51:01] Analytics-Kanban: Put toghether an updated documentation on EventLogging {tick} {oryx} [8 pts] - https://phabricator.wikimedia.org/T112124#1647128 (mforns) Here is a list of what has been done for this task. All points are related to either Wikitech (WT) or MediaWiki (MW) documentation on EventLogging: - Uni... [21:51:33] ya am trying things. [21:51:35] not really sure. [21:51:44] Did you ever have snzip work? [21:52:10] Yeah... seems to simply just not work [21:52:30] Even my local install on my laptop -- that works with files I compressed directly with snzip -- doesn't work. [21:52:32] yes, and i can make it work with file I create with snzip :p [21:52:33] ARG! [21:52:39] oh, works for me [21:52:45] On the part file? [21:52:49] did you compile that on your laptop? [21:52:53] yeah [21:52:55] no [21:53:00] just a text file i compressed with snzip [21:53:08] Yeah. Same here. [21:53:10] haven't found a file generated from hadoop that i can snzip -dc yet [21:53:55] hdfs:///user/halfak/streaming/enwiki-20141106/filtered-diffs-snappy [21:54:35] Yup. Unknown file header on one of those too [21:54:39] *sigh* [21:55:04] So, it looks like I need to re-compress the files. [21:55:31] hmm from http://stackoverflow.com/questions/16674864/how-do-i-read-snappy-compressed-files-on-hdfs-without-using-hadoop [21:55:31] "Thats because snappy used by hadoop has some more meta data which is not undesrtood by libraries like https://code.google.com/p/snappy/, You need to use hadoop native snappy to unsnap the data file that you downloaded. [21:55:31] " [21:55:45] * halfak face palms [21:55:48] kk [21:55:50] So [21:55:59] I have a problem though. They are already sorted and partitioned. I just want to recompress them without changing the order of the contents. [21:56:15] And I'd like to do it with the cluster because it is going to take a lot of CPU to re-compress [21:56:42] Any protips? [21:57:36] hm, i mean, off the top of my head: hive, just cause that is a tool i know a little more [21:57:43] make an external table on top of it [21:57:52] make a new table that is different compression [21:57:56] select from insert into [21:57:57] buuut [21:58:04] there's gotta be a better way [21:58:24] The select would re-order the lines in the files. [21:58:33] I have 2000 files that have the lines in a perfect order. [21:58:45] It took lots of cycles and disk IO to get them in that order. [21:58:55] halfak: what furthre processing are you trying to do? [21:59:00] why not just do it in hadoop too? [21:59:10] This would be a very complicated explanation. [21:59:27] Can we just assume that this is a bad idea? [21:59:33] haha [21:59:34] Altiscale techs would agree. [22:00:01] I've run this exact job in hadoop more than 50 times without success. [22:00:18] I can't debug when I have to wait 30 minutes to see what error occurred. [22:00:50] Also, hadoop doesn't seem to make this go any faster when I run it on samples. [22:00:54] halfak: how big are these files? how much total data? [22:00:57] The server is actually a little bit quicker. [22:01:26] 340GB compressed snappy [22:01:54] 340G spread across 2000 files? [22:02:04] halfak: crappy off the top of my head way: [22:02:26] hdfs dfs -text $file | bzip [22:02:44] But that won't distribute CPU [22:04:26] Meh. It's OK. I guess I'll just burn a little bit of task hours by using the CPU/disks to sort 'em again. [22:04:43] Maybe it'll use a trivial amount of disk/CPU the second time since the data is already sorted. [22:04:49] mkdir jobs; cd jobs [22:05:04] num_cpus=10 [22:05:15] ls files/ | split -l $num_cpus [22:06:06] So... still going to use the local machine, but I appreciate the idea. [22:06:09] I didn't know about split [22:06:15] Analytics-EventLogging, Fundraising-Backlog, Unplanned-Sprint-Work, Fundraising Sprint Snoop (Dogg|Lion), Patch-For-Review: Promise returned from LogEvent should resolve when logging is complete - https://phabricator.wikimedia.org/T112788#1647179 (atgo) [22:06:27] right, i'm making up a crappy shell parallelizer :p [22:06:37] make a script that reads a list of files [22:06:46] Not crappy in that you're not writing a ton of code to do it! [22:06:49] and run one per job ifle in a background [22:07:11] Essentially I want crappy mappers and a queue of files to compress. [22:07:45] On deployment-eventlogging02 are the files in /var/log/eventlogging/ supposed to be empty? [22:08:24] this entry in the archive seems weird too: [22:08:28] -rw-r--r-- 1 eventlogging eventlogging 7.9G Sep 8 22:17 server-side-events.log-20150910 [22:08:46] Thanks for taking a look ottomata [22:09:13] Krenair: host has changed, deployment-eventlogging03 [22:09:18] upgraded to trusty [22:09:36] What? The docs say 02... [22:10:16] oh cool [22:10:19] halfak: xargs! [22:10:19] ? [22:10:20] find . -maxdepth 1 -type f -print0 | xargs -0 -n 1 -P 4 echo [22:10:35] Krenair: where, we are in the process of updating docs [22:10:49] https://wikitech.wikimedia.org/wiki/Analytics/EventLogging/TestingOnBetaLabs [22:11:05] got it thanks, fixing [22:11:49] No mysql installed? [22:12:54] mysql? [22:13:05] OH! [22:13:06] hm. [22:13:14] also fixing! geez, this is not well puppetized. [22:13:50] didn't think that this would write to localhost mysql... [22:14:05] So... has this not been recording logs for the last 8 days? [22:14:10] (in the DB) [22:14:15] correct [22:14:37] Where have the errors about mysql not being there been going? [22:15:04] i guess to upstart logs [22:16:15] hm, i actually don't see errors in there [22:17:00] Me neither [22:22:46] Krenair: ok, database is there, restarted eventlogging, can we insert some client side events and see it insert stuff? [22:23:26] hmm, i see stuff in log [22:23:34] not sure how to make it inset [22:23:35] insert [22:23:38] how to trigger [22:24:14] browsing to http://en.m.wikipedia.beta.wmflabs.org/wiki/Special:Random#/editor/0 should be enough [22:26:02] hmm, yeah it is consuming fine, just not inserting hm [22:31:10] oh Krenair events only get inserted every 5 minutes i think... [22:34:46] ah yeah [22:34:47] its working [22:34:50] thanks Krenair sorry about that [22:38:43] OOOk laters all [23:25:34] random noob hive question, say i have a column of type array>. How can i sum the hits? all i've found so far is related to using explode and a lateral view which doesn't seem like the right approach