[00:00:32] hey yurik :-) [00:00:43] qchris, hi! [00:00:57] plane in 20, what's up? [00:01:01] the cluster is currently working hard to catch up what it missied during the day [00:01:11] can I kill your zero-sms job for today? [00:01:20] sec [00:01:43] (It's hard-blocking the catchup, as the jobs are fighting for resources, and currently there is a stale mate) [00:01:47] (all jobs blocked) [00:02:08] qchris, sure, go ahead - i was worried that it hasn't finished yet [00:02:19] qchris, i might have to bug you tomorrow to possibly manually restart it ;) [00:02:20] k. thanks. [00:02:44] k. [00:04:12] qchris, did you kill the python script that starts them? [00:04:21] killing myself ... [00:04:22] no. [00:04:36] the hive job just released the resources it held. [00:05:04] (Cluster is slowly recovering) [00:05:33] qchris, just in case - i have a 48 */4 * * * job that starts it [00:05:55] And it automatically re-runs missing periods? [00:06:03] it runs python script that spawns a hive job per each missing day [00:06:14] Awesome! [00:06:16] Thanks. [00:06:19] np [00:06:40] qchris, double check - it could have started another job when you killed the first [00:07:00] I did not yet kill a job. [00:07:05] I only asked. [00:07:15] But before I could kill it ... it freed the resources. [00:07:15] ah, ok - i killed the python script anyway [00:07:23] it got scared [00:07:27] :-D [02:04:55] * Fiona smiles at https://wikimediafoundation.org/wiki/File:Andreescu,_Dan_January_2015.jpg [10:10:18] Analytics-Tech-community-metrics, ECT-February-2015, ECT-March-2015: Key performance indicator: Gerrit review queue - https://phabricator.wikimedia.org/T39463#1069401 (Qgil) [14:09:20] Analytics-Tech-community-metrics, Wikimedia-Git-or-Gerrit, ECT-February-2015: Active code review users on a monthly basis - https://phabricator.wikimedia.org/T86152#1069777 (Nemo_bis) > Why? Self-merges are not code review. The bug summary says "code review". If you're not interested in code review,... [14:10:22] Hi ottomata, yt ? [14:10:28] hiya [14:10:57] You ok this morning ? [14:11:15] 'cause I have code review for yaaa ;) [14:11:19] doing alrigggghhhht! :) [14:11:28] qchris and I are petting the cluster, telling it that everything will be alright [14:11:38] huhu [14:12:04] I don't wanna bother qchris, but I'd like to know how he monitors the fact that our jobs ar late [14:14:40] also, working on the mobile monthly uniques and therefore reading on oozie, I'd like to discuss with you the use of forward/backward frequency counts for datasets in coordinators [14:15:15] sureuuuU!U!! [14:15:22] he monitors because he has datasets he cares about [14:15:24] so he noticed. :) [14:16:04] e.g.: shall we use ${coord:current(0)}-${coord:current(23)} OR ${coord:current(-23)}-${coord:current(0)} [14:17:09] i think the former, since we want the nominal time of the workflow to be the day for which the data is [14:17:10] that is. [14:17:23] Feb 26 hours 0-23 [14:17:30] should generate the fiel for [14:17:32] FEb 6 [14:17:35] Feb 26 [14:17:35] * [14:17:44] (typing hard) [14:18:30] joal: About seeing which datasets are late ... you can run '/srv/deployment/analytics/refinery/bin/refinery-dump-status-webrequest-partitions --datasets all' on stat1002. Once things look wrong there, I typically just look at the oozie output directly. [14:18:52] (as long as someone keeps that file up to date with each dataset...:p) [14:19:15] i think i'm going to revert the vcores change [14:19:30] Thx qchris :) [14:19:31] the change happened the night before the cluster started backing up [14:19:34] I'll have allok [14:20:04] ottomata: so about the cluster deadlock. [14:20:08] ja [14:20:11] It also happened before. [14:20:17] Like once a month or something. [14:20:25] really? and you just managed it and didnt' tell me?! :p [14:21:09] It typically was some random query from some researcher that was too big. So I freed up resources for that and once that query finished backfilled the other stuff automatically. [14:21:33] We discussed memory limitations a few times. [14:22:00] But the issue wasn't as big as it is now, because back then, everything was based on wmf_raw. [14:22:09] ? [14:22:12] So things blocked early, and recovery was easy. [14:22:16] ah [14:22:29] Now with wmf.webrequest, one has to be carefull when to run which job, [14:22:39] because more jobs can run at once? [14:22:40] as otherwise, the refining happens on partial data. [14:22:44] right. [14:22:49] ? [14:22:55] O [14:22:56] H [14:23:02] because 2 hours would pass [14:23:07] and camus wouldn,'t be done. [14:23:09] Right. [14:23:11] that's bad. [14:23:12] hm [14:23:12] ok [14:23:25] hm. [14:23:40] ok, before I try to revert vcores then [14:23:44] i'm going to do some fairscheduler reading [14:23:49] and tweak the queues [14:23:54] cool. [14:23:58] i could just up the priority of essential [14:24:00] but i'll read a bit first [14:29:03] Analytics-Tech-community-metrics, Wikimedia-Git-or-Gerrit, ECT-February-2015: Active Gerrit users on a monthly basis - https://phabricator.wikimedia.org/T86152#1069783 (Qgil) [14:58:16] ottomata: how are things coming along with the queue tuning? Should I start refining now nontheless, so we get a few partitions done until you finished your parameter testing? [14:59:00] sure [14:59:02] they are coming along [14:59:14] actually, fair-scheduler allocations will get picked up wtihout restarting resource manager [14:59:17] i but i want to enable preemtpion [14:59:25] yay for preemption! [14:59:37] (refining started again) [15:15:29] Analytics-Cluster, Analytics-Kanban: Add 'version' field to refined webrequest table in Hive - https://phabricator.wikimedia.org/T90725#1069884 (ggellerman) Open>Resolved a:JAllemandou [15:18:02] Analytics-Cluster, Analytics-Kanban: Add jar versions as parameters in oozie jobs - https://phabricator.wikimedia.org/T90736#1069899 (JAllemandou) a:JAllemandou [15:18:45] Analytics-Cluster, Analytics-Kanban: Add jar versions as parameters in oozie jobs - https://phabricator.wikimedia.org/T90736#1066190 (JAllemandou) p:Triage>Normal [15:21:00] qchris: would you mind looking this over? [15:21:02] you can say no :) [15:21:02] https://gerrit.wikimedia.org/r/#/c/193109/ [15:21:07] * qchris looks [15:24:15] ottomata: I have to admit that I do not fully understand what the parameters mean. Your comments make sense though. [15:24:41] Since we announced that the cluster has issues ... should we just try those settings in production? [15:28:15] yes i think so [15:28:18] can't hurt :p [15:28:34] so I 90% grasp the params [15:28:43] preemption will be the most useful one here [15:28:54] yup. [15:28:57] there is a min share that app gets from a queue, and a fair share. [15:29:06] this says that, if an appdoesn't get its min share after 60 seones [15:29:17] other containers (not jobs) from other queues will be killed [15:29:27] then, after 10 minutes, if a job doesn't get its fair share [15:29:30] it will start killing more containers [15:29:42] Yup. Those look good. [15:29:44] this one [15:29:44] yarn.scheduler.fair.locality.threshold.node [15:29:49] But I do not fully grasp [15:29:51] yarn.scheduler.fair.locality.threshold.node [15:29:53] right :-) [15:30:01] makes yarn delay a bit when first scheduling jobs [15:30:10] yarn wants to schedule jobs with data locality [15:30:21] so, in a cluster with nothing running [15:30:29] it will schedule jobs to run where the data that job needs is [15:30:33] but, if a cluster is busy [15:30:41] the node where the data is might be full [15:31:00] so, it will just schedule the job on whichever node is available [15:31:12] which means that the job will have to get the data over the network [15:31:28] Sure. But 1/3 ... seems big. We have jobs with 1500 mappers. [15:31:42] so, it only means [15:31:55] So wouldn't that mean that 500 maps would have to get scheduled locally before the job would start? [15:32:10] that it will wait until it has had opportunities to schedule the job on 1/3 of the cluster before it will schedule something non local [15:32:12] no [15:32:27] it just tells it to not immediately schedule a non local job [15:32:39] it says: wait a bit, maybe a local node will be available elsewhere [15:32:49] it waits until it has a chance to try 1/3 of the cluster [15:32:54] before it schedules non local [15:32:57] Curious to see that int production and make the cluster go zooooooooooooooooom :-) [15:33:07] i think anyway [15:33:09] see [15:33:10] Delay Scheduling [15:33:11] here: [15:33:13] https://www.safaribooksonline.com/library/view/hadoop-the-definitive/9781491901687/ch04.html [15:33:25] Cool. Thanks for the link. [15:34:04] also, correct me if I misunderstood that [15:34:05] :) [15:48:06] ok, qchris, cluster looks pretty empty right now [15:48:09] one of ellery's jobs is running [15:48:16] it just started [15:48:19] i want to restart resource manager [15:48:48] gonna just do it. :) [15:48:50] wait [15:48:55] oh [15:48:58] hm [15:49:04] ohi think i am looking at the wrong queue [15:49:11] there are lots of hdfs jobs running too [15:49:29] ah yeah, ah, i didn't realize i could filter on queue in the scheduler ui [15:49:58] will just wait until this camus job is done, then restart, hopefully the ohter jobs will jstu go, if they don't, we can restart them in oozie [15:50:25] Ignore the hdfs jobs for now :-) [15:50:32] Those are mostly refining. [15:50:34] And some camus. [15:50:43] So restart the resource manager at will. [15:50:59] ottomata: ^ [15:55:07] this camus is almost done [15:55:15] also, um, analytlics1011 looks like i has problems? [15:55:17] looking into it: [15:55:24] http://localhost:8088/cluster/nodes [15:55:26] 9/12 local-dirs are bad: [15:55:39] I'd ignore the camus one. It'll just catch up when needed. [15:56:28] !log restarted resourcemanager on analytics1001 to load new fairscheduler settings [15:58:09] would be nice to have HA resourcemanager :/ [15:58:14] :-) [15:58:17] alljobs just got kablaamed. [15:58:23] Application with id 'application_1424120984454_20321' doesn't exist in RM. :( [15:58:32] I am about to restart the jobs. [15:58:36] cool [15:58:36] danke [16:09:04] Analytics-Tech-community-metrics, Phabricator, ECT-February-2015, ECT-March-2015: Metrics for Maniphest - https://phabricator.wikimedia.org/T28#1070074 (Aklapper) Though I played a bit more with Phab's SQL in the last days (phun phun phun!) I won't get into this task in Feb 2015 (nothing done on the... [16:35:47] Analytics-Cluster, Analytics-Kanban, Scrum-of-Scrums: Create Daily & Monthly pageview dump with country data - https://phabricator.wikimedia.org/T90759#1070191 (kevinator) p:Triage>Normal [16:37:09] Analytics-Cluster, Analytics-Kanban, Scrum-of-Scrums: Create Daily & Monthly pageview dump with country data - https://phabricator.wikimedia.org/T90759#1066753 (kevinator) [16:39:28] Analytics-Cluster, Analytics-Kanban: Refactor MobileApps uniques HQL to use external table to format data. - https://phabricator.wikimedia.org/T90730#1070204 (kevinator) p:Triage>Normal [16:39:53] btw, I just changed the table: https://wikitech.wikimedia.org/wiki/Analytics/Unique_clients/Last_visit_solution#Deliverable [16:40:01] to provide a visual understanding of what we're saying below [16:40:37] gaahhh [16:40:40] analytics1011 [16:40:44] why you unhealthy? [16:40:50] is it because your disks are at 91% utilization? [16:44:53] Analytics-Cluster, Analytics-Kanban: Refactor MobileApps uniques HQL to use external table to format data [5 pts] - https://phabricator.wikimedia.org/T90730#1070237 (kevinator) [16:46:16] qchris, joal, fyi, I just started a limited balancer job [16:46:39] analytics1011 has a lot of data on it, and i think this is why it isn't showing up for yarn usage [16:46:39] What does that do? [16:46:53] mhmm. ok. [16:46:54] i picked 3 other nodes to consider [16:47:06] Analytics-Cluster, Analytics-Kanban: Refactor MobileApps uniques HQL to use external table to format data [5 pts] - https://phabricator.wikimedia.org/T90730#1070250 (Nuria) - refactor mobile app uniques daily job to take advantage of kind of a "temporary" table (real external table and drop it) on hiv... [16:47:08] it will rebalacne blocks between analytics1011 + the other 3 nodes [16:47:16] until each are about withing 10% utilization of each other [16:47:21] cool. [16:47:28] i picked 3 nodes that had about 40% util [16:47:32] i could do the whole cluster [16:47:35] which might not be a bad idea [16:47:45] Btw. After the resource manager restart, the cluster is pretty snappy. [16:47:49] but, i'll let this go first, just to clear up analytics1011 [16:47:53] ha, cool [16:47:57] It can take more load than yesterday. [16:48:06] hm [16:48:07] Not sure yet whether or not it can take the full load again. [16:48:12] not sure why that would be though [16:48:23] i mean, it should prefer the essential jobs now much more [16:48:25] that's all I changed [16:48:37] maybe it just has less running cause we killed stuff [16:48:55] Before the restart, I could not refine all five webrequest_sources at once. [16:48:58] Now I can. [16:49:19] i did not explicitly kill stuff and and resources look comparable. [16:50:00] hm [16:50:01] weird [16:50:05] yup. [16:50:07] well, balacner could slow it down a bit :/ [16:50:08] Analytics-Cluster, Analytics-Kanban: Update documentation page for the refined webrequest table in hive - https://phabricator.wikimedia.org/T90726#1070263 (kevinator) p:Triage>Normal a:JAllemandou [16:50:12] but tha's why i limited it to a few nodes [16:50:14] rather than all of them [16:50:15] oh! [16:50:20] qchris, i was not aware of the data category [16:50:27] i have a bunch of emails yet to respond to today... [16:50:32] :-) [16:50:39] joal: we should just use this [16:50:40] https://wikitech.wikimedia.org/wiki/Category:Data_stream [16:50:44] rather than our Data hierarchy [16:51:23] Soujnds god to me [16:51:24] and/or we could move these pages into Data/ [16:51:31] As you prefer [16:51:34] but adding the category sounds like a better way to orgainize... [16:51:40] Analytics-Cluster, Analytics-Kanban: Update documentation page for the refined webrequest table in hive - https://phabricator.wikimedia.org/T90726#1065940 (kevinator) [16:51:41] i am a noob wiki user [16:51:41] Analytics-Cluster, Analytics-Kanban: Add 'version' field to refined webrequest table in Hive - https://phabricator.wikimedia.org/T90725#1070268 (kevinator) [16:51:45] even though i have worked here for 3 years [16:52:13] when's your anniversary?! [16:52:23] (PS1) Jsahleen: Add min and uz to for beta features dashboard. [analytics/limn-language-data] - https://gerrit.wikimedia.org/r/193128 [16:53:41] (CR) Jsahleen: [C: 2] "Simple change." [analytics/limn-language-data] - https://gerrit.wikimedia.org/r/193128 (owner: Jsahleen) [16:54:47] (CR) Ottomata: "Overall, I like." [analytics/refinery] - https://gerrit.wikimedia.org/r/192891 (owner: Joal) [16:56:06] milimetric: We added some new languages for our beta enablements dashboard. Changes have been merged. Can you push to the server? [16:56:22] sure jsahleen, one sec [16:57:23] jsahleen: done, but any data that needs to recompute won't be there for a bit [16:57:28] wow, qchris, a single mediacount day is huge, eh? [16:57:35] (as it waits for crons / rsyncs / etc.) [16:57:39] milimetric: uncertain [16:57:40] Just 300MB (compressed) [16:57:42] depends on when you start counting [16:57:52] i was 50% for a few months when i started [16:57:52] milimetric: Thanks! [16:57:59] i think i count from when I was 100% [16:58:02] which is probably like april something [16:58:12] ooh! coming up. You should write an ori - style email [16:58:17] ha [16:58:28] qchris: its huge! one day of aggregates is 300M compressed! [16:58:40] 300MB is huge? [16:58:45] for an aggregate, no? [16:58:48] pagecounts is 100MB/hour. [16:58:53] haha, is it really? [16:58:53] haha [16:59:00] :-) [16:59:05] its fine for sure, i'm just surprised [16:59:10] i guess there are a lot of images! [16:59:33] Analytics-Cluster, Analytics-Kanban, Easy: Mobile Apps PM has monthly report from oozie about apps uniques [8 pts] - https://phabricator.wikimedia.org/T88308#1070303 (kevinator) [16:59:35] There are :-D And videos, and audio files ... and images of math formulae :-D [16:59:41] ori-style email? [16:59:44] * ori blinks [17:00:21] ori: your awesome "I've been here for 2 years" one [17:00:26] Analytics-Cluster, Analytics-Kanban: Refactor MobileApps uniques HQL to use external table to format data [8 pts] - https://phabricator.wikimedia.org/T90730#1070307 (kevinator) [17:00:50] that was a great email, I starred it [17:01:18] Analytics-EventLogging, Analytics-Kanban: Investigate EventLogging Monitoring with Ops DBA - https://phabricator.wikimedia.org/T86200#1070311 (kevinator) This is blocked until we change the DB. [17:01:31] heh [17:05:59] Analytics: Upgrade daily/monthly aggregations of pageview dumps to new data files - https://phabricator.wikimedia.org/T90203#1070325 (Ottomata) Hm, unless it is very easy, I think you should hold off from making this change. pagecounts-all-sites uses the same pageview definitino as pagecounts-raw, except th... [17:10:17] qchris: holaaaaa, can i ask you a question? [17:10:28] !ask | nuria [17:10:28] nuria: Please feel free to ask your question: if anybody who knows the answer is around, they will surely reply. Don't ask for help or for attention before actually asking your question, that's just a waste of time---both yours and everybody else's. :) [17:10:34] qchris: jaja [17:10:41] ;-) [17:11:00] so what's your question? [17:11:01] qchris: rememeber the agreggator github depot that holds pageviews that are later shown in dashiki? [17:11:09] Analytics-Kanban: Analyze device class(mobile/desktop) and how it influences Edit Schema events {lion} - https://phabricator.wikimedia.org/T89728#1070344 (kevinator) a:Jdforrester-WMF [17:11:10] yup. [17:11:26] I guess it stalled for the last two days or so. [17:11:39] (Due to the cluster not producing all needed data) [17:11:40] qchriS; the permits on the files are such that apache now cannnot read them [17:12:11] https://www.irccloud.com/pastebin/Hy8APUqI [17:12:15] Analytics-Cluster, Analytics-Kanban: Estimate roughly of how many users might not have javascript capable/enable browsers, use CSS to crosscheck. - https://phabricator.wikimedia.org/T89847#1070351 (kevinator) p:Normal>Low [17:12:24] * qchris looks [17:12:56] No "x" for the directory. [17:13:17] but wait ... daily_temp ... [17:13:33] That does not look like a clone of the repo. [17:13:59] qchris: ah sorry, i think i pasted the wrong one, let me ssh [17:15:47] qchris: actually that is what i see on the machine now. I can reclone but for my life [17:16:08] i could not find the cron jobs that update this [17:16:25] Let me double-check on git.wikimedia.org [17:16:33] Analytics-EventLogging, Analytics-Kanban, Patch-For-Review: Reliable scheduler computes Visual Editor metrics [21 pts] {lion} - https://phabricator.wikimedia.org/T89251#1070361 (kevinator) [17:16:34] Analytics-EventLogging, Analytics-Kanban: Reliable scheduler collects Visual Editor deployments {lion} - https://phabricator.wikimedia.org/T89253#1070360 (kevinator) [17:17:05] https://git.wikimedia.org/tree/analytics%2Faggregator%2Fdata.git/3edec13e98dbbd14852aca7a5e0a0b5215af9ec1 [17:17:16] nuria: ^ is what the clone should look like [17:18:21] qchris: ok, will change, see permits on daily dir: drwxr-xr-x [17:18:44] If you've got that puppetized, you'd need to update that in puppet [17:18:45] qchris: i think we need the last x to be 'rx' [17:19:12] (the git::clone should provide an owner IIRC) [17:19:35] Yes, 'r-x' is the thing you want. [17:20:14] qchris: ok, so i did not find it on puppet but it is because it might not be there ... ahahajam [17:20:23] k :-D [17:20:36] Then chmod is your friend. [17:20:58] qchris: I swear i tried chmod with root and I could not change it ... [17:21:11] qchris: and then i was like .... what????? [17:21:28] Mhmm. Root should be able to do that. [17:21:36] What machine is that on? [17:21:47] qchris: wikimetrics1 [17:23:33] I tried [17:23:34] chmod o+r daily [17:23:41] as root in /srv/aggregator-data/projectcounts [17:23:42] and it worked? [17:23:46] and it worked. yes. [17:23:48] Weird. [17:23:55] man , ok, last nite retardation [17:24:25] But someone screwed up that repo. [17:24:36] It has untracked files. [17:24:38] Analytics-Cluster, Analytics-Kanban: Epic: qchris transition - https://phabricator.wikimedia.org/T86135#1070373 (kevinator) [17:24:42] ok, will fix, [17:24:57] my last question is where is the cron that updates It? [17:25:02] k. If you run into troubles with that, let me know. [17:25:08] No clue about the cron. [17:25:11] Let me look. [17:25:40] Not sure ... is there a cron that updates it? [17:25:56] seems to be by looking at git log [17:26:09] https://www.irccloud.com/pastebin/DkiUgkuO [17:26:17] k. [17:26:27] Are you running puppet on that host? [17:26:44] (Because puppet would update it IIRC) [17:27:11] Analytics-EventLogging, Analytics-Kanban: Reliable scheduler collects Visual Editor deployments [8 pts] {lion} - https://phabricator.wikimedia.org/T89253#1070388 (kevinator) [17:27:26] Yup. Seems like puppet is running. [17:27:45] Logs show things like: [17:27:48] Feb 26 07:59:07 wikimetrics1 puppet-agent[6804]: (/Stage[main]/Role::Wikimetrics/Git::Clone[aggregator_data]/Exec[git_pull_aggregator_data]/returns) executed successfully [17:28:02] So puppet is doing the repo updating for you. [17:28:04] nuria: ^ [17:28:32] qchris: Ok, will triple check today , fix repo and change permits, thank you. [17:28:38] Analytics-EventLogging, Analytics-Kanban: Reliable scheduler collects Visual Editor deployments [8 pts] {lion} - https://phabricator.wikimedia.org/T89253#1070391 (Milimetric) We can be relatively confident that the current git command gets a good approximation of "deployment". It will be a few days behin... [17:29:22] Relevant file in the puppet repo is manifests/role/wikimetrics.pp . [17:29:38] There is an "ensure => latest" that does the trick. [17:29:54] line 316. [17:29:56] yw. [17:31:12] Analytics-Engineering: udp2log: Announce new stream so people can compare streams - https://phabricator.wikimedia.org/T86205#1070397 (kevinator) @ottomata is working on turning off udp2log [17:35:13] Analytics-Cluster, Analytics-Kanban: Epic: qchris transition - https://phabricator.wikimedia.org/T86135#1070410 (kevinator) [17:36:19] Analytics, Analytics-Kanban: udp2log: Announce new stream so people can compare streams - https://phabricator.wikimedia.org/T86205#1070412 (kevinator) [17:38:20] qchris_away: http://dumps.wikimedia.org/other/mediacounts/daily/2015/ :) [17:38:44] ottomata: YOU THA BEST! \o/ [17:38:56] naw, YOU DA BEST [17:39:02] Now I only need to write some docs for mediacounts. [17:39:47] As reward ... I'll grab some food. (Cluster still catching up. But now catching up blazingly fast.) [17:39:51] :) [17:39:56] weird! [17:39:57] but good! [17:40:01] Hey ottomata, about jar verion [17:40:19] hmm, qchris_away [17:40:25] joal, ja? [17:40:34] so, i'm looking at one of the raw hours [17:40:42] 2015-02-25T21:00 [17:40:45] in sequence stats [17:41:15] Na, nothing ... sorry to bother [17:41:17] looks like a lot of missing data [17:41:27] ~50% for some hosts [17:41:42] i wonder if we might want to go back and reset camus and reload from kafka... [17:41:47] joal, naw, man, what's up? [17:44:33] Re-read your comment, found out that we were going for the same name :) [17:44:41] review comming [17:45:01] (PS2) Joal: Add refinery_jar_version as a parameter in oozie bundle properties. [analytics/refinery] - https://gerrit.wikimedia.org/r/192891 [17:45:05] :) [17:45:43] joal, looks good, i think you missed the change in bundle.properties! :p[ [17:45:57] NOWAYYYYYY ! [17:46:10] Sorry :) [17:46:19] np [17:46:45] (PS3) Joal: Add refinery_jar_version as a parameter in oozie bundle properties. [analytics/refinery] - https://gerrit.wikimedia.org/r/192891 [17:47:20] (CR) Ottomata: [C: 2 V: 2] Add refinery_jar_version as a parameter in oozie bundle properties. [analytics/refinery] - https://gerrit.wikimedia.org/r/192891 (owner: Joal) [17:47:54] About docs for the webrequest, I go and add the page to the Data Stream category, and move it back to Analytics roots ? [17:48:01] Ironholds: Woah. There's a user with ~ 10k VisualEditor edits on one wiki? Wow. [17:48:05] naw, i think we can keep Data/ [17:48:14] i already added to the category [17:48:17] so, you can just edit it [17:48:21] i moved the other pages into Data/ too [17:48:29] Riiiight [17:48:40] I need to F5 with you ;) [17:51:57] Anyone know what the max string length for a JSON value in EventLogging data is? Looks like some events are being discarded because the text I added is too long. [17:53:58] bearND: there's no hard limit, but certain browsers and proxy servers impose a limit on URL length [17:54:33] we append a ';' to the URL in EventLogging's JavaScript code as a way of detecting whether the URL was truncated [17:55:07] (since any organic semicolons in the data are encoded as "%3B") [17:55:23] bearND: what's the schema name? I can grep the look [17:55:25] *grep the log [17:56:45] (PS2) Joal: Add client ip and geocoded data to refined webrequest table. [analytics/refinery] - https://gerrit.wikimedia.org/r/192363 [17:59:44] (CR) Ottomata: [C: 2 V: 2] Add client ip and geocoded data to refined webrequest table. [analytics/refinery] - https://gerrit.wikimedia.org/r/192363 (owner: Joal) [18:00:32] Thx for the quick review ottomata [18:00:44] Analytics-Kanban, MediaWiki-General-or-Unknown, JavaScript: mw.user.generateRandomSessionId not so random - https://phabricator.wikimedia.org/T78449#1070504 (kaldari) Has anyone upstreamed this bug to Apple? [18:00:45] When do you want us to go through the deploy process ? [18:01:45] Thanks ori. Schema name is MobileWikiAppShareAFact. [18:04:20] joal ahhhhhHHHH [18:04:37] um, after plan east coast hackathon meeting today? [18:04:39] ottomata: 5 mins [18:04:47] if that is too late for you, we can do tomorrow [18:05:36]