[01:04:37] Quarry, Labs, Labs-Infrastructure: Long-running query produces strangely incorrect results - https://phabricator.wikimedia.org/T135087#2287871 (Neil_P._Quinn_WMF) [01:21:34] Quarry, Labs, Labs-Infrastructure: Long-running Quarry query (querry?) produces strangely incorrect results - https://phabricator.wikimedia.org/T135087#2287926 (Neil_P._Quinn_WMF) [04:45:14] so the "log" database has 94508 tables now? impressive [04:45:21] mysql:research@analytics-store.eqiad.wmnet [log]> SELECT COUNT(*) FROM information_schema.tables; [04:50:42] ah, i mean the entire server [04:50:52] (log has 351: SELECT COUNT(*) FROM information_schema.tables table_name WHERE TABLE_SCHEMA='log';) [05:23:48] Analytics, Wikipedia-Android-App-Backlog: Investigate recent decline in views and daily users - https://phabricator.wikimedia.org/T132965#2288226 (Tbayer) While this is being sorted out (any updates?), I have added an annotation to the [[https://vital-signs.wmflabs.org/#projects=all/metrics=Pageviews |Vi... [07:38:26] Analytics-Kanban, EventBus, Patch-For-Review: Propose evolution of Mediawiki EventBus schemas to match needed data for Analytics need - https://phabricator.wikimedia.org/T134502#2288352 (mobrovac) [07:50:39] joal: bonjour [07:50:46] joal: left some comments over at https://gerrit.wikimedia.org/r/#/c/288210/ [08:05:34] mobrovac: buongiorno [08:05:43] ciao elukey [08:10:53] Bonjour mobrovac ! [08:11:02] Merci pour les commentaires, je vais les lire [08:11:43] Buongiorno elukey ! [08:12:03] rien de mieux pour la fin de la semaine que du bike-shedding :) [08:12:21] hm, it's only thursday though [08:12:32] "only" [08:12:36] mobrovac: Ca fait un moment que j'en fait aussi beaucoup :) [08:43:53] mobrovac: have a minute for a small bikeshedding? [08:45:56] joal: Bonjour joal! [08:46:01] :) [08:46:26] I worked a bit with Papaul yesterday and he should have a patch for my partman config [08:46:35] soooo we might get the host reimaged today [08:46:38] elukey: great ! [08:46:46] YAYYYY :) [08:46:53] finger crossed.. I am going to study the puppet config [08:47:02] to add basic cassandra multi instance [08:47:04] and that's it [08:47:10] elukey: I am also bikeshedding with mobrovac on schemas, but it's good to have some technology moving on :) [08:47:40] haha [08:47:49] joal: i do have a moment [08:48:14] mobrovac: great ! [08:48:18] mobrovac: https://hangouts.google.com/hangouts/_/wikimedia.org/a-batcave ? [08:48:42] be there in 1 min [08:48:47] cool [09:08:31] joal: question about our dear friends oozie and hive - do we affect ongoing jobs if we restart their daemons on 1003? [09:19:50] that's nothing urgent, but they should pick up the new java from the latest security update some time through a restart [09:21:03] I would say that it is fine since oozie should just schedule to hadoop, so the in flight ones will not be affected [09:21:10] but Jo is the expert :) [09:24:53] elukey: I think we do [09:25:37] elukey: oozie should be ok if we ensure no change is ongoing and we stop it properly, byt hive, if there running queries, they'll die I thinmk [09:27:57] moritzm, elukey: easiest would be to pause load job from oozie, wait for the currently running job to finish, then safely restart daemons, and resumer jobs [09:28:19] moritzm, elukey: Shall I go and pause the jobs ? [09:32:37] joal: It would be good for me to learn how to do it [09:32:50] elukey: np roblem for me [09:33:53] elukey: you let me know when :) [09:36:27] joal: I was checking Hue :) Is it enought to stop the running load workflows or should I also get up to coordinators, bundles? [09:36:31] * elukey still ignorant [09:37:08] elukey: you shouldn't stop but suspend :) [09:37:31] yes sorry! [09:37:35] I meant to say suspend [09:37:42] elukey: when doing an action on top level, they are cascaded to the children ones: so suspending bundle is the way to go :) [09:38:17] ahhhhhhh goooooood [09:39:01] yeah, oozie stillsome reasonnably good sides ;) [09:39:10] elukey: brb [09:42:09] joal: ok so I suspended webrequest-load-bundle and the two running workflows got suspended too. I checked the Coordinators and all the running ones are waiting for webrequest load sooo I guess that's it right? [09:43:34] mmm no you mentioned letting the running jobs to finish [09:58:35] correct elukey, need to check yarn [09:58:53] elukey: still ongoing jobs, let's wait for them to finish [10:01:15] checking [10:12:28] elukey: no hive left in yarn, you can proceded :) [10:14:17] joal: all right, I was waiting for Hue's refresh but it is acting weird now [10:14:48] I don't see any workflow/coordinator/bundle (keeps spinning) [10:17:24] [12/May/2016 10:03:12 +0000] workflows ERROR failed to import workflow root [10:17:31] RuntimeError: No hadoop file system to operate on. [10:17:33] whaaaa [10:18:18] elukey: no issue on my side, working fine [10:18:38] ah no that error is also present way back, weird [10:18:59] now it works [10:19:01] -.- [10:19:12] anyhow, proceeding with oozie/hive restarts [10:20:23] !log restarted oozie, hive-* daemons on analytics1003 for java upgrades [10:21:30] resumed the bundle too [10:24:16] Yarn looks good, seeing new jobs flowing [10:26:18] Indeed seems good elukey :) [10:26:27] Thanks for keeping our java up-to-date ! [10:28:38] confirmed, all processes on 1003 use the new java now [10:29:14] joal: kudos to moritzm for keeping ALL the wikimedia daemons up to date :P [10:29:27] +1 elukey, thanks moritzm ! [10:31:02] moritzm: https://wikitech.wikimedia.org/wiki/Service_restarts#Oozie and https://wikitech.wikimedia.org/wiki/Service_restarts#Hive [10:32:03] elukey: nice, thanks! [10:33:19] could you maybe amend whether the two hive server components can be restarted in arbitrary order or not? [10:34:44] moritzm: ah yeah I'll do it, probably it doesn't matter but I'd need to research a bit. I restarted the metastore first then the server, but it shouldn't matter [10:39:20] thanks [11:01:07] elukey: still a lot ongoing with misc-caches, isn't it? [11:02:36] joal: they have been completely migrated to V4 but there is an ongoing problem and some caches are restarted [11:02:57] (new Varnish version + purges) [11:02:57] yup, can follow that on our data miss alerts :) [11:33:32] Analytics: Pageview API: Limit (and document) size of data you can request - https://phabricator.wikimedia.org/T134524#2288722 (JAllemandou) @GWicke : Some more data and more or less expected results. - In current requests patterns, ~80% of requests are for fresh data - (end date either today or yesterday)... [11:33:53] AFK for some time a-team [11:48:15] mobrovac: whenever you have time would you mind to tell me if https://gerrit.wikimedia.org/r/#/c/288373/2 is a bad idea to deploy AQS on the new hosts without affecting the current cluster? The goal is to test cassandra compactions and possibly to load test the cluster [11:49:36] elukey: i'm deep in some stuff, will take a look at it after that [11:49:51] elukey: to be safe, add me as a reviewer there [11:50:14] mobrovac: sure sure, whenever you have time, nothing urgent [11:50:20] kk [11:50:38] thanks! [12:16:01] Analytics, Wikipedia-Android-App-Backlog: Investigate recent decline in views and daily users - https://phabricator.wikimedia.org/T132965#2215521 (dr0ptp4kt) Does https://gerrit.wikimedia.org/r/#/c/285051/ ({T133204}) address the header enrichment? What do the stats say now? [13:34:32] hmmm misc misc misc misc [13:35:49] ottomata: o/ [13:36:00] hiyaaa [13:39:26] am looking into the dataloss alerts [13:39:29] not sure what's happening yet [13:43:02] ottomata: I think that there was a re-install of varnish 4 due to a misc bug this morning [13:43:12] but they are playing with VCL for some issue [13:44:12] elukey: huh so are they restarting varnish and vk is having trouble? [13:44:33] ottomata: mmm I didn't see any alert for vk [13:44:39] oh me neither [13:44:48] but i'm trying to explain why there is consistent loss on misc [13:44:51] ottomata: I also restarted oozie and hive today for java upgrades, but stopping with joal the jobs [13:44:52] minimial loss [13:45:16] * elukey invokes ema [13:45:19] hmmm ok [13:45:19] ---^ [13:45:37] also fairly consistent loss on maps [13:45:40] are things changing there too? [13:45:56] not that I know [13:46:05] very small loss though [13:46:30] what does "loss" mean in this context? I mean, practically what is happening? [13:48:00] the sequence stats check oozie jobs are sending warning emails [13:48:21] they calculate loss by comparing max_seq - min_seq to count(*) in an hour [13:48:45] not counting cases where min_seq == 0, because that signifies a vk restart [13:50:07] ahh ok so from vk [13:51:35] ja [13:51:51] each vk instance logs an incrementing seq for each message [13:51:51] so [13:51:56] if all is well, during any given hour [13:52:00] yep yep [13:52:04] max_seq - min_seq == count(*) [13:52:06] well [13:52:09] count(host) [13:52:20] group by hostname, whateverrr :) [13:52:29] rrrrrr [13:52:32] :) [13:55:35] small loss here is something around 1 percent or less [13:56:29] for misc it looks like 1% is around 1000 messages per hour [14:09:23] ottomata: when you get a chance can you kick ldaptestaccount123/chase.mp@gmail from https://gerrit.wikimedia.org/r/#/admin/groups/833,members and add maven-release-user/maven-release-user@wikimedia.org instead? [14:09:56] back in our channel: ottomata would you mind also to paste the hive queries that you are using somewhere? (not now ofc, whenever you have a min) [14:10:31] madhuvishy: i think it doesn't lke that / in the name [14:10:35] maven-release-user/maven-release-user@wikimedia.org [14:10:39] ohhh [14:10:40] gives me 500 error [14:10:42] ottomata: no no [14:10:46] just username/email [14:10:48] oh [14:10:55] ? [14:10:59] maven-release-user@wikimedia.org [14:10:59] ? [14:11:04] the username is maven-release-user, email is maven-release-user@wikimedia.org [14:11:04] ya [14:11:14] i guess you only need email [14:11:18] ottomata: ^ [14:11:44] hm, madhuvishy its not letting me do that either! [14:11:47] will come back to it.. :/ [14:11:54] ottomata: ya np - no hurry [14:11:59] elukey: haha, yeah i was trying to use hue to save some queries and graph them, but it wasn't really working, dunno why :/ [14:12:11] i don't use hue much at all, so i don't know if it is supposed to work [14:12:12] heh [14:12:24] elukey: example [14:12:24] select CONCAT(webrequest_source, ":", year, month, day, hour) as d, count_lost, percent_lost from webrequest_sequence_stats_hourly where year=2016 and month = 5 and day >= 8 and percent_lost != 0.0; [14:12:50] ottomata: the new hue is kinda weird - it sometimes starts the tasks and they show up in running jobs but doesn't populate it in the UI below [14:16:13] elukey: looking at this now [14:16:14] select CONCAT(webrequest_source, ":", year, '-', month, '-', day, '.', hour) as d, hostname, percent_different from webrequest_sequence_stats where webrequest_source='maps' and year=2016 and month = 4 and day in (1,2,3,4, 23,34,25,26,27,28,29) and percent_different < 0.0 and sequence_min != 0; [14:16:39] looking at per host loss for maps on april 1-4 and april 23-29 [14:16:47] not much before april 4 [14:17:03] then starting april 4 cp1043 and cp1044 show up [14:17:23] then starting april 26 more hosts show up with loss [14:18:08] very nice, definitely something related to vk and v4 then [14:19:09] I am wondering how to track down what requests are lost [14:19:12] maybe there is a pattern [14:20:25] ottomata, elukey : Maybe it's not real loss but index issues (like a request generates a new index value but doesn't need to generate a log line ? [14:20:39] could be indeed! [14:21:43] joal: sorry i missed your point :( can I add --verbose? [14:22:46] joal, hi! [14:23:49] joal, I think unique devices is not being either calculated or inserted in cassandra any more, because the API returns data only until 2016-05-04 [14:24:12] :[[[[[ [14:24:24] btw, good afternoon :] [14:27:26] mforns: o/ [14:27:33] hi elukey :] [14:31:23] joal: you got a sec to talk about the aqs failing tests? [14:31:52] oh, aqs uniques not being loaded sounds more important, never mind, look at that first [14:33:32] milimetric, mforns , elukey Hey ! [14:33:39] hello joal :] [14:33:41] joal: too may people! [14:33:42] :P [14:33:45] in order: mforns for uniques :) [14:33:56] yes! I won! [14:33:57] Let me double check mforns [14:34:00] :D [14:35:44] Good catch mforns !!!! I must have had mistaken when restarting jobs last week: 2 monthly but no daily [14:36:56] !log Start cassandra unique dives loading oozie job backfilling from 2016-05-05 included onward [14:37:30] oozie server restarted: small ID numbers ;) [14:37:52] joal, thanks for looking, it would have taken a while for me to look into it [14:38:10] no problem mforns, thanks for spotting it ! [14:38:18] :] [14:38:27] next [14:38:28] :] [14:38:48] huhu mforns [14:39:23] elukey: more in detail: We compute loss / duplication from autoinc numbers sent by VK [14:39:47] elukey: Let's imagine a request generates such a number but doesn't generate a log line ... [14:40:18] elukey: We interpret that as dataloss, but in fact, this request not sending line could have been the correct behaviour [14:40:25] elukey: makes sense? [14:41:03] joal: yep yep but I was a bit puzzled by the "generates a number but not a log" [14:41:11] elukey: I don't know enough of VK to known if this way of thinking could apply [14:41:43] yeah, that's the thing elukey, I don't know if this latter could be a valid Varnish behaviour [14:41:43] I'll ask to Andrew! [14:41:47] cool :) [14:41:59] milimetric: I'm all for you now, other things done :) [14:42:08] wow, that was fast [14:42:19] ok, so if you run npm test do you tests pass? [14:42:30] hmmm, I will test ! [14:42:42] (PS3) Mforns: Fix unique devices bugs [analytics/dashiki] - https://gerrit.wikimedia.org/r/288104 (https://phabricator.wikimedia.org/T122533) [14:44:37] milimetric: They pass for me, yes [14:44:42] milimetric: batcave? [14:45:09] sure [14:46:49] (PS4) Mforns: Fix unique devices bugs [analytics/dashiki] - https://gerrit.wikimedia.org/r/288104 (https://phabricator.wikimedia.org/T122533) [14:48:38] changing locations, back in a bit [14:48:45] (CR) Nuria: "Looks good. I think it will be worth it to add a small unit test for 1)" [analytics/dashiki] - https://gerrit.wikimedia.org/r/288104 (https://phabricator.wikimedia.org/T122533) (owner: Mforns) [14:49:30] nuria_, hi! thanks for the review, have you seen the latest patch? I did yet another change to the commit message [14:49:37] there were 3 bugs, not 2 [14:50:12] ok ok I see you commented on patch set 4 [14:50:25] mforns: yes, just looked. [14:51:05] mforns: changes look good, my comment was to further support what we were talking about inmutability of objects that are read form configuration [14:51:39] nuria_, agree, will do [14:51:47] mforns: but not on thsi patch [14:51:51] *this [14:52:01] we can do those changes at another time [14:52:07] nuria_, but I will add a unit test [14:52:13] k [14:59:50] Analytics-Kanban, Continuous-Integration-Config: Add a maven-release user to Gerrit {hawk} - https://phabricator.wikimedia.org/T132176#2289305 (madhuvishy) [15:00:32] mforns: backfilled, you should has DATAZ ! [15:00:42] joal, thanks! [15:04:23] (PS2) Milimetric: Support YYYYMMDDHH format for the unique endpoint [analytics/aqs] - https://gerrit.wikimedia.org/r/288264 (https://phabricator.wikimedia.org/T134840) [15:12:53] Analytics-Kanban, Operations, ops-eqiad, Patch-For-Review: rack/setup/deploy aqs100[456] - https://phabricator.wikimedia.org/T133785#2289342 (elukey) a:Cmjohnson>elukey [15:18:06] (PS3) Milimetric: Support YYYYMMDDHH format for the unique endpoint [analytics/aqs] - https://gerrit.wikimedia.org/r/288264 (https://phabricator.wikimedia.org/T134840) [15:22:39] (PS5) Mforns: Fix unique devices bugs [analytics/dashiki] - https://gerrit.wikimedia.org/r/288104 (https://phabricator.wikimedia.org/T122533) [15:26:09] (CR) Mforns: Fix unique devices bugs (1 comment) [analytics/dashiki] - https://gerrit.wikimedia.org/r/288104 (https://phabricator.wikimedia.org/T122533) (owner: Mforns) [15:26:43] nuria_, milimetric, the patch is finally ready for review again ^ [15:33:58] milimetric: here ? [15:34:08] yep, what's up [15:35:25] read your code review, tested it: works fine. I have a suggestion: https://gist.github.com/jobar/017c7d4ae48a1ebf32721ec26668ac36 [15:35:28] milimetric: --^ [15:35:53] milimetric: used that piece of code to actually test the thing :) [15:36:18] joal: sure, that works for me. I gotta finish up something else but I can patch it after our meetings today [15:36:41] I'll also ping mobrovac because the problem we see with cassandra string index doesn't happen with sqlite (so test only means something when used with cassandra for the moment) [15:36:46] milimetric: --^ [15:36:53] Cool ! Thanks milimetric :) [15:41:40] whoo elukey i don't see any ISR shrinks since yesterdays upgrade! [15:41:54] * elukey dances [15:43:58] elukey: any objections to restarting one broker with inter broker protocol version 0.9? [15:44:32] (CR) Joal: [C: 2 V: 2] "LGTM !" [analytics/aqs] - https://gerrit.wikimedia.org/r/288264 (https://phabricator.wikimedia.org/T134840) (owner: Milimetric) [15:44:47] thanks milimetric! [15:45:30] ottomata: LGTM [15:45:34] uh... but I didn't change the test like you wanted joal [15:45:56] milimetric: doesn't matter: you version is even better :) [15:46:25] I just don't know if in node tests get run in sequence (and therefore you can reuse inserted data) or not [15:46:29] milimetric: --^ [15:46:55] Hmm, interesting, dcausse hiya [15:47:03] just happened to be looking at some kafka broker logs [15:47:04] and saw [15:47:06] Topic and partition to exceptions: [mediawiki_CirrusSearchRequestSet,8] -> kafka.common.MessageSizeTooLargeException (kafka.server.KafkaApis) [15:47:23] damn :( [15:47:24] default max message size is 1mb [15:47:28] we can increase it [15:47:38] ouch more than 1mb [15:47:43] it was just one message, but i betcha it happens now and then [15:47:49] but, maybe it shouldn't be so large? dunno [15:47:53] I would not have expected that [15:47:59] man, 1mb of search requests :) [15:48:03] :) [15:48:14] we just added the results displayed in the logs [15:48:33] actually dcausse i am seeing a few of those [15:48:40] not a lot [15:48:40] dcausse: that explains then ! [15:48:41] oh maybe some crazy api requests with limit=5000 [15:48:42] but they are in there [15:49:05] maybe we need to limit what we log here [15:49:09] thanks for the heads up [15:49:15] uhhhhh [15:49:16] also [15:49:18] just noticed [15:49:18] [KafkaApi-13] Closing connection due to error during produce request with correlation id 0 from client id kafka-php with ack=0 [15:49:22] ack=0 [15:49:22] ? [15:49:25] you sure you want that? [15:49:37] I don't know what it means :/ [15:49:57] ebernhardson: maybe knows? [15:50:48] ack=0 means that the producer won't wait for the broker to acknowledge that it has accepted the produce request [15:50:59] ack=-1 means all replicas [15:51:04] ack=1 just the leader replica [15:51:27] search logs are UPD style with kafka :) [15:51:38] *UDP sorry [15:52:38] dcausse: i would recommend at least acks=1, but acks=0 will make the kafka produce request faster for sure [15:52:51] ottomata: thanks, for webrequests you use acks=1 ? [15:52:56] so if you are trying to be done with producing to kafka as fast as possible, since you are doing so in the client request, maybe it makes sense [15:53:08] hmm [15:53:12] I have no idea unfortunately [15:53:21] ja we use 1 [15:53:34] message size too large :S we did just increase the size of the messages but i didn't even know kafka had a limit [15:54:41] Attempting Boot From Hard Drive (C:) [15:54:44] .... [15:54:46] ebernhardson: it is a config limit [15:54:48] ebernhardson: I think it's caused by some very high limits used by internal api consumers [15:54:48] can be increased [15:54:54] also can be increased per topic [15:55:16] but I think we don't need to store all the results, top 20 seems to be sufficient imho [15:55:21] i'll check into changing the ack settings [15:55:29] at least then we get errors on both ends [15:56:18] looks easy enough to set, we left it with the default of 0 [15:56:59] k [15:57:01] cool [15:57:01] :) [15:59:19] i just need to test what kafka sends back when it fails :) [16:01:31] elukey: coming to standupp [16:01:32] ? [16:05:35] ottomata: can you try again to add maven-release-user@wikimedia.org (I believe the gerrit registration was incomplete before - again no hurry) [16:07:15] madhuvishy: worked! [16:07:20] coool [16:07:30] madhuvishy: standup? [16:07:52] ottomata: at hacker school - don't have a good place to join from [16:08:15] aye cool [16:12:20] Analytics-Kanban, Continuous-Integration-Config: Add a maven-release user to Gerrit {hawk} - https://phabricator.wikimedia.org/T132176#2289575 (madhuvishy) maven-release-user (maven-release-user@wikimedia.org) has been created, credentials are available on jenkins, and the right gerrit permissions have b... [16:14:55] Analytics-Kanban: Create separate archiva credentials to be loaded to the Jenkins cred store {hawk} - https://phabricator.wikimedia.org/T132177#2289581 (madhuvishy) a:madhuvishy>Ottomata [16:14:58] Analytics-Kanban: Create separate archiva credentials to be loaded to the Jenkins cred store {hawk} - https://phabricator.wikimedia.org/T132177#2190727 (madhuvishy) @Ottomata Assigning this to you since I don't have powers to make a user. [16:18:28] a-team: for my jenkins update - https://review.openstack.org/#/c/313196/ got +1-ed! Don't know what happens next but i'm hoping someone will merge it soonish. I had some user creation stuff pushed through with OIT's and ottomata's help - and that task is in Done now. [16:25:14] joal: it seems that I hit a bug in boot after installing jessie.. those hosts and I are not in getting along :P [16:25:23] Analytics-Kanban, Patch-For-Review: Client values inbound in X-analytics header (pageview and preview) are reflected in outbound X-Analytics on varnish - https://phabricator.wikimedia.org/T133204#2225002 (Nuria) Confirmed these changes make expected headers appear on cluster cc @Tbayer [16:27:40] Analytics, Wikipedia-Android-App-Backlog: Investigate recent decline in views and daily users - https://phabricator.wikimedia.org/T132965#2215521 (Nuria) @dr0ptp4kt : yes, varnish code publishes appropriate headers to x_analytics field, I confirmed those are present in cluster data. [16:28:28] elukey: I trust you'll tame them, then everything we'll be fine :) [16:28:56] Analytics: Pageview API: Limit (and document) size of data you can request - https://phabricator.wikimedia.org/T134524#2289654 (GWicke) > In current requests patterns, ~80% of requests are for fresh data - (end date either today or yesterday) From what I have seen, there is still a significant spread of val... [16:31:01] nice madhuvishy! [16:36:08] Analytics: Test cassandra compactions on new AQS nodes - https://phabricator.wikimedia.org/T135145#2289668 (JAllemandou) [16:39:54] ottomata: :D Also, I assigned https://phabricator.wikimedia.org/T132177 to you [16:41:08] k [16:41:54] Analytics-Kanban, EventBus, Patch-For-Review: Propose evolution of Mediawiki EventBus schemas to match needed data for Analytics need - https://phabricator.wikimedia.org/T134502#2289699 (Nuria) CR in progress, use code to talk, no need to task. [16:43:41] mobrovac: about my code review - aqs::seeds: needs to contain the cassandra instance domains or the host domains? [16:43:51] I thought the latter [16:44:12] cass instances too [16:44:18] ah ok! [16:44:24] those are used by the driver to connect to the cass nodes directly [16:44:39] Analytics: Create edit data schemas for anaylitcs - https://phabricator.wikimedia.org/T134793#2289728 (mforns) [16:45:22] Analytics: Create edit data hadoop/druid schemas for anaylitcs - https://phabricator.wikimedia.org/T134793#2277198 (mforns) [16:45:34] Analytics: Create edit data hadoop/druid schemas for anaylitcs - https://phabricator.wikimedia.org/T134793#2277198 (Nuria) Build a schema that would help us doing analytics. How do we represent that data into druid? and before how do we represent that data into hadoop? [16:45:36] mobrovac: fixed the code review thanks [16:46:50] mobrovac: my main concern was not to affect the main AQS cluster with this config but to re-use the aqs role for testing [16:47:44] oh elukey, one more thing i forgot to write on the PS [16:47:54] you need to change the cassandra cluster name as well [16:48:10] ahhh nice I forgot to ask, I thought it was confd stuff [16:48:10] i believe it's cassandra::cluster_name or some obvious var name like that [16:48:17] sure sure [16:48:56] kk, i'm off [16:48:59] time to go bowling [16:49:10] good bowling :) [16:49:13] :) [16:49:22] elukey: you're becoming french! [16:49:33] too much joal influence :)))))) [16:59:12] Analytics: Wikistats 2.0. Edit Reports: Setting up a pipeline to source Historical Edit Data into hdfs {lama} - https://phabricator.wikimedia.org/T130256#2289828 (Nuria) 1. DESIGN: 1.1 First team needs to internally define schemas that are to be used to calculate metrics. These are not event-based schema bu... [16:59:29] Analytics: Create edit data hadoop/druid schemas for anaylitcs - https://phabricator.wikimedia.org/T134793#2289833 (Nuria) [16:59:31] Analytics: Wikistats 2.0. Edit Reports: Setting up a pipeline to source Historical Edit Data into hdfs {lama} - https://phabricator.wikimedia.org/T130256#2289832 (Nuria) [16:59:36] Analytics-Kanban: Wikistats 2.0. Edit Reports: Setting up a pipeline to source Historical Edit Data into hdfs {lama} - https://phabricator.wikimedia.org/T130256#2289834 (JAllemandou) [17:02:05] Analytics: Create edit data hadoop/druid schemas for anaylitcs - https://phabricator.wikimedia.org/T134793#2289854 (Nuria) This task is about schema Design. 1.1 First team needs to internally define schemas that are to be used to calculate metrics. These are not event-based schema but data flowing in them c... [17:04:23] Analytics: Spike - Slowly Changing Dimensions on Druid - https://phabricator.wikimedia.org/T134792#2289857 (Nuria) [17:07:35] Analytics: Spike - Slowly Changing Dimensions on Druid - https://phabricator.wikimedia.org/T134792#2289863 (Nuria) [17:13:47] Hi ATeam! :D [17:14:31] Will this user agent defintly get marked as a spider? "Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)" and is there a way to check if a request has been marked as a spider in hadoop? [17:18:20] Analytics: Create edit data hadoop/druid schemas for anaylitcs - https://phabricator.wikimedia.org/T134793#2289898 (Nuria) There are three entities: Page, User and Revision. [17:25:04] Analytics: Test cassandra compactions on new AQS nodes - https://phabricator.wikimedia.org/T135145#2289668 (Nuria) - Basic puppet configuration /deploy / - load data into cassandra -test (via restbase hopefully) [17:25:37] Analytics: Test cassandra compactions on new AQS nodes - https://phabricator.wikimedia.org/T135145#2289913 (Nuria) [17:25:39] Analytics-Kanban, Datasets-Webstatscollector, RESTBase-Cassandra, Patch-For-Review: Better response times on AQS (Pageview API mostly) {melc} - https://phabricator.wikimedia.org/T124314#2289912 (Nuria) [17:37:16] addshore: Hi! [17:37:27] addshore: yes there is definitely a way to check [17:37:40] * madhuvishy looks up the udf [17:38:59] :) [17:41:40] Analytics: Test cassandra compactions on new AQS nodes - https://phabricator.wikimedia.org/T135145#2290040 (Nuria) Puppet is almost done, tasked assuming that metrics show up on graphana, otherwise we need more work [17:42:01] addshore: this is the regex that is being matched https://github.com/wikimedia/analytics-refinery-source/blob/0203ffc79f9ba967d26a73ba1012a97383199296/refinery-core/src/main/java/org/wikimedia/analytics/refinery/core/Webrequest.java#L58 [17:42:08] in your case [17:42:22] it looks like Yahoo will definitely get it marked [17:42:25] so that should match [17:42:28] interesting [17:43:32] basically Wikidata got 87k hits to Special:RecentChangesLinked and I am trying to track down why! [17:43:43] yesterday and the day before at least!¬ [17:46:52] and a quick query on hive shoed the yahoo UA as top of the list (but my query could be bad) ;) [17:47:43] Analytics-Kanban: Test cassandra compactions on new AQS nodes - https://phabricator.wikimedia.org/T135145#2290074 (JAllemandou) [17:47:52] Analytics-Kanban: Spike - Slowly Changing Dimensions on Druid - https://phabricator.wikimedia.org/T134792#2290077 (JAllemandou) [17:48:02] you can also verify any user agent with the UDF - [17:48:06] Analytics-Kanban: Create edit data hadoop/druid schemas for anaylitcs - https://phabricator.wikimedia.org/T134793#2290078 (JAllemandou) [17:48:29] https://www.irccloud.com/pastebin/5miglWgd/ [17:48:42] addshore: aah [17:48:58] madhuvishy: Hiiii :) [17:48:59] addshore: and it wasn't marked spider? [17:49:04] joal: hello :D [17:49:44] madhuvishy: Quick phab question for you https://phabricator.wikimedia.org/T130123, is currently in "To Task", but I think it's the one you almsot finished, no ? [17:50:03] joal: I just discovered that ADD JAR doesn't seem to work on beeline - need to look into it at some point [17:50:10] joal: oh that's different [17:50:59] this is for jenkins to actually upload the jars to our stat box at /srv/deployment/analytics/refinery [17:51:01] i think [17:51:12] madhuvishy: wooooow, the ADD JAR thing makes me feel bad ! [17:51:20] oh, ok madhu ! [17:51:32] Leaving it to "To Task" then :) [17:51:36] Thanks [17:51:51] madhuvishy: how do I see that? ;) [17:52:03] joal: i think there are two parts of the jenkins project - make it deploy to archiva, then make it actually upload the jars to where we use it [17:52:10] oh wait, now I see the paste... [17:52:36] addshore: uhhh, how do you see if tagged as spider? also select the agent-type column [17:52:52]