[01:04:37] Quarry, Labs, Labs-Infrastructure: Long-running query produces strangely incorrect results - https://phabricator.wikimedia.org/T135087#2287871 (Neil_P._Quinn_WMF) [01:21:34] Quarry, Labs, Labs-Infrastructure: Long-running Quarry query (querry?) produces strangely incorrect results - https://phabricator.wikimedia.org/T135087#2287926 (Neil_P._Quinn_WMF) [04:45:14] so the "log" database has 94508 tables now? impressive [04:45:21] mysql:research@analytics-store.eqiad.wmnet [log]> SELECT COUNT(*) FROM information_schema.tables; [04:50:42] ah, i mean the entire server [04:50:52] (log has 351: SELECT COUNT(*) FROM information_schema.tables table_name WHERE TABLE_SCHEMA='log';) [05:23:48] Analytics, Wikipedia-Android-App-Backlog: Investigate recent decline in views and daily users - https://phabricator.wikimedia.org/T132965#2288226 (Tbayer) While this is being sorted out (any updates?), I have added an annotation to the [[https://vital-signs.wmflabs.org/#projects=all/metrics=Pageviews |Vi... [07:38:26] Analytics-Kanban, EventBus, Patch-For-Review: Propose evolution of Mediawiki EventBus schemas to match needed data for Analytics need - https://phabricator.wikimedia.org/T134502#2288352 (mobrovac) [07:50:39] joal: bonjour [07:50:46] joal: left some comments over at https://gerrit.wikimedia.org/r/#/c/288210/ [08:05:34] mobrovac: buongiorno [08:05:43] ciao elukey [08:10:53] Bonjour mobrovac ! [08:11:02] Merci pour les commentaires, je vais les lire [08:11:43] Buongiorno elukey ! [08:12:03] rien de mieux pour la fin de la semaine que du bike-shedding :) [08:12:21] hm, it's only thursday though [08:12:32] "only" [08:12:36] mobrovac: Ca fait un moment que j'en fait aussi beaucoup :) [08:43:53] mobrovac: have a minute for a small bikeshedding? [08:45:56] joal: Bonjour joal! [08:46:01] :) [08:46:26] I worked a bit with Papaul yesterday and he should have a patch for my partman config [08:46:35] soooo we might get the host reimaged today [08:46:38] elukey: great ! [08:46:46] YAYYYY :) [08:46:53] finger crossed.. I am going to study the puppet config [08:47:02] to add basic cassandra multi instance [08:47:04] and that's it [08:47:10] elukey: I am also bikeshedding with mobrovac on schemas, but it's good to have some technology moving on :) [08:47:40] haha [08:47:49] joal: i do have a moment [08:48:14] mobrovac: great ! [08:48:18] mobrovac: https://hangouts.google.com/hangouts/_/wikimedia.org/a-batcave ? [08:48:42] be there in 1 min [08:48:47] cool [09:08:31] joal: question about our dear friends oozie and hive - do we affect ongoing jobs if we restart their daemons on 1003? [09:19:50] that's nothing urgent, but they should pick up the new java from the latest security update some time through a restart [09:21:03] I would say that it is fine since oozie should just schedule to hadoop, so the in flight ones will not be affected [09:21:10] but Jo is the expert :) [09:24:53] elukey: I think we do [09:25:37] elukey: oozie should be ok if we ensure no change is ongoing and we stop it properly, byt hive, if there running queries, they'll die I thinmk [09:27:57] moritzm, elukey: easiest would be to pause load job from oozie, wait for the currently running job to finish, then safely restart daemons, and resumer jobs [09:28:19] moritzm, elukey: Shall I go and pause the jobs ? [09:32:37] joal: It would be good for me to learn how to do it [09:32:50] elukey: np roblem for me [09:33:53] elukey: you let me know when :) [09:36:27] joal: I was checking Hue :) Is it enought to stop the running load workflows or should I also get up to coordinators, bundles? [09:36:31] * elukey still ignorant [09:37:08] elukey: you shouldn't stop but suspend :) [09:37:31] yes sorry! [09:37:35] I meant to say suspend [09:37:42] elukey: when doing an action on top level, they are cascaded to the children ones: so suspending bundle is the way to go :) [09:38:17] ahhhhhhh goooooood [09:39:01] yeah, oozie stillsome reasonnably good sides ;) [09:39:10] elukey: brb [09:42:09] joal: ok so I suspended webrequest-load-bundle and the two running workflows got suspended too. I checked the Coordinators and all the running ones are waiting for webrequest load sooo I guess that's it right? [09:43:34] mmm no you mentioned letting the running jobs to finish [09:58:35] correct elukey, need to check yarn [09:58:53] elukey: still ongoing jobs, let's wait for them to finish [10:01:15] checking [10:12:28] elukey: no hive left in yarn, you can proceded :) [10:14:17] joal: all right, I was waiting for Hue's refresh but it is acting weird now [10:14:48] I don't see any workflow/coordinator/bundle (keeps spinning) [10:17:24] [12/May/2016 10:03:12 +0000] workflows ERROR failed to import workflow root [10:17:31] RuntimeError: No hadoop file system to operate on. [10:17:33] whaaaa [10:18:18] elukey: no issue on my side, working fine [10:18:38] ah no that error is also present way back, weird [10:18:59] now it works [10:19:01] -.- [10:19:12] anyhow, proceeding with oozie/hive restarts [10:20:23] !log restarted oozie, hive-* daemons on analytics1003 for java upgrades [10:21:30] resumed the bundle too [10:24:16] Yarn looks good, seeing new jobs flowing [10:26:18] Indeed seems good elukey :) [10:26:27] Thanks for keeping our java up-to-date ! [10:28:38] confirmed, all processes on 1003 use the new java now [10:29:14] joal: kudos to moritzm for keeping ALL the wikimedia daemons up to date :P [10:29:27] +1 elukey, thanks moritzm ! [10:31:02] moritzm: https://wikitech.wikimedia.org/wiki/Service_restarts#Oozie and https://wikitech.wikimedia.org/wiki/Service_restarts#Hive [10:32:03] elukey: nice, thanks! [10:33:19] could you maybe amend whether the two hive server components can be restarted in arbitrary order or not? [10:34:44] moritzm: ah yeah I'll do it, probably it doesn't matter but I'd need to research a bit. I restarted the metastore first then the server, but it shouldn't matter [10:39:20] thanks [11:01:07] elukey: still a lot ongoing with misc-caches, isn't it? [11:02:36] joal: they have been completely migrated to V4 but there is an ongoing problem and some caches are restarted [11:02:57] (new Varnish version + purges) [11:02:57] yup, can follow that on our data miss alerts :) [11:33:32] Analytics: Pageview API: Limit (and document) size of data you can request - https://phabricator.wikimedia.org/T134524#2288722 (JAllemandou) @GWicke : Some more data and more or less expected results. - In current requests patterns, ~80% of requests are for fresh data - (end date either today or yesterday)... [11:33:53] AFK for some time a-team [11:48:15] mobrovac: whenever you have time would you mind to tell me if https://gerrit.wikimedia.org/r/#/c/288373/2 is a bad idea to deploy AQS on the new hosts without affecting the current cluster? The goal is to test cassandra compactions and possibly to load test the cluster [11:49:36] elukey: i'm deep in some stuff, will take a look at it after that [11:49:51] elukey: to be safe, add me as a reviewer there [11:50:14] mobrovac: sure sure, whenever you have time, nothing urgent [11:50:20] kk [11:50:38] thanks! [12:16:01] Analytics, Wikipedia-Android-App-Backlog: Investigate recent decline in views and daily users - https://phabricator.wikimedia.org/T132965#2215521 (dr0ptp4kt) Does https://gerrit.wikimedia.org/r/#/c/285051/ ({T133204}) address the header enrichment? What do the stats say now? [13:34:32] hmmm misc misc misc misc [13:35:49] ottomata: o/ [13:36:00] hiyaaa [13:39:26] am looking into the dataloss alerts [13:39:29] not sure what's happening yet [13:43:02] ottomata: I think that there was a re-install of varnish 4 due to a misc bug this morning [13:43:12] but they are playing with VCL for some issue [13:44:12] elukey: huh so are they restarting varnish and vk is having trouble? [13:44:33] ottomata: mmm I didn't see any alert for vk [13:44:39] oh me neither [13:44:48] but i'm trying to explain why there is consistent loss on misc [13:44:51] ottomata: I also restarted oozie and hive today for java upgrades, but stopping with joal the jobs [13:44:52] minimial loss [13:45:16] * elukey invokes ema [13:45:19] hmmm ok [13:45:19] ---^ [13:45:37] also fairly consistent loss on maps [13:45:40] are things changing there too? [13:45:56] not that I know [13:46:05] very small loss though [13:46:30] what does "loss" mean in this context? I mean, practically what is happening? [13:48:00] the sequence stats check oozie jobs are sending warning emails [13:48:21] they calculate loss by comparing max_seq - min_seq to count(*) in an hour [13:48:45] not counting cases where min_seq == 0, because that signifies a vk restart [13:50:07] ahh ok so from vk [13:51:35] ja [13:51:51] each vk instance logs an incrementing seq for each message [13:51:51] so [13:51:56] if all is well, during any given hour [13:52:00] yep yep [13:52:04] max_seq - min_seq == count(*) [13:52:06] well [13:52:09] count(host) [13:52:20] group by hostname, whateverrr :) [13:52:29] rrrrrr [13:52:32] :) [13:55:35] small loss here is something around 1 percent or less [13:56:29] for misc it looks like 1% is around 1000 messages per hour [14:09:23] ottomata: when you get a chance can you kick ldaptestaccount123/chase.mp@gmail from https://gerrit.wikimedia.org/r/#/admin/groups/833,members and add maven-release-user/maven-release-user@wikimedia.org instead? [14:09:56] back in our channel: ottomata would you mind also to paste the hive queries that you are using somewhere? (not now ofc, whenever you have a min) [14:10:31] madhuvishy: i think it doesn't lke that / in the name [14:10:35] maven-release-user/maven-release-user@wikimedia.org [14:10:39] ohhh [14:10:40] gives me 500 error [14:10:42] ottomata: no no [14:10:46] just username/email [14:10:48] oh [14:10:55] ? [14:10:59] maven-release-user@wikimedia.org [14:10:59] ? [14:11:04] the username is maven-release-user, email is maven-release-user@wikimedia.org [14:11:04] ya [14:11:14] i guess you only need email [14:11:18] ottomata: ^ [14:11:44] hm, madhuvishy its not letting me do that either! [14:11:47] will come back to it.. :/ [14:11:54] ottomata: ya np - no hurry [14:11:59] elukey: haha, yeah i was trying to use hue to save some queries and graph them, but it wasn't really working, dunno why :/ [14:12:11] i don't use hue much at all, so i don't know if it is supposed to work [14:12:12] heh [14:12:24] elukey: example [14:12:24] select CONCAT(webrequest_source, ":", year, month, day, hour) as d, count_lost, percent_lost from webrequest_sequence_stats_hourly where year=2016 and month = 5 and day >= 8 and percent_lost != 0.0; [14:12:50] ottomata: the new hue is kinda weird - it sometimes starts the tasks and they show up in running jobs but doesn't populate it in the UI below [14:16:13] elukey: looking at this now [14:16:14] select CONCAT(webrequest_source, ":", year, '-', month, '-', day, '.', hour) as d, hostname, percent_different from webrequest_sequence_stats where webrequest_source='maps' and year=2016 and month = 4 and day in (1,2,3,4, 23,34,25,26,27,28,29) and percent_different < 0.0 and sequence_min != 0; [14:16:39] looking at per host loss for maps on april 1-4 and april 23-29 [14:16:47] not much before april 4 [14:17:03] then starting april 4 cp1043 and cp1044 show up [14:17:23] then starting april 26 more hosts show up with loss [14:18:08] very nice, definitely something related to vk and v4 then [14:19:09] I am wondering how to track down what requests are lost [14:19:12] maybe there is a pattern [14:20:25] ottomata, elukey : Maybe it's not real loss but index issues (like a request generates a new index value but doesn't need to generate a log line ? [14:20:39] could be indeed! [14:21:43] joal: sorry i missed your point :( can I add --verbose? [14:22:46] joal, hi! [14:23:49] joal, I think unique devices is not being either calculated or inserted in cassandra any more, because the API returns data only until 2016-05-04 [14:24:12] :[[[[[ [14:24:24] btw, good afternoon :] [14:27:26] mforns: o/ [14:27:33] hi elukey :] [14:31:23] joal: you got a sec to talk about the aqs failing tests? [14:31:52] oh, aqs uniques not being loaded sounds more important, never mind, look at that first [14:33:32] milimetric, mforns , elukey Hey ! [14:33:39] hello joal :] [14:33:41] joal: too may people! [14:33:42] :P [14:33:45] in order: mforns for uniques :) [14:33:56] yes! I won! [14:33:57] Let me double check mforns [14:34:00] :D [14:35:44] Good catch mforns !!!! I must have had mistaken when restarting jobs last week: 2 monthly but no daily [14:36:56] !log Start cassandra unique dives loading oozie job backfilling from 2016-05-05 included onward [14:37:30] oozie server restarted: small ID numbers ;) [14:37:52] joal, thanks for looking, it would have taken a while for me to look into it [14:38:10] no problem mforns, thanks for spotting it ! [14:38:18] :] [14:38:27] next [14:38:28] :] [14:38:48] huhu mforns [14:39:23] elukey: more in detail: We compute loss / duplication from autoinc numbers sent by VK [14:39:47] elukey: Let's imagine a request generates such a number but doesn't generate a log line ... [14:40:18] elukey: We interpret that as dataloss, but in fact, this request not sending line could have been the correct behaviour [14:40:25] elukey: makes sense? [14:41:03] joal: yep yep but I was a bit puzzled by the "generates a number but not a log" [14:41:11] elukey: I don't know enough of VK to known if this way of thinking could apply [14:41:43] yeah, that's the thing elukey, I don't know if this latter could be a valid Varnish behaviour [14:41:43] I'll ask to Andrew! [14:41:47] cool :) [14:41:59] milimetric: I'm all for you now, other things done :) [14:42:08] wow, that was fast [14:42:19] ok, so if you run npm test do you tests pass? [14:42:30] hmmm, I will test ! [14:42:42] (PS3) Mforns: Fix unique devices bugs [analytics/dashiki] - https://gerrit.wikimedia.org/r/288104 (https://phabricator.wikimedia.org/T122533) [14:44:37] milimetric: They pass for me, yes [14:44:42] milimetric: batcave? [14:45:09] sure [14:46:49] (PS4) Mforns: Fix unique devices bugs [analytics/dashiki] - https://gerrit.wikimedia.org/r/288104 (https://phabricator.wikimedia.org/T122533) [14:48:38] changing locations, back in a bit [14:48:45] (CR) Nuria: "Looks good. I think it will be worth it to add a small unit test for 1)" [analytics/dashiki] - https://gerrit.wikimedia.org/r/288104 (https://phabricator.wikimedia.org/T122533) (owner: Mforns) [14:49:30] nuria_, hi! thanks for the review, have you seen the latest patch? I did yet another change to the commit message [14:49:37] there were 3 bugs, not 2 [14:50:12] ok ok I see you commented on patch set 4 [14:50:25] mforns: yes, just looked. [14:51:05] mforns: changes look good, my comment was to further support what we were talking about inmutability of objects that are read form configuration [14:51:39] nuria_, agree, will do [14:51:47] mforns: but not on thsi patch [14:51:51] *this [14:52:01] we can do those changes at another time [14:52:07] nuria_, but I will add a unit test [14:52:13] k [14:59:50] Analytics-Kanban, Continuous-Integration-Config: Add a maven-release user to Gerrit {hawk} - https://phabricator.wikimedia.org/T132176#2289305 (madhuvishy) [15:00:32] mforns: backfilled, you should has DATAZ ! [15:00:42] joal, thanks! [15:04:23] (PS2) Milimetric: Support YYYYMMDDHH format for the unique endpoint [analytics/aqs] - https://gerrit.wikimedia.org/r/288264 (https://phabricator.wikimedia.org/T134840) [15:12:53] Analytics-Kanban, Operations, ops-eqiad, Patch-For-Review: rack/setup/deploy aqs100[456] - https://phabricator.wikimedia.org/T133785#2289342 (elukey) a:Cmjohnson>elukey [15:18:06] (PS3) Milimetric: Support YYYYMMDDHH format for the unique endpoint [analytics/aqs] - https://gerrit.wikimedia.org/r/288264 (https://phabricator.wikimedia.org/T134840) [15:22:39] (PS5) Mforns: Fix unique devices bugs [analytics/dashiki] - https://gerrit.wikimedia.org/r/288104 (https://phabricator.wikimedia.org/T122533) [15:26:09] (CR) Mforns: Fix unique devices bugs (1 comment) [analytics/dashiki] - https://gerrit.wikimedia.org/r/288104 (https://phabricator.wikimedia.org/T122533) (owner: Mforns) [15:26:43] nuria_, milimetric, the patch is finally ready for review again ^ [15:33:58] milimetric: here ? [15:34:08] yep, what's up [15:35:25] read your code review, tested it: works fine. I have a suggestion: https://gist.github.com/jobar/017c7d4ae48a1ebf32721ec26668ac36 [15:35:28] milimetric: --^ [15:35:53] milimetric: used that piece of code to actually test the thing :) [15:36:18] joal: sure, that works for me. I gotta finish up something else but I can patch it after our meetings today [15:36:41] I'll also ping mobrovac because the problem we see with cassandra string index doesn't happen with sqlite (so test only means something when used with cassandra for the moment) [15:36:46] milimetric: --^ [15:36:53] Cool ! Thanks milimetric :) [15:41:40] whoo elukey i don't see any ISR shrinks since yesterdays upgrade! [15:41:54] * elukey dances [15:43:58] elukey: any objections to restarting one broker with inter broker protocol version 0.9? [15:44:32] (CR) Joal: [C: 2 V: 2] "LGTM !" [analytics/aqs] - https://gerrit.wikimedia.org/r/288264 (https://phabricator.wikimedia.org/T134840) (owner: Milimetric) [15:44:47] thanks milimetric! [15:45:30] ottomata: LGTM [15:45:34] uh... but I didn't change the test like you wanted joal [15:45:56] milimetric: doesn't matter: you version is even better :) [15:46:25] I just don't know if in node tests get run in sequence (and therefore you can reuse inserted data) or not [15:46:29] milimetric: --^ [15:46:55] Hmm, interesting, dcausse hiya [15:47:03] just happened to be looking at some kafka broker logs [15:47:04] and saw [15:47:06] Topic and partition to exceptions: [mediawiki_CirrusSearchRequestSet,8] -> kafka.common.MessageSizeTooLargeException (kafka.server.KafkaApis) [15:47:23] damn :( [15:47:24] default max message size is 1mb [15:47:28] we can increase it [15:47:38] ouch more than 1mb [15:47:43] it was just one message, but i betcha it happens now and then [15:47:49] but, maybe it shouldn't be so large? dunno [15:47:53] I would not have expected that [15:47:59] man, 1mb of search requests :) [15:48:03] :) [15:48:14] we just added the results displayed in the logs [15:48:33] actually dcausse i am seeing a few of those [15:48:40] not a lot [15:48:40] dcausse: that explains then ! [15:48:41] oh maybe some crazy api requests with limit=5000 [15:48:42] but they are in there [15:49:05] maybe we need to limit what we log here [15:49:09] thanks for the heads up [15:49:15] uhhhhh [15:49:16] also [15:49:18] just noticed [15:49:18] [KafkaApi-13] Closing connection due to error during produce request with correlation id 0 from client id kafka-php with ack=0 [15:49:22] ack=0 [15:49:22] ? [15:49:25] you sure you want that? [15:49:37] I don't know what it means :/ [15:49:57] ebernhardson: maybe knows? [15:50:48] ack=0 means that the producer won't wait for the broker to acknowledge that it has accepted the produce request [15:50:59] ack=-1 means all replicas [15:51:04] ack=1 just the leader replica [15:51:27] search logs are UPD style with kafka :) [15:51:38] *UDP sorry [15:52:38] dcausse: i would recommend at least acks=1, but acks=0 will make the kafka produce request faster for sure [15:52:51] ottomata: thanks, for webrequests you use acks=1 ? [15:52:56] so if you are trying to be done with producing to kafka as fast as possible, since you are doing so in the client request, maybe it makes sense [15:53:08] hmm [15:53:12] I have no idea unfortunately [15:53:21] ja we use 1 [15:53:34] message size too large :S we did just increase the size of the messages but i didn't even know kafka had a limit [15:54:41] Attempting Boot From Hard Drive (C:) [15:54:44] .... [15:54:46] ebernhardson: it is a config limit [15:54:48] ebernhardson: I think it's caused by some very high limits used by internal api consumers [15:54:48] can be increased [15:54:54] also can be increased per topic [15:55:16] but I think we don't need to store all the results, top 20 seems to be sufficient imho [15:55:21] i'll check into changing the ack settings [15:55:29] at least then we get errors on both ends [15:56:18] looks easy enough to set, we left it with the default of 0 [15:56:59] k [15:57:01] cool [15:57:01] :) [15:59:19] i just need to test what kafka sends back when it fails :) [16:01:31] elukey: coming to standupp [16:01:32] ? [16:05:35] ottomata: can you try again to add maven-release-user@wikimedia.org (I believe the gerrit registration was incomplete before - again no hurry) [16:07:15] madhuvishy: worked! [16:07:20] coool [16:07:30] madhuvishy: standup? [16:07:52] ottomata: at hacker school - don't have a good place to join from [16:08:15] aye cool [16:12:20] Analytics-Kanban, Continuous-Integration-Config: Add a maven-release user to Gerrit {hawk} - https://phabricator.wikimedia.org/T132176#2289575 (madhuvishy) maven-release-user (maven-release-user@wikimedia.org) has been created, credentials are available on jenkins, and the right gerrit permissions have b... [16:14:55] Analytics-Kanban: Create separate archiva credentials to be loaded to the Jenkins cred store {hawk} - https://phabricator.wikimedia.org/T132177#2289581 (madhuvishy) a:madhuvishy>Ottomata [16:14:58] Analytics-Kanban: Create separate archiva credentials to be loaded to the Jenkins cred store {hawk} - https://phabricator.wikimedia.org/T132177#2190727 (madhuvishy) @Ottomata Assigning this to you since I don't have powers to make a user. [16:18:28] a-team: for my jenkins update - https://review.openstack.org/#/c/313196/ got +1-ed! Don't know what happens next but i'm hoping someone will merge it soonish. I had some user creation stuff pushed through with OIT's and ottomata's help - and that task is in Done now. [16:25:14] joal: it seems that I hit a bug in boot after installing jessie.. those hosts and I are not in getting along :P [16:25:23] Analytics-Kanban, Patch-For-Review: Client values inbound in X-analytics header (pageview and preview) are reflected in outbound X-Analytics on varnish - https://phabricator.wikimedia.org/T133204#2225002 (Nuria) Confirmed these changes make expected headers appear on cluster cc @Tbayer [16:27:40] Analytics, Wikipedia-Android-App-Backlog: Investigate recent decline in views and daily users - https://phabricator.wikimedia.org/T132965#2215521 (Nuria) @dr0ptp4kt : yes, varnish code publishes appropriate headers to x_analytics field, I confirmed those are present in cluster data. [16:28:28] elukey: I trust you'll tame them, then everything we'll be fine :) [16:28:56] Analytics: Pageview API: Limit (and document) size of data you can request - https://phabricator.wikimedia.org/T134524#2289654 (GWicke) > In current requests patterns, ~80% of requests are for fresh data - (end date either today or yesterday) From what I have seen, there is still a significant spread of val... [16:31:01] nice madhuvishy! [16:36:08] Analytics: Test cassandra compactions on new AQS nodes - https://phabricator.wikimedia.org/T135145#2289668 (JAllemandou) [16:39:54] ottomata: :D Also, I assigned https://phabricator.wikimedia.org/T132177 to you [16:41:08] k [16:41:54] Analytics-Kanban, EventBus, Patch-For-Review: Propose evolution of Mediawiki EventBus schemas to match needed data for Analytics need - https://phabricator.wikimedia.org/T134502#2289699 (Nuria) CR in progress, use code to talk, no need to task. [16:43:41] mobrovac: about my code review - aqs::seeds: needs to contain the cassandra instance domains or the host domains? [16:43:51] I thought the latter [16:44:12] cass instances too [16:44:18] ah ok! [16:44:24] those are used by the driver to connect to the cass nodes directly [16:44:39] Analytics: Create edit data schemas for anaylitcs - https://phabricator.wikimedia.org/T134793#2289728 (mforns) [16:45:22] Analytics: Create edit data hadoop/druid schemas for anaylitcs - https://phabricator.wikimedia.org/T134793#2277198 (mforns) [16:45:34] Analytics: Create edit data hadoop/druid schemas for anaylitcs - https://phabricator.wikimedia.org/T134793#2277198 (Nuria) Build a schema that would help us doing analytics. How do we represent that data into druid? and before how do we represent that data into hadoop? [16:45:36] mobrovac: fixed the code review thanks [16:46:50] mobrovac: my main concern was not to affect the main AQS cluster with this config but to re-use the aqs role for testing [16:47:44] oh elukey, one more thing i forgot to write on the PS [16:47:54] you need to change the cassandra cluster name as well [16:48:10] ahhh nice I forgot to ask, I thought it was confd stuff [16:48:10] i believe it's cassandra::cluster_name or some obvious var name like that [16:48:17] sure sure [16:48:56] kk, i'm off [16:48:59] time to go bowling [16:49:10] good bowling :) [16:49:13] :) [16:49:22] elukey: you're becoming french! [16:49:33] too much joal influence :)))))) [16:59:12] Analytics: Wikistats 2.0. Edit Reports: Setting up a pipeline to source Historical Edit Data into hdfs {lama} - https://phabricator.wikimedia.org/T130256#2289828 (Nuria) 1. DESIGN: 1.1 First team needs to internally define schemas that are to be used to calculate metrics. These are not event-based schema bu... [16:59:29] Analytics: Create edit data hadoop/druid schemas for anaylitcs - https://phabricator.wikimedia.org/T134793#2289833 (Nuria) [16:59:31] Analytics: Wikistats 2.0. Edit Reports: Setting up a pipeline to source Historical Edit Data into hdfs {lama} - https://phabricator.wikimedia.org/T130256#2289832 (Nuria) [16:59:36] Analytics-Kanban: Wikistats 2.0. Edit Reports: Setting up a pipeline to source Historical Edit Data into hdfs {lama} - https://phabricator.wikimedia.org/T130256#2289834 (JAllemandou) [17:02:05] Analytics: Create edit data hadoop/druid schemas for anaylitcs - https://phabricator.wikimedia.org/T134793#2289854 (Nuria) This task is about schema Design. 1.1 First team needs to internally define schemas that are to be used to calculate metrics. These are not event-based schema but data flowing in them c... [17:04:23] Analytics: Spike - Slowly Changing Dimensions on Druid - https://phabricator.wikimedia.org/T134792#2289857 (Nuria) [17:07:35] Analytics: Spike - Slowly Changing Dimensions on Druid - https://phabricator.wikimedia.org/T134792#2289863 (Nuria) [17:13:47] Hi ATeam! :D [17:14:31] Will this user agent defintly get marked as a spider? "Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)" and is there a way to check if a request has been marked as a spider in hadoop? [17:18:20] Analytics: Create edit data hadoop/druid schemas for anaylitcs - https://phabricator.wikimedia.org/T134793#2289898 (Nuria) There are three entities: Page, User and Revision. [17:25:04] Analytics: Test cassandra compactions on new AQS nodes - https://phabricator.wikimedia.org/T135145#2289668 (Nuria) - Basic puppet configuration /deploy / - load data into cassandra -test (via restbase hopefully) [17:25:37] Analytics: Test cassandra compactions on new AQS nodes - https://phabricator.wikimedia.org/T135145#2289913 (Nuria) [17:25:39] Analytics-Kanban, Datasets-Webstatscollector, RESTBase-Cassandra, Patch-For-Review: Better response times on AQS (Pageview API mostly) {melc} - https://phabricator.wikimedia.org/T124314#2289912 (Nuria) [17:37:16] addshore: Hi! [17:37:27] addshore: yes there is definitely a way to check [17:37:40] * madhuvishy looks up the udf [17:38:59] :) [17:41:40] Analytics: Test cassandra compactions on new AQS nodes - https://phabricator.wikimedia.org/T135145#2290040 (Nuria) Puppet is almost done, tasked assuming that metrics show up on graphana, otherwise we need more work [17:42:01] addshore: this is the regex that is being matched https://github.com/wikimedia/analytics-refinery-source/blob/0203ffc79f9ba967d26a73ba1012a97383199296/refinery-core/src/main/java/org/wikimedia/analytics/refinery/core/Webrequest.java#L58 [17:42:08] in your case [17:42:22] it looks like Yahoo will definitely get it marked [17:42:25] so that should match [17:42:28] interesting [17:43:32] basically Wikidata got 87k hits to Special:RecentChangesLinked and I am trying to track down why! [17:43:43] yesterday and the day before at least!¬ [17:46:52] and a quick query on hive shoed the yahoo UA as top of the list (but my query could be bad) ;) [17:47:43] Analytics-Kanban: Test cassandra compactions on new AQS nodes - https://phabricator.wikimedia.org/T135145#2290074 (JAllemandou) [17:47:52] Analytics-Kanban: Spike - Slowly Changing Dimensions on Druid - https://phabricator.wikimedia.org/T134792#2290077 (JAllemandou) [17:48:02] you can also verify any user agent with the UDF - [17:48:06] Analytics-Kanban: Create edit data hadoop/druid schemas for anaylitcs - https://phabricator.wikimedia.org/T134793#2290078 (JAllemandou) [17:48:29] https://www.irccloud.com/pastebin/5miglWgd/ [17:48:42] addshore: aah [17:48:58] madhuvishy: Hiiii :) [17:48:59] addshore: and it wasn't marked spider? [17:49:04] joal: hello :D [17:49:44] madhuvishy: Quick phab question for you https://phabricator.wikimedia.org/T130123, is currently in "To Task", but I think it's the one you almsot finished, no ? [17:50:03] joal: I just discovered that ADD JAR doesn't seem to work on beeline - need to look into it at some point [17:50:10] joal: oh that's different [17:50:59] this is for jenkins to actually upload the jars to our stat box at /srv/deployment/analytics/refinery [17:51:01] i think [17:51:12] madhuvishy: wooooow, the ADD JAR thing makes me feel bad ! [17:51:20] oh, ok madhu ! [17:51:32] Leaving it to "To Task" then :) [17:51:36] Thanks [17:51:51] madhuvishy: how do I see that? ;) [17:52:03] joal: i think there are two parts of the jenkins project - make it deploy to archiva, then make it actually upload the jars to where we use it [17:52:10] oh wait, now I see the paste... [17:52:36] addshore: uhhh, how do you see if tagged as spider? also select the agent-type column [17:52:52] agent-type is in webrequest? :D [17:52:57] addshore: yes yes [17:53:01] cool ;) [17:53:12] I should have gne and found the schema again I guess and taken a look! [17:53:17] https://wikitech.wikimedia.org/wiki/Analytics/Data/Webrequest [17:53:20] agent_type [17:53:56] madhuvishy: My guess about the ADD JAR thing is that you need to add a HDFS jar, or that the ajr should be accessible on the hive server <-- ottomata [17:53:57] running again, lets se [17:54:29] use the udf if you want to find if any arbitrary UA string will be marked as spider or not. otherwise, the agent_type column should be enough [17:55:02] joal: ah yeah may be [17:56:35] hmm madhuvishy yeh it says spider [17:57:24] does 'desktop' include spiders in the top pageviews api endpoint? [17:57:57] right now I'm guessing that is the case [17:59:14] addshore: I'd think so. joal can confirm. I guess there is no agent-type split for that endpoint [17:59:55] addshore, madhuvishy : top endpoint of API doesn't contains speiders [18:00:03] oh interesting [18:00:11] No agent split, but user only data [18:00:26] then im very confused as to where these 87k views have come from ;) [18:00:29] thanks joal [18:00:52] addshore: when looking into webrequest you can have access to referer [18:00:52] I guess I should dive into the page view tables! [18:01:06] but that's a huge scan [18:01:12] addshore: may be query for top UA's that are not agent_type spider from webrequest? [18:01:30] addshore: My way of doing it: pinpoint an hour when it occurs, then only quyery that hour [18:01:35] madhuvishy: well in 1 hour the top user agent is yahoo with 4.5k, the next down is 675 then 165 and then 7 [18:01:44] actually, top UA's that are not spider wont give you anything [18:01:51] right [18:02:05] referrer makes more sense [18:02:49] ottomata: Do we have the researcher meeting ? [18:04:18] coming! [18:04:48] Analytics: Pageview API: Limit (and document) size of data you can request - https://phabricator.wikimedia.org/T134524#2290184 (Nuria) >Introducing fixed windows (single day, single month) could eliminate this fragmentation. But wait, at the cost of reducing functionality right? as it is not the same to ask... [18:06:17] going offline, byyeee! [18:06:22] Bye [18:06:38] bye elukey! [18:10:46] reports in #-operations that /api/rest_v1/metrics/pageviews/ is down [18:10:59] bd808: ah! [18:11:19] a-team: ^^ [18:11:32] milimetric: ^ [18:11:48] ottomata: my internet is somehow not working with Hangout. [18:11:49] it may not be your fault, but help debugging would be useful [18:11:52] Are y'all in the meeting? [18:11:55] hmmm [18:11:58] ya we here [18:12:12] (CR) Nuria: Fix unique devices bugs (1 comment) [analytics/dashiki] - https://gerrit.wikimedia.org/r/288104 (https://phabricator.wikimedia.org/T122533) (owner: Mforns) [18:14:11] ottomata: milimetric it looks like pageview api is down [18:17:30] madhuvishy: why? [18:18:08] joal: They are talking about it in #ops [18:20:20] so joal madhuvishy I just went and looked at all of the data on one of the days and grouped by agent_type [18:20:29] 299 users and 123k spiders [18:20:44] something is up ;) but must dash to dinner now! back in a bit! [18:28:02] a-team, I'm off for today! [18:29:03] ciao joal. [18:32:56] back! [18:34:47] madhuvishy: I'm guessing I should file a bug? ;) [18:35:03] addshore: what's up [18:35:26] see my messages just above! :) [18:35:38] only 299 users loaded the page and 123k spiders [18:35:59] addshore: 123k requests from spiders? [18:36:01] oh [18:36:03] yup [18:36:18] and it shows up in the top endpoint with 73k views you say? [18:36:39] uhhh 87k [18:36:44] (CR) Mforns: Fix unique devices bugs (1 comment) [analytics/dashiki] - https://gerrit.wikimedia.org/r/288104 (https://phabricator.wikimedia.org/T122533) (owner: Mforns) [18:36:53] https://wikimedia.org/api/rest_v1/metrics/pageviews/top/wikidata/all-access/2016/05/10 [18:37:23] yup, (my query was the the 10th) [18:38:20] so it lists 87k via the api, when looking at webrequest I get 299 users and 123k spiders, matching with /wiki/Special:RecentChangesLinked% on wikidata only [18:38:34] addshore: interesting. [18:38:51] addshore: is the per-article endpoint giving the right thing? [18:38:56] *looks* [18:40:11] {"items":[{"project":"wikidata","article":"Special:RecentChangesLinked","granularity":"daily","timestamp":"2016051000","access":"all-access","agent":"all-agents","views":91400}]} [18:40:26] https://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/wikidata/all-access/spider/Special%3ARecentChangesLinked/daily/20160510/20160510 [18:40:33] {"items":[{"project":"wikidata","article":"Special:RecentChangesLinked","granularity":"daily","timestamp":"2016051000","access":"all-access","agent":"spider","views":72}]} [18:40:37] spider only shows 72 :P [18:41:07] {"items":[{"project":"wikidata","article":"Special:RecentChangesLinked","granularity":"daily","timestamp":"2016051000","access":"all-access","agent":"bot","views":0}]} [18:41:16] bot is 0, thus the rest are 'users' apparently ;) [18:41:47] addshore: so that seems wrong too [18:42:02] yup [18:42:10] I'll file a bug :) [18:42:17] addshore: okay :) thanks! [18:43:52] addshore: can you paste the query you ran against webrequest? [18:44:35] yup [18:46:29] Analytics, Pageviews-API, Wikidata: Pageview API not categorizing spiders correctly - https://phabricator.wikimedia.org/T135164#2290291 (Addshore) [18:46:31] madhuvishy: https://phabricator.wikimedia.org/T135164 [18:46:35] heh, forgot about the bot :p [18:47:39] Analytics, Pageviews-API, Wikidata: Pageview API not reporting spiders correctly - https://phabricator.wikimedia.org/T135164#2290291 (Addshore) [18:47:59] elukey: did you say you looked at pykafka code and saw a place were an exception was uncaught that was causing it to fail during broker restart? [18:57:35] mforns: teh bug we have now I do not think is in teh new code [18:57:38] *the [18:57:47] mforns: rather it was there before [18:58:29] mforns: and the graph needs to clear plots before data comes back, needs to do it once new metricis selected [19:03:32] mforns: "once new metric is selected" [19:06:40] nuria_, I agree the bug was there before, but we should fix it now, because before we only had 1 breakdownable metric, now we have 3 [19:06:48] and the bug can be seen [19:06:50] nuria_: Just had a quick look, and indeed there is a (bug | definition change) in pageview [19:07:20] mforns: yes [19:07:28] joal: i was looking too [19:07:35] joal: we are missing all restabse pageviews [19:07:58] but code on isApppageview seems correct unit test wise so chnage must be elsewhere [19:07:59] nuria_, but I think what happens is: the data for the new metric gets requested before the breakdown is reset [19:08:00] *change [19:08:32] mforns: mmm, i do not think so but i could be totally off [19:08:51] nuria_: problem comes from pageview_definition [19:08:56] nop app_pageview [19:09:09] PROBLEM - Check status of defined EventLogging jobs on eventlog1001 is CRITICAL: CRITICAL: Stopped EventLogging jobs: processor/client-side-11 processor/client-side-10 processor/client-side-09 processor/client-side-08 processor/client-side-07 processor/client-side-06 processor/client-side-05 processor/client-side-04 processor/client-side-03 processor/client-side-02 processor/client-side-01 processor/client-side-00 forwarder/legacy-zmq [19:09:09] PROBLEM - Check status of defined EventLogging jobs on eventlog1001 is CRITICAL: CRITICAL: Stopped EventLogging jobs: processor/client-side-11 processor/client-side-10 processor/client-side-09 processor/client-side-08 processor/client-side-07 processor/client-side-06 processor/client-side-05 processor/client-side-04 processor/client-side-03 processor/client-side-02 processor/client-side-01 processor/client-side-00 forwarder/legacy-zmq [19:09:27] logging off for diner, will be back after [19:10:27] joal, bye! [19:11:54] see that [19:12:38] hm, yeah i must have restarted eventlogging after that broker restart at the wrong time. starting it back up seems to work fine [19:13:19] RECOVERY - Check status of defined EventLogging jobs on eventlog1001 is OK: OK: All defined EventLogging jobs are runnning. [19:13:19] RECOVERY - Check status of defined EventLogging jobs on eventlog1001 is OK: OK: All defined EventLogging jobs are runnning. [19:16:34] ok [19:17:11] milimetric: around? [19:18:40] yes, but in a meeting, sorry [19:19:30] milimetric: np - i was just wondering how to connect to cassandra to query it from aqs. feel free to reply whenever you are free [19:21:19] PROBLEM - Difference between raw and validated EventLogging overall message rates on graphite1001 is CRITICAL: CRITICAL: 53.33% of data above the critical threshold [30.0] [19:21:19] PROBLEM - Difference between raw and validated EventLogging overall message rates on graphite1001 is CRITICAL: CRITICAL: 53.33% of data above the critical threshold [30.0] [19:21:42] ^ s'ok should go away in a sec [19:22:52] joaL: sumitting unit test and bugfix [19:38:19] RECOVERY - Difference between raw and validated EventLogging overall message rates on graphite1001 is OK: OK: Less than 20.00% above the threshold [20.0] [19:38:19] RECOVERY - Difference between raw and validated EventLogging overall message rates on graphite1001 is OK: OK: Less than 20.00% above the threshold [20.0] [19:40:07] (PS1) Nuria: Match paths on request only if it is a web pageview [analytics/refinery/source] - https://gerrit.wikimedia.org/r/288458 [19:41:29] Analytics-Kanban: Pageview definition bug for apps pageviews on rest endpoint - https://phabricator.wikimedia.org/T135168#2290476 (Nuria) [19:41:55] (PS2) Nuria: Match paths on request only if it is a web pageview [analytics/refinery/source] - https://gerrit.wikimedia.org/r/288458 (https://phabricator.wikimedia.org/T135168) [19:43:00] (PS3) Nuria: Match paths on request only if it is a web pageview [analytics/refinery/source] - https://gerrit.wikimedia.org/r/288458 (https://phabricator.wikimedia.org/T135168) [19:45:47] dr0ptp4kt: re https://phabricator.wikimedia.org/T131315#2290387 , what is the default sampling rate? [19:54:05] nuria_, I found what the problem is [19:54:13] nuria_, but not sure how to fix it [19:54:18] mforns: ajam [19:54:25] mforns: we can talk in batcave [19:54:31] nuria_, was going to ask that [19:54:33] omw [19:58:21] nuria_: I investigated a bit more on pageviews [19:58:27] will join you to batcave [19:58:34] joal: did you see my patch? [19:58:39] noper [19:59:28] nuria_: joining batcave [20:15:32] Analytics, Pageviews-API, Wikidata: Pageview API not reporting spiders correctly - https://phabricator.wikimedia.org/T135164#2290582 (JAllemandou) Results look correct to me with that query: ``` SELECT agent_type, count(1) as count FROM webrequest WHERE year = 2016 AND month = 5 AND d... [20:16:17] Analytics-Kanban, Operations, ops-eqiad, Patch-For-Review: rack/setup/deploy aqs100[456] - https://phabricator.wikimedia.org/T133785#2290595 (RobH) Issue: Post jessie install, system states booting off C, and then fails to boot anything. Troubleshooting done so far: * compared all bios settings... [20:29:16] (PS4) Nuria: Match paths on request only if it is a web pageview [analytics/refinery/source] - https://gerrit.wikimedia.org/r/288458 (https://phabricator.wikimedia.org/T135168) [20:33:44] madhuvishy: I'm back and I got an update that everything's more or less ok, you still wanna poke around AQS we can do it together? [20:35:22] HaeB: we found the bug on pageview definition, small detail is that we also need code that parses rest base urls if we want to have page tittles on pageview_hourly as those urls: "/api/rest_v1/page/mobile-sections-lead/Last_of_the_Summer_Wine" [20:35:32] are different from the php api ones [20:35:47] HaeB: joal and I would work on this tomorrow [20:36:30] nuria: awesome [20:37:02] is there a way to test if such bug fixes have the intended effect of restoring pageviews to plausible numbers? [20:45:06] ottomata or someone else: I get an error in labs when I try to log to a new schema (it looks like it has trouble making the kafka topic maybe? I'll paste) [20:45:21] Analytics-Kanban, Operations, ops-eqiad, Patch-For-Review: rack/setup/deploy aqs100[456] - https://phabricator.wikimedia.org/T133785#2290664 (RobH) full log for the aqs1006 install: P3061 full log for the aqs1005 install: P3062 [20:45:43] https://www.irccloud.com/pastebin/zPMU2wb3/ [20:52:44] Analytics-Wikistats: Cross-link stats.wikimedia.org and ee-dashboard.wmflabs.org - https://phabricator.wikimedia.org/T67994#2290684 (Jdforrester-WMF) Open>declined We're killing ee-dashboard (and have been for three years); let's not add more cruft and links to it. [20:52:46] Analytics-Wikistats, Tracking: Increase WikiStats reports discoverability - https://phabricator.wikimedia.org/T67991#2290686 (Jdforrester-WMF) [20:53:50] Analytics, Editing-Analysis: Move contents of ee-dashboards to edit-analysis.wmflabs.org - https://phabricator.wikimedia.org/T135174#2290688 (Jdforrester-WMF) [20:54:05] milimetric: hey [20:54:10] hey [20:54:22] uhhh i think its okay - i wanted to query some stuff for this bug addshore filed [20:54:31] https://phabricator.wikimedia.org/T135164 [20:54:48] unrelated to pageview api giving 500s [20:54:48] Hey, I put T135174 in #Analytics because I didn't know if there was a more appropriate place to put it; please move it if there's a better one. :-) [20:54:48] T135174: Move contents of ee-dashboards to edit-analysis.wmflabs.org - https://phabricator.wikimedia.org/T135174 [20:55:52] James_F: that's good, that's just to ping us, we can file it and triage and prioritize and all that once it's in #Analytics [20:56:15] madhuvishy: ok, so did you login to aqs and if not you wanna? [20:56:27] milimetric: ya i logged in [20:56:31] milimetric: Thanks. [20:56:43] ok, cool, any problems querying cassandra? [20:57:36] milimetric: no I just do not know how to get the cqlsh shell - it looks like it expects a connection string - and i dont know what that is :) [20:57:49] oh! sure, one sec [20:59:22] (CR) Mforns: [C: 2 V: 2] "LGTM!" [analytics/reportupdater] - https://gerrit.wikimedia.org/r/288133 (https://phabricator.wikimedia.org/T134950) (owner: Neil P. Quinn-WMF) [20:59:42] madhuvishy: when you're done... any idea why beta EL is failing with that error message I pasted above ^^? :) [21:00:35] milimetric: uhhhh i think my irccloud didn't pick up your paste, could you paste again? [21:01:59] https://www.irccloud.com/pastebin/zPMU2wb3/ [21:02:55] Analytics, DBA, Editing-Analysis, Patch-For-Review: Reportupdater does not commit changes after each query - https://phabricator.wikimedia.org/T134950#2290755 (mforns) @Neil_P._Quinn_WMF @jcrespo I did some testing against the DB and looks good! I merged the patch. In a while puppet will automati... [21:07:31] madhuvishy: did you get that last one? is irccloud just failing :) [21:07:40] milimetric: yes! looking [21:08:15] oh good, I can give more details if you need [21:08:20] milimetric: hmmm topic creation timed out? [21:08:42] looks like a new schema [21:08:44] interesting [21:08:52] may be something for kafka 0.9 [21:09:16] ehh? [21:09:49] ottomata: i don't know - does it look like auto creation of topic failed for some reason? [21:09:59] do you have a ping for kafka 0.9 setup ;) [21:10:10] for kafka ya :) [21:10:20] ha ha [21:10:31] its just in beta? [21:10:48] yes [21:11:02] well, I donno, could be in prod too, it'd be hard to know until someone made a new schema [21:14:30] i can reproduce [21:14:32] on it... [21:15:25] ottomata: I was doing this on labs if you want an easy way to repro: [21:15:26] curl -k https://bits.beta.wmflabs.org/beacon/event?%7B%22schema%22%3A%22DiacriticsVisibility%22%2C%22revision%22%3A15594725%2C%22wiki%22%3A%22metawiki%22%2C%22event%22%3A%7B%22issues%22%3A1%7D%7D [21:19:54] can verify, our version of kafka-python's ensure_topic_exists does not work with kafka 0.9 [21:20:58] aah [21:21:24] never tested new schema! ah! [21:21:26] :) [21:21:35] HMM [21:22:20] ottomata: so its just the ensure call [21:22:30] not the actual topic creation [21:22:52] can we be like try: create topic; except: go have fun!! [21:25:17] madhuvishy: i haven't been able to create a topic using our version of kafka-python yet [21:25:26] ottomata: oh that too [21:25:27] i think we might need to upgrade kafka-python [21:25:28] hmmm [21:25:30] hmm [21:25:33] oh, we do have pykafka already [21:25:36] oh [21:25:39] i bet i can ensure it exists with that instead..hmm [21:25:50] but then i'd have to instatiate a client with both libs in the kafka writer handler [21:25:50] hm [21:25:57] upgrading kafka-python might be better [21:26:01] but could be a little unsafe to do right now. [21:26:02] hm [21:26:03] ottomata: ha ha - ensure and create with pykafka - produce with kafka-python? [21:26:10] Analytics, Pageviews-API: API incorrectly complains about missing data instead of wrong wiki name - https://phabricator.wikimedia.org/T134926#2290856 (Ijon) (or indeed, if the API was case insensitive re wiki names.) [21:26:15] that sounds foolproof [21:26:16] :D [21:26:40] haha [21:26:42] oof [21:26:43] we should revisit producing with pykafka maybe [21:26:43] hm [21:26:52] naw, they still don't have dynamic topic support [21:26:54] i've been tracking ghat [21:26:56] ohh [21:26:59] hmmm [21:27:01] but, kafka-python has improved a LOT [21:27:05] someone new took it on [21:27:07] and it is active again [21:27:11] oh cool [21:27:41] HMM [21:28:03] ok madhuvishy, milimetric, i was about to quit for the day. also i think it would be dangerous to do anything big atm [21:28:20] new schemas are not created often...how about I manually create this topic in beta for now [21:28:23] send an email to analytics list [21:28:39] that new schemas are broken until we fix [21:28:41] but we will fix asap [21:28:44] e.g. tomorrow :) [21:28:49] ottomata: yeah - me too - if not i can manually checkout the latest kafka-python version and test it [21:29:02] madhuvishy: if you have some hours today and want to do that, that would be greatly appreciated [21:29:34] ottomata: makes sense. Okay i'll try then [21:29:39] if all goes well i could then fix asap tomorrow morn...you could even install it in beta to have it running [21:29:39] ? [21:29:42] "mozilla [en] egranary digital library system" 91297 [21:29:43] Works for me, the beta thing is not urgent don't worry [21:29:53] looks like thats the UA thats thrown this number off [21:29:58] ok, thanks madhuvishy, send me an email with how it goes so I know where to pick up in the morning [21:30:03] i'm going to write an email to analytics public list now [21:30:06] ottomata: cool [21:30:11] addshore: oh [21:30:25] and that was counted as user? [21:30:31] yup [21:30:46] * madhuvishy adds library to ua parser regex [21:31:01] but my initial query to webrequest didn't cover those requetss apparently! [21:31:29] addshore: what was the change? [21:31:56] See https://phabricator.wikimedia.org/T135164 AND is_pageview AND uri_host LIKE "%wikidata.org" [21:32:05] addshore: aah [21:32:21] I guess all of the things I found don't even show as page views by spiders then! [21:32:34] addshore: hmmm i queried pageview_hourly with project='wikidata' [21:32:47] numbers were similar but i didn't check uas [21:33:16] https://www.irccloud.com/pastebin/0saGiosf/ [21:33:35] ok, email sent, latesr yall! [21:33:43] ottomata: byeee [21:33:48] Analytics, Pageviews-API, Wikidata: Pageview API not reporting spiders correctly - https://phabricator.wikimedia.org/T135164#2290869 (Addshore) Okay, so none of the spiders in my first request are actually evaluated as pageviews? It looks like the UA throwing the data off here is not included in the... [21:34:38] im now slightly confused where spiders are counted as page views and where they are not [21:34:50] Analytics-Tech-community-metrics, Developer-Relations, Community-Tech-Sprint: Investigation: Can we find a new search API for CorenSearchBot and Copyvio Detector tool? - https://phabricator.wikimedia.org/T125459#2290870 (kaldari) @Earwig: It looks like everyone thinks the Bing workaround is a good id... [21:35:21] addshore: pageview tagging is independent of agent type [21:35:30] yeh madhuvishy your query returns 376 users and 130398 spiders [21:36:00] addshore: ya so i'm confused now - is my project wrong? [21:36:09] no, I think thats right [21:36:18] hmmm [21:36:28] but the other request in the ticket did AND pageview_info['page_title'] = "Special:RecentChangesLinked" [21:36:38] I did AND uri_path LIKE "/wiki/Special:RecentChangesLinked%" [21:36:43] and you did and page_title like '%Special:RecentChangesLinked%' [21:36:58] hmmmm [21:37:20] Analytics-Kanban, Operations, ops-eqiad, Patch-For-Review: rack/setup/deploy aqs100[456] - https://phabricator.wikimedia.org/T133785#2290875 (RobH) @papaul states he installed aqs1004 without that erorr, but it has the cannot boot issue: Attempting Boot From Hard Drive (C:) after post and then... [21:38:27] switching your query on pageview_hourly to and page_title = "Special:RecentChangesLinked" then returns 52 users and 11 spirders :P [21:39:01] addshore: uhhh my brain isn't working now [21:39:35] My brain is thinking the same ;) [21:39:43] just doing 1 last query ;) [21:40:53] ahh so our queries matched things like Special:RecentChangesLinked/Q13215 which is tracked as a page view on a different page [21:41:14] also hence the thousands of extra spiders there then [21:42:07] addshore: aah [21:42:34] updated my comment and I'll retask the ticket to listing that UA as a spider [21:43:35] addshore: cool thanks [21:43:37] Analytics, Pageviews-API, Wikidata: "egranary digital library system" UA should be listed as a spider - https://phabricator.wikimedia.org/T135164#2290887 (Addshore) [21:44:02] whatever that script UA thing is doing is crazy :P [21:44:42] it feels like it is polling the page every second.... [21:45:12] and the page doesn't even show anything... https://www.wikidata.org/wiki/Special:RecentChangesLinked [21:45:25] heh [22:45:32] Quarry, Labs, Labs-Infrastructure: Long-running Quarry query (querry?) produces strangely incorrect results - https://phabricator.wikimedia.org/T135087#2287871 (Krenair) I looked through the history of that query in the database, and have no good explanation for this. [22:47:17] Analytics-Kanban, Operations, ops-eqiad, Patch-For-Review: rack/setup/deploy aqs100[456] - https://phabricator.wikimedia.org/T133785#2291168 (RobH) So @dzahn was able to work around the dependency issue, I've asked him to put an update, but I'll attempt to paraphrase from irc: > when the insta... [22:47:36] Analytics-Kanban, Operations, ops-eqiad, Patch-For-Review: rack/setup/deploy aqs100[456] - https://phabricator.wikimedia.org/T133785#2291170 (RobH) Additionally, they still have the error of: Attempting Boot From Hard Drive (C:) When they should boot up the OS. [22:52:34] bye team, c u tomorrow! [23:00:52] Quarry, Easy: Display time taken to execute a query - https://phabricator.wikimedia.org/T135189#2291243 (Matthewrbowker) [23:03:00] Analytics-Kanban: Respawn the schema/field white-list for EL auto-purging {tick} - https://phabricator.wikimedia.org/T135190#2291256 (mforns) [23:04:45] Analytics-Kanban: Notify all schema owners that the auto-purging is about to start {tick} - https://phabricator.wikimedia.org/T135191#2291279 (mforns) [23:08:23] Analytics-Kanban, DBA: Set up auto-purging after 90 days {tick} - https://phabricator.wikimedia.org/T108850#2291305 (mforns) a:mforns>None I removed myself from the task, because I created another task in our kanban for that. Nevertheless, I am working right now in putting together that white-lis... [23:08:40] Analytics-Kanban: Respawn the schema/field white-list for EL auto-purging {tick} - https://phabricator.wikimedia.org/T135190#2291312 (mforns) [23:08:42] Analytics-Kanban, DBA: Set up auto-purging after 90 days {tick} - https://phabricator.wikimedia.org/T108850#2291311 (mforns) [23:09:25] Analytics, DBA: Set up bucketization of editCount fields {tick} - https://phabricator.wikimedia.org/T108856#2291315 (mforns) @Nuria @jcrespo Shouldn't this be marked as done? [23:10:36] Analytics, DBA: Set up auto-purging after 90 days {tick} - https://phabricator.wikimedia.org/T108850#1532166 (mforns) [23:25:18] Analytics: Pageview API: Limit (and document) size of data you can request - https://phabricator.wikimedia.org/T134524#2291391 (GWicke) @MusikAnimal, the trade-off is between performance & client convenience. In either scheme, you can load the full data for any time frame. In the per-month scheme you'll have... [23:40:57] Analytics-Kanban, Operations, ops-eqiad, Patch-For-Review: rack/setup/deploy aqs100[456] - https://phabricator.wikimedia.org/T133785#2291414 (Dzahn) Yep, so the "install software" / tasksel step of the installer failed and there were the "packages have unmet dependencies:" errors Rob pasted above...