[00:51:34] (PS6) MaxSem: Count pages with geo tags [analytics/discovery-stats] - https://gerrit.wikimedia.org/r/319260 (https://phabricator.wikimedia.org/T149722) [04:42:47] (CR) Yurik: [C: 1] [analytics/discovery-stats] - https://gerrit.wikimedia.org/r/319260 (https://phabricator.wikimedia.org/T149722) (owner: MaxSem) [04:48:18] (CR) Yurik: [C: 1] [analytics/discovery-stats] - https://gerrit.wikimedia.org/r/319262 (https://phabricator.wikimedia.org/T149722) (owner: MaxSem) [05:24:38] (CR) Yurik: @nuria, I agree that it is better to use one language to generate these stats. Moreover, I think it is by far better to have one *system* to generate these stats. If we ever have such a system, I'm all for getting rid of this code, and simply keeping the SQL. But until we have it, we might as well use whatever the specific developers are more comfortable with. Learning a new language when we should have just one system i [05:24:39] that efficient. [analytics/discovery-stats] - https://gerrit.wikimedia.org/r/319260 (https://phabricator.wikimedia.org/T149722) (owner: MaxSem) [08:42:11] o/ [08:42:36] oozie is back complaining, I can see that it is again the crawler trying to get another page over and over... [08:43:29] ok restarting jobs first [08:45:08] !log started 0018549-161020124223818-oozie-oozi-C to re-run wf-upload-2016-11-4-2 -> wf-upload-2016-11-4-4 [08:45:09] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [08:45:58] !log started 0018557-161020124223818-oozie-oozi-C to re-run wf-upload-2016-11-4-6 [08:45:59] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [08:59:53] unrelated to this issue, now I finally found why the analytics chan has no color [08:59:56] like the other ones [09:00:01] we have the +c in the channel [09:00:11] not really sure why [09:06:52] and I have no idea how to become op to do a -c [09:06:58] anybody knows? [09:12:08] Analytics-Tech-community-metrics: OwlBot recently not updating user accounts DB - https://phabricator.wikimedia.org/T149984#2771013 (Aklapper) [09:12:27] Analytics-Tech-community-metrics: OwlBot recently not updating user accounts DB - https://phabricator.wikimedia.org/T149984#2771013 (Aklapper) p:Triage>High [09:14:42] Hi elukey [09:14:49] Thanks for oozie restarts [09:15:15] I have no idea on how to have -c on the chan [09:17:27] I'd need to know how to become an op in here [09:17:31] super ignorant about IRC [09:17:32] sigh [09:17:34] :D [09:17:59] elukey: I prefer you being an op in varnish + kafka + cassandra + oozie than IRC ;) [09:18:19] elukey: And I probably forget some other things [09:18:54] :D [09:18:59] but I want colooorsss [09:19:05] huhu [09:19:09] the other channels are more joyful [09:19:16] anyhow [09:19:34] looking in beeline it seems that the oozie mess was in ulsfo [09:20:43] all the times the requests does not log return code, dt, etc.. becuase the logs get truncated by Varnish [09:21:14] the request seems to be almost the same from a crawler, and it generates a 404 if I try it manually [09:21:27] BUT [09:21:31] and this is the interesting part [09:21:54] most of the times Varnish correctly logs it, like now that I am using varnishlog on cp4007 [09:22:02] (err 4006 sorry) [09:22:14] but $sometimes it logs only few VSL tags [09:22:22] like I wrote in the ticket [09:22:35] and I have really no idea about how to find a repro [09:23:13] elukey: it feels like a race condition thing, but i don't know enough in vk for having an opinion [09:24:00] I think that the problem is in Varnish this time [09:24:15] because it doesn't log tags for a request when the $condition happens [09:24:22] and we get the incomplete logs [09:24:54] I tried to follow up with upstream and they told me to make a test, namely use the grouping "raw" with varnishlog to capture traffic [09:25:05] elukey: What would be good would be to understand $condition, correcT? [09:25:10] but this means that I have to log ALL the tags for ALL the requests until the error occurs [09:25:16] mwarf [09:25:34] yeah if we got a reliable way to repro it would be awesome [09:25:43] hm [09:25:46] upstream's theory is that some request VSL Tags are misplaced [09:25:50] due to a bug [09:26:13] so say instead of ending up in $request1 grouping the go somewhere else [09:26:30] ending up in $request1 with missing information [11:04:59] so I was able to dump a faulty request on cp4006 with grouping raw [11:05:10] but... it doesn't give me a lot of data [11:05:11] sigh [11:50:28] Analytics, Operations: sync bohrium and apt.wikimedia.org piwik versions - https://phabricator.wikimedia.org/T149993#2771280 (akosiaris) [12:04:16] mmmm one super weird thing that I found is that only on the faulty requests logged the "ReqURL" tag is there two times [12:04:28] I am going to eat something otherwise I'll become crazy [12:04:33] and then I'll restart [12:04:37] * elukey lunch [12:59:42] Analytics, Operations: sync bohrium and apt.wikimedia.org piwik versions - https://phabricator.wikimedia.org/T149993#2771355 (MoritzMuehlenhoff) p:Triage>Normal [13:03:30] hey joal [13:03:37] Hi milimetric [13:04:28] ready to crunch some numbers? [13:04:37] milimetric: Yay [13:04:43] uh... woah, that's a lot of oozie failures [13:04:45] milimetric: I have started already ;) [13:05:45] :( I gotta look into these oozie things, joal, it's my ops week [13:06:02] milimetric: elukey solved that all even before I looked at my emails ;) [13:06:27] oh... what was wrong, the varnish 4 upgrade? [13:06:49] nope, some crawler thing (same as previous case) [13:07:32] milimetric: elukey is looking into those with varnishkafka, but it seems very difficult to reproduce, and therefore even more to understqand and fix [13:07:43] ok, then, do you wanna hang out in the batcave? [13:07:49] Yes ! [13:07:52] k, omw [13:08:58] Analytics, Operations: sync bohrium and apt.wikimedia.org piwik versions - https://phabricator.wikimedia.org/T149993#2771360 (elukey) piwik_2.16.0-1_all.deb seems to be in @ori's home :P http://debian.piwik.org/ contains the latest debs, maybe we could upload it to third-party? [13:10:04] milimetric: still related to upload, I can't figure out what is the bug :( [13:10:50] elukey: I didn't pay super good attention the first time, but I get how it all works from your explanation, if you want to brainbounce that's what the cave is here for [13:11:58] I think it is something related to Varnish's internals, going to put some data in the task in a bit and then if you want you can give me your opinion.. I think it is a sneaky problem :/ [13:24:37] Analytics-Kanban, Operations, Traffic: Varnishlog with Start timestamp but no Resp one causing data consistency check alarms - https://phabricator.wikimedia.org/T148412#2771447 (elukey) The issue re-appeared again in upload today (early UTC morning time), all concentrated in `ulsfo`. I managed to cap... [13:28:23] milimetric: --^ [13:28:25] this is the new part [13:46:40] Analytics-Kanban, Operations, Traffic: Varnishlog with Start timestamp but no Resp one causing data consistency check alarms - https://phabricator.wikimedia.org/T148412#2771535 (elukey) The following request takes ages to complete on `cp400[67]` but it completes very quickly on `cp1099`: ``` curl "h... [14:08:30] hi a-team! [14:08:35] Hey mforns [14:08:36] joal, milimetric, I'm here [14:08:40] hey :] [14:08:43] in da cave mforns :) [14:08:45] ok [15:01:16] joal, milimetric : holaaa, stadddupp [15:30:03] (CR) Nuria: Have in mind we have report updater for this very purpose, sql queries run frequently. Perhaps this use case can be accomodated to use it. Did you look into that? [analytics/discovery-stats] - https://gerrit.wikimedia.org/r/319260 (https://phabricator.wikimedia.org/T149722) (owner: MaxSem) [15:30:43] Analytics-Kanban: Run Standard metrics on denormalized history and compare with wikistats - https://phabricator.wikimedia.org/T150023#2771909 (Milimetric) [15:31:14] joal: one thing that I wanted to ask was if webrequest_stats could be avalable via pivot [15:31:23] for me it would be awesome :) [15:31:27] Analytics-Kanban: Replacing standard edit metrics in dashiki with data from new edit data depot - https://phabricator.wikimedia.org/T143924#2771923 (Milimetric) [15:31:28] Analytics-Kanban: Run Standard metrics on denormalized history and compare with wikistats - https://phabricator.wikimedia.org/T150023#2771922 (Milimetric) [15:31:35] but it is very small data so not sure if it makes sense [15:32:16] a-team: i just got an e-mail on victoria's behalf asking to come to 1 of out staff meetings [15:32:35] a-team: *our staff meetings [15:32:56] nuria: :) [15:37:33] Analytics-Kanban: (subtask) Marcel's standard metrics - https://phabricator.wikimedia.org/T150024#2771937 (Milimetric) [15:37:36] Analytics-Kanban: (subtask) Dan's standard metrics - https://phabricator.wikimedia.org/T150025#2771951 (Milimetric) [15:39:18] Analytics-Kanban: (subtask) Dan's standard metrics - https://phabricator.wikimedia.org/T150025#2771951 (Milimetric) [15:39:20] Analytics-Kanban: Compare early results of Wikistats 2.0 with Wikistats 1.0 - https://phabricator.wikimedia.org/T141536#2771983 (Milimetric) [15:53:16] Analytics-Kanban: Totals doesn't show up on first page load in Vital Signs - https://phabricator.wikimedia.org/T150027#2772019 (Milimetric) [15:55:24] Analytics-Kanban: (subtask) Dan's standard metrics - https://phabricator.wikimedia.org/T150025#2772034 (Milimetric) [15:55:31] Analytics-Kanban: (subtask) Marcel's standard metrics - https://phabricator.wikimedia.org/T150024#2772036 (Milimetric) [16:12:40] Analytics, Discovery, Discovery-Analysis, RfC: RFC: Requirements for analytics stats processor - https://phabricator.wikimedia.org/T150028#2772066 (Yurik) [16:15:21] Analytics, Discovery, Discovery-Analysis, RfC: RFC: Requirements for analytics stats processor - https://phabricator.wikimedia.org/T150028#2772080 (Yurik) [16:20:45] Analytics, Discovery, Discovery-Analysis, RfC: RFC: Requirements for analytics stats processor - https://phabricator.wikimedia.org/T150028#2772082 (Nuria) As we talked about at this time we have two pipelines that can be of use: Data Source: webrequest We can use oozie+ spark+ scala to send dat... [16:22:42] nuria, i put down some of my thoughts for what is needed, without any internal technology names we already have. Now we can fill in the blank of what system can do what, and how we can unify them together :) [16:23:16] yurik: i alredy updated that ticket with teh two systems we have to tackle the two data sources you mentioned [16:25:15] yurik: let's please explore those options 1st [16:29:28] nuria, to clarify - i'm not suggesting we create a new system :) [16:29:45] only that we have a checklist against which to check the existing one, and see what tech is missing [16:29:58] s/tech/functionality [16:30:47] nuria, could you look through the checklist to see if all of it is there, or needs improvements? [16:31:56] yurik: well list is pretty high level but that is fine, we have more detailed tickets for areas in which we are working on [16:32:11] yurik: please do take time to explore sending data to graphana via oozie/spark [16:32:27] About TSV files - they are not very stable/versatile. They don't have a good way to aggegate history. I would much rather have a dedicated storage solution like what Grafana supports. [16:32:35] yurik: one thing we would not be able to do it to accomodateevery language/tool under the sun to process data [16:32:44] of course not :) [16:33:19] yurik: ya, we are alredy testing clickhouse and druid but that is ongoing work that will not be done imediately [16:33:23] i think SQL/HIVE queries is a good minimal list. I know some have used R, but I don't know how critical that is [16:34:03] of course, again - this is a plan for the future, purposefully high level so that if anyone starts looking for a solution, they know what techs to use from that list [16:35:53] yurik: well, that ticket will work for us to keep track of thsi request. what tech to use should be documented in wikitech, and I think for the most part it is [16:36:02] *this request [16:36:19] (PS2) Nuria: Update PageviewDefinition fixing iOS bug [analytics/refinery/source] - https://gerrit.wikimedia.org/r/319374 (https://phabricator.wikimedia.org/T148663) (owner: Joal) [16:36:27] Analytics, Discovery, Discovery-Analysis, RfC: RFC: Requirements for analytics stats processor - https://phabricator.wikimedia.org/T150028#2772101 (Yurik) [16:37:05] Sure, ticket is more for discussion, not for a final documentation [16:45:52] (PS3) Nuria: Update PageviewDefinition to better identify iOS Pageviews [analytics/refinery/source] - https://gerrit.wikimedia.org/r/319374 (https://phabricator.wikimedia.org/T148663) (owner: Joal) [16:46:11] joal: i updated IOS patch a bit please take a look [17:13:59] nuria: works for me ! [17:14:03] nuria: should I merge? [17:35:20] joal: please [17:35:30] (CR) Joal: [C: 2] Update PageviewDefinition to better identify iOS Pageviews [analytics/refinery/source] - https://gerrit.wikimedia.org/r/319374 (https://phabricator.wikimedia.org/T148663) (owner: Joal) [17:35:31] Analytics-Kanban, Operations, hardware-requests: stat1001 replacement box in eqiad - https://phabricator.wikimedia.org/T149911#2772370 (RobH) a:mark I'd like to allocate spare pool system WMF4726 for this request. It has the following specs: * Dual Intel® Xeon® Processor E5- 2623 V3 (3.0GHz/4C... [17:36:02] (CR) Joal: [V: 2] Update PageviewDefinition to better identify iOS Pageviews [analytics/refinery/source] - https://gerrit.wikimedia.org/r/319374 (https://phabricator.wikimedia.org/T148663) (owner: Joal) [17:36:13] (CR) MaxSem: "Does it support outputting to Graphite?" [analytics/discovery-stats] - https://gerrit.wikimedia.org/r/319260 (https://phabricator.wikimedia.org/T149722) (owner: MaxSem) [18:10:27] Hey milimetric [18:10:32] milimetric: still around? [18:10:52] hey yea [18:11:13] Was looking into the project_namespace_map [18:11:43] The content_namespace value is the last column, being 0 or 1 (0 = non-content, 1 = content) ? [18:11:51] milimetric: --^ [18:13:48] yes [18:13:52] Great :) [18:13:54] Thanks :) [18:14:05] lemme double check I didn't screw it up [18:20:37] joal: sorry, it's always null that column [18:20:43] something must've gone wrong in my import maybe [18:20:57] hang on, gotta finish a different conversation and i'll fix this [18:28:36] joal: ok, sorry, the data's fine, but mapping a boolean to a 0/1 field is not working in hive [18:29:02] so joal: yes, 1 for "is content" and 0 for "not labeled content in the sitematrix", but if you read with hive you have to make that column an int [18:29:32] (PS29) Milimetric: Script sqooping mediawiki tables into hdfs [analytics/refinery] - https://gerrit.wikimedia.org/r/306292 (https://phabricator.wikimedia.org/T141476) [18:31:19] nuria: if you have a sec, do you know where the response to this is served from? "curl https://vital-signs.wmflabs.org/" [18:32:06] there's a little bug in dashiki I'd like to track down and fix around that [18:32:20] milimetric: the dashiki instance on labs [18:32:24] milimetric: machine is : [18:32:46] dashiki-01.dashiki.eqiad.wmflabs [18:32:48] there's just a static page there? So it's not in puppet or anything [18:33:27] milimetric: yes, it is just a redirect that i was going to kill once the new domain had been up for a while [18:34:15] gotcha, thx [18:35:26] milimetric: high tech [18:35:29] https://www.irccloud.com/pastebin/RkVpsLh1/ [18:35:58] nuria: i know, i just fixed it [18:36:11] we can remove it whenever you want, it was just bugging me :) [18:36:23] milimetric: I think we can delete those two redirects pretty soon [18:36:24] but I think there's a bug when the url disagrees with the config [18:36:41] they seem to be racing [18:37:45] nuria: sorry for the spam, I didn't get the EURO AND DOLLARS part [18:37:46] :( [18:37:54] elukey: np at all [18:38:02] elukey: really [18:55:49] nuria: thanks - just sent the last version to you, hope that it works this time [18:56:19] * elukey thinks about "How many engineers do you need to compile an expense report?" jokes [19:01:35] depends if you want to do it just in time [19:12:13] lol [19:20:44] ori: lol [20:03:30] Analytics, Research-and-Data: Upgrade R on stat* machines to latest (3.3.2) - https://phabricator.wikimedia.org/T149959#2772756 (mpopov) I put together an R script for re-installing packages after upgrading R: https://gist.github.com/bearloga/7c9078b493e7afb0ca46d5d16bd1aba4 yo @chelsyx, have you update... [20:23:42] Analytics, Research-and-Data: Upgrade R on stat* machines to latest (3.3.2) - https://phabricator.wikimedia.org/T149959#2772797 (chelsyx) @mpopov Sorry, I've upgraded to 3.3.2 on my laptop... [22:40:17] Analytics-Kanban, EventBus, Patch-For-Review, Services (watching), and 3 others: Empty body in EventBus request - https://phabricator.wikimedia.org/T148251#2773163 (Pchelolo) Some new log entries appeared in the logs. That's indeed related to `json_encode` errors when serialising page_properties... [23:13:22] Analytics-Kanban, EventBus, Patch-For-Review, Services (watching), and 3 others: Empty body in EventBus request - https://phabricator.wikimedia.org/T148251#2773279 (Pchelolo) So, ya, here's a minimal example that illustrates what's happening: ``` $data = base64_decode( 'H4sIAAAAAAAAA6uuBQBDv6ajAg... [23:29:29] Analytics-Kanban, EventBus, Patch-For-Review, Services (watching), and 3 others: Empty body in EventBus request - https://phabricator.wikimedia.org/T148251#2773307 (Pchelolo) I think the best option to proceed is to set the `JSON_PARTIAL_OUTPUT_ON_ERROR` flag to simply ignore the binary propertie... [23:41:08] (PS1) Milimetric: Clean up small style things [analytics/dashiki] - https://gerrit.wikimedia.org/r/319965 (https://phabricator.wikimedia.org/T150027) [23:42:18] (PS1) Milimetric: Fix race condition in sitematrix [analytics/dashiki] - https://gerrit.wikimedia.org/r/319966 (https://phabricator.wikimedia.org/T150027)