[03:41:38] Can anyone confirm which timezone is used in the naming of pagecounts stats files? I see https://dumps.wikimedia.org/other/pagecounts-all-sites/2015/2015-06/pagecounts-20150623-000000.gz as the last file available, so it sounds like it's UTC and there is a lag of 3h in the feed. [03:42:05] Thanks [05:26:31] Analytics-Engineering, operations: Varnish caching around datasets.wikimedia.org is causing breakages - https://phabricator.wikimedia.org/T103423#1390932 (Ironholds) A note that this has now been fixed manually by (I think) Roan for one file but not for both, so the dashboards are still broken. I'd really... [05:33:08] Analytics-Tech-community-metrics: Legend for "review time for reviewers" - https://phabricator.wikimedia.org/T103469#1390947 (Nemo_bis) NEW [05:53:41] Analytics-Tech-community-metrics: Mysterious repository breakdown(s)/sorting order - https://phabricator.wikimedia.org/T103474#1391004 (Nemo_bis) NEW [06:57:28] Analytics-Engineering, operations: Varnish caching around datasets.wikimedia.org is causing breakages - https://phabricator.wikimedia.org/T103423#1391121 (Catrope) >>! In T103423#1390932, @Ironholds wrote: > A note that this has now been fixed manually by (I think) Roan for one file but not for both, so th... [09:05:33] Analytics, MediaWiki-Vagrant: fuse fails on vagrant - https://phabricator.wikimedia.org/T103484#1391484 (Physikerwelt) NEW a:Physikerwelt [09:36:26] (PS2) Joal: Correct pageview and projectview bugs [analytics/refinery] - https://gerrit.wikimedia.org/r/219862 [09:37:21] (CR) Joal: [C: 2 V: 2] "Self merging bugs to deploy." [analytics/refinery] - https://gerrit.wikimedia.org/r/219862 (owner: Joal) [09:41:51] Analytics-Tech-community-metrics, ECT-June-2015: Active changeset *authors* and changeset *reviewers* per month - https://phabricator.wikimedia.org/T97717#1391686 (Dicortazar) Based on this data, some raw numbers that will appear in the scr.html page. For authors, we're counting all of the unique people... [09:45:44] Analytics-Tech-community-metrics, ECT-June-2015: Gerrit changes reviewed per month (on scr.html) - https://phabricator.wikimedia.org/T97716#1391721 (Dicortazar) Some raw numbers: Patchsets submitted: * "sent_patchsets": [7164, 7513, 7893, 6683, 6870, 9501, 9175, 10518, 10335, 9285, 11491, 11937, 10241,... [10:44:44] Analytics-Tech-community-metrics, ECT-June-2015: Maniphest backend for Metrics Grimoire - https://phabricator.wikimedia.org/T96238#1391831 (Dicortazar) Hi, This panel contains updated information and this will be updated when the general korma dashboard is updated. This means that this is in production.... [11:05:56] Analytics, MediaWiki-Vagrant: fuse fails on vagrant - https://phabricator.wikimedia.org/T103484#1391856 (Physikerwelt) I figurred out that it's no problem on labs-vagrant. So it might be a problem on windows only. [13:03:22] halfak: Hullo :) [13:03:50] forgot to tell I was awake :) [13:12:34] no halfak today :) [13:24:10] ottomata, you know the 'hdfs all the things' principle? [13:24:20] looks like we're gonna start pushing to get cirrus analytics logs in there \o/ [13:24:27] * Ironholds writes up phab ticket [13:35:02] Analytics-Engineering, operations: Varnish caching around datasets.wikimedia.org is causing breakages - https://phabricator.wikimedia.org/T103423#1392193 (Ironholds) Ack. James, you lie! [13:46:05] Analytics-Tech-community-metrics: Mysterious repository breakdown(s)/sorting order - https://phabricator.wikimedia.org/T103474#1392296 (Dicortazar) The issue seems to be related to a mismatch between the original list of trackers and how that list is visualized. Basically, that list is retrieved ordering by... [13:53:36] Yay Ironholds! [13:53:45] Ironholds: cirrus logs are all server side? [13:54:03] ottomata, yep! [13:54:12] so they should be pretty trivial tog et in. I mean, comparatively. [13:54:19] Ironholds: we should integrate a php kafka producer! [13:54:20] just what we do with the varnish logs; turn them into JSON, suck them in [13:54:23] exactly! [13:54:44] Ironholds: ideally, we would use this: https://phabricator.wikimedia.org/T102082 [13:54:47] but, that i a longer term project :/ [13:54:48] :) [13:54:58] well there will be in your inbox an email from a phab ticket :) [13:55:25] we have hired a search, relevance and data nerd who just so happens to be an engineer who needs to get up to speed on how our infrastructure works, and I had a lightbulb moment for how we could kill three birds with one bird-killing device [13:56:18] Ironholds: if you happen to meet the guy, please intro him to me (or vice versa) :) [14:02:23] joal, sure! I'll recommend he join -analytics and you two can meet :) [14:02:28] seems like a smart cookie thus far [14:02:39] Me like cookie :) [14:03:12] I have done a few bits and pieces in search / relevance field, and would love to know his views / plans etc :) [14:06:24] yay! [14:11:13] milimetric, our home folders in wikimetrics1 are empty :[, no restart script [14:11:30] side effect of NFS removal [14:11:39] I can recover them for you if you wish [14:11:47] we can also give you /data/project back now if you want [14:12:01] YuviPanda, are you omnipresent? :P [14:12:10] mforns: milimetric will make you think I am :) [14:12:28] YuviPanda is NFS-goddess :) [14:12:33] do file a bug? I'll do it in a hour or so [14:12:33] Yuvi is of course omnipresent. He was just born with more dimensions [14:12:35] YuviPanda, would that take a lot of work? [14:12:38] nope [14:12:44] 5mins or less [14:12:58] YuviPanda, then I think it would be good [14:13:01] do file a bug though so I can track it :) [14:13:06] ok [14:13:42] The only tricky part of the restart script was kill -9 ing the queue ghost children [14:14:41] milimetric, oh, yes I remembered there was something [14:15:00] ok [14:17:46] YuviPanda, created this task https://phabricator.wikimedia.org/T103530 [14:21:00] Analytics-Backlog: Stand up piwik in a permanent and privacy-sensitive way - https://phabricator.wikimedia.org/T98058#1392491 (Fjalapeno) @yuvipanda - for reading - we are just intending to pilot this for the mobile apps - which is a much smaller user base than mobile web. I don't think we should have scale... [14:56:39] Analytics-Backlog: Stand up piwik in a permanent and privacy-sensitive way - https://phabricator.wikimedia.org/T98058#1392701 (yuvipanda) Not currently, sorry. [15:37:21] Analytics-Kanban, operations: Varnish caching around datasets.wikimedia.org is causing breakages - https://phabricator.wikimedia.org/T103423#1392899 (kevinator) [15:39:02] Analytics-Kanban, operations: Varnish caching around datasets.wikimedia.org is causing breakages - https://phabricator.wikimedia.org/T103423#1392903 (Milimetric) a:Milimetric [15:39:20] Analytics-Kanban, operations: Varnish caching around datasets.wikimedia.org is causing breakages - https://phabricator.wikimedia.org/T103423#1389620 (Milimetric) I'm claiming this task and closing the original task that added caching. [15:40:40] Analytics-Kanban, Patch-For-Review: Add cache headers to the datasets.wikimedia.org/limn-public-data/metrics folder {lion} [5 pts] - https://phabricator.wikimedia.org/T101125#1392911 (Milimetric) Open>Resolved This is resolved, as we need 24 hour cache on most of the files served here. We have to fi... [15:41:08] Analytics-Kanban: Android debug builds event logging not recorded {oryx} [3 pts] - https://phabricator.wikimedia.org/T102881#1392913 (Milimetric) @Niedzielski: I'm having a hard time generating proper events that then don't go through the pipeline. Do you think you could set up a meeting with me and we can t... [15:41:17] Analytics-Backlog, Analytics-EventLogging, Analytics-Kanban: Write a new Camus consumer and store the data for Event Logging {stag} [21 pts] - https://phabricator.wikimedia.org/T98784#1392916 (Ottomata) [15:42:30] Analytics-Cluster, Analytics-Kanban, Patch-For-Review, Performance: Implement Last-Access cookie [34 pts] {bear} - https://phabricator.wikimedia.org/T88813#1392926 (kevinator) [15:42:32] Analytics-Cluster, Analytics-Kanban, Epic: {bear} Last Access Counts - https://phabricator.wikimedia.org/T88647#1392927 (kevinator) [15:42:34] Analytics-Cluster, Analytics-Kanban, Patch-For-Review, Performance: Implement Unique Clients report on cluster using x-analytics header & last access date {bear} [13 pts] - https://phabricator.wikimedia.org/T92977#1392924 (kevinator) Open>Resolved we have some preliminary data, which is inconc... [15:42:36] Analytics-Backlog, Analytics-EventLogging, Analytics-Kanban: Write a new Camus consumer and store the data for Event Logging {stag} [21 pts] - https://phabricator.wikimedia.org/T98784#1392929 (Ottomata) Open>Resolved a:Ottomata Done! It doesn't make sense to import events by invalid/valid.... [15:52:45] milimetric, thank you for the varnish task taking :). What sorta fix time do you expect? [15:56:14] mforns: do you want to talk after this? [15:56:20] madhuvishy, sure! [15:56:25] I'll stick around then :) [15:56:27] mforns: madhuvishy: can we talk about EL in another channel? [15:56:37] I can create a meeting [15:56:42] kevinator, madhuvishy, yes [15:56:44] ok [15:56:45] kevinator: oh yeah sure [15:57:27] I just sent an invite [16:01:48] halfak: Will you be there with Altiscale today ? [16:06:33] Analytics-Kanban, operations: Varnish caching around datasets.wikimedia.org is causing breakages - https://phabricator.wikimedia.org/T103423#1393049 (Milimetric) This may seem harsh, but the resolution is to use a cache buster on the end of your URL. Most of our dashboarding was built with that in mind.... [16:06:47] Analytics-Kanban, operations: Varnish caching around datasets.wikimedia.org is causing breakages - https://phabricator.wikimedia.org/T103423#1393050 (Milimetric) Open>Resolved [16:07:29] ottomata1: it's been a short meeting ! [16:07:33] Let's talk if you wish [16:08:01] Batcave ? [16:09:01] joal, I think halfak is out of town [16:09:15] thanks Ironholds :) [16:09:43] Analytics-Kanban, operations: Varnish caching around datasets.wikimedia.org is causing breakages - https://phabricator.wikimedia.org/T103423#1393078 (Ironholds) It's nothing to do with my quarterly presentations. Okay, cache busting it is - should be trivial to work out. [16:13:03] Ironholds: are you still thanking me now that you read my note? :/ [16:13:15] milimetric, that note being "please change yo software"? [16:13:23] well, yes [16:13:26] I mean, It's more of a PITA but I have no idea what else runs off the misc varnishes [16:13:39] so it's not like I'm all "HOW DARE YOU MY CODE IS IMPORTANT CODE" "So is the blog" "oh" [16:13:45] it's most of the dashboards and whoever else knows about it... [16:14:25] but that's less important than the correctness of the change. If you think I'm wrong, let me know. It just makes sense to cache at the coarsest level and then bust at whatever level you need [16:19:00] milimetric, no, it makes sense [16:19:08] I'm filing it in my mental "stuff to tell the next data schlubs" [16:19:23] so I'll just GET file.tsv?date=[current timestamp, stripped for illegal chars] [16:19:30] that way it'll cache miss every time [16:20:07] Ironholds: you can add literally anything after the question mark, it's just used to string match on the server for cache hit / miss [16:20:11] so ?timestamp is fine too [16:20:20] and you can truncate the timestamp at the granularity you want [16:20:45] so like ?201506221234 would bust at a minute level [16:21:01] I recommend some caching, not getting it new every time [16:23:35] milimetric, yeah, I am [16:23:41] I handle that client side at the moment [16:23:53] it stores a dt and every new connection, checks against the existing dt [16:24:04] if it's >=1hour offset, goes and rebuilds and resets that value [16:24:29] yo joal [16:24:33] heya [16:24:40] batcave? [16:24:43] yup [16:26:26] Ironholds: I feel like I've failed as the unofficial person to talk to about dashboards. Sorry it came to this, but it sounds like you have it under control. Lemme know if I can help [16:30:47] milimetric, it's okay! And fwiw I'll be also serving as that person, so we can double-team it :) [16:31:01] I've been told to train 3-4 people via cerebral osmosis [16:31:36] cool :) [17:41:02] milimetric: i am trying to backfill but having problems. [17:41:21] what seems to be the problem sir [17:42:14] welp, i don't seem to be able to publish to a 0mq socket on which there is already a process publishing to [17:42:26] zmq.error.ZMQError: Address already in use [17:42:30] how do you do it? [17:42:50] ottomata: told you :-P [17:42:52] publish... hm... I think most people just spin up another consumer and consume from a file [17:42:59] yes [17:43:03] haha, joal told me! [17:43:05] yes, but then what [17:43:06] what kind of events you got, validated already? [17:43:09] consume from file and write to [17:43:22] milimetric: the easiet for me to get were already parsed, but not validated. from the upstart logs [17:43:27] but, i noticed that these don't have the uuid set [17:43:32] because that is set by the processor [17:43:38] they are unencoded [17:43:47] i am trying to get the raw ones [17:43:50] oh ok, so you need to run them through a processor first, then through a consumer [17:43:51] but thoes are harder to islate [17:44:05] well, processor won't work because they aren't raw. [17:44:10] so i guess I'd just spin up a processor, input - file, output - file [17:44:14] oh huh? [17:44:23] processor excepts them to be encoded [17:44:27] oh... parsed but not validated... [17:44:45] ja so, i ugess i should find the raw ones out of the raw file then [17:44:46] weird... so how did that happen? Can't you just get the raw events? [17:44:55] i got them out of the upstart error logs [17:44:57] cause that was easier [17:45:20] but let me try to get them raw... [17:45:26] yeah, this is the reason I wanted these pre-validated logs in HDFS, exactly this kind of mess [17:45:43] much easier to grab them out with a query than using split trying to find the right chunk [17:46:11] not sure how you would grab them with a query [17:46:27] well, they'd have timestamps so I'd just query a timestamp [17:46:46] but you can use split and guess about which chunk was lost [17:47:06] then double check the timestamp by converting it from seconds to a date [17:50:09] AGH THIS internet [17:50:12] i gotta move soon [17:55:15] ok, milimetric i have them raw now, but they don't have seqIds prepended to them like serfver side ones usually do [17:55:28] do I ned to prepend? hmmm [17:55:36] maybe i can use processor with a different format, without the seqId [17:55:42] i dont' think the seqId is used, is it? [17:56:39] ottomata1: sorry - gotta set up for the lightning talks [17:56:40] https://bluejeans.com/362967934/ [17:57:28] np [17:58:04] milimetric, link please asap [17:58:09] linux crashed [18:59:32] milimetric: whenever you are bak and working, i am failing miserably at this backfilling thing! :/ [19:01:42] ottomata: I gotta work with the android app folks who can't see their events on beta [19:01:48] k no worries [19:01:52] if you wanna jump in and help us, we can work on backfilling next [19:01:54] but that's up to you [19:01:54] ok [19:01:57] can help! [19:02:14] ottomata: invited you: https://plus.google.com/hangouts/_/wikimedia.org/dandreescu-snie [19:02:20] guys, I'm off for today ! [19:02:25] laters joal! [19:02:27] nice presentation! [19:02:31] Have a good evening :) [19:02:36] Thx ottomata [19:04:06] PROBLEM - Difference between raw and validated EventLogging overall message rates on graphite1001 is CRITICAL 40.00% of data above the critical threshold [30.0] [19:05:05] PROBLEM - Throughput of event logging events on graphite1001 is CRITICAL 15.38% of data above the critical threshold [600.0] [19:06:55] RECOVERY - Throughput of event logging events on graphite1001 is OK Less than 15.00% above the threshold [500.0] [19:08:55] those alarms are my fault, i was tyring to backfill but what i backfilled was not valid :/' [19:13:38] mforns: did you make progress on the draft? [19:16:48] milimetric ottomata: good to go! [19:16:59] milimetric ottomata: thanks for the help! [19:24:06] RECOVERY - Difference between raw and validated EventLogging overall message rates on graphite1001 is OK Less than 15.00% above the threshold [20.0] [19:35:55] Analytics-Kanban: Android debug builds event logging not recorded {oryx} [3 pts] - https://phabricator.wikimedia.org/T102881#1393888 (Milimetric) Open>Resolved Thanks for the debug session, it turned out events were invalid but I was looking in the wrong logs. FYI in case someone's learning from this:... [19:46:22] madhuvishy, kevinator, I wrote the draft of the email, I'm still with a couple questions, but we can discuss that if you want [19:47:34] madhuvishy, sorry didn't see your message until now, no headphones [20:09:39] mforns: aah - is it on the etherpad [20:09:57] madhuvishy, yes [20:11:19] mforns: hmmm i'm just looking at the policy [20:11:26] what does that mean for us [20:19:43] madhuvishy, maybe the fact that the phrase I pasted is within the "Non-personal information associated with a user account*" section is relevant [20:21:05] madhuvishy, never mind, I think I understood it incorrectly, I think the discussion we had before and the assumptions we made were OK [20:21:17] so, will remove the comment from the email draft [20:22:45] madhuvishy, so what do you think about the draft? what should we add/ remove? [20:23:45] mforns: hmmm - alright. I have a one-on-one with Kevin at 2. I think it looks good. Will ask him to review it too [20:24:02] madhuvishy, ok [20:25:06] madhuvishy, I mean the text has to be sweetened and adapted to each situation, those are just guidelines [20:25:23] mforns: of course [20:42:45] Analytics-EventLogging: Replace EventLogging with Confluent Platform - https://phabricator.wikimedia.org/T102082#1394161 (Ottomata) https://github.com/confluentinc/kafka-rest/issues/79 Iiiinteresting! [20:45:48] Quarry: Seeing all of created queries in one page - https://phabricator.wikimedia.org/T98688#1394177 (matej_suchanek) [20:45:52] Quarry: List of users all published queries - https://phabricator.wikimedia.org/T88920#1394178 (matej_suchanek) [20:45:56] Quarry: Provide the ability to view the queries submitted by a particular user - https://phabricator.wikimedia.org/T71174#1394179 (matej_suchanek) [20:46:00] Quarry: Show all published queries in profile - https://phabricator.wikimedia.org/T77948#1394180 (matej_suchanek) [20:47:00] Quarry: Show all published queries in profile - https://phabricator.wikimedia.org/T77948#832247 (matej_suchanek) [21:11:58] nice end of the day, everyone! [21:14:05] we get to go home?