[00:09:21] Does anyone know about this report card? http://mobile-reportcard.wmflabs.org/ I'm trying to figure out what domain the numbers are coming from (en.wiki? all projects?), and what "main", "other", and "successful" mean. [00:12:16] Never mind, just found https://wikitech.wikimedia.org/wiki/Mobile_Reportcard. [00:59:48] Analytics, Analytics-Backlog, Performance-Team, Patch-For-Review: Collect HTTP statistics about load.php requests - https://phabricator.wikimedia.org/T104277#1432643 (ori) >>! In T104277#1432304, @Krinkle wrote: > We ran into some issues with the metrics. Re-opening as reminder to investigate and ad... [01:08:51] Analytics, Pywikibot-compat-to-core: Measure current usage of pywikibot-compat vs pywikibot-core - https://phabricator.wikimedia.org/T99373#1432689 (jayvdb) p:Unbreak!>High >>! In T99373#1432298, @Aklapper wrote: >> With the July 1 API breakage about to happen, > > This has been [[ https://www.med... [01:15:17] Analytics, Pywikibot-compat-to-core: Measure current usage of pywikibot-compat vs pywikibot-core - https://phabricator.wikimedia.org/T99373#1432700 (Ricordisamoa) >>! In T99373#1432689, @jayvdb wrote: > Perhaps we need to add our own analytics to compat & core ; i.e. pinging a tool labs web-app during st... [01:24:57] Analytics, Pywikibot-compat-to-core: Measure current usage of pywikibot-compat vs pywikibot-core - https://phabricator.wikimedia.org/T99373#1432729 (jayvdb) >>! In T99373#1432700, @Ricordisamoa wrote: >>>! In T99373#1432689, @jayvdb wrote: >> Perhaps we need to add our own analytics to compat & core ; i.... [05:20:55] Analytics, Analytics-Backlog, Performance-Team, Patch-For-Review: Collect HTTP statistics about load.php requests - https://phabricator.wikimedia.org/T104277#1432979 (Krinkle) >>! In T104277#1432643, @ori wrote: >>>! In T104277#1432304, @Krinkle wrote: >> We ran into some issues with the metrics. Re... [09:40:15] (CR) Joal: [C: -1] "Comments in the code -- -1 because of parameter to switch one/off is not present, not allowing to replicate old behavior." (3 comments) [analytics/aggregator] - https://gerrit.wikimedia.org/r/223031 (https://phabricator.wikimedia.org/T95339) (owner: Mforns) [11:17:10] Analytics-Tech-community-metrics, Engineering-Community, ECT-July-2015: Automated generation of repositories for Korma - https://phabricator.wikimedia.org/T104845#1433609 (Qgil) p:Triage>Normal [13:03:56] halfak / joal: sorry, I can't make it to this morning's meeting :( [13:04:08] Hi milimetric [13:04:08] Sorry to miss you milimetric [13:04:13] np [13:04:19] halfak: joining :) [13:04:19] We'll mostly be talking about ongoing hadoop jobs. [14:12:30] Analytics-Backlog, Wikimania-Hackathon-2015: Dockerize Hadoop Cluster, Druid, and Samza + Load Test - https://phabricator.wikimedia.org/T102980#1434036 (Qgil) Who is the owner of this #Wikimania-Hackathon-2015 project? [15:39:33] halfak: do you need me to come to the altiscale syncup meeting ? [15:40:09] I have no news, so I feel I could miss that one [15:41:09] halfak: --^ [15:56:39] halfak: ping ? [15:57:27] Analytics-Backlog, Wikimania-Hackathon-2015: Dockerize Hadoop Cluster, Druid, and Samza + Load Test - https://phabricator.wikimedia.org/T102980#1434726 (Milimetric) @Qgil the whole analytics team is the owner. We'll all be there, we'll all work on this, and we'll be in sync: * @Kevinator * @JAllemandou... [15:58:21] joal, nope. [15:58:26] Not unless you have something to talk about. [15:58:31] Sorry to miss the pings. [15:59:37] halfak: np, thanks for the answer [15:59:45] halfak: I'll miss it [16:00:02] OK [16:39:09] joal, altiscale folks claim that snappy is not splittable for processing. [16:52:09] halfak: That would explain ! But it's weird not to find doc on that matter ... [16:52:36] halfak: I'll provde compression parameter for the json conversion job [16:54:40] joal, they suggested that we try compressing bz2 next time. [16:55:00] I plan to run some tests with a subset of files by recompressing them and running the diff job. [17:18:15] Analytics-Backlog, Analytics-Dashiki: Improve the edit analysis dashboard {lion} - https://phabricator.wikimedia.org/T104261#1435162 (Milimetric) Next week we're all at Wikimania. If you're coming, we can catch up quickly there and schedule something for afterwards. So anything after July 20th works for... [17:19:46] Analytics-Kanban: Bug: puppet not running on wikimetrics1 instance, Vital Signs stale {musk} [5 pts] - https://phabricator.wikimedia.org/T105047#1435170 (Milimetric) NEW a:Milimetric [17:26:23] Analytics-Backlog, Analytics-Dashiki: Improve the edit analysis dashboard {lion} - https://phabricator.wikimedia.org/T104261#1435194 (violetto) Sounds good to me, I'll be at Wikimania. [17:28:42] joal, yt? [17:33:57] (PS2) Mforns: Add aggregation across projects [analytics/aggregator] - https://gerrit.wikimedia.org/r/223031 (https://phabricator.wikimedia.org/T95339) [17:39:23] (CR) Mforns: Add aggregation across projects (3 comments) [analytics/aggregator] - https://gerrit.wikimedia.org/r/223031 (https://phabricator.wikimedia.org/T95339) (owner: Mforns) [17:40:07] (CR) Mforns: "Before merging, I'd like to test this with a real sample of hourly projectcount data." (1 comment) [analytics/aggregator] - https://gerrit.wikimedia.org/r/223031 (https://phabricator.wikimedia.org/T95339) (owner: Mforns) [17:42:49] hi milimetric. would a 6pm EST meeting be late for you? not sure how long you will be around. (for the opt-out discussion) [17:45:00] Analytics-Backlog, Analytics-Dashiki: Improve the edit analysis dashboard {lion} - https://phabricator.wikimedia.org/T104261#1435244 (Neil_P._Quinn_WMF) I'll be at Wikimania as well. I'll definitely catch up with you guys there! [17:50:24] Analytics-Tech-community-metrics, Engineering-Community, ECT-July-2015: Automated generation of repositories for Korma - https://phabricator.wikimedia.org/T104845#1435268 (Aklapper) a:Dicortazar (Assigning to Daniel because he'll work on this at some point) [17:50:57] Analytics, Engineering-Community, Research-and-Data, ECT-July-2015: Metrics about the use of the Wikimedia web APIs - https://phabricator.wikimedia.org/T102079#1435271 (Qgil) a:Qgil Taking this task for now, but I need to find a real assignee that can fix it. [17:53:43] PROBLEM - Check status of defined EventLogging jobs on eventlog1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:55:24] RECOVERY - Check status of defined EventLogging jobs on eventlog1001 is OK All defined EventLogging jobs are runnning. [17:56:53] Analytics-Tech-community-metrics, ECT-July-2015: Exclude third-party / pulled upstream code repositories from metrics - https://phabricator.wikimedia.org/T103984#1435294 (Aklapper) a:Dicortazar [17:57:36] Analytics-Tech-community-metrics, ECT-July-2015: Remove deprecated repositories from korma.wmflabs.org code review metrics - https://phabricator.wikimedia.org/T101777#1435300 (Aklapper) a:Dicortazar [17:58:01] Analytics-Tech-community-metrics, ECT-July-2015: Mysterious repository breakdown(s)/sorting order - https://phabricator.wikimedia.org/T103474#1435301 (Qgil) a:Dicortazar [18:27:21] hey mforns [18:27:25] sorry for the delay [18:27:29] joal, hi! [18:27:33] np at all [18:28:23] joal, I wanted to ask you how I can download a sample of the hourly projectcounts. Although I pushed the changes for review, I'd like to test it with real data before merging [18:29:08] mforns: that's a good idea :) [18:29:40] projectcounts / projectview files can be found on stat10032 [18:29:45] stat1002 sorry [18:30:05] joal, but that's the output of the aggregator, right? [18:30:41] I've seen those already, but I'd like the input hourly files [18:31:37] projectview (new ones) are in /mnt/hdfs/wmf/data/archive/projectview/webstatcollector/hourly/2015/2015-0[567] [18:31:46] those are hourly files yes [18:32:45] for the old ones (projectcounts), they are in /mnt/hdfs/wmf/data/archive/pagecounts-all-sites/2015/2015-0[1234567] [18:33:05] If you take old ones, don't mistake between projectcounts and pagecounts :) [18:33:13] Not the same size ;) [18:33:18] mforns: --^ [18:36:02] mforns: have you found them ? [18:36:29] joal, sorry telephone rung [18:36:41] np mforns [18:36:41] not yet [18:38:24] joal, found them, thanks! does scp work with hdfs? [18:38:35] mforns: works if you use the mount folder [18:38:58] mforns: shall I go and review the last version of your code or do you prefer for me to wait for your tests ? [18:39:44] joal, I think you can wait :] [18:39:55] even do it tomorrow [18:39:58] Ok sounds good mforns :) [18:40:04] cool, thanks joal [18:40:08] no prob ! [18:40:16] Will call it a day foir today :) [18:40:23] Tomorrow lads ! [18:40:59] good night! [19:06:43] milimetric: I think I have something preliminary to start testing - can you tell me how.. [19:16:17] madhuvishy: cool, batcave? [19:16:31] joining [19:37:47] jgage: got a sec to help us out installing something? [19:37:51] we don't have permissions [19:38:17] leila: no that's fine, but I'll jump out for 10 minutes to pick up Steph [19:38:30] np. thanks, milimetric. [19:46:30] Analytics-Backlog, user-notice: Delete all data from EventLogging:PersonalBar schema - https://phabricator.wikimedia.org/T105065#1435706 (mforns) NEW a:mforns [19:50:54] Analytics-Backlog, Compact-Personal-Bar-(Beta), user-notice: Delete all data from EventLogging:PersonalBar schema - https://phabricator.wikimedia.org/T105065#1435718 (Quiddity) [19:52:21] milimetric: sure, what's up? [19:52:42] jgage: so we need pykafka on analytics1004 [19:52:51] but we don't have pip on there, so we can't install it [19:53:00] not sure how andrew installed the other python dependencies [19:53:03] like kafka-python [19:53:15] but he's on vacation for a few days and we wanted to run some load tests on there [19:53:41] ok, i'll take a look. i don't see a debian package for pykafka, i'll see what he did on others stats boxes [19:54:20] pykafka is a new thing we're installing, sorry I don't know more about how he did the other ones [19:54:53] there's no pip on that machine, so I'm guessing he either debianized it himself or... yeah not sure [19:55:07] milimetric: can you run this command on stat1003 and tell me if these are the files you want on stat1004: 'dpkg -L python-kafka' [19:55:27] jgage: it's analytics1004 btw, not stat1004 [19:55:33] ach [19:56:46] milimetric, I have some Q's re. RCstream. Would you ping me when you have time to think about that for a minute? (The only relevant google result for the error was a chat between you an ottomata) [19:57:01] jgage: no that's a different thing. That's kafka-python and we need this new hotness, pykafka: https://github.com/Parsely/pykafka [19:57:17] halfak: hahaha, funny. sure, what's up [19:57:32] milimetric: ok. i'm unclear, is pykafka already installed on any of our other hosts? [19:57:41] jgage: no [19:57:44] On March 27th, ottomata pasted this error into the chat: [19:57:44] WARNING:socketIO_client:stream.wikimedia.org:80/socket.io/1: [packet error] unhandled namespace path () [19:57:51] Then you guys hopped into the batcave. [19:57:55] right, yes [19:57:57] I'm getting that error and don't know what to do [19:58:09] I believe that was due to using the new socket.io [19:58:18] it was a version problem [19:58:25] I switched back to 0.5.6 which is recommended for socket.io 0.9 [19:58:25] halfak: are you using node or python? [19:58:29] Python [19:58:40] hm, I never got it to work with that, node is much more straightforward [19:58:44] Before I made the switch, I was getting 404s [19:59:06] Maybe I'll downgrade more. [19:59:08] let me try and get it to work with python too though [19:59:16] hm, you should be ok version wise [19:59:20] I found another bug in the 0.5.6 code that I had to work around. [20:00:13] Oh wait! It looks like that is just warning and I can run without it. [20:02:35] milimetric, false alarm. I'm going to ignore that warning. Thanks for taking a look. [20:02:56] halfak: i'm gonna try and set it up with python too just to see how that works [20:03:07] OK. I'd be interested to know if you get the same warning. [20:07:18] milimetric: it seems like the proper way to handle this is to make a debian package for pykafka. how urgent is your need? [20:07:59] jgage: it can wait until next week if a package is needed [20:08:07] I was hoping there was some easy way - thanks! [20:08:15] cool ok. i'll take a stab at it tonight. [20:08:55] if the need was dire i would clone the repo, tar it up, move the tarball to the host, untar and run 'setup.py install' manually, but that's.. gross :) [20:09:16] jgage: setup.py install would need pip, no? [20:09:32] and I think Andrew had some good reasons for not putting pip on there [20:09:34] i don't know, i found that via this: https://stackoverflow.com/questions/13270877/how-to-manually-install-a-pypi-module-without-pip-easy-install [20:09:49] any package manager that isn't debian is bad IMO :) [20:10:00] i hate it when languages make their own package managers [20:10:45] yeah, I think there's a lot of animosity there both ways. Like, I can see how python and node don't want to wait forever until their packages get debian certified or whatever [20:10:52] yeah, same [20:10:56] it's one of the worst things about modern programming [20:11:14] i'll take a crack at this tonight and check in with you tomorrow. hopefully it will be simple. [20:11:31] jgage: no worries, this isn't super high priority [20:11:41] ok, groovy [20:11:42] you don't have to look at this if there's no quick simple fix [20:12:08] eh, at the least i'll take a look and make a phab task [20:14:18] halfak: fail :( my python environment is completely messed up. I install packages and then it tells me they don't exist [20:14:34] parallel universe of dist_packages somewhere... but where... and why... [20:14:38] milimetric, bummer. no worries. This is working for me. [20:14:42] k, good [20:14:45] What does 'pip freeze' say? [20:15:21] pip freeze is SUPER weird. It stopped listing almost all my packages a while ago [20:15:26] it just lists a few [20:15:32] Start from scratch! [20:15:40] I know - I gotta reformat this, it's insane [20:22:54] (PS9) Milimetric: Add global default report fields [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/217857 (https://phabricator.wikimedia.org/T74117) [20:23:04] (PS10) Milimetric: [WIP] Add global default report fields [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/217857 (https://phabricator.wikimedia.org/T74117) [21:54:04] (CR) Mforns: "Tested it, seems to work! Ready to merge on my side." [analytics/aggregator] - https://gerrit.wikimedia.org/r/223031 (https://phabricator.wikimedia.org/T95339) (owner: Mforns) [21:55:11] nice end of day team, see you! [22:01:12] leila: was that meeting today or... [22:01:25] milimetric: July 27 or something [22:01:31] it was about a hold, milimetric. ;-) [22:01:43] oh :) sorry, I got confused, thx [22:01:48] I was actually surprised that you knew your plans so well ahead of time. ;-) [22:01:54] my bad, milimetric. [22:02:18] oh, I never have plans around that hour, I'm usually finishing up my day reading email and such [22:15:56] milimetric: Where can I find the datafiles for https://edit-analysis.wmflabs.org/compare/? [22:16:33] Maybe I'm blind but I can't find anything under datasets.wikimedia.org [22:36:50] Analytics-Backlog, Analytics-Cluster: Report pageviews to the annual report - https://phabricator.wikimedia.org/T95573#1436217 (Heather) As another 90 days is about to pass, can we revisit this request? Thank you! [22:55:36] neilpquinn: no that server has the worst naming :) the files you seek are here: http://datasets.wikimedia.org/limn-public-data/metrics/ [22:56:31] The name of each section should point you to the right first level folder there. Then it's in the format of editor/wiki.csv [23:04:08] Okay, thanks! I felt like I had looked through every file in the tree but apparently not :) [23:39:11] Analytics-Kanban, Research-and-Data: Validate Uniques using Last Access cookie {bear} - https://phabricator.wikimedia.org/T101465#1436374 (madhuvishy) @leila and I were looking at the possibility of comparing mobile apps' UUID based uniques and the last access cookie based uniques only for apps - to see if... [23:49:53] analytics1021: kafka.server.BrokerTopicMetrics.AllTopicsMessagesInPerSec.FifteenMinuteRate CRITICAL: 1.0199990467e-20 [23:50:01] analytics1018: kafka.server.BrokerTopicMetrics.AllTopicsMessagesInPerSec.FifteenMinuteRate CRITICAL: 0.0 [23:50:07] (it's even CRIT when it's 0.0 ?) [23:51:12] yeah, 0 is bad. i'll look. [23:52:52] thanks [23:55:57] as mentioned in the ops channel, i triggered a kafka election (all replicas were already in sync) and we're back to a healthy cluster. it was operating in a degraded mode but no data was lost. yay redundancy.