[00:02:12] Analytics-Wikistats: figure out process for adding new wikis to wikistats - https://phabricator.wikimedia.org/T64739#1416146 (Dzahn) Maybe subscribe to https://lists.wikimedia.org/mailman/listinfo/newprojects with a special user and then parse incoming email? This was really about stats.wikimedia, not wikis... [00:04:16] Analytics-Wikistats: wikistats: Insecure content warnings (images from upload.wikimedia.org) - https://phabricator.wikimedia.org/T57443#1416157 (Dzahn) This is blocked by https://phabricator.wikimedia.org/T93702#1146743 [00:05:08] Analytics-Wikistats: Wikistats portal, fix html errors - https://phabricator.wikimedia.org/T48192#1416160 (Dzahn) This is blocked by https://phabricator.wikimedia.org/T93702#1146743 [00:06:39] Analytics-Wikistats: stats.wikimedia.org needs options to see exact counts and dates - https://phabricator.wikimedia.org/T37150#1416167 (Dzahn) submitting patches of any kind is blocked by https://phabricator.wikimedia.org/T93702#1146743 [01:12:50] Analytics-Backlog: EventLogging cleanup debrief {tick} - https://phabricator.wikimedia.org/T104351#1416297 (kevinator) [04:37:56] Analytics-Engineering, operations, network: networking: adjust ACLs to allow analytics clusters to talk to new ganglia aggregator - https://phabricator.wikimedia.org/T104036#1416460 (BBlack) [04:57:16] Analytics-Engineering, operations, network: networking: adjust ACLs to allow analytics clusters to talk to new ganglia aggregator - https://phabricator.wikimedia.org/T104036#1416472 (BBlack) Open>Resolved a:BBlack logstash1*.eqiad.wmnet appear to be in the normal private vlans rather than the a... [11:36:04] Analytics-Kanban, Reading-Web: Cron on stat1003 for mobile data is causing an avalanche of queries on dbstore1002 - https://phabricator.wikimedia.org/T103798#1416981 (Jhernandez) Thanks a lot @milimetric [13:36:26] Analytics-Kanban: Vet data in intermediate aggregate {wren} [8 pts] - https://phabricator.wikimedia.org/T102161#1417163 (JAllemandou) Datasets compared: pageview_hourly and pagecounts-all-site. Same hour of data as previous comment: 2015-06-24T00:00:00 (for pagecounts-all-site, use 2015-06-24T01:00:00 because... [13:49:11] Hi mforns :) [13:49:27] oh,forgot to tell I was awaken ! [14:07:56] Analytics, Engineering-Community, Research-and-Data, ECT-July-2015: Metrics about the use of the Wikimedia web APIs - https://phabricator.wikimedia.org/T102079#1417255 (Halfak) We talked about setting up some basic logging with EventLogging. Here's our notes: http://etherpad.wikimedia.org/p/measuri... [14:25:48] Analytics, Engineering-Community, Research-and-Data, ECT-July-2015: Metrics about the use of the Wikimedia web APIs - https://phabricator.wikimedia.org/T102079#1417310 (Qgil) p:Triage>Normal @Halfak, can this task be assigned to you, at least in this initial stage? [14:31:37] joal, hi! [14:42:06] hey mforns, don't know if you have received my previous message, got disconnected [14:42:11] 16:37:19 < joal> Do you have feedback for me on how to aggregate all values ? [14:43:10] joal_, I did not receive this message, did you wrote it yesterday? [14:43:27] nono, just a few minutes ago :) [14:43:37] oh [14:47:14] milimetric and I talked about changing aggregation in python vs hive, I explained your thoughts to him, he is more for doing it in python, and I think also intuitively will prefer python [14:48:15] joal, ^ however, if I find difficulties as you said with the aggregator code, I'll consider hive again [14:48:28] mforns: no problem :) [14:48:45] ok [14:49:09] thx mforns for taking this one :) [14:49:51] np, it was the next in next up :] [14:54:08] hey ottomata, are you around ? [14:54:15] yup! [14:54:23] HeyaaaAAAaaa :) [14:54:31] Aggregation worked fine :) [14:54:48] Do you mind merging wikimetrics puppet ? [14:54:58] can do! [14:55:09] milimetric: are you around as well ? [14:55:19] Awesome ottomata :) [14:56:16] joal: yes [14:56:43] when ottomata has merged the change on wikimetrics folders, can you update the config for limn/dashiki ? [14:56:47] milimetric: --^ [14:57:02] merged. [14:57:06] I have looked for the config files, but couldn't find them :) [14:57:10] thx ottomata ! [14:57:12] well, I have to wait for it to take effect really [14:57:19] Will run in something like 15mins right ? [14:58:23] oh wait... no, it will never run, I have to do it myself 'cause it's a self-hosted [14:58:28] btw, the config is here: https://meta.wikimedia.org/wiki/Dashiki:CategorizedMetrics [14:58:37] i'll run it now and then change the config if everything looks good [14:59:05] milimetric: awesome ^ 2 :) [14:59:48] milimetric: would love to bounce about EL stuff [14:59:56] whenever you got a min [15:00:09] ottomata: do you mind if I join, trying to keep up ? [15:00:23] please! this would be a good one to [15:00:33] stayed up late and did this last night: https://gerrit.wikimedia.org/r/#/c/222064/1/server/bin/eventlogging-processor [15:00:34] Thx :) [15:00:40] needs work, but might make things faster! [15:00:57] ottomata: ok, lemme merge this thing and take care of configs, then standup, then bounce? [15:01:01] k [15:07:06] milimetric: reading the config, seems to be changes in the pageviewApi though, no ? [15:07:14] I can't find the way you retrieve data :) [15:07:47] joal: convention, the name is the only important part [15:07:57] since the rest of the path stayed the same [15:08:15] (puppet is still running btw) [15:08:19] k [15:08:26] I think I get it :) [15:08:38] #paulbunyanwasfasterthanpuppet [15:09:09] milimetric: so the new names will be "LegacyPageviews (webstatcollector)" and "Pageviews" ? [15:09:30] no, just LegacyPageviews and Pageviews right? [15:09:38] ok for me [15:09:50] I might be missing something, but it's ok [15:10:02] Since there was the webstatcollector in the prevous name, I just thought to keep it, but I don't mind [15:10:12] this is the philosophy of the JS developer joal. No certainty, just change stuff until it works [15:10:54] milimetric: Is it really different from any other dev philosophy ? [15:11:26] :) [15:11:41] so, joal the link to the new Pageviews (not legacy) didn't work for some reason [15:11:44] the repo didn't clone [15:11:48] hm [15:11:58] weird [15:12:28] oh, no, it worked but it was set incorrectly [15:12:28] probably an access config [15:12:35] ? [15:12:38] the symlink is to /srv/aggregator-projectview-data/projectcounts/daily [15:12:44] but the clone is to ... [15:13:01] to this: /srv/aggregator-projectview-data/projectview/daily [15:13:09] -projectcounts +projectview [15:13:15] OOOOHHHHH ! [15:13:19] My bad :( [15:13:23] i'll fix it manually and you can fix it in puppet [15:13:23] Will correct htat now [15:13:27] which way you gonna do it? [15:14:40] change the path to /srv//srv/aggregator-projectview-data/projectview/daily [15:14:43] /srv/aggregator-projectview-data/projectview/daily [15:14:45] sorry [15:14:55] the projectview part of it is already in git [15:15:04] ok, i'll change the symlink then [15:17:38] ottomata if you don't mind, I messed up : https://gerrit.wikimedia.org/r/#/c/222127/ [15:17:57] thx milimetric for correcting manually [15:19:07] thx ottomata [15:19:10] yup, done [15:22:22] joal: you were right, something else is missing in dashiki [15:22:29] this'll take another few minutes to get right [15:47:01] (PS1) Milimetric: Let pageview api work with different metrics [analytics/dashiki] - https://gerrit.wikimedia.org/r/222134 (https://phabricator.wikimedia.org/T104003) [16:12:29] (PS1) Milimetric: Fix incorrect query [analytics/limn-extdist-data] - https://gerrit.wikimedia.org/r/222137 [16:12:40] (CR) Milimetric: [C: 2 V: 2] Fix incorrect query [analytics/limn-extdist-data] - https://gerrit.wikimedia.org/r/222137 (owner: Milimetric) [16:13:39] ottomata: batcave ? [16:13:45] yes! you wanted a few minutes though? [16:13:47] oh [16:13:49] to chat. :p [16:13:58] huhu :) [16:13:59] yes batcave [16:15:06] (CR) Milimetric: [C: 2 V: 2] "eh, self-merging to unbreak vital signs" [analytics/dashiki] - https://gerrit.wikimedia.org/r/222134 (https://phabricator.wikimedia.org/T104003) (owner: Milimetric) [16:16:06] milimetric, sorry, I was going to merge it right now [16:16:32] mforns: it's ok. weiird... I tried toping you and IRC said you weren't on [16:16:39] *to ping [16:16:53] yeeea... I was here, don't know what happened [16:17:06] my internet is crazy today [16:17:13] it's ok, feel free to -2 if anything was wrong though [16:17:45] milimetric, I tested it and could not find any problem [16:17:56] milimetric, what is the config page for Vital Signs? [16:17:58] oh joal: pat yourself on the back man, NICE: https://vital-signs.wmflabs.org [16:18:08] https://meta.wikimedia.org/wiki/Config:VitalSigns [16:18:25] and this is the default that the local dev version uses mforns: v [16:18:28] oops :) [16:18:28] https://meta.wikimedia.org/wiki/Dashiki:DefaultDashboard [16:18:39] and this is where I defined the new metric: https://meta.wikimedia.org/wiki/Dashiki:CategorizedMetrics [16:19:00] so now any of the metrics-by-project dashboards can use either "LegacyPageviews" or "Pageviews" [16:19:08] milimetric, cool [16:19:16] but, I love that new graph, so smooth [16:19:42] hehe true [16:26:32] joal: over 4 million daily hits to zero?! wow!! https://vital-signs.wmflabs.org/#projects=enwiki/metrics=Pageviews [16:31:28] Analytics, Engineering-Community, Research-and-Data, ECT-July-2015: Metrics about the use of the Wikimedia web APIs - https://phabricator.wikimedia.org/T102079#1417886 (bd808) >>! In T102079#1417776, @Halfak wrote: > Indeed, it may be that implementing this in EL would be redundant in that you are a... [16:32:23] Analytics-Cluster, hardware-requests, operations: rack new hadoop worker nodes - https://phabricator.wikimedia.org/T104463#1417892 (RobH) NEW a:RobH [16:32:36] Analytics-Cluster, operations, ops-eqiad: rack new hadoop worker nodes - https://phabricator.wikimedia.org/T104463#1417892 (RobH) [16:33:27] Analytics-Cluster, operations, ops-eqiad: rack new hadoop worker nodes - https://phabricator.wikimedia.org/T104463#1417892 (RobH) Last instruction from @ottomata is to place 4 in rack d2-eqiad for now, the location of the remainder still need to be determined. [16:35:10] Analytics-Cluster, Analytics-Kanban: {mule} Cluster Expansion - https://phabricator.wikimedia.org/T99952#1417924 (RobH) [16:35:13] Analytics-Cluster, hardware-requests, operations: Hadoop worker node procurement - 2015 - https://phabricator.wikimedia.org/T100442#1417920 (RobH) Open>Resolved I've created T104463 for the racking of these systems. As this hardware request has been completed, I'm resolving this task. [16:43:47] madhuvishy: ! come to batcave! [16:43:52] we are talkign EL and parallelization [16:48:26] bah - crash - gonna reboot [16:55:47] ottomata: oh! I'm on the commute :/ [17:14:45] ok, madhuvishy, let me know when you are around and I will catch you up [17:18:54] Analytics, Engineering-Community, Research-and-Data, ECT-July-2015: Metrics about the use of the Wikimedia web APIs - https://phabricator.wikimedia.org/T102079#1418062 (Halfak) Both massive capacity increases and stream processing/aggregation are planned for EL. In the meantime, I agree that 5000-5... [17:19:09] Guys, I'm good for today :) [17:19:12] Baby time ;) [17:19:17] Laaaaters [17:20:09] Analytics, Engineering-Community, Research-and-Data, ECT-July-2015: Metrics about the use of the Wikimedia web APIs - https://phabricator.wikimedia.org/T102079#1418065 (Halfak) @Qgil, re. assigning this task, I don't think that I can promise much more than discussion here in the short term. I've go... [17:22:27] halfak: are you around ? [17:22:36] o/ joal|night [17:22:55] do you have a minute for me before I forget and get to my son ? [17:23:01] Sure [17:23:07] I have 8 minute s :) [17:23:09] batcave ? [17:23:22] * halfak joins [17:29:16] Analytics-Kanban, MediaWiki-extensions-ExtensionDistributor, Patch-For-Review: Set up graphs and dumps for ExtensionDistributor download statistics {frog} [3 pts] - https://phabricator.wikimedia.org/T101194#1418119 (Milimetric) I double checked and your dashboard is updating as of last night: http://ex... [17:33:06] (PS1) Milimetric: Restore overall monthly downloads, without a graph [analytics/limn-extdist-data] - https://gerrit.wikimedia.org/r/222144 [17:33:26] (CR) Milimetric: [C: 2 V: 2] Restore overall monthly downloads, without a graph [analytics/limn-extdist-data] - https://gerrit.wikimedia.org/r/222144 (owner: Milimetric) [17:33:57] Analytics-Tech-community-metrics, Engineering-Community, ECT-July-2015: Check whether it is true that we have lost 40% of code contributors in the past 12 months - https://phabricator.wikimedia.org/T103292#1418127 (Dicortazar) > I think I'm still not seeing the refreshed data (last complete month is st... [17:34:55] ottomata: I'm here [17:36:29] madhuvishy: batcave? [17:37:03] joining [17:58:34] team, internet problems again... I'm going to the internet provider shop, will be here later on today [17:58:57] no worries mforns_internet [17:59:17] :[ [18:08:45] Analytics, Engineering-Community, Research-and-Data, ECT-July-2015: Metrics about the use of the Wikimedia web APIs - https://phabricator.wikimedia.org/T102079#1418217 (Anomie) >>! In T102079#1418062, @Halfak wrote: > In the meantime, I agree that 5000-5500 requests per second is far too much. That... [18:50:22] EVERYONE: apparently that weird Wikimania registration question that none of us knew how to answer, that was backwards, so they think we all want single rooms [18:50:32] ottomata, joal, mforns, madhuvishy ^ [18:50:43] madhuvishy has confirmed single. so she's fine. [18:50:46] milimetric, ^ [18:50:50] i think we are getting that :) [18:50:54] halfak, ^ [18:51:05] and apparently there's a 40 minute deadline to say yes / no to sharing [18:51:56] ottomata: you want single or share? joal|night, can you answer the same question? how about you halfak? [18:52:13] kevinator too, but he's not on [18:52:24] and I don't see him at his desk milimetric. [18:52:28] I'm for sharing [18:52:38] me, too, milimetric. [19:00:02] ottomata, joal|night, halfak: Dan and I emailed Karen. if you want to email Karen in the next 30 min and ask to share a room with each other, please do so. That's the deadline (not sure how much we can change things later). I know Bob didn't mind sharing with one of you either, so pull him in if you need another person, please. :-) [19:02:51] leila: i don't care so much, but i think there was some confusion around this so I went with just a single room. SOooo that's fine! [19:03:04] ok milimetric [19:03:09] ready for some benchmarking! [19:03:11] i need to eat something [19:03:16] want me to get you set up and then i go eat? [19:03:20] or should I eat first? [19:03:23] yeah, ottomata. np. we just wanted to make sure you're getting what you're fine with. :-) [19:03:25] ottomata: go eat [19:03:27] k [19:28:27] milimetric: pingggggnngngngng [19:28:35] hey [19:28:38] batcave? [19:28:41] ja [19:44:10] ah, arrived too late :( [19:46:19] Single will be fine, but I could have shared [19:46:25] Anyway, it is it [19:46:28] Bye all P [19:50:13] milimetric: ut? [19:51:59] hey kevinator [19:52:21] hey I got dashiki / vital signs issues... [19:52:24] I messed up the config [19:52:29] can we talk in the batcave? [19:52:36] I'm in the batcave [20:55:46] wow milimetric, you are right about the console logging really slowing things down [20:56:04] i'm testing some things, and turned on debug logging, which causes more than 1 log to fire for every read message [20:56:10] cuts performance about in half [20:56:46] yep :) [20:56:56] try sticking it in a log file [20:57:02] it should make no difference then [20:58:00] like piping to a file? [20:58:02] like /dev/null? [20:59:05] yes [20:59:17] or an actual file if you're actually interested in the output [21:00:06] hm, it did make a difference [21:00:15] hm, well [21:00:17] i did one run [21:00:23] ipping to /dev/null with loglevel debug [21:00:32] and another piping to dev/null with log level info [21:00:49] info was about 30% faster [21:01:33] weird... the thing that's slowing it should be just the rendering of the text on the console [21:01:47] when I pipe it still renders the log text [21:37:37] milimetric: [21:37:39] it is on stderr [21:37:43] 2>&1 | ... [21:37:44] actually [21:37:47] just stderr to pipe to file [21:37:49] 2> /dev/null [21:37:50] will doi t [21:55:33] so did that make it faster ottomata ? [21:55:49] piping? [21:55:54] not looking at that at the moment [21:56:06] but,i thikn i have to change something [21:56:12] i think i have to init writers in the workers [21:56:27] doing that now [21:56:30] trying it [21:56:35] producing to kafka wasn't working [21:56:37] oh interesting... won't that add overhead? [21:56:49] it should, but hopefully not much [22:26:11] ok, milimetric, the answer is yes, it does make it faster [22:26:26] with 1 process, and default kafka producer settings (no async, we might want to try that) [22:26:37] i can do about process and produce 170 msgs / sec [22:27:22] with 8 or 12 procs, i can keep up with the event stream, which is only around 220+ / sec. [22:27:27] so at least i've overcome existin glag [22:36:14] oooweee can go way faster with async [22:37:05] NIIIIce, milimetric, I am reading from kafka the last week of data, and able to produce about 7800 / sec to kafka [22:37:22] that's in batches of 100 [22:37:22] hmmm [22:53:51] Analytics-EventLogging, MediaWiki-General-or-Unknown, Patch-For-Review, Performance: Add event tracking queue to MediaWiki core for loose coupling with EventLogging or other interested consumers - https://phabricator.wikimedia.org/T95356#1419137 (bd808) After discussion on the initial patch to add... [23:25:08] Analytics-EventLogging, MediaWiki-General-or-Unknown, Patch-For-Review, Performance: Add event tracking queue to MediaWiki core for loose coupling with EventLogging or other interested consumers - https://phabricator.wikimedia.org/T95356#1419246 (Tgr) I wonder where event schemas should be document... [23:27:53] bye team, see ya tomorrow :] [23:35:42] 7800 per second was about what we got with 5 workers right [23:39:34] Analytics-EventLogging, MediaWiki-General-or-Unknown, Patch-For-Review, Performance: Add event tracking queue to MediaWiki core for loose coupling with EventLogging or other interested consumers - https://phabricator.wikimedia.org/T95356#1419316 (bd808) >>! In T95356#1419246, @Tgr wrote: > I wonder...