[00:01:08] Analytics-Backlog, Analytics-Dashiki, VisualEditor: Improve the edit analysis dashboard {lion} - https://phabricator.wikimedia.org/T104261#1486963 (Neil_P._Quinn_WMF) a:Neil_P._Quinn_WMF [00:21:31] Ironholds: hey, around? [02:47:05] (PS1) Catrope: Fix unique-users query [analytics/limn-flow-data] - https://gerrit.wikimedia.org/r/227398 (https://phabricator.wikimedia.org/T106564) [02:47:07] (PS1) Catrope: Clean up the messages-posted and moderation-actions queries [analytics/limn-flow-data] - https://gerrit.wikimedia.org/r/227399 [02:47:09] (PS1) Catrope: Correct weekstart computation in the remaining queries [analytics/limn-flow-data] - https://gerrit.wikimedia.org/r/227400 [02:47:11] (PS1) Catrope: Put unique-users on top, and add top-reply-link back in [analytics/limn-flow-data] - https://gerrit.wikimedia.org/r/227401 [04:33:57] Analytics-Backlog: Delete all data from EventLogging:PersonalBar schema {tick} - https://phabricator.wikimedia.org/T105065#1487351 (kevinator) [04:33:58] Analytics-Backlog: Enforce policy for each schema: Sanitize {tick} [8 pts] - https://phabricator.wikimedia.org/T104877#1487352 (kevinator) [04:34:00] Analytics-Backlog: Host a debrief of EventLogging cleanup {tick} - https://phabricator.wikimedia.org/T104351#1487353 (kevinator) [04:34:02] Analytics-Backlog, Analytics-EventLogging: Update Schema Talk pages {tick} - https://phabricator.wikimedia.org/T103133#1487356 (kevinator) [04:39:53] Analytics-Kanban: Bug: puppet not running on wikimetrics1 instance, Vital Signs stale {musk} [5 pts] - https://phabricator.wikimedia.org/T105047#1487363 (kevinator) [04:39:54] Analytics-Backlog, Analytics-Cluster: Python Aggregator: Solve inconsistencies in data ranges when using --all-projects flag {musk} - https://phabricator.wikimedia.org/T106554#1487362 (kevinator) [04:39:56] Analytics-Backlog: Sanitize aggregated data presented in VitalSign using K-Anonymity {musk} [8 pts] - https://phabricator.wikimedia.org/T104485#1487364 (kevinator) [04:39:58] Analytics-Kanban, Analytics-Visualization, Patch-For-Review: Integrate Dygraphs into Vital Signs {musk} [13 pts] - https://phabricator.wikimedia.org/T96339#1487366 (kevinator) [04:40:00] Analytics-Kanban, Analytics-Visualization, Patch-For-Review: Update Vital Signs UX for aggregations {musk} [13 pts] - https://phabricator.wikimedia.org/T95340#1487367 (kevinator) [04:40:02] Analytics-Kanban, Analytics-Visualization, Patch-For-Review: Set up vital-signs.wmflabs.org {musk} [8 pts] - https://phabricator.wikimedia.org/T95338#1487368 (kevinator) [04:40:04] Analytics-Kanban, Patch-For-Review: Link to new projectcounts data and serve via wikimetrics {Musk} [5 pts] - https://phabricator.wikimedia.org/T104003#1487365 (kevinator) [04:40:06] Analytics-Cluster, Analytics-Kanban: {musk} Pageviews in Vital Signs - https://phabricator.wikimedia.org/T101120#1487361 (kevinator) [09:15:33] Analytics-Tech-community-metrics: Add "Ticket Openers" to Korma's "Activity by contributors" - https://phabricator.wikimedia.org/T105634#1487642 (Qgil) p:Lowest>Low [11:25:58] Analytics-Wikistats: statistics for Wikidata API usage - https://phabricator.wikimedia.org/T64873#1487882 (Addshore) Well, I recently started adding some apifeature logging to certain areas of the wikibase api. The question is should I bother doing this? Hoo mentioned to me that all api calls are logged any... [11:26:11] Analytics-Wikistats: statistics for Wikidata API usage - https://phabricator.wikimedia.org/T64873#1487884 (Addshore) a:Addshore [12:56:26] (CR) Matthias Mullie: [C: 2] Fix unique-users query [analytics/limn-flow-data] - https://gerrit.wikimedia.org/r/227398 (https://phabricator.wikimedia.org/T106564) (owner: Catrope) [12:59:31] (CR) Matthias Mullie: [C: 2] Clean up the messages-posted and moderation-actions queries [analytics/limn-flow-data] - https://gerrit.wikimedia.org/r/227399 (owner: Catrope) [12:59:48] (CR) Matthias Mullie: [C: 2] Correct weekstart computation in the remaining queries [analytics/limn-flow-data] - https://gerrit.wikimedia.org/r/227400 (owner: Catrope) [13:04:20] (CR) Matthias Mullie: [C: -1] "I don't think we want top-reply-link back, or at least not in its current form." [analytics/limn-flow-data] - https://gerrit.wikimedia.org/r/227401 (owner: Catrope) [13:04:50] o/ joal|weekend & milimetric [13:15:05] No live systems meeting I guess. [13:15:28] TL;DR: of my updates is that I want to run some tests with doing the persistence computation on stat1003. [13:22:12] We've been seeing some substantial performance differences in the map/reduce context that work fine in single-uber-machine. [13:22:56] I'd like to get a sense for the performance of one machine before we go back to the cluster with persistence computation work. [15:32:14] ottomata: around? [15:32:18] coming to standup? [15:32:38] OO [15:32:39] yes [15:32:42] thanks [15:38:08] Analytics-Kanban: Build a deb package for kafka 0.8.2 - https://phabricator.wikimedia.org/T107157#1488288 (Milimetric) NEW a:Ottomata [15:39:17] Analytics-Kanban: Build a deb package for kafka 0.8.2 - https://phabricator.wikimedia.org/T107157#1488288 (Milimetric) [15:39:26] Analytics-Cluster, Analytics-Kanban, operations: Build new latest stable (0.8.2.1?) Kafka package and upgrade Kafka brokers - https://phabricator.wikimedia.org/T106581#1488300 (Milimetric) [15:57:11] btw, I'll deploy wikimetrics now [15:57:27] i'll be in the batcave [16:01:41] o/ milimetric, is joal away this week? [16:07:57] halfak: he's gone for 3 days [16:08:05] Gotcha. Thank you :) [16:08:15] I think he'll be back Thursday [16:10:37] mforns: any more time to review https://gerrit.wikimedia.org/r/#/c/223789/ ? [16:18:14] Analytics-Kanban, MediaWiki-extensions-ExtensionDistributor, Patch-For-Review: Set up graphs and dumps for ExtensionDistributor download statistics {frog} [3 pts] - https://phabricator.wikimedia.org/T101194#1488460 (Milimetric) Resolved>Open something went really wrong with this... I'm seeing Jap... [16:23:02] (PS1) Milimetric: Fixing repo - forgot to add downloads file [analytics/limn-extdist-data] - https://gerrit.wikimedia.org/r/227470 [16:23:43] (CR) Milimetric: [C: 2 V: 2] Fixing repo - forgot to add downloads file [analytics/limn-extdist-data] - https://gerrit.wikimedia.org/r/227470 (owner: Milimetric) [16:54:10] PROBLEM - Difference between raw and validated EventLogging overall message rates on graphite1001 is CRITICAL 20.00% of data above the critical threshold [30.0] [16:56:19] RECOVERY - Difference between raw and validated EventLogging overall message rates on graphite1001 is OK Less than 15.00% above the threshold [20.0] [16:59:38] Analytics-Engineering, Wikimedia-Logstash, operations: Convert Hadoop-Logstash logging to use Redis to address failures - https://phabricator.wikimedia.org/T85015#1488627 (bd808) Note: the redis connector has been removed from the logstash servers after its use caused problems with MediaWiki latency in... [17:03:54] Analytics-Backlog: Restart Pentaho - https://phabricator.wikimedia.org/T105107#1488630 (Milimetric) FYI we talked to Tilman and showed him around the data, and we're available for any help he needs to dig through it. [17:32:19] milimetric: did you finish deploying? [17:32:32] yes [17:32:45] oh cool :) [17:32:51] I sent a note to wikimetrics-l when i finished [17:33:03] aaah, i think i'm not in that list [17:33:27] this was an easy one, just needed a restart to apply the cache buster based on the new commit SHA [17:34:08] ah okay [17:34:14] milimetric: we should talk about the puppet stuff for wikimetrics sometime. I started working on it at the hackathon but probably need your help for some things [17:34:55] (CR) Matthias Mullie: [V: 2] Fix unique-users query [analytics/limn-flow-data] - https://gerrit.wikimedia.org/r/227398 (https://phabricator.wikimedia.org/T106564) (owner: Catrope) [17:35:03] (CR) Matthias Mullie: [V: 2] Clean up the messages-posted and moderation-actions queries [analytics/limn-flow-data] - https://gerrit.wikimedia.org/r/227399 (owner: Catrope) [17:35:14] madhuvishy: sure, what's up [17:35:16] (CR) Matthias Mullie: [V: 2] Correct weekstart computation in the remaining queries [analytics/limn-flow-data] - https://gerrit.wikimedia.org/r/227400 (owner: Catrope) [17:35:39] milimetric: https://phabricator.wikimedia.org/T101763 is the ticket [17:36:05] brb, door [17:38:09] k, back [17:38:35] madhuvishy: ok, yep, so you're modeling this after ORES? [17:38:43] the puppet code here - https://github.com/wikimedia/operations-puppet-wikimetrics/tree/master/manifests [17:38:49] milimetric: yeah, i'd like to [17:38:59] and separate the deployment things to fabric [17:39:36] but i was hoping to use the puppet celery and uwsgi modules. and the way wikimetrics services work is confusing to me [17:40:20] ok, well, Andrew did the puppet work, and I found that confusing too, but the services themselves are fairly simple [17:40:27] what part's confusing? [17:41:37] milimetric: for example, https://github.com/wikimedia/operations-puppet/blob/production/modules/ores/manifests/worker.pp - the celery module expects an application [17:42:22] https://github.com/wikimedia/analytics-wikimetrics/blob/master/wikimetrics/run.py where the run queue stuff is [17:42:51] is modeled differently, or i dont understand how it works - or how i can get it to use the celery module [17:43:44] right, so running celery was done by run.py but is configured externally. So we can just change the configuration and not run it ourselves [17:43:55] that should just work [17:44:17] milimetric: ah, where is the config? [17:45:13] this is the default config: https://github.com/wikimedia/analytics-wikimetrics/blob/master/wikimetrics/config/queue_config.yaml [17:45:20] but in prod it pulls it from: /etc/wikimetrics/queue_config.yaml [17:45:43] right [17:46:12] I never tried to configure it with an remote celery, so there may be some code changes required (it has to serialize the tasks to the remote server) [17:47:29] yeah, that's what it feels like to me. and not sure if we should invest in doing everything, or just move to fab for deployment, and reuse the services the same way - and not use the modules for celery etc [17:48:15] the fab deployment part seems like the part we want. Because all we really want is to stop it from being a self-hosted puppet master [17:49:13] I fully agree that wikimetrics puppet work is not anywhere near our top list of priorities right now [17:52:50] milimetric: okay [17:53:48] i'll look at it more when i get a chance, and poke you with more specific questions [17:53:49] (PS2) Catrope: Put unique-users on top [analytics/limn-flow-data] - https://gerrit.wikimedia.org/r/227401 [17:59:42] Analytics-Kanban: {lama} Wikistats 2.0 - https://phabricator.wikimedia.org/T107175#1488807 (Milimetric) NEW [18:01:14] kevinator / madhuvishy / mforns: I made this ^ epic [18:01:28] lookin [18:01:31] we can change the animal name, but I needed somewhere to log some discussions that have already started [18:02:22] cool [18:07:55] milimetric: nice! [18:30:37] (CR) Matthias Mullie: [C: 2 V: 2] Put unique-users on top [analytics/limn-flow-data] - https://gerrit.wikimedia.org/r/227401 (owner: Catrope) [19:18:34] (CR) Mforns: [C: 2 V: 2] "LGTM" [analytics/dashiki] - https://gerrit.wikimedia.org/r/223789 (https://phabricator.wikimedia.org/T95340) (owner: Milimetric) [19:18:45] sorry for taking so long [19:20:15] mforns: I'll try deploying it now [19:22:13] Analytics-EventLogging, Analytics-Kanban: Can Search up sampling to 5%? {oryx} - https://phabricator.wikimedia.org/T103186#1489349 (Milimetric) a:Milimetric [19:30:06] mforns: looking good, you can search for "totals" now [19:30:27] milimetric, did you substitute available-projects.json by the stub? [19:30:30] or "all" [19:30:32] https://vital-signs.wmflabs.org/#projects=ruwiki,itwiki,dewiki,frwiki,enwiki,eswiki,jawiki,all/metrics=Pageviews [19:30:41] not the stub, the new one generated by wikimetrics [19:30:46] (which I deployed this morning) [19:31:18] oh! I see [19:32:03] milimetric, it works fine, the only thing I noticed, which was happening before already, is that the second choice does not autocomplete [19:32:20] yeah, i just noticed that too [19:32:30] or maybe I just ignored it until now :) [19:32:36] no big deal [19:32:50] we can add "total" to the defaults, or replace the defaults with total [19:32:55] ok, we'll do that at some point [19:33:04] aha [19:34:12] I like that when sharing the graph with the main languages, the base of the y-axis is 0 [19:34:51] with the totals alone, the base is not 0 and the drop in pageviews seems a lot more steep than it actually is [19:41:54] yeah, there's a force 0 option in dygraphs, maybe it'd be a good idea. Along with a toggle log scale maybe [20:13:20] milimetric, I'm looking at cleaning the mobile reportcard now, there's one line in the description I don't understand [20:13:37] : duplicate from above, but keep only last 3 months: "http://datasets.wikimedia.org/limn-public-data/mobile/datafiles/mobile-options.csv", [20:13:54] https://phabricator.wikimedia.org/T104379 [20:16:28] Trying to remember what i meant mforns :) [20:17:00] milimetric, maybe it is creating another graph with the last 3 months only? [20:18:53] Yes, that's it. But this one was optional, we would have to make a custom graph or change limn code or change the reportupdater [20:21:02] milimetric, is it implied to change all graphs to reportupdater? [20:21:11] ok [20:21:48] that's up to us but I was thinking it would be a good idea. Do you want to split this up mforns? [20:22:28] I can't start on the endpoints until jo is back [20:22:50] milimetric, no, that's fine! I was thinking, if we keep this report as non-timeboxed (in generate.py) it could be restricted to the last 3 months [20:23:10] but this only could be done if the schema is non-sensitive... [20:23:52] oh, ok, yeah, if that works then i'm fine with it [20:24:07] or... we can migrate all of them to reportupdater, and leave this special case for later [20:24:24] ok [21:15:28] Analytics-Kanban: Clean up mobile-reportcard dashboards {frog} [13 pts] - https://phabricator.wikimedia.org/T104379#1489852 (mforns) http://datasets.wikimedia.org/limn-public-data/mobile/datafiles/ui-daily-historic.csv uses MobileWebClickTracking schema. This schema has seen a dramatic drop in volume of event... [22:13:37] Analytics-Kanban: Clean up mobile-reportcard dashboards {frog} [13 pts] - https://phabricator.wikimedia.org/T104379#1489960 (mforns) Same for http://datasets.wikimedia.org/limn-public-data/mobile/datafiles/main-menu-daily.csv, it uses MobileWebMainMenuClickTracking and this schema suffered two drops in the re... [22:36:20] Analytics-Cluster, Ops-Access-Requests, operations: Sudo permissions for hdfs user madhuvishy on analytics-hadoop - https://phabricator.wikimedia.org/T104020#1490053 (RobH) Open>Resolved @madhuvishy, Your access has been granted and is now live. [22:36:32] Analytics-Cluster, Ops-Access-Requests, operations: Sudo permissions for hdfs user madhuvishy on analytics-hadoop - https://phabricator.wikimedia.org/T104020#1490056 (RobH) a:Ottomata>None