[08:17:44] hello oozie! [08:18:21] oh really? Did you have a bad weekend? I know I know, but there is no need to complain each time with us.. [08:18:30] just saying [08:18:53] :P [08:22:09] are those nuria's ? [09:00:25] 10Analytics-Tech-community-metrics: Deployment of IRC panel - https://phabricator.wikimedia.org/T138004#3057081 (10Lcanasdiaz) 05Open>03Resolved @Aklapper this is finally done! https://wikimedia.biterg.io/app/kibana#/dashboard/IRC [09:00:27] 10Analytics-Tech-community-metrics, 06Developer-Relations, 07Epic: Complete migration to new Bitergia's development dashboard (and then kill korma.wmflabs.org) - https://phabricator.wikimedia.org/T137997#3057083 (10Lcanasdiaz) [09:17:55] Hi elukey, I think they are [09:19:57] elukey: [09:19:58] https://hue.wikimedia.org/oozie/list_oozie_coordinator/0024526-160420145651441-oozie-oozi-C/ [09:20:49] yeah I saw it, plus Nuria's email seems to indicate that she is doing some experiments? [09:22:49] elukey: I think so to [09:22:55] +o [09:24:36] (03CR) 10Joal: [C: 031] "LGTM again :)" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/337593 (https://phabricator.wikimedia.org/T156388) (owner: 10Mforns) [09:40:36] I've been craving lasagna all morning [09:44:38] (03PS1) 10Fdans: Use v2 table in Cassandra, switch to padded day timestamp [analytics/refinery] - 10https://gerrit.wikimedia.org/r/340093 (https://phabricator.wikimedia.org/T156312) [09:44:43] fdans: you should come to Bologna and enjoy them :) [09:45:00] damn I know right? [09:45:16] ahahah [09:45:58] damn I could just drink bolognese sauce right now [09:46:32] * fdans apologises for the overuse of "damn" [09:50:41] (03PS2) 10Fdans: Use v2 table in Cassandra, switch to padded day timestamp [analytics/refinery] - 10https://gerrit.wikimedia.org/r/340093 (https://phabricator.wikimedia.org/T156312) [10:06:51] (03CR) 10Joal: [C: 04-1] "Comments inline" (034 comments) [analytics/aqs] - 10https://gerrit.wikimedia.org/r/339419 (https://phabricator.wikimedia.org/T156312) (owner: 10Fdans) [10:08:34] merci beaucoup joal!! [10:14:25] You're welcome fdans :) [10:51:52] (03PS2) 10Fdans: Format timestamps in per-project aggregation so that comparison in Cassandra returns the correct months [analytics/aqs] - 10https://gerrit.wikimedia.org/r/339419 (https://phabricator.wikimedia.org/T156312) [11:00:12] (03CR) 10Fdans: Format timestamps in per-project aggregation so that comparison in Cassandra returns the correct months (034 comments) [analytics/aqs] - 10https://gerrit.wikimedia.org/r/339419 (https://phabricator.wikimedia.org/T156312) (owner: 10Fdans) [11:25:37] 06Analytics-Kanban, 15User-Elukey: Clean up datasets.wikimedia.org - https://phabricator.wikimedia.org/T125854#3057512 (10elukey) @Milimetric: please check /home/milimetric/apache_logs_datasets on thorium, it should be good :) (you are the only one with read permissions for those files) [11:45:37] team: I've restarted zookeeper on druid nodes a while ago to apply the Xmx settings [11:45:40] all good for the moment [11:45:54] I am going to proceed after lunch with main-eqiad [11:46:05] that also holds stuff for Kafka and Hadoop [11:46:15] I don't expect many issues [11:46:25] but I said the same thing last week [11:46:29] and Cassandra exploded [12:03:29] * elukey lunch! [13:17:16] 10Analytics-Tech-community-metrics, 06Developer-Relations (Jan-Mar-2017): Deployment of IRC panel - https://phabricator.wikimedia.org/T138004#3057769 (10Aklapper) Thanks! [13:18:17] 10Analytics, 15User-Elukey: Piwik puppet configuration refactoring and updates - https://phabricator.wikimedia.org/T159136#3057772 (10elukey) [13:19:41] milimetric: --^ [14:29:58] Taking a reak a-team [14:34:58] 10Analytics, 10EventBus, 13Patch-For-Review, 06Services (next): Page properties-change event is rejected if page was deleted - https://phabricator.wikimedia.org/T158702#3057929 (10Ottomata) > We are not really sure what properties might've been changed and whether it's 100% safe to discard it ... > Hm, it... [14:36:49] 10Analytics, 10Analytics-Cluster, 10EventBus, 10MediaWiki-Vagrant, 06Services (watching): Kafka logs are not pruned on vagrant - https://phabricator.wikimedia.org/T158451#3057930 (10Ottomata) a:03Ottomata [14:43:22] Anyone here knows a thing or 2 about reportupdater? [14:43:24] I’m curious about 2 things: [14:43:32] - can I configure it to (re)create instead of append, every time it runs? [14:43:41] - can I write multiple lines per time it runs? [14:43:54] My scenario: we want to keep track of errors/exceptions (logged with eventlogging) [14:44:03] These could change any time the query is run (as new issues happen and old ones get fixed) [14:44:10] So there are no fixed rows... [14:44:30] I don’t necessarily care for old data anyway (though ideally I’d have data for at least a few days old, to help pinpoint when new issues started happening, which is why I’d like to be able to write multiple lines at once) [14:50:22] milimetric: ^ maybe? [14:51:29] hey matthiasmullie, yes, but on my ipad at jury duty so bear with typing slowness [14:52:16] :) no rush! [14:55:34] recreate vs. append: it used to be able to do this but we turned it off because that's kind of what cron is for. We could reconsider that based on your use case, if you just want to keep everything under reportupdater. [14:56:16] As for multiple lines, yes, you can output multiple lines per date by specifying the "funnel" option, which is a little oddly named, we know [14:57:59] but matthiasmullie wait how are exceptions being thrown for past dates? [14:59:29] they're not, but I could query for all errors in the past X days and then group them by date & (re)write them all at once (then I'd have all the same errors as row/column) [14:59:56] as opposed to running the query daily and letting it append: that'd probably be a different order of most common queries quite often, with unreliable columns [15:13:49] matthiasmullie: I should take a look at your schema, can you link me? Something's not adding up :) [15:24:44] sure: https://meta.wikimedia.org/wiki/Schema:UploadWizardExceptionFlowEvent [15:26:21] the existing quick and dirty queries are here: https://phabricator.wikimedia.org/T156694 (basically something like SELECT message, COUNT(*) as total FROM table GROUP BY message ORDER BY total DESC LIMIT 10, and then split up by day) [15:27:36] (forgot to also add a "WHERE timestamp > -30 days" in that example query) [15:39:01] 06Analytics-Kanban, 15User-Elukey: Bump replication factor of system.auth table in cassandra when new nodes have finished bootstrap - https://phabricator.wikimedia.org/T157354#3058137 (10elukey) Some notes from the investigation made so far: 1) The procedure used by me has been reviewed with Eric and we didn'... [16:01:49] milimetric: standddupppp [16:06:50] milimetric: stadduppp? [16:19:01] (for the record, I was running back from jury duty :)) [16:22:44] matthiasmullie: I had to run home, back now, I've got meetings but I'm looking at your query in-between [16:23:05] matthiasmullie: do you want feedback here on IRC async over the next couple of hours or in the task? [16:28:21] milimetric irc, task, doesn't matter :) [16:28:26] take your time ;) [16:37:58] matthiasmullie: ok, I think I have a good solution but let me first make sure I understand [16:38:40] so your schema is instrumented to log errors anywhere in steps 1, 2, 3, ..., N of different flows through UploadWizard [16:38:54] are the steps always the same or are there different flows? [16:39:28] like is there flow 1: step1, step2, step3 and flow 2: step1, step2, where flow 1/step1 is a totally different thing from flow 2/step1? [16:42:24] and essentially you want to see the top 10 most common errors / exceptions you get every day, and track back a few days to see how they trend or where they start happening, right? [16:43:34] do you mean the "flowPosition" thingy? [16:43:58] yes [16:44:52] that one is incremented for every entry that is logged, so it could vary depending on user input (theoretically, at least - I'd have to doublecheck exactly what is logged, but likely enough that it's not guaranteed to always be the same number for the same event) [16:46:31] ok, so that's more used when you're looking at an individual session to see what the sequence was then, ok [16:46:42] indeed [16:47:48] ok, just for future reference, if you identify and log an ordered set of steps that would make up a funnel, you can visualize what that funnel looks like with this kind of graph (the top one): https://edit-analysis.wmflabs.org/compare/ [16:48:23] 10Analytics-Tech-community-metrics, 06Developer-Relations (Jan-Mar-2017), 07Documentation: Create basic Kibana (dashboard) documentation for admins - https://phabricator.wikimedia.org/T145929#3058472 (10Aklapper) 05Open>03Resolved Done in https://www.mediawiki.org/w/index.php?title=Community_metrics&oldi... [16:48:25] that shows how all user's experience breaks down at each step in the process [16:48:33] *users' [16:49:29] milimetric: when we talked about acluster on labs we were thinking hadoop (cc joal)? or more like "clickhouse" [16:49:57] matthiasmullie: ok, so just for the top 10 you need, you can query (date, exception, count) and just pivot on exception in the dashboard [16:50:16] and then you can show line graph, area graph, percentage, whatever you want [16:50:44] are you using dashiki for dashboarding? I forget if we ever implemented pivoting properly but I'll do that for you quickly if not [16:51:25] nuria: that would be cool if we think we can work a hadoop cluster there, but I think druid/clickhouse would bring more value per ops work [16:51:40] milimetric: k, let's ask druid/clickhouse then [16:51:41] nuria: because people can get computing resources, but they can't get our data [16:54:16] milimetric yeah, plan to use dashiki [16:54:52] so reportupdater would let me append something like this for every run: [16:55:08] 2017-02-27, some-exception, 10 [16:55:17] 2017-02-27, other-exception, 5 [16:55:26] 2017-02-27, yet-another, 3 [16:55:28] ... [16:55:40] and dashiki would let me make sense of input like that? [16:55:50] or am I misreading your suggestion? :) [16:56:16] matthiasmullie: exactly [16:56:33] the feature of reportupdater you need is to set "funnel: true" (https://wikitech.wikimedia.org/wiki/Analytics/Reportupdater#The_reports_section) [16:57:15] and you can also set a window to only keep a certain amount of data, that's "max_data_points: 30" [16:57:56] matthiasmullie: if you do that, I'll make sure you have a nice visualization of that in dashiki that you can use. Do you have an existing dashboard or you were thinking of a new one? [16:58:03] milimetric: ok, added details here, please take a look: https://docs.google.com/spreadsheets/d/1123OTmek4eRriBkZrAjbp06aH0RMmR0e69TMUlVF84s/edit#gid=1863261805 [16:58:17] matthiasmullie: and is this going to be measured on a bunch of different projects or just one? [16:59:50] joal: spreadsheet updayted with labs cluster ask: https://docs.google.com/spreadsheets/d/1123OTmek4eRriBkZrAjbp06aH0RMmR0e69TMUlVF84s/edit#gid=1863261805 [16:59:54] nuria / joal: do you want to think about making the public cluster beefier by moving one or two machines from prod and keeping the production druid only for ad-hoc analysis of small amounts of private data? [17:00:05] milimetric: nah, i think we will need both [17:00:29] besides any serious setup needs more than 1 machine [17:00:39] nuria: but if we do the sanitization, we can put that on labs, and follow our "public first" paradigm [17:00:57] * milimetric realizes he said paradigm and ducks as tomatoes are thrown [17:01:54] also nuria: I think wee need to carfully thing of druid/clickhouse if we go for a cluster in labs [17:01:58] milimetric: once we have master the sanitization such we know is possible to have a dataset that is useful and sanitized we can think of doing that, it is a bit far off [17:02:06] nuria, milimetric: It will really depend on the use case we go for [17:02:22] joal: since hw is the same we can take decision later [17:02:25] yeah, budgeting is such sweet sorrow [17:02:32] nuria: Works for me, awesome [17:02:32] milimetric that's only going to be for uploadwizard [17:02:34] milimetric: i think we need to focus first on making sure our edit dataset is on labs [17:02:38] Thanks nuria :) [17:02:43] nuria: agreed [17:02:44] milimetric: once that is done, let's think of necxt steps [17:02:52] I believe there is some existing dashboard already, but not sure where, and if we'll also want to have these in there... [17:02:52] k [17:03:21] matthiasmullie: cool, so is it measured across multiple projects or just a few? [17:03:58] (I'm just trying to think if you have like 3 or 4 projects, then we can make a separate tab for each one in a layout like this, where it's more flexible: https://analytics.wikimedia.org/dashboards/browsers/) [17:04:01] matthiasmullie: ^ [17:04:29] commons only [17:04:39] ok, cool [17:04:48] then, yeah, get me that data and I'll get you a dashboard :) [17:07:13] matthiasmullie: ^ when do you need this by? [17:09:35] I'll start working on the reportupdater things tomorrow [17:10:01] but it's not crazy urgent [17:10:10] it's mostly for internal use [17:10:20] and I can run some manual queries for the time being [17:10:29] I have to run now, I'll try to catch up with backlog tonight! [17:20:42] 10Analytics, 10Recommendation-API: productionize recommendation vectors - https://phabricator.wikimedia.org/T158973#3058562 (10schana) @JAllemandou As I understand it, there is still some [[https://etherpad.wikimedia.org/p/recommendation-api-productisation|ongoing discussion]] about how/what parts of the [[htt... [17:22:55] milimetric: did you add the piwik revamp in the budget doc? [17:23:06] oh, no that was elukey? [17:23:20] not me [17:23:26] didn't know we were doing something like thta [17:23:41] yep i proposed it [17:23:53] something a bit powerful than bohrium [17:23:57] but only if we want [17:24:00] nothing urgent [17:25:05] ok cool [17:25:11] piwik is not clusterable, eh? [17:26:13] I don't think so but I am super ignorant [17:27:00] 10Analytics, 10Recommendation-API: productionize recommendation vectors - https://phabricator.wikimedia.org/T158973#3053346 (10leila) @Fjalapeno I'm adding you to this thread, to keep an eye on it. As schana says above, there is no decision to productionize recommendation vectors atm, so this is fyi. [17:29:16] ok, i'll just make something up on the budget :) [17:32:56] ottomata: thanks :) [17:57:13] 10Analytics, 10ChangeProp, 10EventBus, 06Revision-Scoring-As-A-Service, and 3 others: Create generalized "precache" endpoint for ORES - https://phabricator.wikimedia.org/T148714#3058705 (10Halfak) a:05Halfak>03Ladsgroup [18:17:43] 10Analytics, 10ChangeProp, 10EventBus, 06Revision-Scoring-As-A-Service, and 3 others: Create generalized "precache" endpoint for ORES - https://phabricator.wikimedia.org/T148714#3058764 (10Ottomata) If there's enough need, we could set up a Kafka cluster in labs that is mirrored from Prod. And/or we can e... [18:18:15] * elukey afk! [18:18:16] o/ [18:48:06] 10Analytics, 06Discovery-Analysis: Get 'sparklyr' working on stats1002 - https://phabricator.wikimedia.org/T139487#3058871 (10mpopov) a:05mpopov>03None [18:50:59] 10Analytics, 10RESTBase, 06Services: REST API entry point web request statistics at the Varnish level - https://phabricator.wikimedia.org/T122245#3058879 (10Nuria) Can @GWicke or @Ottomata update this ticekt with their conversation from collab jam? Sounds like an agreement was reach about services consuming... [18:58:57] 06Analytics-Kanban, 06Operations, 10Traffic, 06Wikipedia-iOS-App-Backlog, and 2 others: Periodic 500s from piwik.wikimedia.org - https://phabricator.wikimedia.org/T154558#2915651 (10ema) >>! In T154558#3058745, @Stashbot wrote: > {nav icon=file, name=Mentioned in SAL (#wikimedia-operations), href=https://t... [19:24:18] 10Analytics, 10Analytics-EventLogging: Move eventlogging backend to hadoop - https://phabricator.wikimedia.org/T159170#3058941 (10Nuria) [19:28:48] 10Analytics, 10Analytics-EventLogging: Move eventlogging backend to hadoop - https://phabricator.wikimedia.org/T159170#3058941 (10Ottomata) I wonder if we should checkout [[ https://docs.citusdata.com/en/v6.1/tutorials/tut-hash-distribution.html#tut-hash | Citus DB ]] for this type of stuff. Might be worth exp... [19:30:32] 10Analytics: Investigate adding user-friendly testing functionality to Reportupdater - https://phabricator.wikimedia.org/T156523#3058960 (10mpopov) [19:36:02] 10Analytics, 10RESTBase, 06Services: REST API entry point web request statistics at the Varnish level - https://phabricator.wikimedia.org/T122245#3058977 (10Ottomata) Or at least, working with Services so they can get these metrics themselves. Gabriel and I talked about them making workers to consume from t... [19:52:28] Gone for tonight, bye a-team [19:54:38] nitey [20:35:40] milimetric nuria fdans sent you some updated mocks [20:35:45] lets try to sync up soon [20:35:56] nuria can you schedule something? (assuming you're back?) [20:37:23] thanks! I think nuria was going to book something for later this week [21:28:57] 06Analytics-Kanban, 10EventBus, 10Wikimedia-Stream, 06Services (watching), 15User-mobrovac: EventStreams - https://phabricator.wikimedia.org/T130651#3059258 (10Ottomata) [21:28:59] 06Analytics-Kanban, 13Patch-For-Review: Create EventStreams swagger spec docs endpoint - https://phabricator.wikimedia.org/T158066#3059257 (10Ottomata) [21:30:46] 10Analytics, 10EventBus, 10Wikimedia-Stream: Create /schema/:schema endpoint in eventbus service to serve schemas by schema_uri - https://phabricator.wikimedia.org/T159179#3059259 (10Ottomata) [21:58:51] 10Analytics, 10Analytics-EventLogging: Move eventlogging backend to hadoop - https://phabricator.wikimedia.org/T159170#3058941 (10Tbayer) As discussed previously by email, it will be great to be able to query EL data in full in Hive. But removing the existing MySQL setup completely will impose huge switching c... [22:32:23] 10Analytics, 10RESTBase, 06Services: REST API entry point web request statistics at the Varnish level - https://phabricator.wikimedia.org/T122245#3059479 (10GWicke) Yeah, we touched on a few options, including using kafkacat to efficiently narrow down events to those that match the string /api/rest_v1/, and... [22:34:31] 10Analytics, 10EventBus, 06Services (watching): EventBus logs don't show up in logstash - https://phabricator.wikimedia.org/T153029#3059504 (10Pchelolo) Ok, apparently @bd808 temporarily disabled logging in https://gerrit.wikimedia.org/r/#/c/320016/2 Can that be reverted now? [22:46:53] 06Analytics-Kanban, 13Patch-For-Review: Create EventStreams swagger spec docs endpoint - https://phabricator.wikimedia.org/T158066#3059523 (10Ottomata) Woot, its working! https://stream.wikimedia.org/?doc Wikitech docs updated here: https://wikitech.wikimedia.org/wiki/EventStreams#API We should only pipe di... [23:04:21] 10Analytics: Create robots.txt policy for datasets - https://phabricator.wikimedia.org/T159189#3059618 (10Milimetric) [23:09:03] 10Analytics, 10EventBus, 06Services (watching): EventBus logs don't show up in logstash - https://phabricator.wikimedia.org/T153029#3059635 (10bd808) Have you done anything to change the code that was causing the failures as noted in T150106#2774165 and T150106#2777178? We did not come up with a generic solu... [23:12:43] 06Analytics-Kanban, 15User-Elukey: Clean up datasets.wikimedia.org - https://phabricator.wikimedia.org/T125854#3059648 (10Milimetric) Thank you very much. Ok, so searching through this month of data, I found that the following are basically *never* used by anything other than a crawler (which made me think we... [23:13:30] 06Analytics-Kanban, 10Wikimedia-Extension-setup, 13Patch-For-Review: Deploy mediawiki-Dashiki extension to meta.wikimedia.org - https://phabricator.wikimedia.org/T156971#3059650 (10Milimetric) @demon: got time this week? [23:14:24] 06Analytics-Kanban, 10Wikimedia-Extension-setup, 13Patch-For-Review: Deploy mediawiki-Dashiki extension to meta.wikimedia.org - https://phabricator.wikimedia.org/T156971#3059651 (10demon) Yes, I did all the prep work for this, it should be already sitting on the clusters and just need the config merged [23:46:59] 06Analytics-Kanban, 10EventBus, 10Wikimedia-Stream, 06Services (watching), 15User-mobrovac: Bikeshed what events should be exposed in public EventStreams API - https://phabricator.wikimedia.org/T149736#3059768 (10Halfak) I'd use this for precaching scores in our experimental deployment of #ORES. We curr... [23:52:51] 06Analytics-Kanban, 10Wikimedia-Extension-setup, 13Patch-For-Review: Deploy mediawiki-Dashiki extension to meta.wikimedia.org - https://phabricator.wikimedia.org/T156971#2991507 (10Milimetric) a:03demon [23:53:45] 10Analytics-Dashiki, 06Analytics-Kanban, 13Patch-For-Review: Add extension and category (ala Eventlogging) for DashikiConfigs - https://phabricator.wikimedia.org/T125403#1986718 (10Milimetric) This has been deployed, and works as expected: https://meta.wikimedia.org/wiki/Config:Dashiki:Sample/tabs I will s... [23:56:37] 06Analytics-Kanban, 10Wikimedia-Extension-setup, 13Patch-For-Review: Deploy mediawiki-Dashiki extension to meta.wikimedia.org - https://phabricator.wikimedia.org/T156971#3059797 (10demon) 05Open>03Resolved