[02:01:48] Analytics, MediaWiki-Authentication-and-authorization, Reading-Infrastructure-Team, MW-1.26-release, and 2 others: Create dashboard to track key authentication metrics before, during and after AuthManager rollout - https://phabricator.wikimedia.org/T91701#1513464 (Tgr) [04:10:17] (PS1) Ori.livneh: Lower the log level of stats output to stderr [analytics/statsv] - https://gerrit.wikimedia.org/r/229623 [04:10:47] (CR) Ori.livneh: [C: 2 V: 2] Lower the log level of stats output to stderr [analytics/statsv] - https://gerrit.wikimedia.org/r/229623 (owner: Ori.livneh) [07:40:23] Analytics-Tech-community-metrics, ECT-August-2015: Tech community KPIs for the WMF metrics meeting - https://phabricator.wikimedia.org/T107562#1513750 (Qgil) The casual presentation yesterday at the weekly WMF Engineering management meeting was interesting. They didn't dig much into the numbers because th... [07:47:18] Analytics-Tech-community-metrics, Engineering-Community: Automated generation of repositories for Korma - https://phabricator.wikimedia.org/T104845#1513769 (Qgil) [07:47:20] Analytics-Tech-community-metrics, ECT-August-2015: Jenkins-mwext-sync appears in "Who contributes code" - https://phabricator.wikimedia.org/T105983#1513768 (Qgil) [07:47:23] Analytics-Tech-community-metrics, Engineering-Community: Check whether it is true that we have lost 40% of code contributors in the past 12 months - https://phabricator.wikimedia.org/T103292#1513771 (Qgil) [08:00:52] Analytics-Tech-community-metrics: Present most basic community metrics from T94578 on one page - https://phabricator.wikimedia.org/T100978#1513777 (Qgil) I'm linking this task to T107562 because both a related: we need a good place to highlight and track our KPIs. Options: a special page in Korma or https://... [08:02:19] Analytics-Tech-community-metrics, ECT-August-2015: Jenkins-mwext-sync appears in "Who contributes code" - https://phabricator.wikimedia.org/T105983#1513780 (Qgil) This is probably affecting one of our KPIs to be presented at the next Metrics meeting, see T107562. [08:11:33] Analytics-Tech-community-metrics, ECT-August-2015: "Median time to review for Gerrit Changesets, per month": External vs. WMF/WMDE/etc patch authors - https://phabricator.wikimedia.org/T100189#1513789 (Qgil) At least from the point of view of the WMF Engineering management, having reliable metrics by affi... [08:12:14] Analytics-Tech-community-metrics: Remove deprecated repositories from korma.wmflabs.org code review metrics - https://phabricator.wikimedia.org/T101777#1513792 (Qgil) [08:13:33] Analytics-Tech-community-metrics: Remove deprecated repositories from korma.wmflabs.org code review metrics - https://phabricator.wikimedia.org/T101777#1347475 (Qgil) Not only clearly deprecated repositories (as in they cannot be found in Gerrit anymore). I think unmaintained repositories should be also remo... [08:21:42] Analytics-Tech-community-metrics, ECT-August-2015: Tech community KPIs for the WMF metrics meeting - https://phabricator.wikimedia.org/T107562#1513808 (Qgil) I also wonder whether we should highlight the metrics specific to the MediaWiki Core repository. For several reasons: * It is easier to compare a s... [11:36:33] Analytics-Kanban: Create Hadoop Job to load data into cassandra [?pts] {slug} - https://phabricator.wikimedia.org/T108174#1514229 (JAllemandou) NEW a:JAllemandou [12:30:27] PROBLEM - Difference between raw and validated EventLogging overall message rates on graphite1001 is CRITICAL 20.00% of data above the critical threshold [30.0] [12:32:36] RECOVERY - Difference between raw and validated EventLogging overall message rates on graphite1001 is OK Less than 15.00% above the threshold [20.0] [14:02:16] joal: I'm confused about this routing [14:02:25] Hi milimetric :) [14:02:29] cave ? [14:02:32] It looks like I'm doing the same thing as page_revisions but it's not working [14:02:32] k [14:26:03] Analytics-Tech-community-metrics: Remove deprecated repositories from korma.wmflabs.org code review metrics - https://phabricator.wikimedia.org/T101777#1514563 (Aklapper) >>! In T101777#1513792, @Qgil wrote: > Not only clearly deprecated repositories (as in they cannot be found in Gerrit anymore). "depreca... [14:30:23] Analytics-Kanban, Reading-Admin, Research-and-Data, Research consulting: Request for data: sites traffic by topics/ subject areas and geographies - https://phabricator.wikimedia.org/T107613#1514569 (DarTar) a:ezachte [14:30:42] Analytics-Kanban, Reading-Admin, Research-and-Data, Research consulting: Request for data: sites traffic by topics/ subject areas and geographies - https://phabricator.wikimedia.org/T107613#1499257 (DarTar) p:Triage>Low [15:12:48] Analytics-Tech-community-metrics, Epic, Google-Summer-of-Code-2015: Allow contributors to update their own details in tech metrics directly - https://phabricator.wikimedia.org/T60585#1514652 (NiharikaKohli) Hello! End of GSoC is fast approaching. 17 August is "Suggested pencils down" deadline and 21 A... [15:20:07] PROBLEM - Difference between raw and validated EventLogging overall message rates on graphite1001 is CRITICAL 20.00% of data above the critical threshold [30.0] [15:20:10] Analytics-Kanban, Patch-For-Review: Check and potentially timebox limn-extdist-data reports [3 pts] {tick} - https://phabricator.wikimedia.org/T107506#1514672 (kevinator) Open>Resolved [15:22:15] RECOVERY - Difference between raw and validated EventLogging overall message rates on graphite1001 is OK Less than 15.00% above the threshold [20.0] [15:33:23] milimetric: managed to log :) [15:33:29] will let you nkow how [15:33:31] YES! [15:33:31] :) [15:42:58] Analytics-Kanban, RESTBase-API: create second RESTBase endpoint [8 pts] {slug} - https://phabricator.wikimedia.org/T107054#1485473 (Milimetric) [15:43:20] Analytics-Kanban, RESTBase-API: create third RESTBase endpoint [8 pts] {slug} - https://phabricator.wikimedia.org/T107055#1485480 (Milimetric) [15:46:14] ottomata: I don't underdstand why I can't run the aggregator manually :( [15:46:21] oh? [15:46:22] wassup? [15:46:23] Well, in fact [15:46:32] I can run the thing (as stats) [15:46:39] But can't push using git [15:46:45] any idea on that? [15:47:25] I cheated a bit: I sudoed a bash [15:47:40] then ran my things --> no chance [15:47:45] ottomata: --^ [15:51:19] yeah, i thikn there are issues with sudo and git defaults being read from homedirs [15:51:28] hm [15:51:29] i can do it if I sudo to root and then su stats [15:51:39] but sudo -u stats doesn't seem to work i thikn [15:51:41] ok weird [15:52:42] hm, maybe we need something like [15:52:45] Can you run a thing for me ? [15:52:55] /a/aggregator/scripts/bin/git-pusher [15:52:57] that does [15:53:07] import aggregate_projectcounts [15:53:16] or [15:53:21] import aggregate_projectcounts.run_git [15:53:41] oh, or, joal, does sudo -u aggregate_projectcounts [15:53:43] fail on push for you? [15:53:57] sudo -u stats fails on commit for me [15:54:17] telling me I have no user [15:54:21] ottomata: --^ [15:58:54] joal: sudo -u stats git ... [15:59:02] or sudo -u stats ../aggregate_projectcounts [15:59:09] (or whatever path) [16:12:16] PROBLEM - Difference between raw and validated EventLogging overall message rates on graphite1001 is CRITICAL 20.00% of data above the critical threshold [30.0] [16:12:56] James_F, so if there's an SUL bug, what's the appropriate Phabricator project, do you know? [16:13:03] ottomata: sorry, got to disappear for a mnute [16:13:13] So when running git: warning: unable to access '/home/joal/.config/git/attributes': Permission denied [16:13:14] ah sorry joal, i keep getting distracted [16:13:16] Ironholds: "An SUL bug"? As in a user not global? Or the concept is a mistake? Or what? [16:13:29] yes, makes sense. but, you can run the aggregator script itself, right? [16:13:37] and git commands that it runs work? [16:13:38] or not.? [16:13:40] I can aggregate, but not commit [16:13:50] as in, the git commands that the script runs fail? [16:13:55] git inside agg doesn't workg [16:14:08] James_F, "SUL stopped working in a non-deterministic fashion" [16:14:34] ottomata: this is what I'd like to run as stats: https://gist.github.com/jobar/44f4e795e484464d8581 [16:15:38] I guess https://phabricator.wikimedia.org/project/sprint/profile/167/ ? [16:18:17] RECOVERY - Difference between raw and validated EventLogging overall message rates on graphite1001 is OK Less than 15.00% above the threshold [20.0] [16:18:56] yes right [16:18:57] hm. [16:19:12] joal: i wonder if we can make it possible for you to sudo su stats [16:19:25] ottomata: I can sudo stats [16:19:26] ! [16:19:38] But even with that, git no good [16:20:08] right [16:20:09] but not su [16:20:14] you need to become stats user [16:20:18] not just execute a command as [16:20:22] correct [16:20:33] I cheated like sudo -u stats bash [16:20:38] But nope [16:21:27] doesn't work, right? [16:21:33] oh! [16:21:34] it does! [16:21:38] hm [16:21:43] joal, try doing [16:21:45] sudo -u stats bash [16:21:54] II did that [16:21:57] export HOME=/var/lib/stats [16:22:00] then run [16:22:04] your command [16:22:44] Will tell you in 5 minutes [16:24:14] ottomata: WORKED ! [16:24:19] Awesome :) [16:24:40] Ironholds: Yeah, not good. Thanks for filing. [16:25:08] ottomata: Thanks a lot for that ! [16:25:31] cool! [16:26:13] James_F, np! [16:44:00] ggellerman______: I'm commuting and it looks like I will be super late to retro. Feel free to start without me. I had to leave now else would not be able to make it to metrics meeting [16:45:12] madhuvishy: k [17:42:05] joal: https://github.com/cervisiarius/wikimedia/blob/master/navigation_trees/src/main/java/org/wikimedia/west1/traces/streaming/GroupAndFilterMapper.java [17:44:46] joal: https://github.com/cervisiarius/wikimedia/blob/master/navigation_trees/src/main/java/org/wikimedia/west1/traces/GroupAndFilterMapper.java [17:48:01] Analytics-Backlog: Write script to track cycle time of tasked tickets - https://phabricator.wikimedia.org/T108209#1515290 (ggellerman) NEW [17:49:11] Analytics-Backlog: Write script to track cycle time of tasked tickets - https://phabricator.wikimedia.org/T108209#1515315 (ggellerman) Alternatively, the end of the cycle could be when the task is marked as Resolved from Done column in the Analytics-Kanban board [17:54:19] Analytics-Backlog: Write script to calculate total point value of cards marked as resolved in a regular 1wk or 2 wk window - https://phabricator.wikimedia.org/T108211#1515332 (ggellerman) NEW [18:09:15] milimetric: have you managed to join the meeting ? [18:09:48] yes, joal, https://www.youtube.com/watch?v=aMQ5XVF0zm0 [18:09:56] Thx milimetric [18:10:09] where have you found that link ? [18:11:26] Staff channel [18:12:01] Oops, i mean office [18:12:24] K ! [18:12:27] Thx :) [18:51:37] back in a bit... [19:17:17] Analytics-Kanban, Team-Practices-This-Week: Get regular traffic reports on TPG pages - https://phabricator.wikimedia.org/T99815#1515554 (JAufrecht) Kevin, is this done (as in automated)/should I check in in a month? [19:43:59] anyone remember brandon's IRC handle? [19:45:24] Ironholds: 'jorm'? [19:45:30] Black ;p [19:45:38] Ironholds: Oh, 'bblack'. [19:46:05] aha, thanks! [19:47:15] PROBLEM - Difference between raw and validated EventLogging overall message rates on graphite1001 is CRITICAL 20.00% of data above the critical threshold [30.0] [19:51:16] PROBLEM - Difference between raw and validated EventLogging overall message rates on graphite1001 is CRITICAL 20.00% of data above the critical threshold [30.0] [19:53:16] RECOVERY - Difference between raw and validated EventLogging overall message rates on graphite1001 is OK Less than 15.00% above the threshold [20.0] [20:17:05] ottomata: around? [20:19:54] yup hiya [20:19:56] madhuvishy: [20:20:15] hey, so i made a tiny patch for the statsd stuff yesterday [20:20:22] https://gerrit.wikimedia.org/r/#/c/229631/1/server/eventlogging/handlers.py [20:21:03] ottomata: but i have 2 problems. 1. I'm not sure where to count actual inserted. It seems like the write place to do it is inside store_sql_events [20:21:12] rigt* [20:21:17] gah i cant type [20:23:06] 2. for some reason, if I have a url like - mysql://root:root@127.0.0.1/log?charset=utf8&statsd_host=statsd.eqiad.wmnet [20:23:22] looking... [20:24:02] argparse seems to ignore the stuff from after mysql://root:root@127.0.0.1/log?charset=utf8 [20:24:49] if i pass the host as first uri param it picks it up. the second one on, it strips - and I'm not sure why [20:25:56] madhuvishy: are you enclosing in single quotes? [20:26:02] the whole uri? [20:26:08] ottomata: no [20:26:08] & is a special shell char [20:26:15] will background your command [20:26:16] ottomata: aah no wonder [20:26:41] alright so that's not a problem then [20:26:50] for your frist question, i'm looking at jrm.py [20:27:00] um, do you know what insert.execute returns? [20:27:39] hm, i am assuming _insert_multi is what we mostly use, yes? [20:28:01] and, from that, basd on the log message that is there [20:28:05] insert(table, events, replace) [20:28:06] logger.info('Data inserted %d', len(events)) [20:28:16] that it is either all or none, right? [20:28:20] if you had a batch of 100 events [20:28:22] and 1 failed [20:28:26] does the whole batch fail? [20:28:59] ottomata: ah i don't actually know [20:29:45] but yeah I considered putting it before after the log line. but I'm not sure if I should pass the statsd_host url to this [20:30:49] PROBLEM - Difference between raw and validated EventLogging overall message rates on graphite1001 is CRITICAL 20.00% of data above the critical threshold [30.0] [20:31:28] ottomata: ^ [20:32:33] yeah i see the problem [20:32:35] cause that's in a thread [20:32:39] HMM [20:33:16] madhuvishy: i think it has to be there, because the handler inserts asynchronous [20:33:53] you could maybe give make store_sql_events take a on_success callback of some kind [20:33:57] that you pass in [20:34:09] something like [20:34:21] ottomata: hmmm ya that could work [20:34:31] hm, not sure what a good form of that would be [20:34:37] hard to know how general to try to make it [20:34:43] yup [20:35:09] i'd also need it to return how many events it successfully inserted [20:35:20] right, and that i don't know how you know [20:35:25] i assume insert somehow tells you [20:35:32] but that is sqlalchemy details maybe? [20:35:53] ottomata: ah i was thinking just len(events) would do [20:36:11] PROBLEM - Difference between raw and validated EventLogging overall message rates on graphite1001 is CRITICAL 20.00% of data above the critical threshold [30.0] [20:36:56] i don't think so, beacause events is the batch [20:36:58] that is attempted [20:37:02] madhuvishy: yeah, if it attempts to insert X, it will either insert X or fail. There's no middle state, it's in a transaction [20:37:03] i mean, it does look like that [20:37:08] ok [20:37:15] so if that is the case, then you can just do len(events) madhuvishy [20:37:19] if you can tell if insert succeeded [20:37:26] these functions don't look like htey really return anything [20:37:31] yep, even though it's in a batch, SQL Alchemy wraps that whole thing in a transaction [20:37:55] you can get the number of inserted records, but I don't remember how that code looks, maybe not the way we're doing it now [20:38:05] it's moot though [20:38:19] RECOVERY - Difference between raw and validated EventLogging overall message rates on graphite1001 is OK Less than 15.00% above the threshold [20.0] [20:39:23] ottomata: milimetric so the silliest thing to do is pass statsd_host as a param to store_sql_events - and just do stats.incr('inserted', len(events)) [20:39:48] it doesn't feel nice though [20:39:51] madhuvishy: i think you should make it a callback, so as not to bring statsd dependency into the mix there [20:39:57] store_sql_events shouldn't care [20:39:58] yeah [20:40:03] agree [20:40:12] not sure what to call it [20:40:18] on_insert_callback=None [20:40:57] can you do partials in python ?? [20:41:03] is that what is happening all over jrm.py? [20:41:41] ah with functools [20:41:50] hehe [20:41:50] https://docs.python.org/2/library/functools.html#functools.partial [20:41:51] ottomata: haa [20:41:53] could do [20:44:37] Analytics, Analytics-EventLogging: ServerSideAccountCreation logging captures mobile app accounts creations as "not self made" - https://phabricator.wikimedia.org/T108243#1515995 (Neil_P._Quinn_WMF) NEW [20:44:37] increment_inserted_event = functools.partial(stats.incr, ('insertSuccessful', )) [20:45:00] Analytics, Analytics-EventLogging: ServerSideAccountCreation logging captures mobile app accounts creations as "not self made" - https://phabricator.wikimedia.org/T108243#1516002 (Neil_P._Quinn_WMF) [20:45:13] PeriodicThread(target=store_sql_events, ..., args=(..., inserted_callback=increment_inserted_event) [20:45:15] i dunno, something like that [20:45:16] :) [20:45:48] or maybe that's too much, not sure what the callback signature should be [20:46:32] ottomata: there is also another call to store_sql_events in the finally block [20:48:58] oh hm. [20:49:13] yeah i guess you just have to pass the callback there too, if that's how you do it [20:49:30] wait a minute [20:49:39] madhuvishy: what's that stuff about @sqlalchemy.event.listens_for [20:49:40] ? [20:50:01] could you potential register some event handler with sqlalchmey to do statsd stuff in??? [20:50:07] milimetric: ^ [20:50:07] ? [20:50:43] but now i need to tell the callback what the statsd url is [20:51:04] um... [20:51:15] lemme search abit [20:51:50] yes, that's why i was tlaking about partials! [20:52:10] you could bind the host arg to a new partial of stats.incr :) [20:52:21] oh sorry [20:52:36] uhh, no, you should be able to just use stats.incr? oh, not if that is in a thread....OR can you? [20:52:37] i dunno. [20:53:46] ottomata: ha ha [20:54:02] i feel like i've forgotten all python [20:54:54] ottomata: the host is for creating a StatsClient instance [20:55:36] yes, but if you have the object, it has a function, and you could make a partial with that [20:55:41] yes [20:55:52] so, you could catch the insert event, but I'm not sure you'd want to [20:55:53] http://docs.sqlalchemy.org/en/rel_1_0/core/events.html#sqlalchemy.events.ConnectionEvents.after_cursor_execute [20:56:03] or higher level: http://docs.sqlalchemy.org/en/rel_1_0/core/events.html#sqlalchemy.events.ConnectionEvents.after_execute [20:56:09] but that would be registered on the engine [20:56:23] so if the code ever changes and we do more things with that engine, it'd be complicated [20:56:30] milimetric: okay [20:56:32] result! [20:56:33] kind of the same reason people don't love triggers at the db level [20:56:34] but result! [20:56:39] yeah, you get the result [20:57:15] hm [20:57:42] doesn't MySQL have a query or maybe api method to return the last statemetn's nubmer of inserts or updates [21:00:19] it's not an api, it returns it as a response to the batch insert [21:00:24] it tells you how many rows were inserted [21:00:52] why is that a problem we're solving though? If there's no exception, it'll insert the events we passed it [21:01:10] the result proxy we get in the listener doesn't have the insert count btw, http://docs.sqlalchemy.org/en/rel_1_0/core/connections.html#sqlalchemy.engine.ResultProxy [21:02:02] milimetric: http://dev.mysql.com/doc/refman/5.0/en/information-functions.html#function_row-count [21:02:43] yeah, sure, but you'd have to execute that *right* after, which can't be guaranteed with concurrent threads inserting [21:03:19] hm, they don't share the same db instance though, do they? [21:03:21] oh i see the problem, you guys don't want to make store_sql_events smarter [21:03:21] so probably you could [21:03:33] well, no we will have to make it sorta smarter [21:03:40] whatever is happening has to happen in that thread [21:04:32] ottomata: i dont understand what exactly [21:04:37] eh? [21:04:53] milimetric: not followin gthe exception thing [21:04:55] where would you know? [21:05:10] store_sql_events just calls insert [21:05:22] right, and if it throws an exception, it means it inserted 0 [21:05:28] if it doesn't, it means it inserted all of them [21:05:44] howso? [21:05:50] what catches that exceptoin? [21:05:57] the thread doesn't stop, i don't see anytihng catching it [21:06:50] i don't see an exception thrown or caught in store_sql_events, it looks like it logs logger.info('Data inserted %d', len(events)) every time, even if the insert() call failed [21:07:40] this line: https://github.com/wikimedia/mediawiki-extensions-EventLogging/blob/master/server/eventlogging/jrm.py#L233 [21:07:47] would never be reached, the insert above would throw [21:07:56] so at the point it logs that, it's safe to emit to statsd [21:07:57] woudln't the thread stop? [21:08:06] yeah, it should die, yes [21:08:08] ah [21:08:18] really? the thread dies if it encouters an error? [21:08:48] yeah, nothing would catch the error [21:09:20] so, woudln't that mean that all events are inserted unless there is an error that causes the consumer to die? [21:09:20] well, so yeah, basically if the call to store_sql_events works, then the events are inserted [21:09:48] right, so we could emit to statsd right after this line: [21:09:48] https://github.com/wikimedia/mediawiki-extensions-EventLogging/blob/master/server/eventlogging/handlers.py#L279 [21:09:57] and be sure that we're telling the truth [21:10:06] milimetric: oh [21:10:19] milimetric: naw, you'd still do it in store_sql_events, because the thread is executing that [21:10:20] it's worth testing, but I think it's as easy as that [21:10:30] that finally only happens when the consumer dies [21:10:33] ummm, i'm not sure [21:10:40] oh duh, right [21:10:43] that's just the cleanup call [21:10:44] that is just the finally [21:10:45] but there too [21:10:48] yup [21:10:52] ja ok, then yeah [21:10:55] so inside store_sql_events yea [21:10:56] just a callback to store_sql_events [21:11:00] so if store_sql_events takes a callback [21:11:02] that it will call after it calls insert [21:11:05] after the log line [21:11:06] right [21:11:09] it will call it [21:11:11] cool [21:11:15] well, put the log line in the callback too [21:11:22] meh? whatveer :) [21:11:24] hehe [21:11:31] madhuvishy: add some comments about this in there whne you do this [21:11:45] how, the thread will die and the whole process will finish if anything goes wrong [21:11:57] so counting len(events) is correct [21:13:03] ottomata: okay will do that [21:18:49] Analytics-Backlog: Write script to track cycle time of tasked tickets - https://phabricator.wikimedia.org/T108209#1516093 (ggellerman) Thanks to @ksmith, here is info about @Robla's script: code: https://github.com/robla/phab-wbstatus Running instance, pointed at some arbitrary week that contains data: http... [22:30:23] Analytics-Backlog, Research-and-Data, Fundraising research: What's our projected ability to fundraise in the coming years - https://phabricator.wikimedia.org/T107606#1516444 (DarTar) a:ellery [22:44:00] randomly...does anyone have a bit of mysql code that will convert a mediawiki timestamp into a timestamp mysql understands and can do operations (like subtract two timestamps) on? [22:44:13] the mw format is '20150806142212' or 'YYYYMMDDHHIISS' [22:49:21] heh, so according to this data, the longest someone spent on a page its 3.5 days [23:54:03] Is there a WikiMetrics web API? [23:56:25] Let's imagine I have a web app that wants to analyse a cohort of people. Is there a way I can do that? [23:57:03] Can I mark https://phabricator.wikimedia.org/T99014 as resolved? It was…