[00:31:44] Analytics-EventLogging, MediaWiki-ContentHandler: EventLogging schemas on Meta no longer display properly - https://phabricator.wikimedia.org/T86706#974939 (Legoktm) [00:33:37] Analytics-EventLogging, MediaWiki-ContentHandler: [Regression 1.25wmf14] EventLogging schemas on Meta and other forms of JSON no longer display properly - https://phabricator.wikimedia.org/T86706#974941 (Jdforrester-WMF) [00:33:52] Analytics-EventLogging, MediaWiki-ContentHandler: [Regression 1.25wmf14] EventLogging schemas on Meta and other forms of JSON no longer display properly - https://phabricator.wikimedia.org/T86706#974942 (Krinkle) [01:14:17] got it milimetric, just note the questions on our doc and we shall talk to sean about it, sooner rather than later [01:34:35] how do i see wikigrok in action? [07:52:34] Hello, I am looking at Kannada stats, everytime, there is a difference of around 2000 articles at stats. wikimedia.org and at dumps. Why is it so? [07:53:12] Also, where can i have a latest stats? stats.wikimedia.org always has stats that are a month old [07:55:39] are there any API calls, for Total editors, New editors, active editors, very active editors, article count, new articles per day, edits per month and page views for Indic languages like Kannada, Telugu, Odia, Marathi, etc [08:12:24] Nemo_bis: do you know? [08:49:49] tuxnani: why are month-old stats not good enough? [08:50:29] Nemo_bis: This is January, and when I access Telugu data, http://stats.wikimedia.org/EN/TablesWikipediaTE.htm, I cannot see December data [08:50:38] tuxnani: yes, so? [08:52:11] We have links to more resources in https://meta.wikimedia.org/wiki/Statistics . None of them offers *all* the metrics (for which see https://meta.wikimedia.org/wiki/Research:Metrics_standardization and https://www.mediawiki.org/wiki/Analytics/Metric_definitions ) [08:52:39] Some select metrics, re-defined for live update, are available at https://metrics.wmflabs.org/static/public/dash/ [08:56:33] And on Wikimedia projects there is this API https://www.mediawiki.org/wiki/Extension:UserDailyContribs [08:58:57] Thenk you Nemo_bis, that was of some help. [09:02:52] (thanks Nemo_bis, you're awesome.) [09:03:08] (PS2) Gergő Tisza: Generate pageview stats [analytics/multimedia] - https://gerrit.wikimedia.org/r/179872 (https://phabricator.wikimedia.org/T78189) [09:04:15] (CR) Gergő Tisza: "I came up with this horrible thing. Still takes 40 min to run the global query, but whatever." [analytics/multimedia] - https://gerrit.wikimedia.org/r/179872 (https://phabricator.wikimedia.org/T78189) (owner: Gergő Tisza) [09:43:06] Analytics-EventLogging: Eventlogging file logging code split weirdly between role and base class - https://phabricator.wikimedia.org/T86745#975424 (yuvipanda) NEW [09:46:14] Analytics-EventLogging: Eventlogging file logging code split weirdly between role and base class - https://phabricator.wikimedia.org/T86745#975435 (yuvipanda) p:Triage>Low [10:33:27] Analytics-EventLogging: HTML of Schema pages appears garbled - https://phabricator.wikimedia.org/T86748#975510 (ori) NEW a:Nuria [14:11:50] (CR) Ananthrk: [WIP] UDF to get country code from IP address UDF to determine client IP address given values from source IP address and XFF headers Added I (7 comments) [analytics/refinery/source] - https://gerrit.wikimedia.org/r/183551 (owner: Ananthrk) [14:15:13] (PS3) Ananthrk: [WIP] UDF to get country code from IP address UDF to determine client IP address given values from source IP address and XFF headers Added IntelliJ related files to .gitignore Split existing Geo UDF into two - GeoCodedCountryUDF and GeoCodedDataUDF Both U [analytics/refinery/source] - https://gerrit.wikimedia.org/r/183551 [14:18:13] Analytics: Puppetized SSH setup of stats user from stat1003 - https://phabricator.wikimedia.org/T86763#975834 (QChris) NEW a:Ottomata [14:55:17] milimetric: Today, I was in the batcave like 3 minutes ago. Everything seems to have worked. Then it all of a sudden went "The party is over" [14:55:27] And now I cannot connect to the hangout any longer. [14:55:35] :-( [14:55:37] qchris: let's do another experiment :) [14:55:43] let's wait until :59 after [14:55:48] try it then [14:55:54] Oh. You think I am too early? [14:55:57] Ok. Let's try. [14:56:02] and if that doesn't work, then i'll remove / add [14:56:06] but yeah, it might be a timing issue [14:57:11] I just tried to join and it said "the call is already over" [14:57:35] hm [14:57:38] it's clearly confused [14:57:49] ooh :) let me start the meeting 5 minutes earlier, see if that fixes it too [14:58:18] * qchris is relieved :-) He is no longer the only one having issues to connect. [14:58:25] :) [14:58:32] ok, I moved it to 9:55, try now? [14:58:37] the both of youse [14:58:40] * qchris tries [14:58:56] * qchris is in :-) [14:59:08] Thanks milimetric! [14:59:23] is the url same? [14:59:48] ananthrk: Yes. [15:00:06] got it..thanks [15:00:49] Wayne's world^W^W ottomata. meeting time. excellent. ottomata. meeting time. excellent. [15:06:07] haha [15:13:58] ah nuria, i think i said this in IRC but you may not have been around. I think you missed my inline comments on patchset 16 [15:15:26] Analytics, Multimedia, MediaWiki-extensions-MultimediaViewer: Find a robust way of filtering local cache hits out of performance figures - https://phabricator.wikimedia.org/T86672#975909 (Gilles) I remember seeing Firefox on my own machine report values > 0 sometimes, so I'm a little suspicious of assumptions... [15:15:38] ottomata: I see, this one right: https://gerrit.wikimedia.org/r/#/c/181017/16/oozie/mobile_apps/daily_uniques/coordinator.properties [15:15:49] yup and workflow.xml [15:16:01] https://gerrit.wikimedia.org/r/#/c/181017/16/oozie/mobile_apps/daily_uniques/workflow.xml [15:16:39] ottomata: yes, thank you, will have those two done this morning [15:17:13] cool, I think you can also remove the comment about hardcoding filename on line 109 too [15:19:39] sorry I am not hearing this clearly [15:19:46] what is it that needs to be picked up? [16:14:35] (CR) OliverKeyes: [WIP] UDF to get country code from IP address UDF to determine client IP address given values from source IP address and XFF headers Added I (1 comment) [analytics/refinery/source] - https://gerrit.wikimedia.org/r/183551 (owner: Ananthrk) [16:22:08] (CR) Milimetric: [C: 2 V: 2] Manage warehouse migrations with alembic [analytics/data-warehouse] - https://gerrit.wikimedia.org/r/177739 (https://phabricator.wikimedia.org/T76829) (owner: Nuria) [16:25:39] (Abandoned) Milimetric: [WIP] Add schema for edit fact table [analytics/data-warehouse] - https://gerrit.wikimedia.org/r/167839 (owner: QChris) [16:26:32] (CR) OliverKeyes: [WIP] UDF to get country code from IP address UDF to determine client IP address given values from source IP address and XFF headers Added I (1 comment) [analytics/refinery/source] - https://gerrit.wikimedia.org/r/183551 (owner: Ananthrk) [16:27:10] Analytics, Multimedia, MediaWiki-extensions-MultimediaViewer: Investigate if pre-rendering images is having an impact on performance - https://phabricator.wikimedia.org/T76035#976147 (Gilles) Open>Resolved See the multimedia mailing list for continued discussion on this topic. The answer to the question... [16:28:57] (PS7) Milimetric: Manage warehouse migrations with alembic [analytics/data-warehouse] - https://gerrit.wikimedia.org/r/177739 (https://phabricator.wikimedia.org/T76829) (owner: Nuria) [16:29:10] (CR) Milimetric: [C: 2 V: 2] Manage warehouse migrations with alembic [analytics/data-warehouse] - https://gerrit.wikimedia.org/r/177739 (https://phabricator.wikimedia.org/T76829) (owner: Nuria) [16:31:00] (PS1) QChris: DO NOT SUBMIT. Test commit for http authentification [analytics/refinery] - https://gerrit.wikimedia.org/r/184905 [16:31:47] Analytics, Multimedia, MediaWiki-extensions-MultimediaViewer: Collect more data in MediaViewer network performance logging - https://phabricator.wikimedia.org/T86609#976189 (Gilles) a:Gilles [16:31:56] Analytics, Multimedia, MediaWiki-extensions-MultimediaViewer: Collect more data in MediaViewer network performance logging - https://phabricator.wikimedia.org/T86609#972548 (Gilles) p:Triage>Normal [16:32:07] (Abandoned) QChris: DO NOT SUBMIT. Test commit for http authentification [analytics/refinery] - https://gerrit.wikimedia.org/r/184905 (owner: QChris) [16:32:44] Analytics, Multimedia, MediaWiki-extensions-MultimediaViewer: Fix filter differentiating between varnish hits and misses in performance queries - https://phabricator.wikimedia.org/T86675#976192 (Gilles) Open>Resolved [16:41:40] (PS18) Nuria: Mobile apps oozie jobs [analytics/refinery] - https://gerrit.wikimedia.org/r/181017 [16:42:11] (CR) Gilles: [C: -1] "Why generate this only for 2014?" [analytics/multimedia] - https://gerrit.wikimedia.org/r/179872 (https://phabricator.wikimedia.org/T78189) (owner: Gergő Tisza) [16:42:30] (CR) Nuria: Mobile apps oozie jobs (4 comments) [analytics/refinery] - https://gerrit.wikimedia.org/r/181017 (owner: Nuria) [16:42:35] (PS19) Nuria: Mobile apps oozie jobs [analytics/refinery] - https://gerrit.wikimedia.org/r/181017 [17:06:33] (CR) Ottomata: [WIP] UDF to get country code from IP address UDF to determine client IP address given values from source IP address and XFF headers Added I (1 comment) [analytics/refinery/source] - https://gerrit.wikimedia.org/r/183551 (owner: Ananthrk) [17:07:18] Analytics: Puppetized SSH setup of stats user from stat1003 - https://phabricator.wikimedia.org/T86763#976338 (Ottomata) Open>Resolved Should be fixed here! https://gerrit.wikimedia.org/r/#/c/184909/ [17:10:23] ottomata, have you set up with Altiscale yet? [17:10:44] I'm trying to figure out the best way to transfer the enwiki dataset. [17:13:07] It looks like they'll want us to copy to Amazon S3 first :( [17:13:52] halfak: no [17:14:11] you don't have shell access anywhere? [17:14:40] I do. [17:14:50] I have shell access to a workbench that has about 10GB free. [17:15:09] *19GB [17:15:30] But it looks like I can write directly to HDFS from the mount on the filesystem [17:15:49] So, presumably, I could scp the data if I could get my ssh key on stat2 [17:15:54] hm, if that's true, than yeah. [17:16:20] hm [17:36:30] milimetric, I just updated http://etherpad.wikimedia.org/p/diff_test with the new run [17:36:50] TL;DR: The run without turning off speculation worked just fine. 31 diffs/second. [17:37:09] So, I don't think we know what's going on yet. :/ [17:38:46] But if this diff generation is representative of the rest of the wiki, we should expect that processing 500M revisions should take 3.7 days. [17:38:58] (over 50 reducers) [17:39:07] halfak: ok, bad, but also - maybe not horrible because we know speculation still has the potential to cause problems. [17:39:16] indeed. [17:39:22] ok, so you want to start it without speculation? [17:39:37] So, we have run both with and without speculation now. [17:39:43] i mean start enwiki [17:40:13] Yes. I'll give it a try. With speculate=false. [17:40:35] or - you wanna do simplewiki to see if we run into the error again in a larger test? [17:40:50] simplewiki without speculate=false i mean [17:41:01] *argh **with** speculate=false [17:42:01] Sure. Simplewiki should take less than an hour. [17:42:14] (assuming that we can distribute work perfectly) [17:42:40] or maybe something else a little bigger? [17:42:58] I've been meaning to do ptwiki, but I'll need to convert to json first. [17:43:07] I could start both things in parallel. [17:43:13] ok, that sounds good [17:43:26] we can comb the logs for simplewiki for that directory not found error and retry [17:43:27] * halfak gets to work on that. [17:43:43] if that doesn't show up, do ptwiki, and if we still run clean, try enwiki [17:43:47] and pray really hard [17:43:51] :) [17:44:26] we can pray in person too, I'll buy some candles [17:44:26] i think that what i'd like to do is have EventLogging validation errors logged as EventLogging events [17:44:49] ori: that'd be ok with me except we'd have to have a way to ground them if they're spiking [17:44:54] (which has happened) [17:45:10] ori, what if an error fails validation 0_o [17:45:28] ValidationErrorValidationError [17:45:29] we could make a simple schema like "event url" [17:45:41] halfak: [17:45:43] 115-00:42 ori: if you can't stand to wait, do -- i'm going to make the file-based logging way less verbose [17:45:43] 116:00:42 ori: by logging validation errors back into the event stream as ValidationError events or something [17:45:43] 117:00:42 ori: (don't ask me what happens if a ValidationError event fails to validate :P) [17:45:45] 118:00:42 YuviPanda: ValidationErrorValidationError? [17:45:47] 119-00:42 YuviPanda: :D [17:45:49] 120-00:42 ori: turtles all the way down [17:45:54] :) [17:46:00] so just one filed and it can be empty, it would never fail validation [17:46:03] I got to say that twice today. not bad [17:46:10] heh. I'm not as clever as I think [17:46:15] * YuviPanda follows ori around saying ‘ValidationErrorValidationError' [17:46:35] lol [17:46:36] Or we're all very clever :) [17:47:17] ok but really - just make the schema super simple so it doesn't fail... {attempted_schema, payload} or something, with both required: false [17:47:20] milimetric: seems like a separate problem. if there's no more than one validation error emitted per event, then the overall volume of events is not greater than what it would be if all events were perfectly valid [17:47:37] (re: spikes) [17:47:49] ori: it's a separate problem but what happened in the past is people submitted a ton of invalid events and didn't even know they were doing it [17:48:14] if those had turned into valid events, we'd have had a problem [17:48:50] maybe it'd be good to have a buffering kind of thing that writes similar events together with a count [17:49:03] right, so someone could up the logging rate for some data collection script because they don't realize that the volume of data they're generating is actually much larger than what they're seeing in the database [17:49:03] just md5 the payload or something [17:49:15] right [17:49:31] increased visibility for validation errors would have to come with increased responsibility to fix them [17:49:33] that's another possibility, though also a bit different [17:50:10] what i'm talking about is completely being unaware of the logging and how it works, and just by chance generating invalid events [17:50:18] right now we sort of let people off the hook because it's not always easy to detect mistakes [17:50:49] ori: what about piping the invalid event stream via a kafka producer [17:51:09] it'd be easier to analyze in the cluster if it grows big anyway [17:51:23] what are the bottlenecks when the volume of data goes up? [17:51:35] unknown right now, it used to be bad before your fix in the fall [17:51:37] keep in mind that eventlogging is still on a throwaway machine mark gave me in 2012 to shut me up :P [17:51:44] but now we don't know at what point it would choke [17:51:51] so we have credit with ops for getting more hardware [17:52:02] vanadium is dinky even by the standards of single-server machines currently on the cluster [17:52:09] it has been repurposed twice, i think it's out of warranty [17:52:18] no, totally, we have it on our list to do a load test and see where it breaks [17:52:21] :) [17:53:15] * YuviPanda mentions the now-out-of-use lsearchd boxes, but they are out of warranty too [17:54:14] before we look for hardware, though, we need to know what kind of hardware we need [17:54:21] so we gotta find the bottleneck [17:54:35] anyway, an invalid event stream sounds like a good idea to me [18:03:28] Analytics-Engineering: Request for current data about mobile editing in he.wikipedia - https://phabricator.wikimedia.org/T86793#976577 (ggellerman) NEW [18:03:46] Analytics-Engineering: Request for current data about mobile editing in he.wikipedia - https://phabricator.wikimedia.org/T86793#976584 (ggellerman) response from Dario: "the only revert data we publish for hewiki is here: http://ee-dashboard.wmflabs.org/dashboards/hewiki-metrics#reverts-graphs-tab breakdow... [18:06:11] YuviPanda: so shinken sometimes goes "puppet CRITICAL" and then 30 minutes later "puppet OK" [18:06:14] what's up with that? [18:06:29] (it's doing it for dan-pentaho, which has no custom puppet work at all) [18:06:30] milimetric: that’s… things being flaky, mostly. [18:06:38] milimetric: until last week it was DNS failures [18:06:47] milimetric: now there was a wikitech outage for a few minutes [18:06:49] causing puppet to fail [18:06:57] ok, gotcha [18:07:08] is there a way to turn off puppet monitoring? [18:07:13] for specific instances? [18:13:31] milimetric: sadly not yet. [18:22:03] ori: the unvalidated events (consumer logs) are so small that hadoop seems a little overkill [18:22:29] ori: it'll be the needle in the haystack, unless we have a fast way to get to that data [18:31:46] nuria, milimetric: after vagrant destroy; vagrant up, I get the same error as you :/ [18:32:23] mforns: aham, proxy working not so good, do you have apache logs? [18:32:53] the log file we discussed has no errors... [18:33:12] nuria, don't bother, I'll continue here [18:33:21] if I need help, I'll call [18:34:34] mforns: ok, the fact that requests to http://localhost:8080/event.gif are 500 and not 404 suggests missconfiguration [18:34:56] sure [19:08:51] wikimedia/mediawiki-extensions-EventLogging#329 (wmf/1.25wmf15 - c39d563 : Reedy): The build passed. [19:08:51] Change view : https://github.com/wikimedia/mediawiki-extensions-EventLogging/commit/c39d56313f6e [19:08:51] Build details : http://travis-ci.org/wikimedia/mediawiki-extensions-EventLogging/builds/47023770 [19:08:57] (PS4) OliverKeyes: Legacy pageviews definition UDF [analytics/refinery/source] - https://gerrit.wikimedia.org/r/182971 [19:09:36] hey ottomata, who has two thumbs and just submitted a patch that uses the patternIsFound method and doesn't have whitespace problems? [19:09:40] THIS GUY [19:10:24] and milimetric, what permissions do I need to get into the Pentaho instance? [19:11:01] checking Ironholds [19:11:07] kk :) [19:12:06] Ironholds: you just needed to be added to the "analytics" labs project [19:12:11] gotcha [19:12:14] and I do that how? [19:12:15] can you log into labs instances in general? [19:12:19] oh I just did it for you [19:12:36] (note I used "needed" as in past tense :)) [19:14:19] thanks! [19:14:23] and, I don't know. I'll try ;p [19:14:35] Ironholds: my ssh config (pasting): [19:14:56] https://www.irccloud.com/pastebin/DexhvA0e [19:15:21] and you need to make sure you have a key registered (let me know if stuff don't work) [19:15:51] yup; thanks! :) [19:16:16] mforns: qchris, heads up. i just merged another udp2log related firewall change. I don't expect anything to be affected [19:16:18] but just in case. [19:16:32] ottomata, ok [19:16:36] ottomata: k. Thanks for the heads up [19:17:15] fun fact: "key" and "keys" are invalid aliases for mysql queries [19:17:27] fact which it helpfully makes you aware of by saying: "PROBLEM" [19:18:01] note to self: bite thumb at person who maintains mysql [19:19:09] someone maintains MySQL? [19:20:10] :) [19:20:15] ottomata, mforns: I briefly checked on erbium, and oxygen. Both are still getting good data. So gadolinium is fine too. It looks good for now. I'll double check in half an hour, to make sure puppet does not get in the way. [19:20:39] thanks qchris [19:25:28] (CR) Ottomata: [WIP] UDF to get country code from IP address UDF to determine client IP address given values from source IP address and XFF headers Added I (1 comment) [analytics/refinery/source] - https://gerrit.wikimedia.org/r/183551 (owner: Ananthrk) [19:29:38] (CR) Ottomata: Mobile apps oozie jobs (3 comments) [analytics/refinery] - https://gerrit.wikimedia.org/r/181017 (owner: Nuria) [19:35:27] (PS5) OliverKeyes: Legacy pageviews definition UDF [analytics/refinery/source] - https://gerrit.wikimedia.org/r/182971 [19:36:21] (PS3) Gergő Tisza: Generate pageview stats [analytics/multimedia] - https://gerrit.wikimedia.org/r/179872 (https://phabricator.wikimedia.org/T78189) [19:38:16] (CR) Gergő Tisza: "For the record, the old version ran for about 80 minutes, so doing the index-intensive operations before the union sped things up but not " [analytics/multimedia] - https://gerrit.wikimedia.org/r/179872 (https://phabricator.wikimedia.org/T78189) (owner: Gergő Tisza) [19:39:52] (PS6) Ottomata: Legacy pageviews definition UDF [analytics/refinery/source] - https://gerrit.wikimedia.org/r/182971 (owner: OliverKeyes) [19:40:02] (CR) Ottomata: [C: 2 V: 2] Legacy pageviews definition UDF [analytics/refinery/source] - https://gerrit.wikimedia.org/r/182971 (owner: OliverKeyes) [19:58:05] er... there was no information about what IRC channel to go to for the EL office hours [19:58:16] I just updated the engineering event and pointed it here [19:58:23] milimetric, ok [19:58:25] thanks [19:59:26] argh, I also see it has the wrong goog hangout in there :-( [19:59:51] I’ll hang out in that channel just in case peeps join there. [20:00:00] The conference room will use the batcabe [20:00:37] i pasted that in there because it was set to TBD [20:00:40] but feel free to edit [20:01:01] Welcome Everyone!! [20:01:13] This marks the beginning of The Analytics Team Office Hours on Event Logging [20:02:01] please excuse bots / random chat, but feel free to ask any questions you have about Event Logging, Dashboarding, our plans for the near and far future [20:10:45] S is here in the conference room talking about EventLogging documentation [20:11:23] EL documentation: https://www.mediawiki.org/wiki/Extension:EventLogging [20:15:37] more documentation for folks internal to WMF to use EventLogging: https://wikitech.wikimedia.org/wiki/EventLogging [20:17:32] is this the office hours? [20:18:32] https://meta.wikimedia.org/wiki/Schema:EventCapsule [20:19:04] spagewmf: https://phabricator.wikimedia.org/T86748 [20:19:25] the good news (if you can call it that..) is that the bug appears to be presentational-only [20:20:48] (CR) Nuria: "Please see comments in GeocodedDataUDF.java regarding testing, initialization of db and static initialization that is being un-done by tes" (14 comments) [analytics/refinery/source] - https://gerrit.wikimedia.org/r/183551 (owner: Ananthrk) [20:27:05] milimetric: what's the schedule for retiring limn? I would like to figure out if it makes sense to hold back non-urgent dashboard creation tasks to avoid having to do them twice [20:28:06] tgr: yay! a question! [20:28:19] tgr: don't hold back is the short answer [20:28:24] (CR) Ottomata: [WIP] UDF to get country code from IP address UDF to determine client IP address given values from source IP address and XFF headers Added I (2 comments) [analytics/refinery/source] - https://gerrit.wikimedia.org/r/183551 (owner: Ananthrk) [20:28:39] reason is, we are going to migrate all the existing limn dashboards without bothering you guys about them [20:28:57] and we won't retire limn until we've accomplished that [20:29:26] does that answer your question tgr? [20:30:30] mforns, yt? [20:30:31] partly, but it would still be nice to have a rough estimate (a month? a quarter? a year?) [20:30:40] Ironholds, yep [20:30:54] hello :] [20:31:16] to be more specific about my use case, the multimedia dashboards are a bit of a pain and I was thinking about migrating them to some sort of templating system [20:31:28] and I am wondering if that's worth the effort [20:31:33] mforns, who has two thumbs and the new data? :D [20:31:41] xD [20:31:45] a templating system to generate limn config files I mean [20:32:29] :] so, Ironholds, do you want to import the data together? [20:32:48] totally! I'll get my keys set up now and then we can walk through it? [20:33:16] ok, until what time are you working today? I have a meeting with Christian at 22h CET... [20:33:36] tgr: ah, ok, can you explain a bit more about what is a pain? [20:33:45] nuria, milimetric: if there's a way you could prioritize T86748, I'd appreciate it [20:34:13] as far as a timeline, we want to retire it in Q3 (so by end of March) [20:34:34] (I'm going to ask Timo to help you guys out) [20:34:48] Ironholds, ^ (sorry for not poking you inline) [20:35:29] ori: I think we have our plate full now until next week. [20:36:00] mforns, I'm working until 8pm EST but I have to go out for a haircut [20:36:02] yeah, but we can't just leave things in that state. [20:36:06] what time is 22h CET in relation to now? [20:36:14] hopefully timo can figure it out quickly [20:36:27] but try to at least be prepared for a quick code review if you can manage [20:36:30] ori: ok [20:36:45] thanks, sorry to pile on [20:36:55] milimetric: limn config files for graphs with lots of similar metrics are large ard hard to read, and if I understand https://phabricator.wikimedia.org/T75611 correctly making them better-structured causes a performance hit [20:37:26] ori: ok, will keep an eye for CRs today, be back in 40 mins [20:37:34] will the Limn migration affect team's current dashboards? Will https://github.com/wikimedia/analytics-limn-flow-data/blob/master/dashboards/reportcard.json just move to a wiki page? [20:37:40] so for example doing the refactoring needed for that task without a template system is not something I'm looking forward to :) [20:37:40] Ironholds, i'll work until +- 00h CET -> 18h EST [20:37:50] tgr: so I know you guys are doing some fancier stuff, but had you looked at the "magic" approach to adding graphs to limn? as in, where you just reference the datafile you want graphed? [20:37:59] Ironholds, is 17h EST OK for you? [20:38:25] spagewmf: yes, the plan is to basically migrate all the dashboard configuration to equivalent wiki pages [20:38:31] then to organize those wiki pages using categories [20:38:48] milimetric: I'm not sure I am aware of that, do you have an example for it? [20:38:53] then to show all the dashboards in a central location, so people can discover this data if they don't know the address [20:38:57] tgr: one sec [20:39:09] tgr: http://mobile-reportcard.wmflabs.org/dashboards/reportcard.json?pretty [20:39:23] mforns, that sounds perfect! [20:39:26] I can get a haircut beforehand :) [20:39:27] so, in that dashboard's case, nothing else is needed to render it. and you can see it here of course: http://mobile-reportcard.wmflabs.org/dashboards/reportcard.json [20:39:31] oops [20:39:32] http://mobile-reportcard.wmflabs.org/dashboards/reportcard [20:39:36] Ironholds, OK done [20:39:53] Ironholds, see you then [20:40:26] can Extension:Graph render the same datasets? [20:40:30] tgr: notice that you can't go directly to a graph and that it's limited to timeseries, etc. But all of these are things we can fix (either in limn or dashiki) [20:40:47] spagewmf: not directly, but yes [20:41:18] milimetric: that looks cool! is that what you were referring to in T75611? I misunderstood you then [20:41:32] mforns, awesome! [20:42:17] tgr: yes, i was not very explicit, sorry. I always feel like I'm being overbearing :( [20:42:53] milimetric: https://www.mediawiki.org/wiki/Extension:Graph doesn't give an example of graphing a remote dataset like http://datasets.wikimedia.org/limn-public-data/flow/datafiles/active-boards.csv , the examples are local JSON [20:43:02] tgr: so that approach and the current way you're doing it can live side by side. So if you have some custom more complicated graphs, just keep their graph_id in there. If not, you can replace the graph id with the datafile url and remove both the datasource and graph metadata [20:43:28] I thought it would be more like including the current content of the datasource definition files in the dashboard file [20:43:34] but this looks really simple [20:43:37] spagewmf: yes, remote data is a problem, but the data we use in limn comes from datasets.wikimedia.org and has CORS set up to work [20:43:57] spagewmf: it should just work, but I wouldn't eat my underwear if it doesn't [20:44:25] tgr: yeah, it could be expanded to add a few overrides and stuff if you find that the current solution is *too* simplistic [20:44:44] and I'm happy to do that if it makes your life easier, because then I'll know what we need to work on for the next tool (dashiki) [20:45:08] dashiki, btw, is "dashboards configurable by wikis" and is our next tool after limn [20:45:54] milimetric: btw we are preparing a retrospective for mediaviewer and it includes some reflections on the analytics toolset [20:46:24] I'll compile a mail from that eventually but you can see the raw list of observations here: http://etherpad.wikimedia.org/p/multimedia-mmv-technical-retrospective [20:47:40] tgr: awesome, thank you very much (looking now) [20:50:08] yes -- thanks tgr [20:53:27] this is great - the raw retro notes. tgr, you can feel free to send that as is, I think I'll use it as a template to tell others: this is the kind of feedback we need to make this pipeline better for y'all [20:54:10] milimetric: did someone (John Katz?) write a better guide to setting up a reportcard dashboard than the guidance you gave Flow, https://www.mediawiki.org/wiki/Talk:Flow/Analytics ? I want EventLogging/Guide to link to that as the next step in visualizing EventLogging [20:54:49] spagewmf: that's more of a technical guide and I have just been doing the whole thing myself anyway [20:54:57] basically the guide is: "ping milimetric" [20:55:09] with this as the fallback landing page: https://wikitech.wikimedia.org/wiki/Analytics/Dashboards [20:55:23] and the Flow talk page as perhaps even more technical detail [20:55:57] btw, hangouts 0, irc 4 in a blow-out victory of text over video [20:55:58] that's it, thanks I'll add it to ===Visualizing EventLogging data=== [20:56:17] milimetric: I'm sorry I didn't see what you typed because I had to mute you [20:56:21] :) [20:56:21] btw, someone flow-enabled my talk page on office, thank you! :) [20:56:24] I wish I chatted [20:56:29] lol [20:59:34] I think all of office wiki is now flow enabled [21:00:30] nuria: there is an icinga alert. do you know if this is related to WikiGrok? [21:01:37] kevinator: yes, every talk page on officewiki is Flow (but not subpages) [21:01:55] milimetric: FYI I linked reportcards from https://www.mediawiki.org/wiki/Extension:EventLogging/Guide#Visualizing_EventLogging_data [21:02:32] thanks spagewmf [21:02:45] i've gotta run everyone - car stuff [21:02:49] i'll be back laterish [21:11:22] someone should update the "Funnel analysis" and "Defining a cohort..." sections of https://www.mediawiki.org/wiki/Extension:EventLogging/Guide , they aren't really part of EventLogging and there are other pages for them [21:17:39] leila: icinga? [21:17:44] leila for EL? [21:18:19] leila: yes, we are wayyyyy over throughput [21:18:24] leila: that is bad [21:18:44] leila: is kaldari arround? [21:19:25] nuria: let me walk to kaldari's desk. I'll update in few min [21:19:47] (they may have to turn off few other schemas to let EL breathe. we prefer not to stop WikiGrok data collection) [21:20:59] leila: i think throughput might have gone down a bit, let me look in detail [21:24:06] nuria: I'm with kaldari. we're checking the data to see if it's enough and we can turn off the test or we should stop some other schema [21:24:16] leila: it's this one: [21:24:18] do you have few minutes to get on a Hangout nuria [21:24:19] https://www.irccloud.com/pastebin/X7Kx6OB1 [21:24:32] MobileWebUIClickTracking [21:24:56] lzia: yes [21:25:01] lzia: of course [21:25:22] bat cave? [21:25:27] can you send the link [21:25:36] https://plus.google.com/hangouts/_/wikimedia.org/a-batcave [21:39:26] Analytics, Multimedia, MediaWiki-extensions-MultimediaViewer: Find a robust way of filtering local cache hits out of performance figures - https://phabricator.wikimedia.org/T86672#977353 (Tgr) >>! In T86672#975909, @Gilles wrote: > I went through the spec and unfortunately there doesn't seem to be a way to te... [21:42:00] ottomata, got a few minutes to think about transfering files to altiscale with me? [21:43:07] The gist of my current problem is that I can't ssh to bastion from stat1002 (public key denied), so I can't proxy a connection to altiscale. [21:43:09] $ ssh bast1001.wikimedia.org [21:43:09] Permission denied (publickey). [21:43:58] The only way I can imagine to get around this is to start bringing a bunch of ssh keys to the server and that makes me feel uneasy. I'm hoping you know a better strategy [21:47:29] oh HM [21:47:30] right [21:47:56] halfak: if you ssh -A into stat1002, does that work? [21:48:06] Good Q [21:48:51] Same issue. [21:49:07] The WMF machines have been refusing AgentForwarding for a few months. [21:49:11] Or so it seems [21:50:46] Analytics, Multimedia, MediaWiki-extensions-MultimediaViewer: Find a robust way of filtering local cache hits out of performance figures - https://phabricator.wikimedia.org/T86672#977386 (Tgr) Btw while searching I found [[ http://lists.w3.org/Archives/Public/public-web-perf/2014Nov/0034.html | this long thre... [21:51:00] Analytics: Look at proposed naming conventions for Diffusion - https://phabricator.wikimedia.org/T76057#977389 (Nemo_bis) AFAICS it's too late for any change, this looks declined/invalid. [21:53:25] wikimedia/mediawiki-extensions-EventLogging#330 (master - d4e34cf : Translation updater bot): The build has errored. [21:53:25] Change view : https://github.com/wikimedia/mediawiki-extensions-EventLogging/compare/b68364723fb2...d4e34cf5e7a2 [21:53:25] Build details : http://travis-ci.org/wikimedia/mediawiki-extensions-EventLogging/builds/47045969 [21:56:38] halfak: , hm, it works for me [21:56:43] at lesat, i can ssh into bastion from stat1002 [21:57:07] witih agent forwarding [21:57:14] hmmMM [21:57:20] lemme try something [21:58:26] ottomata, gotcha. Must be something with me then. [22:00:02] hm, halfak it worked for me, i just logged into my personal server using my private ssh key from stat1002. you should be able to do it if you have an ssh key for altiscale on stat1002 [22:00:33] Hmm... I do, but not an ssh key for bastion. [22:00:43] I'll keep playing with my SSH Agent until I figure it out. [22:00:46] * halfak runs to meeting [22:00:50] thanks ottomata [22:00:56] ? [22:00:58] no, i mean [22:01:22] i did pretty much what you have setup [22:01:23] hm [22:01:29] let's work together on this when you get back, or tomorrow [22:01:34] kk [22:14:02] is wikitech down? ;p [22:14:53] ottomata, will review webrequests tomorrow morning, if that's okay? [22:16:23] cool [22:20:00] yay! [22:20:03] argh, labs. [22:29:11] ori: any documentation about how does the pretty printing on "schemas" work? [22:53:55] tnegrin, WE HAVE PAGEVIEWS [22:54:00] they don't appear totally batshit [22:56:56] yaaaaa [23:02:53] looks good Ironholds -- roughly checks out with WSC -- thanks much! [23:03:02] np! [23:03:02] (PS20) Nuria: Mobile apps oozie jobs [analytics/refinery] - https://gerrit.wikimedia.org/r/181017 [23:16:54] Analytics: Evaluate ABLincoln A/B testing framework - https://phabricator.wikimedia.org/T86853#977678 (ori) NEW [23:18:46] Analytics: Evaluate ABLincoln A/B testing framework - https://phabricator.wikimedia.org/T86853#977686 (kevinator) [23:20:46] (PS21) Nuria: Mobile apps oozie jobs [analytics/refinery] - https://gerrit.wikimedia.org/r/181017 [23:24:29] (CR) Nuria: "Please see patch#21" (3 comments) [analytics/refinery] - https://gerrit.wikimedia.org/r/181017 (owner: Nuria) [23:26:52] Ironholds: i own you a review on your last udf code, sorry [23:27:03] Ironholds: been trying to get to it for forever [23:28:23] nuria, that's okay; I have to fix a load of minor stuff for otto's review anyway, and I won't get to that until tomorrow [23:29:08] Ironholds: k [23:46:28] Analytics-EventLogging: HTML of Schema pages appears garbled - https://phabricator.wikimedia.org/T86748#977804 (Krinkle) I can't reproduce this issue. In MediaWiki core, on e.g. User:Root/foo.json (with https://gerrit.wikimedia.org/r/177172 applied), this works fine: {F28655} In EventLogging, on e.g. Schema... [23:50:44] Analytics-EventLogging: HTML of Schema pages appears garbled - https://phabricator.wikimedia.org/T86748#977806 (Krinkle) Ah, so those were pre-existing pages that I edited to invalidate any caches, but the page itself already existed. And that includes the content-model property of a page. When creating a ne... [23:51:04] Analytics-EventLogging, MediaWiki-ContentHandler: HTML of Schema pages appears garbled - https://phabricator.wikimedia.org/T86748#977807 (Krinkle) a:Nuria>Krinkle