[11:52:29] hey *, there is interest in the number of "Active Wikimedia Editors for Wikidata" but the report card is still in november. is this the normal lag? [11:53:58] i assume the data is computed with some SQL, is there some way i can reuse your query to run it and a few variations myself? where would i find it? [12:51:31] (PS1) Zhaofeng Li: base.html: Add title and navbar toggle button [analytics/quarry/web] - https://gerrit.wikimedia.org/r/186345 [12:53:44] (PS2) Zhaofeng Li: base.html: Add title and navbar toggle button [analytics/quarry/web] - https://gerrit.wikimedia.org/r/186345 [13:15:01] Analytics-Visualization: Wikidata is the number five in number of active editors - https://phabricator.wikimedia.org/T58539#990968 (JanZerebecki) Open>Invalid [13:16:07] Analytics-Visualization: Wikidata is the number five in number of active editors - https://phabricator.wikimedia.org/T58539#633821 (JanZerebecki) As far as I can see this is exactly what the report card shows. [17:15:40] Analytics-EventLogging: Add Composer support - https://phabricator.wikimedia.org/T60459#991128 (kevinator) [17:18:02] Analytics-EventLogging: Check that schema name matches revid - https://phabricator.wikimedia.org/T48174#991130 (kevinator) [17:19:45] Analytics-Engineering, Analytics-EventLogging: EventLogging calling deprecated SyntaxHighlight_GeSHi::buildHeadItem - https://phabricator.wikimedia.org/T71328#991132 (kevinator) [17:21:16] Analytics-Engineering, Analytics-EventLogging: EventLogging calling deprecated SyntaxHighlight_GeSHi::buildHeadItem - https://phabricator.wikimedia.org/T71328#728100 (kevinator) This is probably fixed, we just need to verify that it is so [17:24:34] millimetric: ping [17:24:38] Analytics-Dashiki: make Dashiki JSON pages nicely - https://phabricator.wikimedia.org/T87441#991149 (kevinator) NEW [17:27:08] Analytics-EventLogging: Provide a robust way of logging events without blocking until network request completes; use sendBeacon with client-side storage fallback - https://phabricator.wikimedia.org/T44815#991166 (kevinator) a:Nuria [17:37:29] Analytics-EventLogging: Story: User clicks on link to event capsule schema while viewing a schema - https://phabricator.wikimedia.org/T74745#991176 (kevinator) [17:42:34] Analytics-EventLogging: Multiple user_ids per username in account creation events from ServerSideAccountCreation log - https://phabricator.wikimedia.org/T68101#991181 (kevinator) @csteipp can you give us an update on this? [17:44:50] Analytics-Engineering, Analytics-EventLogging: Epic: Engineer has simpler way to deploy dashboard from EL data - https://phabricator.wikimedia.org/T75836#991185 (kevinator) [17:44:53] Analytics-Engineering, Analytics-EventLogging: WMF reads announcement on simpler process to get a dashboard from EL data - https://phabricator.wikimedia.org/T76058#991182 (kevinator) Open>Resolved a:kevinator Email we sent out to wikitech and engineering mailinglists at the end of 2014. [17:46:51] Analytics-EventLogging: Generate alerts if theoretically impossible or unwanted logging occurs - https://phabricator.wikimedia.org/T49591#991187 (kevinator) [17:47:49] Analytics-EventLogging: Generate alerts if theoretically impossible or unwanted logging occurs - https://phabricator.wikimedia.org/T49591#991190 (kevinator) p:Normal>Low This should happen on beta labs while product testing. [17:50:46] Analytics-Engineering, Analytics-EventLogging: Epic: WMF Engineer reads documentation to set up a dashboard from EL data - https://phabricator.wikimedia.org/T76362#991195 (kevinator) [17:50:49] Analytics-Engineering, Analytics-EventLogging: WMF engineer follows steps to collect EL data - https://phabricator.wikimedia.org/T76679#991192 (kevinator) Open>Resolved a:kevinator This is a work in progress. We are talking to engineers (office hours and MWDS) [17:51:41] Analytics-EventLogging: Provide a robust way of logging events without blocking until network request completes; use sendBeacon with client-side storage fallback - https://phabricator.wikimedia.org/T44815#991196 (Gilles) sendBeacon is done now and the local storage fallback is what remains to be implemented,... [17:52:28] Analytics-EventLogging: Multiple user_ids per username in account creation events from ServerSideAccountCreation log - https://phabricator.wikimedia.org/T68101#991197 (csteipp) >>! In T68101#991181, @kevinator wrote: > @csteipp can you give us an update on this? @kevinator SUL Finalization is a ways off. We'... [17:52:44] Analytics-Engineering, Analytics-EventLogging: Automate pruning of sampled logs after 90 days [0 pts] - https://phabricator.wikimedia.org/T74743#991198 (kevinator) p:Low>Normal [17:54:24] Analytics-EventLogging: Empty objects can pass schemas with required fields - https://phabricator.wikimedia.org/T67607#991199 (kevinator) [17:55:58] Analytics-Engineering, Analytics-EventLogging: EventLogging calling deprecated SyntaxHighlight_GeSHi::buildHeadItem - https://phabricator.wikimedia.org/T71328#991205 (Legoktm) >>! In T71328#991132, @kevinator wrote: > This is probably fixed, we just need to verify that it is so Do you have a gerrit patch or... [17:56:33] Analytics-EventLogging: Story: Product groups have EventLogging validation - https://phabricator.wikimedia.org/T69126#991206 (kevinator) Open>declined a:kevinator closing because we are going to treat events that fail validation differently: pipe them somewhere people can watch for them. [17:58:24] Analytics-Engineering, Analytics-EventLogging: Automate purge of rows older than 90 days for select tables/schemas [0 pts] - https://phabricator.wikimedia.org/T74744#991209 (kevinator) Open>Resolved a:kevinator The process is sending an email to analytics list cc'ing Sean for tables that can be purged. [18:10:52] Analytics-EventLogging: Provide a robust way of logging events without blocking until network request completes; use sendBeacon - https://phabricator.wikimedia.org/T44815#991238 (Nuria) [18:12:56] Analytics-EventLogging: Provide a robust way of logging events without blocking until network request completes; use sendBeacon - https://phabricator.wikimedia.org/T44815#991242 (Nuria) @Gilles: we have no plans to implement the local storage fallback at this time. Send beacon fallback (when supported) is a... [18:14:01] Analytics, MediaWiki-General-or-Unknown: the ability to sort MostTranscludedPages by tranclusion in a namespaces is needed - https://phabricator.wikimedia.org/T87443#991245 (Amire80) NEW [18:19:59] Analytics-Dashiki: make Dashiki JSON pages display nicely - https://phabricator.wikimedia.org/T87441#991256 (kevinator) [18:21:26] Analytics-Dashiki: PM relabels metric in Vital Signs - https://phabricator.wikimedia.org/T86963#991269 (kevinator) [18:21:29] Analytics-Dashiki: Update label in Vital Signs - https://phabricator.wikimedia.org/T86600#991270 (kevinator) [18:21:41] +Ironholds hey [18:22:53] heyo [18:24:31] ggellerman, ^ [18:30:04] are these the 2 blockers I volunteered to raise with Kevin: prioritizing session metrics [Oliver] [18:30:04] Task review- Kevin to prio session solution [Oliver] [18:30:30] What does "task review" mean here? [18:32:04] could I get Phab ticket? [18:34:38] https://phabricator.wikimedia.org/T86535 [18:34:56] ggellerman, https://phabricator.wikimedia.org/T86535 [18:35:46] Analytics-Dashiki: Strange rendering glitches when removing lines - https://phabricator.wikimedia.org/T76745#991294 (kevinator) Open>Resolved a:kevinator we turned off animation so this doesn't occur anymore. [18:37:04] Analytics-EventLogging: Provide a robust way of logging events without blocking until network request completes; use sendBeacon - https://phabricator.wikimedia.org/T44815#991300 (Gilles) sendBeacon is the default now when available, the old method is the fallback [18:39:42] Analytics-Dashiki: Improve Dashiki's HTML template - https://phabricator.wikimedia.org/T73983#991320 (kevinator) Do we want this for SEO? need some clarification. [18:40:00] Analytics-Dashiki: Improve Dashiki's HTML template - https://phabricator.wikimedia.org/T73983#991321 (kevinator) [18:41:06] Analytics-Dashiki: weekly/monthly granularity in Dashiki - https://phabricator.wikimedia.org/T76092#991332 (kevinator) [18:43:43] Analytics-Dashiki: weekly/monthly granularity for pageviews in Dashiki - https://phabricator.wikimedia.org/T76092#991346 (kevinator) [18:50:50] Analytics-Dashiki: Story: EEVSUser selects ALL wikis - https://phabricator.wikimedia.org/T70478#991369 (kevinator) This is not very useful because the total always correlates with the biggest wikipedia (enwiki) {F31371} [18:56:06] hey where is this all being in the same place meeting? [19:19:02] hey *, there is interest in the number of "Active Wikimedia Editors for Wikidata" but the report card is still in november. is this the normal lag? [19:19:08] i assume the data is computed with some SQL, is there some way i can reuse your query to run it and a few variations myself? where would i find it? [20:30:31] jzerebecki: Yes, this lag is normal. November data has been added ~10 days ago. [20:33:59] If you need updates of this data, I guess it's best to capture milimetric, nuria, or ezachte when they are around. [20:36:10] (But IIRC, they are generated through parsing the dumps, not through some SQL command) [20:36:21] http://stats.wikimedia.org/wikispecial/EN/TablesWikipediaWIKIDATA.htm#editor_activity_levels [20:49:06] Analytics-Wikimetrics: Tag cohort functionality needs to check for ownership - https://phabricator.wikimedia.org/T68483#991732 (bmansurov) Open>Resolved [20:58:47] Analytics-EventLogging: Add sampling support in EventLogging - https://phabricator.wikimedia.org/T67500#991748 (Tgr) >>! In T67500#709963, @Nuria wrote: > Adding comments posted by ori on e-mail thread: > > "to do this {sampling] in the schema itself confuses the structure of the data with the mechanics of i... [21:06:37] +kevinator Pau and I are in Jaucourt [21:10:28] Analytics-EventLogging: Add A/B testing support in EventLoggin - https://phabricator.wikimedia.org/T87459#991759 (Tgr) NEW [22:00:57] !log Marked raw mobile webrequest partition for 2015-01-16T01/1H ok (The partition only needed deduping) [22:04:39] !log Marked raw text webrequest partition for 2015-01-15T15/1H ok (The partition only needed deduping) [22:08:03] qchris: thx [22:08:19] yw [22:11:20] !log Marked raw upload webrequest partition for 2015-01-15T17/1H ok (The partition only needed deduping) [22:23:27] !log Marked raw upload webrequest partition for 2015-01-16T01/1H ok (The partition only needed deduping) [22:32:07] Analytics-EventLogging: Add A/B testing support in EventLoggin - https://phabricator.wikimedia.org/T87459#991969 (Tgr) [22:34:54] Analytics-EventLogging: Use non-static calls in EventLogging - https://phabricator.wikimedia.org/T87468#991973 (Tgr) NEW [22:36:27] ottomata, did we deploy the LegacyPageviews class, or is that in the merged-but-not-deployed set? [22:37:29] i tihnk that is merged but not deployed [22:37:40] many reviews on hold due to onsite stuff :/ [22:40:04] aha [22:46:05] !log Marked raw upload webrequest partition for 2015-01-16T12/1H ok (The partition only needed deduping) [22:47:31] HMM [22:47:43] qchris, quick idea that problably wouldn't work. [22:47:53] shoot [22:48:07] ok, so, camus fires up a mapper per kafka partition to import kafka data to hdfs [22:48:16] it already parses each message to look at the timestamp. [22:48:21] what if. [22:48:38] what if that same job, kept a map of hostname -> last sequence number [22:49:01] and, for each message, it checks that the hostname,lastseq isn't already seen [22:49:11] since, within a given partition, the messages should be in order [22:49:29] they should be in order for a particular job [22:49:38] so, if it encounters a hostname,seq that it ahs already seen [22:49:41] we can assume it is a dup [22:49:45] and not write it... [22:49:59] hm, maybe that's the tricky part, not sure if i know how to tell camus to skip...probably possible [22:50:50] I am not sure that sequence numbers follow the scheme you have in mind. [22:50:57] But it should be easy to check... [22:51:03] whatcha mean? [22:51:06] per partition? [22:51:17] Yes. I am not convinced that they have to be increasing. [22:51:23] ? [22:51:28] But let me check. [22:51:30] for our usecase they are, no? [22:51:33] unless we restart varnishkafka [22:52:00] I mean ... varnishkafka is sending them in an increasing manner. [22:52:12] But I am not sure they get put into the partition in the same manner. [22:52:29] you think varnishkafka could reorder them? [22:52:29] But if I check one of the files in hdfs's raw data dir. [22:52:34] aye, per file [22:52:48] And restrict to a single hostname there [22:52:51] aye, [22:52:55] The you say they should be increasing. [22:53:05] Not sure ... gonna test. [22:54:11] ok! [22:54:11] :)( [23:13:45] ggellerman: what’s your ingress nick? [23:14:36] Analytics, MediaWiki-extensions-MultimediaViewer, Multimedia: Collect more data in MediaViewer network performance logging - https://phabricator.wikimedia.org/T86609#992060 (Tgr) [23:58:04] qchris: can you get me a list (or point me to) a list of python packages that geowiki scripts need? [23:58:12] i'm refactoring the statistics manifests for fadion [23:58:14] faidon [23:58:30] ottomata: not really. [23:58:34] hm [23:58:34] ok [23:58:50] But it includes pandas. [23:58:55] (In case that helps) [23:59:16] ok thanks [23:59:53] there are some at least in the setup.py file