[00:05:19] 10Analytics-Kanban, 10Easy, 10Patch-For-Review: Don't accept data from automated bots in Event Logging - https://phabricator.wikimedia.org/T67508#3321787 (10Nuria) Closing, events are present on MediaWikiPingback from 20170606231658. [00:05:27] 10Analytics-Kanban, 10Easy, 10Patch-For-Review: Don't accept data from automated bots in Event Logging - https://phabricator.wikimedia.org/T67508#3321788 (10Nuria) 05Open>03Resolved [00:12:55] phuedx: just an FYI but FF 9specially old versions) are well known for non canceling inflight requests [00:13:19] phuedx: so it is much more likely to find duplicates on FF than chrome due to an error on implementation [00:13:32] phuedx: of your instrumentation [04:57:04] nuria_: thanks for the heads up, we're seeing the duplicates from v53 and v52 [04:58:54] and – just in case i wasn't clear before – i wasn't trying to imply it was the fault of the server, i'm just trying to understand a part of the system which, until yesterday afternoon, was mostly a black box to me [05:00:44] but i think it's time to enlist an external reviewer for the instrumentation [06:50:46] !log restarted mediacounts-archive-wf-2017-06-06 in Hue (Java OOMs) [06:50:47] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [07:29:17] 10Analytics-Kanban, 10Easy, 10Patch-For-Review: Don't accept data from automated bots in Event Logging - https://phabricator.wikimedia.org/T67508#3322263 (10Tgr) Thanks! [07:49:59] 10Analytics, 10LDAP-Access-Requests: Requesting access to the nda LDAP group for GoranSMilovanovic - https://phabricator.wikimedia.org/T167199#3322323 (10elukey) 05Open>03Resolved a:03elukey It is probably partially my fault, I didn't explain myself correctly in the other task (that was closed by Goran).... [10:05:49] nuria_: could you clarify your comment about inflight requests? under what circumstances should a browser cancel an inflight beacon request? [10:31:58] * elukey lunch + dentist, will be afk a little longer [10:36:18] 10Analytics, 10LDAP-Access-Requests: Requesting access to the nda LDAP group for GoranSMilovanovic - https://phabricator.wikimedia.org/T167199#3322768 (10GoranSMilovanovic) @elukey Thanks, I can access https://pivot.wikimedia.org now. [13:25:59] 10Analytics: Sqoop wbc_entity_usage from all wikis into hadoop (HDFS) - https://phabricator.wikimedia.org/T167290#3323476 (10Halfak) [13:27:07] 10Analytics-Cluster, 10Analytics-Kanban, 10Operations, 10ops-eqiad, 10User-Elukey: Analytics hosts showed high temperature alarms - https://phabricator.wikimedia.org/T132256#3323492 (10elukey) @Cmjohnson do you have time during the next days to do a couple of hosts? [13:29:43] * elukey afk a bit for coffee [13:48:59] 10Analytics, 10MediaWiki-extensions-WikimediaEvents, 10The-Wikipedia-Library, 10Wikimedia-General-or-Unknown, 10Patch-For-Review: Implement Schema:ExternalLinksChange - https://phabricator.wikimedia.org/T115119#3323616 (10Samwalton9) [14:16:23] 10Analytics, 10Analytics-EventLogging, 10Editing-Analysis, 10Wikimedia-Hackathon-2017, and 2 others: Record an EventLogging event every time a new content namespace page is created - https://phabricator.wikimedia.org/T150369#3323770 (10Ottomata) @kaldari, I just added https://wikitech.wikimedia.org/wiki/Ev... [14:30:47] hello team [14:30:53] helllooo [14:30:58] :] [14:40:52] 10Analytics, 10Analytics-Cluster, 10Operations, 10Traffic, 10User-Elukey: Encrypt Kafka traffic, and restrict access via ACLs - https://phabricator.wikimedia.org/T121561#3323871 (10Ottomata) We should do some work to understand how ACLs work and what ACLs for what topics we should set in production. [14:42:12] phuedx: it is not just beacon requests, FF versions used to re issue requests that were inflight over and over whereas other browsers will notice request was issued and cancel it , you can do some digging on firefox bugs filed to that extent. I have run into this before but it is been a while. I would first check your instrumentation and filing of events as duplicates most frequently point to instrumentation [14:42:12] failures. Do check the version of browser where you see issues and focus on troubleshooting just that one. [14:42:18] phuedx: see my comment here: I think there are several issues here, the main one is is likely a browser issue, almost all duplicated events on pop up schema are coming from FF (which unlike chrome doesn't cancel outgoing requests even if your finger gets stuck on F5) . It is our experience that when this happens the client is at fault, that is, the instrumentation is sending two events in close succession when it [14:42:19] should only send one. In this scenario chrome will cancel the 2nd request, FF would not (at least older FF versions did not). FF 47, 48 and 49 (to a lesser extent) seem to have this issue. [14:42:23] 10Analytics-Cluster, 10Analytics-Kanban: Provision new Kafka cluster(s) with security features - https://phabricator.wikimedia.org/T152015#3323912 (10Ottomata) [14:42:25] 10Analytics, 10Analytics-Cluster: Understand Kafka ACLs and figure out what ACLs we want for production topics - https://phabricator.wikimedia.org/T167304#3323900 (10Ottomata) [14:42:35] phuedx: https://phabricator.wikimedia.org/T142667 [14:47:01] 10Analytics-Kanban: Reinstate a subset of reports removed from the reportcard until WikiStats 2.0 is back - https://phabricator.wikimedia.org/T166679#3323941 (10mforns) a:03mforns [14:50:18] 10Analytics, 10Analytics-EventLogging, 10Editing-Analysis, 10Wikimedia-Hackathon-2017, and 2 others: Record an EventLogging event every time a new content namespace page is created - https://phabricator.wikimedia.org/T150369#3323970 (10Nuria) @ottomata: I think we need a short intro of that page about the... [14:55:37] 10Analytics, 10MediaWiki-API, 10PageViewInfo, 10Reading-Infrastructure-Team-Backlog, 10Patch-For-Review: Add pageview stats to the action API - https://phabricator.wikimedia.org/T144865#3324008 (10Fjalapeno) [14:57:24] ottomata: ReadAhead seems a nice feature - it preloads data into cache (while reading) with the assumption that you'll read sequential data [14:57:38] 10Analytics, 10Discovery, 10Reading-Infrastructure-Team-Backlog: Determine proper encoding for structured log data sent to Kafka by MediaWiki - https://phabricator.wikimedia.org/T114733#3324051 (10Fjalapeno) [14:58:51] 10Analytics, 10MediaWiki-API, 10Reading-Infrastructure-Team-Backlog: Load API request count and latency data from Hadoop to a dashboard - https://phabricator.wikimedia.org/T108414#3324088 (10Fjalapeno) [14:59:29] ottomata: Adaptive—turns on read-ahead when the two most recent disk accesses are for sequential sectors and turns off read-ahead (i.e., set to normal) when the disk accesses revert to random sectors [14:59:46] (this one was a quote) [15:00:02] 10Analytics, 10Discovery, 10Reading-Infrastructure-Team-Backlog: Determine proper encoding for structured log data sent to Kafka by MediaWiki - https://phabricator.wikimedia.org/T114733#3324131 (10Ottomata) > Querying data with Hive from an EventLogging supplied data set requires a intermediate step of parsi... [15:00:08] ping ottomata standdup [15:03:50] nuria_: thanks a lot for elucidating! [15:04:11] 10Analytics, 10Developer-Relations, 10MediaWiki-API, 10Reading-Admin, and 7 others: Metrics about the use of the Wikimedia web APIs - https://phabricator.wikimedia.org/T102079#3324141 (10Fjalapeno) [15:06:25] 10Analytics, 10Developer-Relations, 10MediaWiki-API, 10Reading-Admin, and 3 others: Is User-Agent data PII when associated with Action API requests? - https://phabricator.wikimedia.org/T154912#3324147 (10Fjalapeno) [15:17:20] 10Analytics, 10Analytics-EventLogging, 10Editing-Analysis, 10Wikimedia-Hackathon-2017, and 2 others: Record an EventLogging event every time a new content namespace page is created - https://phabricator.wikimedia.org/T150369#3324213 (10Ottomata) Ok, modified it a bit, and also added https://wikitech.wikime... [15:19:41] 10Analytics, 10EventBus, 10ORES, 10Scoring-platform-team, 10Patch-For-Review: Emit revision-score event to EventBus and expose in EventStreams - https://phabricator.wikimedia.org/T167180#3324241 (10Fjalapeno) [15:19:45] 10Analytics, 10ChangeProp, 10Collaboration-Team-Triage, 10Edit-Review-Improvements-ReviewStream, and 4 others: Set up the foundation for the ReviewStream feed - https://phabricator.wikimedia.org/T143743#3324240 (10Fjalapeno) [15:20:14] 10Analytics, 10Analytics-EventLogging, 10Composer: EventLogging has Invalid composer.json - https://phabricator.wikimedia.org/T167309#3324243 (10dbarratt) [15:21:31] 10Analytics, 10Analytics-EventLogging, 10Editing-Analysis, 10Wikimedia-Hackathon-2017, and 2 others: Record an EventLogging event every time a new content namespace page is created - https://phabricator.wikimedia.org/T150369#3324261 (10Ottomata) The only 3 things that are not yet on the [[ https://github.c... [15:21:38] 10Analytics, 10Analytics-EventLogging, 10Composer, 10Easy: EventLogging has Invalid composer.json - https://phabricator.wikimedia.org/T167309#3324264 (10dbarratt) [15:25:17] 10Analytics, 10Analytics-EventLogging, 10Composer, 10Easy: EventLogging has Invalid composer.json - https://phabricator.wikimedia.org/T167309#3324275 (10dbarratt) [15:27:07] 10Analytics, 10EventBus, 10ORES, 10Scoring-platform-team, 10Patch-For-Review: Emit revision-score event to EventBus and expose in EventStreams - https://phabricator.wikimedia.org/T167180#3324280 (10Fjalapeno) @Ottomata just wanted to say thanks for getting this work together! I linked it to the other tas... [15:27:29] 10Analytics, 10Analytics-EventLogging, 10Composer, 10Easy: EventLogging has Invalid composer.json - https://phabricator.wikimedia.org/T167309#3324283 (10dbarratt) [15:27:32] 10Analytics, 10EventBus, 10ORES, 10Reading-Infrastructure-Team-Backlog, and 3 others: Emit revision-score event to EventBus and expose in EventStreams - https://phabricator.wikimedia.org/T167180#3324284 (10Fjalapeno) [15:40:55] 10Analytics, 10EventBus, 10ORES, 10Reading-Infrastructure-Team-Backlog, and 3 others: Emit revision-score event to EventBus and expose in EventStreams - https://phabricator.wikimedia.org/T167180#3324380 (10Jdlrobson) I'm not sure how I'll be using these yet for trending. I'd need to see what they look like... [15:41:32] 10Analytics, 10EventBus, 10ORES, 10Reading-Infrastructure-Team-Backlog, and 3 others: Emit revision-score event to EventBus and expose in EventStreams - https://phabricator.wikimedia.org/T167180#3324384 (10Ottomata) Oh yaaahhh! When looking at revision fields, note that I removed a bunch of them that were... [15:41:48] 10Analytics, 10EventBus, 10ORES, 10Reading-Infrastructure-Team-Backlog, and 3 others: Emit revision-score event to EventBus and expose in EventStreams - https://phabricator.wikimedia.org/T167180#3324387 (10Ottomata) > I assume these will automatically be available in EventStreams events ? Ya! [15:45:55] 10Analytics-Kanban, 10Discovery, 10Discovery-Analysis (Current work), 10Interactive-Sprint, 10Patch-For-Review: No maps tile requests in webrequests table as of 1 June 2017 - https://phabricator.wikimedia.org/T167083#3324397 (10debt) [15:46:57] 10Analytics, 10Analytics-EventLogging, 10Editing-Analysis, 10Wikimedia-Hackathon-2017, and 3 others: Record an EventLogging event every time a new content namespace page is created - https://phabricator.wikimedia.org/T150369#3324403 (10Nuria) I think this is a lot better! Thank you @kaldari, please take a... [15:54:27] 10Analytics, 10Analytics-EventLogging, 10Editing-Analysis, 10Wikimedia-Hackathon-2017, and 3 others: Record an EventLogging event every time a new content namespace page is created - https://phabricator.wikimedia.org/T150369#3324421 (10Ottomata) There's a helper function in the EventBus extension to build... [16:47:57] 10Analytics, 10Analytics-EventLogging, 10Editing-Analysis, 10Wikimedia-Hackathon-2017, and 3 others: Record an EventLogging event every time a new content namespace page is created - https://phabricator.wikimedia.org/T150369#3324678 (10Ottomata) > Are you sure this is actually getable? :) Answering my own... [16:49:02] 10Analytics, 10Analytics-EventLogging, 10Editing-Analysis, 10Wikimedia-Hackathon-2017, and 3 others: Record an EventLogging event every time a new content namespace page is created - https://phabricator.wikimedia.org/T150369#3324682 (10Ottomata) Oh, another Q. I don't see an obvious way to get the user cr... [16:50:30] 10Analytics, 10Analytics-EventLogging, 10Editing-Analysis, 10EventBus, and 4 others: Record an EventLogging event every time a new content namespace page is created - https://phabricator.wikimedia.org/T150369#3324687 (10Ottomata) [16:56:42] 10Analytics, 10Analytics-EventLogging, 10Editing-Analysis, 10EventBus, and 4 others: Record an EventLogging event every time a new content namespace page is created - https://phabricator.wikimedia.org/T150369#3324702 (10dbarratt) a:05dbarratt>03Ottomata Feel free to assign back to me if there's anythin... [16:56:44] 10Quarry: Explain command forces Quarry to keep running endlessly - https://phabricator.wikimedia.org/T155808#3324705 (10yuvipanda) a:05yuvipanda>03None [16:59:42] 10Analytics, 10EventBus, 10ORES, 10Reading-Infrastructure-Team-Backlog, and 3 others: Emit revision-score event to EventBus and expose in EventStreams - https://phabricator.wikimedia.org/T167180#3324743 (10Fjalapeno) > Perhaps instead, it would be worth adding an optional scores field to the revision/creat... [17:02:04] 10Analytics, 10EventBus, 10ORES, 10Reading-Infrastructure-Team-Backlog, and 3 others: Emit revision-score event to EventBus and expose in EventStreams - https://phabricator.wikimedia.org/T167180#3324764 (10Ottomata) I think @Jdlrobson is consuming the old format recentchange events, for which we are not r... [17:02:49] 10Analytics, 10Analytics-EventLogging, 10Editing-Analysis, 10EventBus, and 4 others: Record an EventLogging event every time a new content namespace page is created - https://phabricator.wikimedia.org/T150369#3324769 (10Ottomata) Rats, I've been assigned. YARRRRrrrrrrr fine. [17:11:24] 10Analytics, 10Analytics-EventLogging, 10Editing-Analysis, 10EventBus, and 4 others: Record an EventLogging event every time a new content namespace page is created - https://phabricator.wikimedia.org/T150369#3324826 (10kaldari) >When we talked about this with the collaboration team, this sounded expensive... [17:30:39] mforns: turns out that i can go to SoS [17:30:56] nuria_, OK [17:32:10] 10Analytics, 10Analytics-EventLogging, 10Editing-Analysis, 10EventBus, and 4 others: Record an EventLogging event every time a new content namespace page is created - https://phabricator.wikimedia.org/T150369#3324979 (10kaldari) @Ottomata: The documentation improvements are awesome! One other things that w... [17:58:12] 10Analytics, 10Analytics-EventLogging, 10Editing-Analysis, 10EventBus, and 4 others: Record an EventLogging event every time a new content namespace page is created - https://phabricator.wikimedia.org/T150369#3325078 (10Ottomata) It does! https://wikitech.wikimedia.org/wiki/EventBus#Topic_Config [18:01:38] 10Analytics, 10Analytics-EventLogging, 10Editing-Analysis, 10EventBus, and 4 others: Record an EventLogging event every time a new content namespace page is created - https://phabricator.wikimedia.org/T150369#3325098 (10kaldari) @Ottomata: Yay! I didn't notice that had been added. Awesome! [18:08:51] * elukey off! [18:19:00] 10Analytics, 10EventBus, 10ORES, 10Reading-Infrastructure-Team-Backlog, and 3 others: Emit revision-score event to EventBus and expose in EventStreams - https://phabricator.wikimedia.org/T167180#3325185 (10Jdlrobson) We are using revision-create See https://github.com/wikimedia/mediawiki-services-trending-... [18:20:08] 10Analytics, 10Analytics-EventLogging, 10Editing-Analysis, 10EventBus, and 4 others: Record an EventLogging event every time a new content namespace page is created - https://phabricator.wikimedia.org/T150369#3325187 (10kaldari) >Is autopatrol a 'user group', like bot, sysop, etc.? If so, this information... [18:59:12] 10Quarry: Explain command forces Quarry to keep running endlessly - https://phabricator.wikimedia.org/T155808#3326586 (10Soni) Just wanted to point out that 4 months later, the queries are still running. [19:28:45] 10Analytics, 10EventBus, 10ORES, 10Reading-Infrastructure-Team-Backlog, and 3 others: Emit revision-score event to EventBus and expose in EventStreams - https://phabricator.wikimedia.org/T167180#3328860 (10Ottomata) Ahhh right, great! [19:43:19] 10Analytics-Kanban, 10Patch-For-Review: Make non-nullable columns in EL database nullable - https://phabricator.wikimedia.org/T167162#3330009 (10mforns) a:03mforns [20:01:34] 10Quarry: Explain command forces Quarry to keep running endlessly - https://phabricator.wikimedia.org/T155808#3330162 (10Nemo_bis) I think what the interface says isn't necessarily true. [20:06:33] 10Analytics, 10Discovery-Analysis: Get 'sparklyr' working on stats1002 - https://phabricator.wikimedia.org/T139487#3330195 (10mpopov) I downloaded **spark-1.6.3-bin-hadoop2.6** (http://spark.rstudio.com/#installation uses Spark 1.6.2) and put it in my homedir on stat1002 and have the following in my **.bashrc*... [20:37:20] hey folks. I' [20:37:28] m looking at pageview_hourly. [20:37:39] And the docs for page_id say "For redirects this could be the page_id of the redirect or the page_id of the target. This may not always be set, even if the page is actually a pageview." [20:37:54] How is it that the page_id might be the page_id of the redirecting page? [20:46:31] 10Analytics, 10Discovery-Analysis: Get 'sparklyr' working on stats1002 - https://phabricator.wikimedia.org/T139487#3330364 (10mpopov) ```lang=R Sys.setenv(HADOOP_CONF_DIR = "") Sys.setenv(HADOOP_HOME = "") Sys.setenv(HADOOP_PREFIX = "") Sys.setenv(YARN_CONF_DIR = "") ``` fixed it for me when using the latest... [22:34:16] ottomata: hey! Is there any easy way to get a dump of latest revision.create events for a given page? [22:34:33] to be more specific - i want all the revision.create events for https://trending.wmflabs.org/en.wikipedia/Special:History/Tetragrammaton within last 24 hrs