[06:22:17] 10Analytics: Check EventBus doesn't need broker sync to enqueue - https://phabricator.wikimedia.org/T200025 (10elukey) 05Open>03Resolved a:03elukey This should be the case if I am reading the code correctly. Kafka Python by default works in this way (https://kafka-python.readthedocs.io/en/master/apidoc/Kaf... [07:47:47] elukey: o/ Can you remind me this gerrit URL to switch the UI ? [07:48:51] hello! [07:48:56] should be ?polygerrit=1 IIRC [07:49:38] \o/ ! Thanks :) [08:15:01] I made some calculations in https://wikitech.wikimedia.org/w/index.php?title=Incident_documentation/20180711-kafka-eqiad&redirect=no#Kafka_considerations about the file descriptors opened [08:22:05] as FYI, I am rolling restarting kafka to pick up the new max open files settings [08:26:58] ack/ [09:59:11] 10Analytics, 10EventBus, 10Operations, 10Services (watching): Set a proper max open files limit for Kafka clusters - https://phabricator.wikimedia.org/T200177 (10mobrovac) [10:32:04] 10Analytics, 10EventBus, 10Operations, 10Services (watching): Set a proper max open files limit for Kafka clusters - https://phabricator.wikimedia.org/T200177 (10elukey) 05Open>03Resolved [10:44:25] * elukey lunch + errand! [11:16:08] 10Analytics, 10Tool-Pageviews: Add option to include redirects in Massviews - https://phabricator.wikimedia.org/T200256 (10Fuzzy) [11:16:35] hellooo team [11:43:22] hi mforns :) [11:43:29] heya [11:46:46] !log Cheked that oozie webrequest upload warning for hour 2018-07-24-07 contains only false positive [11:46:47] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [13:04:00] (03PS1) 10Joal: Correct mediawiki_history_reduced create script [analytics/refinery] - 10https://gerrit.wikimedia.org/r/447614 [13:47:21] o/ [13:54:26] \o [13:54:58] 10Analytics-Kanban: Correct mediawiki_history_reduced schema - https://phabricator.wikimedia.org/T200270 (10JAllemandou) [13:56:20] 10Analytics-Kanban: Correct mediawiki_history_reduced schema - https://phabricator.wikimedia.org/T200270 (10JAllemandou) [13:57:04] (03PS2) 10Joal: Correct mediawiki_history_reduced create script [analytics/refinery] - 10https://gerrit.wikimedia.org/r/447614 (https://phabricator.wikimedia.org/T200270) [13:57:53] 10Analytics-Kanban, 10Patch-For-Review: Correct mediawiki_history_reduced schema - https://phabricator.wikimedia.org/T200270 (10JAllemandou) a:03JAllemandou [14:16:48] joal: o/ - if you have time today can we batcave? I have an idea for a script and I'd like to know your inputs [14:17:07] elukey: give me 5 minutes and I'm al yours [14:19:16] 10Analytics-Kanban: Update AQS new-pages endpoint - https://phabricator.wikimedia.org/T200272 (10JAllemandou) [14:19:31] 10Analytics-Kanban: Update AQS new-pages endpoint - https://phabricator.wikimedia.org/T200272 (10JAllemandou) a:03JAllemandou [14:19:58] milimetric: Just suscribed you to that task --^ [14:20:06] Input welcome :) [14:22:26] (03PS1) 10Joal: Update new_pages endpoint druid query [analytics/aqs] - 10https://gerrit.wikimedia.org/r/447624 (https://phabricator.wikimedia.org/T200272) [14:22:33] elukey: ready ! [14:23:16] ack! [14:23:39] batcave? [14:30:56] (03CR) 10Milimetric: Update new_pages endpoint druid query (031 comment) [analytics/aqs] - 10https://gerrit.wikimedia.org/r/447624 (https://phabricator.wikimedia.org/T200272) (owner: 10Joal) [14:37:01] (03CR) 10Mforns: Update pageview whitelist (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/447381 (owner: 10Joal) [14:58:19] (03CR) 10Milimetric: Update pageview whitelist (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/447381 (owner: 10Joal) [15:12:32] (03PS2) 10Joal: Update pageview whitelist [analytics/refinery] - 10https://gerrit.wikimedia.org/r/447381 [15:13:31] (03CR) 10Joal: "Good catch guys! Thanks :)" (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/447381 (owner: 10Joal) [15:25:48] (03PS3) 10Joal: Update pageview whitelist [analytics/refinery] - 10https://gerrit.wikimedia.org/r/447381 [15:28:54] (03CR) 10Joal: [C: 04-2] "Actually this patch is not needed since foundation.wikimedia is hosted by a third party and gets analytics through Matamo." [analytics/refinery] - 10https://gerrit.wikimedia.org/r/446399 (https://phabricator.wikimedia.org/T188776) (owner: 10Reedy) [15:51:48] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: [EL sanitization] Productionize EventLoggingSanitization.scala - https://phabricator.wikimedia.org/T193176 (10Milimetric) For the record, the documentation was updated and looks great: https://wikitech.wikimedia.org/w/index.php?title=Analytics%2FSystems%2F... [15:53:28] mforns: I added you to the patch about dropping empty partition dirs, hope you don't mind :) [15:54:08] not at all, in short I will need this script, or a very similar one, for dropping EL data [15:54:11] fdans, ^ [15:54:33] graciaaaas [16:37:06] milimetric, mforns: if you guys have minute, can you proof-read https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Edits/Mediawiki_history_reduced and https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Mediawiki_history_reduced_algorithm ? [16:37:10] Many thanks :) [16:37:21] I'm gonna go for diner, will be back after [16:37:54] will do, just finishing up tech meeting [16:41:43] (03CR) 10Milimetric: [V: 032 C: 032] Correct mediawiki_history_reduced create script [analytics/refinery] - 10https://gerrit.wikimedia.org/r/447614 (https://phabricator.wikimedia.org/T200270) (owner: 10Joal) [16:42:18] (03CR) 10Milimetric: [V: 032 C: 032] Update pageview whitelist [analytics/refinery] - 10https://gerrit.wikimedia.org/r/447381 (owner: 10Joal) [17:14:20] joal, page looks good! I fixed some typos [17:17:02] (03CR) 10Reedy: "No, no it isn't." [analytics/refinery] - 10https://gerrit.wikimedia.org/r/446399 (https://phabricator.wikimedia.org/T188776) (owner: 10Reedy) [17:33:31] * elukey off! [19:20:57] (03PS1) 10Fdans: Filter out unwanted wikis from wmf.virtualpageview_hourly [analytics/refinery] - 10https://gerrit.wikimedia.org/r/447665 (https://phabricator.wikimedia.org/T197971) [19:34:04] milimetric: see the last comment from Reedy here https://gerrit.wikimedia.org/r/c/analytics/refinery/+/446399 [19:34:49] milimetric: I don't know enough on hosting side, so I'm all for adding foundation.wikimedia to pageviews (even if so far we've not seen any alarm related) [19:35:11] Thanks mforns for the read and corrections :) [19:35:19] :] [19:36:11] mforns: I plan to have another couple of pages tomorrow - WIll ask for review again ;) [19:37:17] ping Reedy - About the patch for pageview whitelist [19:37:23] ohai [19:37:27] hello :) [19:37:38] Yeah.. The wiki is staying around [19:37:42] It's probably a lot less traffic [19:37:52] Reedy: I'm very sorry I thought that site was out [19:38:03] It's alright, just wanted to make sure you were aware [19:38:07] It's been pretty confusing [19:38:15] But it is keeping some important stuff like Privacy policy and cookie policy stuff, AFAIK [19:38:53] Reedy: the thing I'm not sure of is: so far, we have the whitelist as an alarming system --> porjects not in he whitelist makes an alarm ring - And we've never received any from foundation.wikimedia [19:39:09] There is a redirect still in place [19:39:19] Depending what munging you do... your system may still see them as the same site [19:39:33] Ah - So for the moment we receive wikimediafoundation, but soon we'll have foundation.wikimedia? [19:39:55] The database name hasn't changed underneath [19:40:09] So it might just depend on how/where you get your stats and how it labels them [19:40:10] Ok - anyway no harm having it in the list - I'll update the patch and merge it :) [19:40:27] Reedy: we receive the host from varnish [19:42:47] joal / Reedy I think there may be some miscommunication here. We should keep wikimediafoundation as a host, as it will continue to serve things like Terms of Service. We shouldn’t add foundation.wikimedia, as we don’t answer to requests for that address [19:43:04] milimetric: Try it: ) [19:43:12] https://foundation.wikimedia.org/wiki/Special:Version [19:44:07] ok, then confusion was mine and it’s cleared up. Still weird we don’t see it in the varnish logs [19:44:17] Sams-MBP:~ reedy$ curl -I https://www.wikimediafoundation.org [19:44:17] HTTP/2 302 [19:44:17] date: Tue, 24 Jul 2018 19:44:11 GMT [19:44:18] content-length: 0 [19:44:18] location: https://foundation.wikimedia.org/ [19:44:36] It's possible it's still special cased or something [19:44:42] Might be worth checking with Brandon [19:45:12] No big deal on whitelist though, but interesting to actually know the answer [19:48:00] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Virtual pageview refine should not refine data that does not come from wikimedia domains - https://phabricator.wikimedia.org/T197971 (10JAllemandou) Sounds good to me, maybe with a `.org$` to match only end-of-string (and make the regexp parser life easier). [19:53:37] (03CR) 10Joal: "Comments inline - Also, let's add the whitelist_table property in the list of needed ones in coordinator.xml so that it fails before even " (033 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/447665 (https://phabricator.wikimedia.org/T197971) (owner: 10Fdans) [19:55:50] joal: milimetric: [19:55:51] [20:55:25] does "not getting anything" mean webrequest data at all, or just their calculated last-access uniques stuff? [19:56:40] Reedy: no pageview, meaning no webrequest that match a pageview - Give me a minute to check for webrequest [19:56:44] We didn’t see anything that got marked as a pageview [19:57:14] yeah, could be something in the pageview definition is just filtering out these requests [19:59:34] milimetric: I confirm - Pages are present in webrequest but not flagged as pageviews [20:00:19] ok, so nothing that would need to worry bblack, Reedy [20:00:46] milimetric, Reedy : https://github.com/wikimedia/analytics-refinery-source/blob/master/refinery-core/src/main/java/org/wikimedia/analytics/refinery/core/PageviewDefinition.java#L75 [20:01:08] ok - it's on our side - sorry for boethring, Reedy [20:01:26] ok, np [20:03:52] (03CR) 10Joal: "Pageviews were not showing up in analyics, but this site should be whitelisted. Will provide a patch updating the date." [analytics/refinery] - 10https://gerrit.wikimedia.org/r/446399 (https://phabricator.wikimedia.org/T188776) (owner: 10Reedy) [20:07:46] (03PS3) 10Joal: Add foundation.wikimedia to whitelist [analytics/refinery] - 10https://gerrit.wikimedia.org/r/446399 (https://phabricator.wikimedia.org/T188776) (owner: 10Reedy) [20:11:36] milimetric: if you have a minute --^ [20:11:56] git st [20:11:59] oops [20:13:45] (03PS1) 10Joal: Add foundation.wikimedia to pageviews [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/447720 (https://phabricator.wikimedia.org/T188776) [20:13:55] (03CR) 10Milimetric: [V: 032 C: 032] Add foundation.wikimedia to whitelist [analytics/refinery] - 10https://gerrit.wikimedia.org/r/446399 (https://phabricator.wikimedia.org/T188776) (owner: 10Reedy) [20:13:59] milimetric: this one as well ;) [20:15:10] 10Analytics-Kanban, 10DNS, 10Operations, 10Release-Engineering-Team, and 5 others: Move Foundation Wiki to new URL when new Wikimedia Foundation website launches - https://phabricator.wikimedia.org/T188776 (10JAllemandou) [20:18:56] (03CR) 10Joal: "Comment inline" (031 comment) [analytics/aqs] - 10https://gerrit.wikimedia.org/r/447624 (https://phabricator.wikimedia.org/T200272) (owner: 10Joal)