[00:05:21] awight: i see, ok, ya, if it is a low volume stream that is a problem cause as monitoring is now we will not find out, do file ticket really [00:15:47] (03PS1) 10Nuria: Correct overcounting of namespace zero editors [analytics/refinery] - 10https://gerrit.wikimedia.org/r/547689 (https://phabricator.wikimedia.org/T237072) [00:26:57] awight: https://grafana.wikimedia.org/d/000000234/kafka-by-topic?refresh=5m&orgId=1&var-datasource=eqiad%20prometheus%2Fops&var-kafka_cluster=jumbo-eqiad&var-kafka_broker=All&var-topic=eventlogging_EventError [01:02:14] (03PS2) 10Nuria: Correct overcounting of namespace zero editors [analytics/refinery] - 10https://gerrit.wikimedia.org/r/547689 (https://phabricator.wikimedia.org/T237072) [04:22:19] PROBLEM - Check the last execution of monitor_refine_sanitize_eventlogging_analytics_delayed on an-coord1001 is CRITICAL: CRITICAL: Status of the systemd unit monitor_refine_sanitize_eventlogging_analytics_delayed https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [04:27:17] PROBLEM - Check the last execution of monitor_refine_sanitize_eventlogging_analytics_immediate on an-coord1001 is CRITICAL: CRITICAL: Status of the systemd unit monitor_refine_sanitize_eventlogging_analytics_immediate https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [04:35:49] iflorez_: i think you want MM instead of mm [05:38:43] (03PS1) 10Milimetric: Fix monthly insert and publish query [analytics/refinery] - 10https://gerrit.wikimedia.org/r/547707 (https://phabricator.wikimedia.org/T237072) [07:41:53] nuria: Great, thanks for the dashboard link! [07:56:11] Just submitted a patch which should knock out 90% of the errors :D [08:22:22] Correction to what I was saying about the eventlogging dash in logstash. I was looking at the wrong one, the server error logs were empty (good thing) and the EventErrors I was looking for were in the "eventlogging processor" dash. [08:25:59] RECOVERY - Check the last execution of monitor_refine_sanitize_eventlogging_analytics_delayed on an-coord1001 is OK: OK: Status of the systemd unit monitor_refine_sanitize_eventlogging_analytics_delayed https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [08:30:51] RECOVERY - Check the last execution of monitor_refine_sanitize_eventlogging_analytics_immediate on an-coord1001 is OK: OK: Status of the systemd unit monitor_refine_sanitize_eventlogging_analytics_immediate https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [09:01:54] (03PS1) 10Awight: Skip reports which start in the future [analytics/reportupdater] - 10https://gerrit.wikimedia.org/r/547713 [09:02:25] (03CR) 10jerkins-bot: [V: 04-1] Skip reports which start in the future [analytics/reportupdater] - 10https://gerrit.wikimedia.org/r/547713 (owner: 10Awight) [09:05:43] (03PS2) 10Awight: Skip reports which start in the future [analytics/reportupdater] - 10https://gerrit.wikimedia.org/r/547713 [09:40:11] mforns: Something else I don't understand. Reportupdater jobs compile daily or weekly data, but they're all on an hourly cron, AFAICT. [12:13:07] (03CR) 10Joal: [C: 03+1] "Good catch @nuria!" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/547689 (https://phabricator.wikimedia.org/T237072) (owner: 10Nuria) [12:49:27] @joal: did you get a chance to look at what I wrote about the geoeditors? [12:51:39] or mforns: I'd like to see if it makes sense, and then deploy and rerun just the bucketed job today [13:17:04] Hi milimetric - I have read your email, and agree with the findings (the unintuitive row is indeed unintuitive) [13:18:06] milimetric: I agree with nullifying data for 1-4 for older data, and recompute for 2019-09 (only month we have left) [13:18:29] haha, ok, I have the morning busy but if yall merge the patch I’ll rerun the bucketed [13:21:28] milimetric: let's wait for nuria before proceeding (most of europe is off today) [13:21:47] k [13:22:30] awight, RU runs hourly but only schedules jobs when they are due, because of delay param, this can be anytime [13:23:53] milimetric, I can look into that! Is it OK if I do it later? I'm leaving now to eat outside. Will work 4 hours later today to compensate halloween [13:29:11] mforns: I don't understand why the delay param would cause the job to run at any time rather than at a specific time like, I'd think that e.g. delay=3hr means the job is ready to run after 03:00 each day. [13:30:27] It also makes sense to check hourly cos that guarantees a really robust system where missing outputs or start_date changes cause almost immediate backfilling. [13:31:00] Not a complaint either way, I was just curious whether it was intentional or not. [13:32:30] I guess the next rung up the complexity ladder would be a daemon which runs continuously, listening to config file changes, scheduling its own jobs rather than simply test for ready jobs on each run... [13:32:45] * awight tiptoes off into the weekend [13:41:26] mforns: no problem, we’re waiting for nuria anyway [13:42:14] 10Analytics, 10Event-Platform, 10Operations, 10CPT Initiatives (Modern Event Platform (TEC2)), and 2 others: Possibly expand Kafka main-{eqiad,codfw} clusters in Q4 2019. - https://phabricator.wikimedia.org/T217359 (10herron) [13:46:05] 10Analytics, 10Event-Platform, 10Operations, 10CPT Initiatives (Modern Event Platform (TEC2)), and 2 others: Possibly expand Kafka main-{eqiad,codfw} clusters in Q4 2019. - https://phabricator.wikimedia.org/T217359 (10herron) 05Open→03Resolved a:03herron To circle back on this, we moved forward with... [13:46:16] 10Analytics, 10Event-Platform, 10Operations, 10CPT Initiatives (Modern Event Platform (TEC2)), and 2 others: Possibly expand Kafka main-{eqiad,codfw} clusters in Q4 2019. - https://phabricator.wikimedia.org/T217359 (10herron) [13:46:18] 10Analytics, 10Operations, 10Core Platform Team Legacy (Watching / External), 10Patch-For-Review, and 2 others: Replace and expand kafka main hosts (kafka[12]00[123]) with kafka-main[12]00[12345] - https://phabricator.wikimedia.org/T225005 (10herron) [14:20:17] Heya ottomata - Is today a working day for ya? [14:34:49] 10Analytics, 10WMDE-Analytics-Engineering, 10Wikidata, 10Wikidata-Campsite: Track WDQS updater UA in wikidata-special-entitydata grafana dashboard - https://phabricator.wikimedia.org/T218998 (10Rosalie_WMDE) a:03Rosalie_WMDE [14:35:24] 10Analytics, 10WMDE-Analytics-Engineering, 10Wikidata, 10Wikidata-Campsite (Wikidata-Campsite-Iteration-∞): Track WDQS updater UA in wikidata-special-entitydata grafana dashboard - https://phabricator.wikimedia.org/T218998 (10Rosalie_WMDE) [14:57:37] 10Analytics, 10WMDE-Analytics-Engineering, 10Wikidata, 10Wikidata-Campsite (Wikidata-Campsite-Iteration-∞): Track WDQS updater UA in wikidata-special-entitydata grafana dashboard - https://phabricator.wikimedia.org/T218998 (10Rosalie_WMDE) [14:58:49] (03CR) 10Nuria: "Abandoning in favor of https://gerrit.wikimedia.org/r/#/c/analytics/refinery/+/547707" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/547689 (https://phabricator.wikimedia.org/T237072) (owner: 10Nuria) [14:58:55] (03Abandoned) 10Nuria: Correct overcounting of namespace zero editors [analytics/refinery] - 10https://gerrit.wikimedia.org/r/547689 (https://phabricator.wikimedia.org/T237072) (owner: 10Nuria) [15:01:37] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Correct namespace zero editor counts on geoeditors_monthly table on hive and druid - https://phabricator.wikimedia.org/T237072 (10Nuria) Actually given that in druid we do not index the namespace zero field we do not need to reindex: https://bit.ly/2WzpqaP [15:14:34] (03CR) 10Nuria: [C: 04-1] Fix monthly insert and publish query (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/547707 (https://phabricator.wikimedia.org/T237072) (owner: 10Milimetric) [15:17:54] 10Analytics, 10Analytics-Kanban, 10Operations, 10Traffic, 10observability: Publish tls related info to webrequest via varnish - https://phabricator.wikimedia.org/T233661 (10Nuria) >Hmm maybe it could make sense to store some TLS data like the ciphersuite, version or elliptic curve as integers assuming th... [15:20:47] 10Analytics, 10Better Use Of Data, 10Event-Platform, 10Product-Infrastructure-Team-Backlog, 10Epic: Event Platform Client Libraries - https://phabricator.wikimedia.org/T228175 (10jlinehan) [15:24:36] 10Analytics, 10Better Use Of Data, 10Event-Platform, 10Product-Infrastructure-Team-Backlog, 10Epic: Event Platform Client Libraries - https://phabricator.wikimedia.org/T228175 (10jlinehan) I updated the description and moved the many tiny confusing subtasks to {T237106}. [15:30:17] 10Analytics, 10WMDE-Analytics-Engineering, 10WMDE-FUN-Funban-2019, 10WMDE-FUN-Sprint-2019-10-14, 10WMDE-New-Editors-Banner-Campaigns (Banner Campaign Autumn 2019): Implement banner design for WMDEs autum new editor recruitment campaign - https://phabricator.wikimedia.org/T235845 (10AndyRussG) >>! In T235... [15:32:52] 10Analytics, 10Analytics-Kanban, 10Operations, 10Traffic, 10observability: Publish tls related info to webrequest via varnish - https://phabricator.wikimedia.org/T233661 (10BBlack) @Nuria - what you're asking for is something like a combined TLS field with separators? e.g. we contruct a 4-part semicolon-... [15:43:37] 10Analytics, 10Analytics-Kanban, 10Operations, 10Traffic, 10observability: Publish tls related info to webrequest via varnish - https://phabricator.wikimedia.org/T233661 (10Nuria) @BBlack Varnish would send items to varnishkafka similar to how it is done for x-nalytics: https://github.com/wikimedia/puppe... [15:56:56] 10Analytics, 10Analytics-Kanban, 10Operations, 10Traffic, 10observability: Publish tls related info to webrequest via varnish - https://phabricator.wikimedia.org/T233661 (10BBlack) We probably don't need to send the reused value (it's not that useful for analysis at this level, IMHO), and we don't need t... [15:59:48] (03PS2) 10Fdans: Escape double quotes in file urls [analytics/refinery] - 10https://gerrit.wikimedia.org/r/546882 [16:00:34] a-team am I the only one in today? [16:00:42] fdans, coming! [16:08:25] 10Analytics, 10Analytics-Kanban, 10Operations, 10Traffic, 10observability: Publish tls related info to webrequest via varnish - https://phabricator.wikimedia.org/T233661 (10Nuria) >Also I'm assuming from how X-Analytics is set up that the format is k1=v1;k2=v2;.... (equal sign rather than colon). yes, co... [16:09:57] PROBLEM - eventlogging Varnishkafka log producer on cp5011 is CRITICAL: PROCS CRITICAL: 0 processes with args /usr/bin/varnishkafka -S /etc/varnishkafka/eventlogging.conf https://wikitech.wikimedia.org/wiki/Analytics/Systems/Varnishkafka [16:10:27] PROBLEM - statsv Varnishkafka log producer on cp5011 is CRITICAL: PROCS CRITICAL: 0 processes with args /usr/bin/varnishkafka -S /etc/varnishkafka/statsv.conf https://wikitech.wikimedia.org/wiki/Analytics/Systems/Varnishkafka [16:11:09] PROBLEM - Webrequests Varnishkafka log producer on cp5011 is CRITICAL: PROCS CRITICAL: 0 processes with args /usr/bin/varnishkafka -S /etc/varnishkafka/webrequest.conf https://wikitech.wikimedia.org/wiki/Analytics/Systems/Varnishkafka [16:11:48] 10Analytics, 10Analytics-Kanban, 10Operations, 10Traffic, 10observability: Update webrequest_128 dataset in turnilo to include TLS fields once available - https://phabricator.wikimedia.org/T237117 (10Nuria) [16:12:47] RECOVERY - Webrequests Varnishkafka log producer on cp5011 is OK: PROCS OK: 1 process with args /usr/bin/varnishkafka -S /etc/varnishkafka/webrequest.conf https://wikitech.wikimedia.org/wiki/Analytics/Systems/Varnishkafka [16:13:13] RECOVERY - eventlogging Varnishkafka log producer on cp5011 is OK: PROCS OK: 1 process with args /usr/bin/varnishkafka -S /etc/varnishkafka/eventlogging.conf https://wikitech.wikimedia.org/wiki/Analytics/Systems/Varnishkafka [16:13:41] RECOVERY - statsv Varnishkafka log producer on cp5011 is OK: PROCS OK: 1 process with args /usr/bin/varnishkafka -S /etc/varnishkafka/statsv.conf https://wikitech.wikimedia.org/wiki/Analytics/Systems/Varnishkafka [16:14:30] cp5011 is due to maintenance, see #operations :) [16:14:35] Cc: fdans --^ [16:15:02] ack elukey , thanks you [16:16:18] 10Analytics, 10Analytics-Kanban, 10Product-Analytics, 10Patch-For-Review: Create a reports directory under analytics.wikimedia.org - https://phabricator.wikimedia.org/T235494 (10Nuria) Looks like dashboards are fixed: https://pingback.wmflabs.org/#unique-wiki-count [16:20:54] (03PS1) 10Fdans: Add python script to generate intervals for long backfilling [analytics/refinery] - 10https://gerrit.wikimedia.org/r/547750 [16:23:34] 10Analytics, 10Analytics-Kanban, 10Product-Analytics, 10Patch-For-Review: Create a reports directory under analytics.wikimedia.org - https://phabricator.wikimedia.org/T235494 (10CCicalese_WMF) Thank you, they are indeed working again. It seems that https://gerrit.wikimedia.org/r/c/analytics/reportupdater-q... [16:24:44] 10Analytics, 10Analytics-Kanban, 10Multimedia, 10Tool-Pageviews: Create script that returns oozie time intervals every time a coordinator is started from a cron job - https://phabricator.wikimedia.org/T237119 (10fdans) [16:24:52] 10Analytics, 10Analytics-Kanban, 10Product-Analytics, 10Patch-For-Review: Create a reports directory under analytics.wikimedia.org - https://phabricator.wikimedia.org/T235494 (10Nuria) >Is there anything that needs to happen past merging that patch that will enable it for the dashboard? Yes, the reports ne... [16:25:00] (03PS2) 10Fdans: Add python script to generate intervals for long backfilling [analytics/refinery] - 10https://gerrit.wikimedia.org/r/547750 (https://phabricator.wikimedia.org/T237119) [16:28:48] (03CR) 10Nuria: Add python script to generate intervals for long backfilling (032 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/547750 (https://phabricator.wikimedia.org/T237119) (owner: 10Fdans) [16:29:33] nuria: sorry! meant to add wip, I'm testing it still [16:29:35] 10Analytics, 10Analytics-Kanban, 10Product-Analytics, 10Patch-For-Review: Create a reports directory under analytics.wikimedia.org - https://phabricator.wikimedia.org/T235494 (10CCicalese_WMF) Cool, thanks. I couldn't remember how often they run. The patch was merged yesterday, but I guess since the dashbo... [16:32:41] (03CR) 10Nuria: "Please see an example of tests inline in file:" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/547750 (https://phabricator.wikimedia.org/T237119) (owner: 10Fdans) [17:49:50] nuria: It seems like the 2016 dataset does not include entries of content_type = "text/...". Am I mistaken/why is this? [17:50:29] lexnasser: right, we only released images, will do same now as it simplifies data and makes it more pertinent [17:50:55] how about "application/..." types? I found a few of those in 2016 [17:51:00] lexnasser: text pages are evicted from cache cause someone has say, edited a page so using images for cache modeling makes more sense [18:33:38] 10Analytics, 10Growth-Team, 10Product-Analytics: Growth: implement wider data purge window - https://phabricator.wikimedia.org/T237124 (10nettrom_WMF) [18:35:38] 10Analytics, 10Growth-Team, 10GrowthExperiments, 10Product-Analytics: Homepage: ensure data retention is in line with the guideline exception - https://phabricator.wikimedia.org/T235577 (10nettrom_WMF) What happens here is most likely related to T237124. [18:50:47] 10Analytics, 10Growth-Team, 10GrowthExperiments, 10Product-Analytics: Homepage: ensure data retention is in line with the guideline exception - https://phabricator.wikimedia.org/T235577 (10MMiller_WMF) Also related to {T230072}. [19:25:22] 10Analytics, 10Analytics-Cluster, 10Operations, 10ops-eqiad: analytics1062 lost one of its power supplies - https://phabricator.wikimedia.org/T237133 (10Dzahn) [19:30:54] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Correct namespace zero editor counts on geoeditors_monthly table on hive and druid - https://phabricator.wikimedia.org/T237072 (10Milimetric) Plan to correct, with draft of scripts needed: * null out namespace zero distinct editors for the 1 to 4 activit... [19:31:45] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Correct namespace zero editor counts on geoeditors_monthly table on hive and druid - https://phabricator.wikimedia.org/T237072 (10Nuria) Correcting what we need to do, we need to empty column for zero_namespace_editors but only for the 1-4 bucket, the oth... [19:40:30] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Correct namespace zero editor counts on geoeditors_monthly table on hive and druid - https://phabricator.wikimedia.org/T237072 (10JAllemandou) Great plan @Milimetric! One precision: We only have geoeditors_daily up to 2019-09, so we can backfill from the... [19:42:56] 10Analytics, 10Analytics-Kanban, 10Product-Analytics, 10Patch-For-Review: Correct namespace zero editor counts on geoeditors_monthly table on hive and druid - https://phabricator.wikimedia.org/T237072 (10Nuria) [19:55:30] Hey folks. I'm looking into some weird activity on ORES right now. It just started in the last hour. [19:55:39] Would I find data from the last our in the "webrequest" table? [20:05:15] halfak: no, data is delayed by 2 hours or so [20:07:19] halfak: last hour is hdfs://analytics-hadoop/wmf/data/wmf/webrequest/webrequest_source=text/year=2019/month=11/day=1/hour=18 [20:07:31] Gotcha. Thanks. [20:07:34] halfak: so, ya , 2hrs [20:15:35] 10Analytics, 10Analytics-Kanban, 10Product-Analytics, 10Patch-For-Review: Correct namespace zero editor counts on geoeditors_monthly table on hive and druid - https://phabricator.wikimedia.org/T237072 (10Nuria) We do not need to delete the druid datasource, re-indexing should be sufficient. Let's amend pl... [20:38:29] 10Analytics, 10Analytics-EventLogging, 10MediaWiki-ContentHandler: Allow YAML as an alternative for JSON on MediaWiki pages - https://phabricator.wikimedia.org/T237136 (10Tgr) [21:09:37] 10Analytics, 10Analytics-Kanban, 10Product-Analytics, 10Patch-For-Review: Create a reports directory under analytics.wikimedia.org - https://phabricator.wikimedia.org/T235494 (10Nuria) @CCicalese_WMF mmm, looking at report times i think last time these were updated was October 6th, pinging @mforns to make... [21:21:04] 10Analytics, 10Cloud-Services, 10Developer-Advocacy (Oct-Dec 2019): Setup Config:Dashiki:WMCSEdits on meta wiki - https://phabricator.wikimedia.org/T236223 (10Nuria) > (wikis in columns and not in rows)? Is that right, or is there any workaround for this? That's correct, expectation is