[04:12:19] 10Analytics, 10Analytics-Kanban, 10Product-Analytics: Technical contributors metrics definition - https://phabricator.wikimedia.org/T247419 (10jwang) @Nuria @bmueller, here is the trend of quarterly submitter and contribution. The trend is flat. The abnormal number of 2018Q3 is due to typo. I have made it co... [04:37:07] (03CR) 10Nuria: [C: 04-1] "i think this needs a bit of cleanup to respect basic java conventions, like the list/arraylist specification but also the definition of ex" (036 comments) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/582131 (https://phabricator.wikimedia.org/T240985) (owner: 10Ottomata) [04:38:30] 10Analytics, 10Analytics-Kanban, 10Event-Platform: Event schemas common schema should set additionalProperties: false - https://phabricator.wikimedia.org/T248173 (10Nuria) 05Open→03Resolved [04:38:57] 10Analytics, 10Analytics-Kanban, 10Event-Platform, 10User-Elukey: Documentation improvements for Eventstreams - https://phabricator.wikimedia.org/T240181 (10Nuria) 05Open→03Resolved [04:38:59] 10Analytics, 10User-Elukey: Redesign architecture of irc-recentchanges on top of Kafka - https://phabricator.wikimedia.org/T234234 (10Nuria) [04:39:14] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Add Prometheus Presto metrics and dashboards - https://phabricator.wikimedia.org/T247884 (10Nuria) 05Open→03Resolved [04:54:18] 10Analytics, 10Analytics-Kanban, 10Product-Analytics, 10Inuka-Team (Kanban), 10Patch-For-Review: Set up pageview counting for KaiOS app - https://phabricator.wikimedia.org/T244547 (10Nuria) >Do you see any problem with us always sending https://www.wikipedia.org in the refrerer header to ensure the refer... [05:15:27] 10Analytics, 10Analytics-Kanban, 10Pageviews-API: Add wikimania.wikimedia.org to pageview definition - https://phabricator.wikimedia.org/T216525 (10Nuria) 05Open→03Resolved [06:19:11] hello team [06:48:30] good morning : [06:48:31] :) [07:01:43] (03PS1) 10Fdans: Instrument event tracking in TopicExplorer [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/583521 (https://phabricator.wikimedia.org/T247106) [07:12:57] Morning elukey [07:13:03] :) [07:14:10] elukey: you probably want the logs link to be http://wm-bot.wmflabs.org/browser/index.php?display=%23wikimedia-analytics [07:14:16] In the topic [07:18:53] nope wrong :) [07:19:40] RhinosF1: done! should be better, thanks [07:21:00] Np elukey [07:58:59] 10Analytics, 10Analytics-Kanban, 10Product-Analytics, 10Inuka-Team (Kanban), 10Patch-For-Review: Set up pageview counting for KaiOS app - https://phabricator.wikimedia.org/T244547 (10nshahquinn-wmf) >>! In T244547#6000890, @Nuria wrote: >>Do you see any problem with us always sending https://www.wikipedi... [09:09:15] !log re-running manually webrequest-load upload 26/03/2020T08 - kerberos failures [09:09:16] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [09:09:26] super weird [09:12:26] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Investigate sporadic failures in oozie hive actions due to Kerberos auth - https://phabricator.wikimedia.org/T241650 (10elukey) 05Resolved→03Open p:05High→03Medium [09:12:42] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Investigate sporadic failures in oozie hive actions due to Kerberos auth - https://phabricator.wikimedia.org/T241650 (10elukey) Reopened, since today a webrequest load failed due to the same issue.. [09:12:59] sigh [09:20:14] 10Analytics, 10Analytics-EventLogging, 10Product-Analytics: EventLogging does not properly classify KaiOS user agents - https://phabricator.wikimedia.org/T248560 (10nshahquinn-wmf) [10:10:55] 10Analytics, 10Operations, 10Product-Analytics, 10SRE-Access-Requests: Hive access - https://phabricator.wikimedia.org/T248097 (10Volans) a:03spatton @spatton: gentle reminder for the above request. [11:47:54] awight: o/ [11:49:01] :-) hi [11:50:23] 10Analytics: GPUs are not correctly handling multitasking - https://phabricator.wikimedia.org/T248574 (10Miriam) [11:50:25] awight: how are things? I bother you a second to ask if you saw my email about refine failure [11:50:40] ooh! lemme look for that [11:51:09] I see, thanks for the ping. [11:51:16] np :) [11:51:32] I think it should be just a matter of fixing the schema, then I can re-run the refine jobs [11:52:15] Is there a way to hide details at a certain point, or do I need to create a 100% complete JSON-schema? [11:52:34] ... I can pass JSON-encoded strings otherwise [11:54:06] awight: I think that the schema needs to be precise, but I am not authoritative enough for json schema :D [11:54:36] elukey: Do you think it's worthwhile to run a json-schema validator on these, before allowing a new version to be saved? Might avoid some production errors in the future... [11:55:40] awight: this is what refine currently does, the "raw" data in this case fails to comply with the schema (this is a borderline use case since the schema itself is not well formed) [11:55:47] we can re-run refine anytime on raw data [11:56:00] the script fetches the content of meta.w.o [11:56:06] and then tries to validate etc.. [11:56:15] (not sure if I got your question though) [11:56:22] I mean, to validate the schema itself from Extension:EventLogging so that the Schema:X page cannot be broken [11:56:38] There's already basic JSON validation, but we could do full json-schema validation at save time, in theory. [11:56:49] ahh! [11:56:56] you mean when you save the schema [11:57:03] exactly [11:57:18] got it, might be really good indeed [11:57:57] cool, I'll make a task (or see if one exists already) [11:58:05] ack :) [11:58:21] * elukey interview :) [12:01:58] elukey: np! Also, I just realized that I can make the fixes to the schema but will have to wait until next Monday to deploy the code to use the new schema version. Is that blocking anyone else, or just my own data? [12:51:27] awight: only your data I think [12:51:43] =) [12:51:59] I mean we get some alarms but we can shut them off and/or wait [12:53:32] We've merged a patch pointing to the fixed schema, so I guess this will go out with the train next week. [13:05:57] awight: I am back [13:06:04] one qs - did you update https://meta.wikimedia.org/w/index.php?title=Schema%3ATwoColConflictExit ? [13:06:28] not seeing it from the history but the array seems good now, maybe I am missing something [13:06:49] anyway, in theory if I just re-refine it should work now, since our tool will fetch the new schema [13:07:08] ah but the event will specify a version ok [13:07:13] right right now I got it [13:09:04] 10Analytics: GPUs are not correctly handling multitasking - https://phabricator.wikimedia.org/T248574 (10elukey) We are still not sure what the issue is, but we decided to upgrade stat1008 according to T247082 to have the last ROCm upstream version before contacting the devs. [13:09:20] all right going to have lunch now, ttl! [13:16:32] elukey: Exactly, next week's code will use a new schema which includes https://meta.wikimedia.org/w/index.php?oldid=19905420&diff=19928709 [13:27:58] I found an outstanding task to validate Schema JSON from a MediaWiki save hook: T76432 [13:27:58] T76432: Validate JsonSchemaContent using MediaWIki core's handling - https://phabricator.wikimedia.org/T76432 [13:33:25] 10Analytics, 10patch-welcome: Validate JsonSchemaContent using MediaWIki core's handling - https://phabricator.wikimedia.org/T76432 (10awight) I ran across this today, I believe. [[ https://meta.wikimedia.org/w/index.php?title=Schema:TwoColConflictExit&oldid=19905420 | This revision ]] should have triggered a... [13:34:09] (sorry for the spam!) [13:45:14] Anyone: does the 'time_firstbyte' field in wmf.webrequest indicate "time to first byte of server response"? Thanks. [13:46:37] GoranSM: that's correct! [13:46:56] that's the time it took from when the req was first received by varnish until it started sending response bytes back to the client [13:47:12] ottomata: Thank you! [13:49:06] ottomata: edited the `time_firstbyte` field in the docs: https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Traffic/Webrequest#Current_Schema [13:57:05] ty [14:51:28] 10Analytics, 10Event-Platform, 10serviceops, 10Patch-For-Review, 10Wikimedia-production-error: Lots of "EventBus: Unable to deliver all events" - https://phabricator.wikimedia.org/T247484 (10Joe) 05Open→03Resolved >>! In T247484#5999552, @Ottomata wrote: > Checking in, how goes? We're now down to mu... [14:53:19] isaacj: Heya - do you want to chat about the data doc? mforns and me are reviewing it :) [14:54:51] joal: yeah -- i'm unfortunately about to hop in a meeting in six minutes that's probably at least 45 minutes. if your day is ending, totally understand. otherwise i'm free after thatt [14:55:10] thanks for the feedback too - this is great! [14:55:11] ping when you're back isaacj :) [14:55:16] sounds good - thanks! [15:19:58] 10Analytics, 10Analytics-Kanban, 10EventStreams, 10Operations, and 2 others: EventStreams drops the connection after 15 minutes, which makes it unreliable - https://phabricator.wikimedia.org/T242767 (10stjn) 05Open→03Resolved Frequent disconnects stopped after 25th March, 15:30 UTC, so yes. Thank you f... [15:41:15] 10Analytics, 10Analytics-Kanban, 10Release Pipeline, 10Patch-For-Review, and 2 others: Migrate EventStreams to k8s deployment pipeline - https://phabricator.wikimedia.org/T238658 (10akosiaris) >>! In T238658#5999287, @Ottomata wrote: > Ah we need to merge https://gerrit.wikimedia.org/r/c/operations/puppet/... [15:55:32] PROBLEM - Check if active EventStreams endpoint is delivering messages. on icinga1001 is CRITICAL: CRITICAL: No EventStreams message was consumed from https://stream.wikimedia.org/v2/stream/recentchange within 10 seconds. https://wikitech.wikimedia.org/wiki/Event_Platform/EventStreams/Administration [15:59:48] hmm ^ that is strange [16:01:49] ping mforns [16:11:40] ok joal, i'm back. if you see this message and are still around to meet, let me know [16:12:17] isaacj: in meeting, ping after [16:12:33] :thumbs up: [16:25:12] 10Quarry: Quarry should warn users about space->underscore transformations - https://phabricator.wikimedia.org/T248599 (10RoySmith) [16:25:36] RECOVERY - Check if active EventStreams endpoint is delivering messages. on icinga1001 is OK: OK: An EventStreams message was consumed from https://stream.wikimedia.org/v2/stream/recentchange within 10 seconds. https://wikitech.wikimedia.org/wiki/Event_Platform/EventStreams/Administration [16:27:08] I am running release:perform on the refinery/source repository [16:27:27] trying to get it to complete while running inside a container for T210271 [16:27:28] T210271: Migrate analytics/refinery/source release jobs to Docker - https://phabricator.wikimedia.org/T210271 [16:27:30] it is progressing at https://integration.wikimedia.org/ci/job/analytics-refinery-maven-release-docker/31/console [16:27:33] joal: ^ :) [16:27:53] awesome hashar [16:28:06] hashar: if you need repo being cleaned up let us know [16:33:58] 10Analytics, 10Analytics-Kanban, 10Release-Engineering-Team-TODO, 10Continuous-Integration-Infrastructure (phase-out-jessie), and 2 others: Migrate analytics/refinery/source release jobs to Docker - https://phabricator.wikimedia.org/T210271 (10hashar) Instead of using `ssh-agent` and doing Gerrit access ov... [16:34:10] Failed to execute goal org.apache.maven.plugins:maven-release-plugin:2.5.1:prepare (default-cli) on project refinery: An error is occurred in the checkin process: Exception while executing SCM command. Detecting the current branch failed: fatal: ref HEAD is not a symbolic ref -> [Help 1] [16:34:11] hehe [16:34:34] 10Analytics, 10Research: covid19 data preservation - https://phabricator.wikimedia.org/T248600 (10Nuria) [16:35:10] 10Analytics, 10Research: covid19 data preservation - https://phabricator.wikimedia.org/T248600 (10Nuria) https://docs.google.com/document/d/1WMx847RYGdigdlh4OerA7jRZXwsQbWSaV_xBHeP5Xzs/edit?ts=5e7ac26b# [16:37:51] will dig into that [16:38:18] weird :) [16:46:28] yeah well it is in detached state [16:46:32] I just need to checkout to a branch [16:46:33] ;) [16:54:46] 10Quarry: Quarry should warn users about space->underscore transformations - https://phabricator.wikimedia.org/T248599 (10bd808) https://www.mediawiki.org/wiki/Manual:Title.php#Canonical_forms [17:00:50] 2nd try https://integration.wikimedia.org/ci/job/analytics-refinery-maven-release-docker/32/console ;) [17:00:57] isaacj: heya - we're waiting for you with mforns in da cave if you have time now [17:01:08] https://meet.google.com/rxb-bjxn-nip?authuser=1 [17:01:10] sure thing [17:39:57] * elukey afk for a bit! [18:03:17] (03PS1) 10Hashar: Attempt to allow developerConnection override [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/583714 [18:09:57] and trying yet another build [18:11:18] (03CR) 10Hashar: "That is merely for testing, I have triggered a release:perform build at https://integration.wikimedia.org/ci/job/analytics-refinery-maven-" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/583714 (owner: 10Hashar) [18:12:29] 10Analytics, 10Analytics-Kanban, 10Release-Engineering-Team-TODO, 10Continuous-Integration-Infrastructure (phase-out-jessie), and 2 others: Migrate analytics/refinery/source release jobs to Docker - https://phabricator.wikimedia.org/T210271 (10hashar) I have instructed Maven release plugin to use https wit... [18:15:56] (03CR) 10Hashar: "That seems to work now by passing:" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/583714 (owner: 10Hashar) [18:16:12] still fails but I made some progress [18:16:15] I am off for today [18:18:13] 10Analytics, 10Analytics-Cluster, 10Analytics-Kanban, 10User-Elukey: Upgrade the Hadoop test cluster to BigTop - https://phabricator.wikimedia.org/T244499 (10elukey) The first attempt of rollback was a disaster, I wasn't able to restore HDFS to its previous state. From the documentation it seemed possible... [18:19:50] * elukey off! [18:27:34] PROBLEM - Check the last execution of refinery-drop-webrequest-refined-partitions on an-coord1001 is CRITICAL: NRPE: Command check_check_refinery-drop-webrequest-refined-partitions_status not defined https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [18:42:32] 10Analytics, 10DC-Ops, 10Operations, 10ops-eqiad, 10Patch-For-Review: (Need by: TBD) rack/setup/install an-druid100[12] and druid100[78] - https://phabricator.wikimedia.org/T245569 (10Cmjohnson) a:05Jclark-ctr→03Cmjohnson [19:08:22] 10Analytics, 10Analytics-EventLogging, 10Event-Platform, 10Patch-For-Review, 10Services (watching): Switch all eventgate clients to use new TLS port - https://phabricator.wikimedia.org/T242224 (10Ottomata) @Joe can we call this one done too now? [19:19:49] 10Analytics, 10Technical-blog-posts: Techblog. Change url shape to better measure pageviews - https://phabricator.wikimedia.org/T248614 (10Nuria) [19:59:22] 10Analytics, 10Analytics-Kanban, 10Release-Engineering-Team-TODO, 10Continuous-Integration-Infrastructure (phase-out-jessie), and 2 others: Migrate analytics/refinery/source release jobs to Docker - https://phabricator.wikimedia.org/T210271 (10hashar) Finally a successful build! https://integration.wikime... [21:49:17] (03PS6) 10Ottomata: Support multiple possible schema base URIs in EventSchemaLoader [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/582131 (https://phabricator.wikimedia.org/T240985) [21:49:31] (03CR) 10Ottomata: Support multiple possible schema base URIs in EventSchemaLoader (035 comments) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/582131 (https://phabricator.wikimedia.org/T240985) (owner: 10Ottomata) [21:50:39] 10Analytics, 10Event-Platform, 10Multimedia: Mediaviewer views should be reworked to be an eventlogging event - https://phabricator.wikimedia.org/T239630 (10Milimetric) ::gasps:: Amazing. So much awesome will come from this incremental change. [21:53:01] 10Analytics, 10Event-Platform, 10Multimedia: Mediaviewer views should be reworked to be an eventlogging event - https://phabricator.wikimedia.org/T239630 (10Ottomata) We had a meeting on Wednesday where we decided we want to work on session length metrics via a session ping event first. [21:56:21] 10Analytics: Spike: look at how old Pageview API data is accessed - https://phabricator.wikimedia.org/T247539 (10Milimetric) Expected except for the weird 10% requesting daily data from 2015. That's the one thing that would keep us from reducing historical resolution. It would be nice to isolate that use case...