[00:28:00] 10Analytics, 10Pageviews-API: Yearly endpoint for the /pageviews/top API - https://phabricator.wikimedia.org/T154381 (10MusikAnimal) [11:08:16] !log restart hue on analytics-tool1001 to pick up some new changes (should be a no-op) [11:08:18] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [11:31:02] * elukey lunch! [11:54:09] 10Analytics, 10Product-Analytics, 10Research, 10WMDE-Analytics-Engineering, and 2 others: Replace the current multisource analytics-store setup - https://phabricator.wikimedia.org/T172410 (10Marostegui) 05Open→03Resolved Closing as this is done [12:43:17] Hi [12:43:30] At the bottom of https://grafana.wikimedia.org/d/000000598/content-translation?orgId=1 there's the "Published translations with high amount of unreviewed MT" table. Can I get a JSON file with the values of this table for a particular date using some kind of a web API? [12:46:51] a-team ^ [12:54:55] Hi team - Lino stays home today with me for fever - Only news from me is that sqoop succeeded in 18h this wekend [13:03:52] elukey, joal ^ [13:03:57] any idea about Grafana? [13:04:15] Someone here mentioned a web API for something like this, but I cannot find the details. [13:24:28] 10Analytics, 10Product-Analytics, 10MW-1.33-notes (1.33.0-wmf.21; 2019-03-12), 10Patch-For-Review: Standardize datetimes/timestamps in the Data Lake - https://phabricator.wikimedia.org/T212529 (10Ottomata) Hm. But, also because the Hive/ANSI SQL (thanks for correction Neil, didn't realize that :) ) format... [13:44:54] /away [13:45:11] here I am :) [13:45:17] aharoni: o/ [13:45:53] IIRC we already discussed about the graphite API no? [13:51:29] yep it was earlier in the month https://bots.wmflabs.org/browser/index.php?start=03%2F04%2F2019&end=03%2F04%2F2019&display=%23wikimedia-analytics [13:53:55] so the graphite URL for that metric is https://graphite.wikimedia.org/S/2 (it renders as regular graph, the table is done via grafana) [13:54:13] full link is https://graphite.wikimedia.org/render/?width=586&height=308&target=MediaWiki.cx.publish.highmt.*.sum [13:54:20] now we need to add the rendering in json [13:54:31] appening &format=json [13:55:15] that becomes (removing also width/height since it is not important) [13:55:15] https://graphite.wikimedia.org/render/?target=MediaWiki.cx.publish.highmt.*.sum&format=json [13:55:29] that should be the past 24h worth of datapoints [13:57:01] elukey: OK, but: 1. this is a rendered chart, and I'd like the JSON data. 2. Can I get for a particular date in the past? [13:57:06] if you add "&from=-3hours" it shows only the past three hours [13:57:25] aharoni: did you open my last link? [13:57:35] Oh, missed it. Let me see... [14:00:46] ah, it was a bit slow, but the data seems useful. trying with -3hours. [14:00:54] can I also give it a specific date, instead of relative? [14:01:12] something like 20190224? [14:03:01] aharoni: from https://graphite-api.readthedocs.io/en/latest/api.html#the-render-api-render it seems possible (check RELATIVE vs ABSOLUTE time) [14:06:29] aharoni: in theory from/until can be used with dates [14:07:10] aharoni: please also be careful in querying data, don't pull too much at once without testing first lower time windows :) [14:17:55] elukey: ah, very good. I'll try. [14:17:59] thank you [14:21:20] 10Analytics, 10EventBus, 10Operations, 10Core Platform Team (Modern Event Platform (TEC2)), and 3 others: Possibly expand Kafka main-{eqiad,codfw} clusters in Q4 2019. - https://phabricator.wikimedia.org/T217359 (10herron) According to netbox support for hosts `kafka[12]00[123]` expired in Dec 2018. After... [14:25:10] nuria: can you give me publish permissions for https://www.npmjs.com/package/node-rdkafka-statsd ? [14:25:23] 10Analytics, 10EventBus, 10Operations, 10Core Platform Team (Modern Event Platform (TEC2)), and 3 others: Possibly expand Kafka main-{eqiad,codfw} clusters in Q4 2019. - https://phabricator.wikimedia.org/T217359 (10elukey) In the SRE spreadsheet I can see that the suggested replacement FY is 20/21, not the... [14:36:40] 10Analytics, 10Discovery, 10Operations, 10Research: Workflow to be able to move data files computed in jobs from analytics cluster to production - https://phabricator.wikimedia.org/T213976 (10fgiunchedi) >>! In T213976#5028623, @Nuria wrote: > Ping @fgiunchedi about putting this as a commong goal next quar... [14:58:24] 10Analytics, 10Discovery, 10Operations, 10Research: Workflow to be able to move data files computed in jobs from analytics cluster to production - https://phabricator.wikimedia.org/T213976 (10Nuria) @fgiunchedi sounds good, will try to set up short 30 min meeting? [14:59:22] joal: Hi, I was looking for a volunteer task and wondering if https://phabricator.wikimedia.org/project/view/11/ filtered by "good first bug" is the recommended approach? [15:00:42] ping ottomata elukey joal [15:00:45] standup [15:01:43] ping ottomata standdduppp [15:07:49] OGH NO [15:18:46] 10Analytics, 10Analytics-Data-Quality, 10Product-Analytics: Some registered users have null values for event_user_text and event_user_text_historical in mediawiki_history - https://phabricator.wikimedia.org/T218463 (10Milimetric) p:05Triage→03High [15:20:17] 10Analytics, 10Core Platform Team: Ingest api data (for posts) into druid - https://phabricator.wikimedia.org/T218348 (10Milimetric) p:05Triage→03Normal [15:20:33] 10Analytics, 10Discovery: Ingest cirrusserachrequest data into druid - https://phabricator.wikimedia.org/T218347 (10Milimetric) p:05Triage→03Normal [15:20:55] 10Analytics: Upgrade analytics cluster to cloudera distro CDH 5.16 - https://phabricator.wikimedia.org/T218343 (10Milimetric) p:05Triage→03High [15:22:57] 10Analytics, 10Product-Analytics: automatic ingestion from annotations on schemas into druid - https://phabricator.wikimedia.org/T218319 (10Milimetric) p:05Triage→03Normal [15:23:47] 10Analytics, 10EventBus, 10Patch-For-Review, 10Services (watching), 10cloud-services-team (Kanban): EventGate wikimedia implementation should emit rdkafka stats - https://phabricator.wikimedia.org/T218305 (10Milimetric) p:05Triage→03High [15:33:53] 10Analytics, 10Product-Analytics: `rev_parent_id` and `rev_content_changed` are missing in event.mediawiki_revision_tags_change - https://phabricator.wikimedia.org/T218274 (10Milimetric) @Tgr and @Pchelolo can you look at why mediawiki is not emitting those fields? Backfilling is too expensive, but you can jo... [15:35:03] 10Analytics, 10EventBus, 10Operations, 10Prod-Kubernetes, and 2 others: eventgate-analytics k8s pods occasionally can't produce to kafka - https://phabricator.wikimedia.org/T218268 (10Milimetric) p:05Triage→03High [15:35:27] 10Analytics, 10Analytics-Kanban, 10EventBus, 10Operations, and 3 others: eventgate-analytics k8s pods occasionally can't produce to kafka - https://phabricator.wikimedia.org/T218268 (10Milimetric) [15:35:41] 10Analytics, 10Analytics-Kanban, 10EventBus, 10Core Platform Team Kanban (Doing), and 2 others: Decrease timeout for EventBus extension for analytics events - https://phabricator.wikimedia.org/T218260 (10Milimetric) [15:36:01] 10Analytics, 10Analytics-Kanban, 10EventBus, 10Core Platform Team (Security, stability, performance and scalability (TEC1)), and 4 others: EventBus extension should never log unserialized events - https://phabricator.wikimedia.org/T218254 (10Milimetric) [15:36:07] 10Analytics, 10Analytics-Kanban, 10EventBus, 10Core Platform Team Kanban (Doing), and 2 others: Decrease timeout for EventBus extension for analytics events - https://phabricator.wikimedia.org/T218260 (10Milimetric) p:05Triage→03High [15:36:16] 10Analytics, 10Analytics-Kanban, 10EventBus, 10Core Platform Team (Security, stability, performance and scalability (TEC1)), and 4 others: EventBus extension should never log unserialized events - https://phabricator.wikimedia.org/T218254 (10Milimetric) p:05Triage→03High [15:41:02] 10Analytics, 10Product-Analytics: `rev_parent_id` and `rev_content_changed` are missing in event.mediawiki_revision_tags_change - https://phabricator.wikimedia.org/T218274 (10Pchelolo) I think that the schema is incorrect here. Changing the tags for the revision cannot modify the revision contents, so the exis... [15:41:29] 10Analytics, 10Product-Analytics: Eventbus revisions are duplicated in event.mediawiki_revision_tags_change - https://phabricator.wikimedia.org/T218246 (10Milimetric) @Pchelolo we looked at this and the request id and id were different as well. Any idea where the duplication occurs? I'd be curious to help de... [15:42:01] 10Analytics, 10Product-Analytics: Eventbus revisions are duplicated in event.mediawiki_revision_tags_change - https://phabricator.wikimedia.org/T218246 (10Milimetric) p:05Triage→03Normal [15:43:16] 10Analytics, 10Analytics-Kanban, 10MediaWiki-Vagrant: Vagrant initial provision fails on NodeJS version mismatch - https://phabricator.wikimedia.org/T218238 (10Milimetric) [15:43:23] 10Analytics, 10Analytics-Kanban, 10MediaWiki-Vagrant: Vagrant initial provision fails on NodeJS version mismatch - https://phabricator.wikimedia.org/T218238 (10Milimetric) p:05Triage→03High [15:45:10] 10Analytics: Update mediawiki-history subgraph-partitioner so that it uses page_id for pages - https://phabricator.wikimedia.org/T218130 (10Milimetric) p:05Triage→03Normal [15:47:05] 10Analytics, 10User-Elukey: Upgrade matomo1001 to latest upstream - https://phabricator.wikimedia.org/T218037 (10Milimetric) p:05Triage→03High [15:48:07] 10Analytics, 10Discovery-Search (Current work), 10Patch-For-Review: Publish both shaded and unshaded artifacts from analytics refinery - https://phabricator.wikimedia.org/T217967 (10Milimetric) p:05Triage→03High [15:49:34] 10Analytics, 10Discovery-Search (Current work), 10Patch-For-Review: Publish both shaded and unshaded artifacts from analytics refinery - https://phabricator.wikimedia.org/T217967 (10Milimetric) p:05High→03Normal [15:50:30] 10Analytics: Wikistats: Change Mercator Projection to Eckert IV - https://phabricator.wikimedia.org/T218045 (10Milimetric) [15:50:59] 10Analytics: Wikistats: Change Mercator Projection to Eckert IV - https://phabricator.wikimedia.org/T218045 (10Milimetric) [15:51:00] 10Analytics, 10Analytics-Wikistats: Changes to map projection in wikistats - https://phabricator.wikimedia.org/T188927 (10Milimetric) [15:51:17] 10Analytics, 10Analytics-EventLogging, 10MediaWiki-API, 10MW-1.33-notes (1.33.0-wmf.22; 2019-03-19), and 2 others: ApiJsonSchema implements ApiBase::getCustomPrinter for no good reason - https://phabricator.wikimedia.org/T91454 (10Anomie) >>! In T91454#5029931, @gerritbot wrote: > Change 264494 **merged**... [16:07:48] awight: for a volunteer task (thanks!) i think looking for "good first bug" in backlog might return many tasks on "radar" that our team does not work on, let me see [16:12:54] https://phabricator.wikimedia.org/T144100 looks reasonable, with a non-zero amount of challenge. [16:13:37] nuria: node-rdkafka-statsd npm rights pllzlzzzz [16:13:50] This could be hilarious as well, https://phabricator.wikimedia.org/T91454 [16:16:51] awight: the first one sounds good [16:16:55] awight: thnak you [16:16:57] *thank [16:17:00] ottomata: coming [16:17:35] nuria: Thanks for the note! [16:18:25] ottomata: this one right? node-rdkafka-statsd [16:24:25] ottomata: what is your npm user? [16:28:17] ottomata, Pchelolo : gave you rights in https://www.npmjs.com/package/node-rdkafka-statsd/access [16:29:17] ottomata: probably a good candidate package to move to github [16:29:26] ottomata: 12 downloads a week, wowowow [16:31:46] oh great, thanks! [16:47:31] ottomata: How does this sound for a goal ? (it is kind of dry) "" Initial workflow to move files from analytics' computation infrastructure to production infrastructure"" [16:48:20] sounds good [16:49:46] nuria: Pchelolo still can't publish to node-rdkafka-statsd [16:51:18] ottomata: wait, you can push but not Pchelolo ? [16:51:54] as FYI phab/gerrit are down for maintenance, see #operations [16:52:43] sorry ottomata I was eating [16:53:06] try enabling 2fa [16:55:39] worked Pchelolo ! [16:59:16] ottomata, Pchelolo ok, so you both are set right? [16:59:20] yup thank you! [17:11:54] phab kaput? [17:15:42] nuria: yeah maintenance, see my updated above :) [17:15:58] elukey: k [18:34:29] ottomata: o/ [18:34:33] do you have a minute? [18:34:38] ya for sure [18:34:43] i'm blocked on most things now anyway :p [18:34:51] (because gerrit/phab) [18:35:09] have you ever had the need to raise logging for a mapred job to say DEBUG? [18:35:20] I mean also logging for the MR2 framwork [18:35:26] *framework [18:35:36] I am trying but somethign seems not working [18:35:58] I need to figure out why TLS doesn't work [18:36:20] hm [18:36:23] for a job? [18:36:36] in theory yes [18:36:47] I'd need to know what say the shuffler does etc.. [18:36:50] at debug level [18:37:33] i don't think i've ever done that. [18:37:38] for example get things like [18:37:38] https://github.com/apache/hadoop/blob/branch-2.6.0/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Fetcher.java#L445 [18:37:53] that would be extremely useful to know if a plaintext conn is still used or not [18:38:01] aye [18:38:39] am googling [18:38:56] elukey: maybe [18:38:58] https://hadoop.apache.org/docs/r2.7.2/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml [18:39:02] grep for log.level settings [18:39:08] bug, i duno about shuffle [18:39:21] i guess that is reduce code you linked [18:39:32] yeah [18:40:24] elukey: maybe https://stackoverflow.com/a/27862971/555565 [18:40:35] thta's the app master though [18:41:15] that thread also suggests trying [18:41:15] -Dyarn.app.mapreduce.am.log.level=DEBUG,console -Dmapreduce.map.log.level=DEBUG,console -Dmapreduce.reduce.log.level=DEBUG,console [18:42:00] didn't try those, might be a good suggestion [18:42:12] do you think that I could pass them directly to oozie via -D ? [18:42:24] or should I set it properly on mapred-site.xml? [18:47:32] ya you shoudl be able to pass [18:47:33] hmm [18:47:35] not sure if via oozie [18:47:39] hmm [18:47:51] you might need to get oozie .xml to pass somehow [18:48:02] elukey: what job are you testing? [18:48:32] ottomata: it is in the testing cluster, the webrequest-test-load [18:48:54] I just launched a 1h coordinator [18:49:04] passing parameters directly to the oozie launcher [18:49:10] I doubt it will work as you said [18:49:14] but I'll test [18:49:15] :) [18:49:25] it is crazy how difficult it is to turn log to debug [18:50:38] oooo you can pass them to the hive action in the workflow i think elukey [18:50:39] if that doesn't wrok [18:50:40] e.g. [18:50:46] https://www.irccloud.com/pastebin/6lR7HH15/ [18:50:54] https://oozie.apache.org/docs/3.3.1/DG_HiveActionExtension.html [18:51:12] actually, if that works in the workflow.xml [18:51:24] i betcha it will work on the CLI with -D like you are tryign [18:53:32] I'll also try the hive suggestion thanks! [18:55:48] basically I get stuff like [18:55:48] 2019-03-18 18:50:56,193 ERROR org.apache.hadoop.mapred.ShuffleHandler: Shuffle error: [18:55:52] org.jboss.netty.handler.ssl.NotSslRecordException: not an SSL/TLS record [18:55:56] oo hm [18:56:12] nasty [18:56:23] I am sure that I am not using correctly one setting [18:56:34] and at the end I'll flip my desk [19:01:10] (the setting passed to oozie didn't work) [19:01:37] :/ [19:02:17] all right will restart testing tomorrow [19:02:18] o/ [19:02:35] thanks for the help! [19:02:39] ok sorry couldn't be more [19:02:40] laters elukey ! [20:16:34] ottomata, hi, I was wondering if you made deployment-eventgate-analytics-1 be jessie for any particular reason [20:16:42] is this mirroring a prod jessie setup? [20:17:43] just looking at the recent jessie instance creations in deployment-prep [20:17:47] Hallo. Earlier today I asked how can I get JSON data for the chart at the bottom of https://grafana.wikimedia.org/d/000000598/content-translation?orgId=1 . elukey suggested something like this: [20:17:54] https://graphite.wikimedia.org/render/?target=MediaWiki.cx.publish.highmt.*.sum&format=json&from=20190201&until=20190202 [20:18:12] This doesn't seem to produce something really useful. [20:18:49] It gives a JSON in which for every language code there are 24 array members with "null" and a number, and the numbers are the same in all the languages. [20:18:59] Am I doing something wrong? [20:19:17] What I really want to know is how many times was this event logged in every language. [20:27:01] If, for example, I choose a range of dates at the top of the screen at https://grafana.wikimedia.org/d/000000598/content-translation?orgId=1 , then I get sensible results, but I'd love to have it scripted with wget or something. [20:29:04] Krenair: it uses the docker_service stuff [20:29:08] so it runs docker machine [20:29:18] which iiuc is not avialable in stretch [20:29:21] ok [20:29:30] are you likely to need to make another in the future? [20:29:40] its possible ya, [20:29:44] well [20:29:46] urandom just made one too [20:29:48] for another service [20:29:58] I saw, hadn't got to that yet [20:30:22] the thing is they're about to disable (may have already disabled?) jessie instance creation [20:30:35] so new things should be stretch [20:30:52] and existing things will need to be ported in order to have a future [20:32:25] https://github.com/wikimedia/puppet/blob/73aabd204810f1ba9212431cf84b220dfedb7340/modules/profile/manifests/docker/engine.pp [20:32:31] it look slike there is some stretch special casing in there [20:32:39] i don't remember exactly, but it didn't work [20:33:07] need to ping _joe_ on that one I think [20:33:11] ok [20:37:22] Krenair: ah i have a message from urandom a few days ago [20:37:24] "for point of reference, the reason a docker_services using VM must be jessie is because it relies on the docker-engine package in jessie" [21:45:55] (03CR) 10Ottomata: "Sorry I tried to leave a comment earlier but then gerrit went down!" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/496885 (https://phabricator.wikimedia.org/T210844) (owner: 10Bmansurov) [22:00:46] (03PS11) 10Ottomata: Event(Logging) schema loader [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/492399 (https://phabricator.wikimedia.org/T215442) [22:01:58] (03PS6) 10Ottomata: Add SparkSchemaLoader capabilities to Refine and RefineTarget [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/494831 (https://phabricator.wikimedia.org/T215442) [22:07:31] (03CR) 10jerkins-bot: [V: 04-1] Add SparkSchemaLoader capabilities to Refine and RefineTarget [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/494831 (https://phabricator.wikimedia.org/T215442) (owner: 10Ottomata) [22:19:30] (03CR) 10Nuria: [C: 03+1] "My comment as to tests on patch 10 stands (i do not think methods should be just public for testing alone) but other than that +2" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/492399 (https://phabricator.wikimedia.org/T215442) (owner: 10Ottomata) [22:28:32] 10Analytics, 10Analytics-EventLogging, 10MediaWiki-API, 10MW-1.33-notes (1.33.0-wmf.22; 2019-03-19), and 2 others: ApiJsonSchema implements ApiBase::getCustomPrinter for no good reason - https://phabricator.wikimedia.org/T91454 (10Anomie) >>! In T91454#5033443, @zeljkofilipin wrote: > Train blockers are UB... [22:30:40] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10EventBus, and 3 others: Modern Event Platform: Stream Intake Service: Implementation: Deployment Pipeline - https://phabricator.wikimedia.org/T211247 (10Krenair) (Please see {T218609} regarding that deployment-prep instance) [22:30:45] 10Analytics, 10Analytics-EventLogging, 10MediaWiki-API, 10MW-1.33-notes (1.33.0-wmf.22; 2019-03-19), and 2 others: ApiJsonSchema implements ApiBase::getCustomPrinter for no good reason - https://phabricator.wikimedia.org/T91454 (10Krinkle) To summarise for here: I've merged a change in the master branch. T... [22:30:48] 10Analytics, 10Analytics-EventLogging, 10MediaWiki-API, 10MW-1.33-notes (1.33.0-wmf.22; 2019-03-19), and 2 others: ApiJsonSchema implements ApiBase::getCustomPrinter for no good reason - https://phabricator.wikimedia.org/T91454 (10Krinkle) a:03Krinkle [22:30:51] 10Analytics, 10Analytics-EventLogging, 10MediaWiki-API, 10MW-1.33-notes (1.33.0-wmf.22; 2019-03-19), and 2 others: ApiJsonSchema implements ApiBase::getCustomPrinter for no good reason - https://phabricator.wikimedia.org/T91454 (10Krinkle) p:05Unbreak!→03High [22:32:41] (03CR) 10Nuria: Add SparkSchemaLoader capabilities to Refine and RefineTarget (032 comments) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/494831 (https://phabricator.wikimedia.org/T215442) (owner: 10Ottomata) [22:32:52] 10Analytics, 10Analytics-EventLogging, 10MediaWiki-API, 10MW-1.33-notes (1.33.0-wmf.22; 2019-03-19), and 2 others: ApiJsonSchema implements ApiBase::getCustomPrinter for no good reason - https://phabricator.wikimedia.org/T91454 (10Jdforrester-WMF) Why not just back-port it and SWAT today? [22:33:48] 10Analytics, 10Product-Analytics: MobileWikiAppiOSUserHistory schema uses array for items type - https://phabricator.wikimedia.org/T218617 (10Ottomata) [22:34:52] 10Analytics, 10CirrusSearch, 10Discovery, 10Discovery-Search: Ingest cirrusserachrequest data into druid - https://phabricator.wikimedia.org/T218347 (10Nuria) [22:42:31] (03CR) 10Nuria: WIP: Add workflow for article-recommender (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/496885 (https://phabricator.wikimedia.org/T210844) (owner: 10Bmansurov) [22:43:13] nuria: hm, I EventSparkSchemaLoader is an implementation of trait [22:43:17] which is kinda like an interface [22:43:23] the trait is defined for RefineTarget [22:43:37] the interfacae is a class that has a loadSchema method like [22:43:41] def loadSchema(target: RefineTarget): Option[StructType] [22:43:57] so, anything could implement a SparkSchemaLoader for a RefineTarget [22:44:04] which might not use URIs at all [22:44:31] the Refine job is a specific implementation that uses RefineTargets to load dataframes [22:44:39] and it instatiates them with SparkSchemaLoaders, [22:44:42] ottomata: aha [22:44:51] it just happens that the only way we use loaders (yet) is with URIs [22:45:20] the default is actually just this ExplicitSchemaLoader(None) [22:45:41] but you could e.g. give them same explicit schema loader with the schema pre-defined to any refine target [22:45:44] ottomata: i was thinking more of [22:45:52] and it wouldnt' use the java stuff at all [22:45:57] https://www.irccloud.com/pastebin/jwSDfI4L/ [22:46:13] ya? [22:47:00] https://www.irccloud.com/pastebin/CvVbP1iA/ [22:47:56] ottomata: so refine does not need to special case EL at all [22:48:06] ottomata: right? or you think is not needed [22:48:11] the special casing will be done somewhere tho [22:48:30] nuria [22:48:31] heheh [22:48:33] how about: [22:48:42] i make a overloaded apply method in EventSparkSchemaLoader [22:48:45] that takes string, string [22:48:54] and will do as you say :) [22:49:25] ungh [22:49:29] i think that's uglier [22:49:39] i'd rather the special casing be done in Refine [22:49:42] it is more specific [22:49:44] ottomata: ya uglier probably and also that does not push eventlogging issues out of refine , they will still be there [22:49:47] for WMF use cases [22:50:03] nuria they are still t here no matter what we do! [22:50:09] i'm running into issues with the draft 3 schemas [22:50:18] and having to special case for them in SparkSchemaConverter :/ [22:50:37] ottomata: ah that i did not see, let me look [22:50:47] oh i haven't pushed yet [22:50:53] but e.g. [22:51:02] required is not an list of required fields in draft 3 [22:51:05] its a property of a field [22:51:09] also [22:51:18] https://phabricator.wikimedia.org/T218617 [22:51:19] 10Analytics, 10Product-Analytics: Fix EventLogging schemas that use array for items type - https://phabricator.wikimedia.org/T218617 (10Ottomata) [22:53:44] ottomata: i see, you are going to have to do mapping from convention to hive table convention [22:53:57] well, sort of. [22:54:03] those issues are draft 3 vs draft 7 [22:54:10] jo al just implemented the converter for draft 7 [22:54:12] which is good [22:54:19] but if we want to support eventlogging, draft 3 needs to work too [22:55:00] ottomata: ok, i concede [22:55:04] ottomata: also https://www.kubeflow.org/ [22:55:11] cool! [22:55:39] ya nuria i was thinking that we should one day probably figure out how to migrate notebooks to k8s too [22:55:56] that would be pretty amazing and solve contention problems on notebook hosts [22:55:56] ottomata: writing ticket [22:56:08] it'll be a long time before that happens though [22:56:25] i'd expect a year or two before we could start on that, we'd need more k8s clusters, maybe an 'analytics' k8s cluster [22:56:31] and our k8s tooling is really primitive right now. [22:56:45] helm is great, but there are TONS of flags you have to set to do anything [22:57:06] (03CR) 10Ottomata: Add SparkSchemaLoader capabilities to Refine and RefineTarget (031 comment) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/494831 (https://phabricator.wikimedia.org/T215442) (owner: 10Ottomata) [22:57:10] 10Analytics: Migrate jupyter notebooks to kubernetes - https://phabricator.wikimedia.org/T218621 (10Nuria) [23:07:13] ottomata: is there a document about outage caused by event gate? [23:08:03] nuria: https://wikitech.wikimedia.org/wiki/Incident_documentation/20190313-MW_API_App_Servers_and_EventGate [23:14:24] ottomata: thank you, red and understood [23:15:27] *read [23:20:26] (03PS7) 10Ottomata: Add SparkSchemaLoader capabilities to Refine and RefineTarget [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/494831 (https://phabricator.wikimedia.org/T215442) [23:21:11] nuria: a reason I want those methods to be public: [23:21:17] i just used them to debug why my schemas were wrong [23:21:30] without having to construct an actual JSON event [23:21:52] i used spark2-shell repl and was able to load and encapsulate schemas [23:21:54] to figure out what was going on [23:21:58] if they were private/protected [23:21:59] i couldn't do that [23:22:16] ottomata: you could if theere was logging as to what is happening to teh request no? [23:22:38] ya there is logging, but i couldnt' figure out where the problem wasa for a while [23:23:07] it turned out to be in the hardcoded capsule schema [23:23:12] i thought it was in my Spark stuff [23:23:18] i had spaces in the capsule field names [23:23:36] it is really handy to be able to do e.g. getEventLoggingSchema("NavigationTiming") [23:24:08] insteadd of having to do getEventSchema("{"schema": NavigationTiming", "event": {...} }") [23:24:31] the getEventSchema from an event is more of a convenience method anyway, for use cases when you have an actual event you want the schema for [23:25:36] ottomata: mmm.. i think if we had debug logging will have same effect and would work for other users w/o having to read code no? like getEventSchema("{"schema": NavigationTiming", "event": { empty-event} }") would print a trace [23:25:57] right, but then the next time you need something that isn't logged to debug [23:26:07] you need to commit a patch and compile and deploy [23:26:21] when something goes wrong, i like to fire up a repl and execute code to figure irt out [23:27:27] i can't see any method in this class that wouldn't be nice to have public [23:27:53] its a helper class for getting with EventLogging events/schemas [23:27:59] why not have helper methods? [23:29:33] ottomata: cause it clouds the map as to what the class is really doing ? [23:29:44] the map? [23:31:00] ottomata: sorry, it clouds matters as to what are the responsabilities of the class, clients will use them directly (when they were not mean for that) and you cannot change your code w/o upgrading a bunch of client code [23:31:34] our clients are us! [23:31:34] :) [23:31:38] so instead of having 1 touch up point you end up having several [23:31:52] you are saying it expands the API unnecessarily [23:31:56] i get that argument sometimes [23:31:58] ottomata: yes [23:32:35] but here i don't think it matters. refinery-source isn't really usable anywhere outside of our Hadoop cluster. [23:32:49] and I'm not just putting these public methods here for fun, i want to use them! [23:33:34] ottomata: true that in our case we are the main consumers of this code but even then i think having less touch points is better [23:34:08] haha ok then nuria when refine breaks in prod i'll call you to trouble shoot :p [23:36:40] ottomata: i think i will not be able to convince you , so well... it is what it is [23:47:35] i gotta run, but if someone could ping me with whatever repo the mediawiki history runner is in that might be useful. Having an odd spark error related to inserting new partitions in an existing table, and I'm pretty sure joal mentioned he had solved a simliar issue there [23:48:22] ebernhardson: that should be in refinery-source somewhere [23:49:06] ottomata: doh, i was going to look there but figured it was too big to be burried in that repo :) Thanks! will look