[01:17:48] a-team: when does the AQS new pages endpoint get updated for a new month of data? [01:18:08] The latest month I can currently get is September [01:18:34] Sorry, I mean, the latest month I can currently get is August [01:18:39] And there's nothing for September [01:19:21] e.g. https://wikimedia.org/api/rest_v1/metrics/edited-pages/new/commons.wikimedia.org/all-editor-types/content/monthly/20180701/20181001 [01:19:29] Doesn't include September [02:28:14] neilpquinn: the job that needs to run is "mediawiki-history-reduced-coord", its current url is https://hue.wikimedia.org/oozie/list_oozie_coordinator/0000296-181009135629101-oozie-oozi-C/ [02:28:25] it looks like it's done for September [02:29:14] and the druid coordinator seems to have finished too: https://hue.wikimedia.org/oozie/list_oozie_coordinator/0075866-180705103628398-oozie-oozi-C/ [02:29:44] oops, no nvm, that's not it [02:30:14] I was right the first time, the reduced job has the druid loading in there: https://hue.wikimedia.org/oozie/list_oozie_workflow/0000297-181009135629101-oozie-oozi-W/?coordinator_job_id=0000296-181009135629101-oozie-oozi-C [02:32:49] so theoretically they should show up but I remember Joseph saying something about having to restart something. Maybe they did something manual and it should show up I'm guessing tomorrow [03:17:35] (03CR) 10Zhuyifei1999: "One issue: this running time is global and not local to any single result set. Repeating for every result set does not make sense to me an" [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/465659 (https://phabricator.wikimedia.org/T126888) (owner: 10Framawiki) [03:59:09] (03CR) 10Milimetric: [C: 031] "Looks good, one suggestion, I'm fine leaving it as is too (you can +2 either way if you push a patch and I'm slow to get to it)." (031 comment) [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/465296 (https://phabricator.wikimedia.org/T206479) (owner: 10Nuria) [04:30:29] (03PS6) 10Nuria: Time dimension should be reseted to "1-Month" for top metrics [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/465296 (https://phabricator.wikimedia.org/T206479) [07:10:15] Morning elukey - For when you're back from physio: https://gerrit.wikimedia.org/r/466018 [07:15:51] joal: Bonjour! aqs1004 depooled and ready for testing [07:29:40] elukey: thorough testing done - Everything looks good - Green light for full deploy :) [07:29:47] <3 [07:30:22] I am currently trying to install the os on stat1007 \o/ [07:30:47] \o/ indeed ! GPU to come :) [07:31:28] I hope that it will be possibile with the current state of drivers etc.. [07:31:38] it has been sitting there for a year? [07:32:08] more than that, 2/3 years I think [07:32:15] :( [07:32:33] indeed [07:38:18] joal: so the camus_checker alarm is kinda weird, in theory IIRC it should have been fixed after the first puppet run on an-coord1001 post mediawiki switchover [07:38:30] ah no wait now I know wh [07:38:32] why [07:38:38] we didn't switch services yet [07:38:39] elukey: I agree - I was about to say we should discuss this with Andrew [07:38:42] only mediawiki [07:38:46] Ah [07:38:47] so the puppet variable changed [07:38:49] ok [07:38:52] but not the traffic of eventbus [07:38:53] okok [07:39:01] hm - not nice [07:39:02] it will keep spamming until later on today [07:39:06] k [07:39:14] but false positive [07:44:31] joal: aqs updated [07:45:25] \o/ :) Thanks elukey - wikistat/v2 looks good :) [07:47:54] \o/ [08:06:18] 10Analytics, 10Operations, 10ops-eqiad, 10Patch-For-Review, 10User-Elukey: rack/setup/install stat1007.eqiad.wmnet (stat1005 user replacement) - https://phabricator.wikimedia.org/T203852 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by elukey on neodymium.eqiad.wmnet for hosts: ``` stat1007... [08:08:22] so manual install + boot went fine, I am properly re-doing it with wmf-auto-reimage so it will be added to puppet etc.. -^ [08:08:38] after this we could surely thing about moving stuff from stat1005 to 1007 [08:08:43] (and eventually users) [08:09:07] the only fear that I have is the time that will be required to make the GPU working.. [08:21:07] so manual install + boot went fine, I am properly re-doing it with wmf-auto-reimage so it will be added to puppet etc.. -^ [08:21:10] 10:08:38 <@elukey> after this we could surely thing about moving stuff from stat1005 to 1007 [08:21:13] 10:08:44 <@elukey> (and eventually users) [08:21:58] wrong copy paste? :D [08:22:03] Maaaan - Sorry for the span - wrong mouse movement [08:22:07] ahahhah [08:22:07] indeed [08:22:09] :( [08:22:20] * joal hides in blushing [08:46:43] elukey@stat1007:~$ [08:46:45] yesss [08:46:55] /dev/mapper/stat1007--vg-data 7.2T 89M 6.8T 1% /srv [08:47:01] also disk partitions looks good [08:51:09] 10Analytics, 10Operations, 10ops-eqiad, 10Patch-For-Review, 10User-Elukey: rack/setup/install stat1007.eqiad.wmnet (stat1005 user replacement) - https://phabricator.wikimedia.org/T203852 (10ops-monitoring-bot) Completed auto-reimage of hosts: ``` ['stat1007.eqiad.wmnet'] ``` and were **ALL** successful. [08:54:56] 10Analytics, 10Operations, 10ops-eqiad, 10Patch-For-Review, 10User-Elukey: rack/setup/install stat1007.eqiad.wmnet (stat1005 user replacement) - https://phabricator.wikimedia.org/T203852 (10elukey) a:05RobH>03elukey [08:55:22] 10Analytics, 10Operations, 10ops-eqiad, 10Patch-For-Review, 10User-Elukey: rack/setup/install stat1007.eqiad.wmnet (stat1005 user replacement) - https://phabricator.wikimedia.org/T203852 (10elukey) 05Open>03Resolved Done! Will follow up in another task to replace stat1005 with this new host. [08:56:53] 10Analytics, 10Analytics-Kanban, 10User-Elukey: Q1 2018/19 Analytics procurement - https://phabricator.wikimedia.org/T198694 (10elukey) [09:33:27] 10Analytics, 10Analytics-Kanban, 10Page-Issue-Warnings, 10Product-Analytics, and 3 others: Ingest data from PageIssues EventLogging schema into Druid - https://phabricator.wikimedia.org/T202751 (10mforns) @Tbayer > Another question: It seems that the dimensions lack e.g. Ua Browser Major and other user a... [09:43:06] 10Analytics, 10Analytics-Kanban, 10Page-Issue-Warnings, 10Product-Analytics, and 3 others: Ingest data from PageIssues EventLogging schema into Druid - https://phabricator.wikimedia.org/T202751 (10mforns) @Tbayer > It occurred to me afterwards though that this might be because we were looking at the Coun... [12:07:22] elukey: https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/466516/ [12:08:41] ema: thank you! [12:10:34] pleasure! [12:20:29] (03PS2) 10Fdans: [wip] Add change_tag to mediawiki_history sqoop [analytics/refinery] - 10https://gerrit.wikimedia.org/r/465416 [12:25:46] elukey: could you in exchange sanity check this patch here? https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/466483/ :) [12:25:59] (that's how I found the '-c 1' issue) [12:27:07] ema: looks good! [12:28:02] elukey: thanks :) [12:42:36] (03PS3) 10Fdans: [wip] Add change_tag to mediawiki_history sqoop [analytics/refinery] - 10https://gerrit.wikimedia.org/r/465416 [13:10:32] 10Analytics: Turnilo export and displayed data do not match - https://phabricator.wikimedia.org/T206760 (10atgo) [13:11:13] 10Analytics: Turnilo export and displayed data do not match - https://phabricator.wikimedia.org/T206760 (10atgo) [13:11:24] 10Analytics, 10New-Readers: Turnilo export and displayed data do not match - https://phabricator.wikimedia.org/T206760 (10atgo) [13:15:15] 10Analytics, 10New-Readers: Turnilo export and displayed data do not match - https://phabricator.wikimedia.org/T206760 (10atgo) [13:18:20] hmm, elukey o/ [13:18:28] so we are getting those camus partition checker emails [13:18:35] mediawiki::state('primary_dc') is returning eqiad [13:18:36] but [13:18:40] the data is still in the codfw topics! [13:18:50] are we in some intermediate stage atm? [13:18:52] do you know? [13:39:30] (03PS1) 10Gilles: Keep recently added navtiming + survey fields [analytics/refinery] - 10https://gerrit.wikimedia.org/r/466607 (https://phabricator.wikimedia.org/T187299) [13:47:21] ottomata: o/ [13:47:35] i got my answer in mw sec! [13:47:40] makes sense [13:47:55] ahhh ok sorry! I as was afk :( [13:48:06] np! [13:48:43] btw i will be at stand up after all! not picking up baby til the later afternoon now [14:11:19] ottomata: ok for me to stop eventlogging and reboot eventlog1002? [14:12:58] elukey: +1 [14:20:08] 10Analytics, 10Analytics-Kanban: Reboot Analytics hosts for kernel security upgrades - https://phabricator.wikimedia.org/T203165 (10elukey) [14:20:58] !log reboot eventlog1002 for kernel upgrades [14:20:59] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [14:28:53] * ottomata running home, back on in a little bit, def before stand up [14:39:07] (03CR) 10Fdans: [V: 032 C: 032] "This will all change when we rework the time range selector and we're able to pick other months in top metrics, but for now I think this i" [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/465296 (https://phabricator.wikimedia.org/T206479) (owner: 10Nuria) [14:55:52] elukey, hey :] qq: how can I restart Turnilo? I'm in thorium [14:56:16] mforns: you still cannot because the sre team didn't merge the sudo permissions [14:56:22] elukey, ah ok [14:56:28] now turnilo is on analytics-tool1002 though [14:56:36] oh ok, will update docs [14:56:43] if you want I can restart it now [14:57:12] can you restart it please when it's good for you? [14:57:22] oh, ok :] [15:04:02] elukey, no need to restart turnilo, I see that dimensions have updated... O.o [15:04:14] ack! [15:04:18] sorry for noise [15:17:13] (03PS1) 10Fdans: Release 2.4.5 [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/466659 [15:20:05] (03CR) 10Fdans: [V: 032 C: 032] Release 2.4.5 [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/466659 (owner: 10Fdans) [15:28:55] (03CR) 10Mforns: [C: 032] Refactor EventLoggingToDruid to use whitelists and ConfigHelper [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/465532 (https://phabricator.wikimedia.org/T206342) (owner: 10Mforns) [15:29:44] (03CR) 10Mforns: "This is tested and ready for CR and merge if appropriate :]" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/465532 (https://phabricator.wikimedia.org/T206342) (owner: 10Mforns) [15:30:45] 10Analytics, 10New-Readers: Turnilo export and displayed data do not match - https://phabricator.wikimedia.org/T206760 (10Nuria) @atgo: I think the "export to csv" does not work on the "totals" view. At least that option is completely disabled when you come to the tool on your first pageview. Try to do a ti... [15:32:27] (03CR) 10Ottomata: [C: 031] Refactor EventLoggingToDruid to use whitelists and ConfigHelper [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/465532 (https://phabricator.wikimedia.org/T206342) (owner: 10Mforns) [15:33:20] 10Analytics, 10Analytics-Kanban, 10Operations, 10Traffic, and 2 others: Add Accept header to webrequest logs - https://phabricator.wikimedia.org/T170606 (10Ottomata) [15:34:10] 10Analytics, 10Analytics-Kanban, 10EventBus, 10ORES, and 4 others: Modify revision-score schema so that model probabilities won't conflict - https://phabricator.wikimedia.org/T197000 (10Ottomata) @Pchelolo merging https://gerrit.wikimedia.org/r/#/c/mediawiki/event-schemas/+/439917/9 will require a coordina... [15:37:30] 10Analytics, 10Analytics-Kanban, 10EventBus, 10ORES, and 4 others: Modify revision-score schema so that model probabilities won't conflict - https://phabricator.wikimedia.org/T197000 (10Pchelolo) @Ottomata I think there should be 2 steps here - first we just stop emitting events completely - second we de... [15:46:41] He team - sending an email about me missing meetings later on - I Didn't manage to work, I'm taking it off [16:00:32] 10Analytics, 10Analytics-Kanban, 10EventBus, 10ORES, and 4 others: Modify revision-score schema so that model probabilities won't conflict - https://phabricator.wikimedia.org/T197000 (10Ottomata) +1 plan sounds good to me. [16:00:46] ping milimetric [16:29:38] 10Analytics, 10Analytics-Kanban, 10EventBus, 10ORES, and 4 others: Modify revision-score schema so that model probabilities won't conflict - https://phabricator.wikimedia.org/T197000 (10Pchelolo) https://github.com/wikimedia/change-propagation/pull/295 [16:58:35] 10Analytics, 10New-Readers: Turnilo export and displayed data do not match - https://phabricator.wikimedia.org/T206760 (10fdans) a:03Nuria [17:02:09] 10Analytics: Cleanup refinery artefact folder from old jars - https://phabricator.wikimedia.org/T206687 (10fdans) p:05Triage>03Normal [17:06:26] 10Analytics: AQS: category information per WikiProject (ex: wiki-project-medicine) available in API - https://phabricator.wikimedia.org/T206686 (10fdans) [17:07:38] 10Analytics, 10Pageviews-API: Adding top counts for wiki projects (ex: WikiProject:Medicine) to pageview API - https://phabricator.wikimedia.org/T141010 (10Milimetric) p:05Low>03Normal [17:07:40] 10Analytics: AQS: category information per WikiProject (ex: wiki-project-medicine) available in API - https://phabricator.wikimedia.org/T206686 (10fdans) p:05Triage>03Normal [17:07:57] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: is-yarn-app-running script should output the running application id - https://phabricator.wikimedia.org/T206555 (10fdans) p:05Triage>03High [17:08:08] 10Analytics, 10Analytics-Cluster, 10Operations, 10User-Elukey: Manage Hue via systemd unit - https://phabricator.wikimedia.org/T206484 (10fdans) p:05Triage>03Normal [17:08:11] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Time dimension carried on url for top metrics - https://phabricator.wikimedia.org/T206479 (10fdans) p:05Triage>03High [17:09:58] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: eventlogging logs taking a huge amount of space on eventlog1002 and stat1005 - https://phabricator.wikimedia.org/T206542 (10Nuria) Suggestion is: - make camu job that ingest raw client side kafka stream into HDFS - reduce retention of logs to 30 days on s... [17:11:52] 10Analytics, 10Analytics-Cluster, 10Contributors-Analysis, 10Product-Analytics: Hive join fails when using a HiveServer2 client - https://phabricator.wikimedia.org/T206279 (10fdans) a:03fdans [17:23:16] leila: hey we're having jitsi issues [17:23:22] you cannot hear any of us [17:23:23] :( [17:23:33] jdlrobson: got you. [17:23:38] jdlrobson: is it on my end? [17:23:40] :D [17:23:43] i think so [17:25:44] 10Analytics, 10Services (watching): [Hackathon] Consider converting AQS to TypeScript - https://phabricator.wikimedia.org/T206269 (10Nuria) [17:31:24] 10Analytics, 10Analytics-Kanban, 10Contributors-Analysis, 10Product-Analytics: Decommision edit analysis dashboard - https://phabricator.wikimedia.org/T199340 (10Nuria) a:03Milimetric [17:53:04] * elukey off! [17:57:18] (03CR) 10Mforns: [C: 032] "LGTM! Feel free to merge" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/466607 (https://phabricator.wikimedia.org/T187299) (owner: 10Gilles) [18:39:32] (03CR) 10Gilles: [V: 032] Keep recently added navtiming + survey fields [analytics/refinery] - 10https://gerrit.wikimedia.org/r/466607 (https://phabricator.wikimedia.org/T187299) (owner: 10Gilles) [18:39:34] neilpquinn: i see wikidata on wikistats for sep: https://stats.wikimedia.org/v2/#/wikidata.org [18:39:47] neilpquinn: let me know of the queries in AQS that might have not worked [18:42:01] 10Analytics, 10Analytics-EventLogging, 10EventBus, 10Services (watching): Modern Event Platform: Stream Intake Service: Implementation - https://phabricator.wikimedia.org/T206785 (10Ottomata) p:05Triage>03Normal [18:42:31] nuria: https://wikimedia.org/api/rest_v1/metrics/edited-pages/new/commons.wikimedia.org/all-editor-types/content/monthly/20180701/20181001 [18:42:42] nuria: that still doesn't have September data [18:44:14] neilpquinn: it might be caching as i see it: 2: {timestamp: "2018-09-01T00:00:00.000Z", new_pages: 804334} [18:44:47] neilpquinn: try https://wikimedia.org/api/rest_v1/metrics/edited-pages/new/commons.wikimedia.org/all-editor-types/content/monthly/20180701/20181002 [18:45:32] 10Analytics, 10Analytics-EventLogging, 10EventBus, 10Services (watching): Modern Event Platform: Stream Intake Service: Implementation - https://phabricator.wikimedia.org/T206785 (10Ottomata) Some open questions I have (some of these don't need to be resolved now since we don't have a use case for them) -... [18:47:14] 10Analytics, 10Analytics-EventLogging, 10EventBus, 10Services (watching): Modern Event Platform: Stream Intake Service: Implementation - https://phabricator.wikimedia.org/T206785 (10Ottomata) I have a prototype of an eventbus rewrite in node-service-template here: https://github.com/ottomata/eventbus Shou... [18:58:58] nuria: mm, yes, I see that too. Thanks, I didn't think of caching! [18:59:46] 10Analytics, 10Analytics-EventLogging, 10EventBus, 10Services (watching): Modern Event Platform: Event Schema Registry: Implementation - https://phabricator.wikimedia.org/T206789 (10Ottomata) p:05Triage>03Normal [19:01:29] 10Analytics, 10Analytics-EventLogging, 10EventBus, 10Services (watching): Modern Event Platform: Event Schema Registry: Implementation - https://phabricator.wikimedia.org/T206789 (10Ottomata) Q: should we use the term 'repository' or 'registry' here. I'm considering retitling the tickets to 'repository'... [19:05:29] 10Analytics, 10Analytics-EventLogging, 10EventBus, 10Services (watching): Modern Event Platform: Event Schema Registry: Implementation - https://phabricator.wikimedia.org/T206789 (10Ottomata) Q: Analytics has a use case to add extra jsonschema features to be able to know more about the contextual 'types' o... [19:07:38] 10Analytics, 10Analytics-EventLogging, 10EventBus, 10Services (watching): Modern Event Platform: Event Schema Registry: Implementation - https://phabricator.wikimedia.org/T206789 (10Ottomata) [19:12:23] heya nuria, milimetric. I just made two new tickets, lemme know what you think and if they make sense [19:12:32] https://phabricator.wikimedia.org/T206785 and https://phabricator.wikimedia.org/T206789 [19:14:13] ottomata: regarding the acked response [19:14:49] did you consider giuseppe's idea of a separate "status" request that a client could poll to get the result of the initial request? [19:15:34] no, but i have no idea how we'd do that with this service [19:16:12] it'd make it way more complex. we'd have to store the kafka offsets somewhere and then consume that offset from kafka until the message appears [19:16:40] i mean, yes i considered it, but it seems way out of scope to me! [19:16:43] i guess i should respond... [19:17:14] well, this would theoretically be lower throughput? The sync requests? [19:17:28] because if you get like 10k of those per second, I think that would take down even a bigger cluster [19:17:33] waiting on all those responses, no? [19:18:05] (not sure about the numbers, but the limit seems lower than for fire-and-forget) [19:18:46] hm, i dunno, its kind of the same? unless we change the ack setting for kafka [19:18:56] so then maybe the implementation's not that different. You just store the status in REDIS or something similar and update it when you get the async callback from validation/kafka [19:19:07] its faster for the client, since they dont' have to wait for the service to validate and produce [19:19:12] but the service is still doing the same stuff [19:19:27] except it doesn't have to keep a thread alive to respond on, right? [19:19:32] 10Analytics, 10New-Readers: Turnilo export and displayed data do not match - https://phabricator.wikimedia.org/T206760 (10atgo) Thanks @Nuria, that seems to work. Appreciate the pointer! While you're requesting things from Turnilo, it would be great to be able to get the more granular numbers without this exp... [19:19:56] milimetric: in the way i'd build it, fire and forget would just return the http 204 before any validation/kafka produce is done [19:20:03] but validation and kafka produce still happens [19:20:41] right, I agree, but lower level, the 204 response would go immediately so those requests wouldn't need any more processing [19:20:52] ? [19:20:53] bc? [19:20:56] the acked ones would need a thread to stick around that would process the callback [19:20:57] k [19:34:07] 10Analytics, 10Analytics-EventLogging, 10EventBus, 10Services (watching): Modern Event Platform: Stream Intake Service: Implementation - https://phabricator.wikimedia.org/T206785 (10Ottomata) [19:34:33] 10Analytics, 10Analytics-EventLogging, 10EventBus, 10Services (watching): Modern Event Platform: Event Schema Registry: Implementation - https://phabricator.wikimedia.org/T206789 (10Ottomata) [19:46:23] 10Analytics, 10Product-Analytics: Metrics request on portal namespace usage - https://phabricator.wikimedia.org/T205681 (10Tbayer) @AfroThundr3007730 Thanks for the additional background! I should be able to get you some data for the first to groups of questions soon, based on the [[https://en.wikipedia.org/w... [20:00:31] ottomata, do you have some time today to look at the --deploy-mode cluster issue with me? It seems there are no logs, at least I can not access them.. [20:11:03] OH [20:11:04] there is not much chance I could find wiki user passwords in the data lake, I suppose? [20:11:08] yes mforns now is good [20:11:13] actually gimme 5 mins [20:11:22] ottomata, sure :] [20:11:55] tgr, no, there are no passwords.. [20:12:08] 10Analytics, 10Product-Analytics: Metrics request on portal namespace usage - https://phabricator.wikimedia.org/T205681 (10Tbayer) a:03Tbayer [20:14:27] mforns: ok [20:14:33] so hmm that is strange, can I try? [20:14:35] gimme a command [20:14:46] right [20:15:38] ottomata, https://pastebin.com/2ZnFAnT0 [20:15:54] this I run from stat1004 [20:16:03] the property file is there [20:17:50] mforns: did you do yarn logs on htis? [20:17:51] i see [20:17:54] yes [20:17:57] 18/10/11 20:16:29 INFO EventLoggingToDruid: Starting process for event_. [20:17:58] 18/10/11 20:16:30 ERROR ApplicationMaster: User class threw exception: org.apache.spark.sql.catalyst.parser.ParseException: [20:17:58] mismatched input 'FROM' expecting (line 3, pos 12) [20:18:01] == [20:18:02] SELECT * [20:18:02] FROM event. [20:18:02] ------------^^^ [20:18:06] but it says there are no logs [20:18:14] hmmmmm [20:18:16] mforns: after your app is done [20:18:17] right? [20:18:33] this is because the properties file is not read properly [20:18:59] deploy-mode cluster tries to read the file from the driver node no? [20:19:04] so the file is not there!@ [20:19:05] ya [20:19:05] Config( [20:19:05] config_file = "", [20:19:05] start_date = 1970-01-01T00:00:00.000Z, [20:19:05] end_date = 1970-01-01T00:00:00.000Z, [20:19:05] database = "event", [20:19:05] table = "", [20:19:06] dimensions = List(), [20:19:06] time_measures = List(), [20:19:07] metrics = List(), [20:19:07] segment_granularity = "hour", [20:19:08] query_granularity = "minute", [20:19:15] yea those are defaults [20:19:23] but table is not defaulted, so event. [20:19:29] mforns: you should probably make required things not have defaults in the config case class [20:19:31] that way it will fail earlier [20:19:40] aha [20:19:43] but [20:19:52] anyway, yeah, in cluster mode, you need to ship the file to the driver on the cluster [20:19:55] makes sense [20:19:57] which is what the spark_job thing is doing [20:20:05] ah! the --files thing! [20:20:15] right [20:20:23] yup [20:20:25] and the file name [20:21:05] I see, so when using --files that file_path will be available to the job automatically? [20:21:11] yes [20:21:13] by its filename [20:21:14] basename [20:21:21] you can alias it to another by doing something lke [20:21:26] --files #aliased_file_name [20:21:33] but you don't need that, the original file basename is good :) [20:21:46] so --files should be the absolute path? [20:21:52] yes [20:22:11] ok [20:22:11] and then --config_file just the basename [20:22:19] oh... [20:22:24] e.g. test_config_file.properties [20:22:27] aha [20:22:36] ok, checking the code [20:22:37] it copies the --files to the local working temp dir of the remote driver [20:22:47] i'm trying with your job [20:22:50] it is RUNNING for much longe rnow :) [20:23:37] with that change i get: [20:23:37] Config( [20:23:37] config_file = "", [20:23:37] start_date = 2018-09-19T00:00:00.000Z, [20:23:37] end_date = 2018-10-11T00:00:00.000Z, [20:23:38] database = "event", [20:24:01] yea [20:24:09] oo just got kicked for too much paste! [20:24:17] anyway [20:24:17] Config( [20:24:18]  config_file = "", [20:24:18]  start_date = 2018-09-19T00:00:00.000Z, [20:24:18]  end_date = 2018-10-11T00:00:00.000Z, [20:24:18]  database = "event", [20:24:18]  table = "PageIssues", [20:24:18] etc. [20:24:35] I think the puppet code is good, it uses the full path for --files and the basename for --config_file [20:24:41] yea, looks good [20:24:41] ya [20:24:49] cool! [20:25:18] ok i gotta go pick up a baby now finally! [20:25:21] SEE YAALLLLLL [20:25:27] riiiight! enjoy! [20:25:28] i'll should be online tomorrow afternoon a little bit [20:25:36] byee [20:33:05] (03PS1) 10Milimetric: [SPIKE] [Don't merge] [analytics/refinery] - 10https://gerrit.wikimedia.org/r/466730 [21:12:09] 10Analytics, 10Analytics-EventLogging, 10EventBus, 10Services (watching): Modern Event Platform: Event Schema Registry: Implementation - https://phabricator.wikimedia.org/T206789 (10Pchelolo) [21:20:20] 10Analytics, 10Analytics-EventLogging, 10EventBus, 10Core Platform Team (Modern Event Platform (TEC2)), and 2 others: Modern Event Platform: Event Schema Registry: Implementation - https://phabricator.wikimedia.org/T206789 (10Pchelolo) > Do we need to use a custom meta JSONSchema for this, or can we just a... [21:44:04] 10Analytics, 10Analytics-EventLogging, 10EventBus, 10Core Platform Team (Modern Event Platform (TEC2)), and 2 others: Commit hook that adds a whole new file when a new version of schema is comitted - https://phabricator.wikimedia.org/T206812 (10Nuria) p:05Triage>03Normal [21:46:57] 10Analytics, 10Analytics-EventLogging, 10EventBus, 10Core Platform Team (Modern Event Platform (TEC2)), and 2 others: Git Commit hook that adds a whole new file when a new version of schema is committed - https://phabricator.wikimedia.org/T206812 (10Nuria) [21:51:14] 10Analytics, 10New-Readers: Turnilo export and displayed data do not match - https://phabricator.wikimedia.org/T206760 (10Nuria) issue submitted: https://github.com/allegro/turnilo/issues/197 [21:51:31] 10Analytics, 10New-Readers: Turnilo export and displayed data do not match - https://phabricator.wikimedia.org/T206760 (10Nuria) a:05Nuria>03None [21:52:55] tgr: wiki user passwords? I do not think so, no, is this is some config to access the db replicas? [21:53:23] no, I just needed some stats on passwords [21:54:43] but they are available in analytics-store so I'm just using that [22:03:46] 10Analytics, 10Analytics-EventLogging, 10EventBus, 10Core Platform Team (Modern Event Platform (TEC2)), and 2 others: CI Support for Schema Registry - https://phabricator.wikimedia.org/T206814 (10Nuria) p:05Triage>03Normal [22:04:11] 10Analytics, 10Analytics-EventLogging, 10EventBus, 10Core Platform Team (Modern Event Platform (TEC2)), and 2 others: CI Support for Schema Registry - https://phabricator.wikimedia.org/T206814 (10Nuria) CI for ensuring that schemas all have consistent meta field CI for ensuring schema backwards compatibili... [22:08:40] 10Analytics, 10Analytics-EventLogging, 10EventBus, 10Services (watching): Prototype in node service intake - https://phabricator.wikimedia.org/T206815 (10Nuria) p:05Triage>03Normal [22:08:58] 10Analytics, 10Analytics-EventLogging, 10EventBus, 10Services (watching): Prototype in node intake service - https://phabricator.wikimedia.org/T206815 (10Nuria) [22:12:40] milimetric: how are you doing with those queries? [22:17:27] (03CR) 10Nuria: [V: 032 C: 032] Make is-yarn-application-running --verbose more informative [analytics/refinery] - 10https://gerrit.wikimedia.org/r/465471 (https://phabricator.wikimedia.org/T206555) (owner: 10Ottomata) [22:20:16] (03CR) 10Nuria: [C: 032] Update default value of Refine hive_server_url [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/465459 (https://phabricator.wikimedia.org/T205509) (owner: 10Ottomata) [22:21:37] 10Analytics, 10Analytics-EventLogging, 10EventBus, 10Core Platform Team (Modern Event Platform (TEC2)), and 2 others: CI Support for Schema Registry - https://phabricator.wikimedia.org/T206814 (10Pchelolo) We actually have [[ https://github.com/wikimedia/mediawiki-event-schemas/blob/master/test/jsonschema/... [22:26:52] 10Analytics, 10Analytics-EventLogging, 10EventBus, 10Services (watching): Modern Event Platform: Stream Intake Service: Implementation - https://phabricator.wikimedia.org/T206785 (10Nuria) @Ottomata added ticket for prototype explicitly: https://phabricator.wikimedia.org/T206815 If it is a full rewrite I... [22:48:57] 10Analytics: Splits on top metrics when selections are present on url - https://phabricator.wikimedia.org/T206822 (10Nuria) [22:49:39] 10Analytics: Splits on top metrics when selections are present on url - https://phabricator.wikimedia.org/T206822 (10Nuria) Moving to kanban as this is a prod issue we should fix. [22:49:53] 10Analytics, 10Analytics-Kanban: Splits on top metrics when selections are present on url - https://phabricator.wikimedia.org/T206822 (10Nuria) [23:05:28] (03CR) 10Nuria: Keep recently added navtiming + survey fields (032 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/466607 (https://phabricator.wikimedia.org/T187299) (owner: 10Gilles) [23:12:44] 10Analytics, 10Analytics-EventLogging, 10EventBus, 10Core Platform Team (Modern Event Platform (TEC2)), and 2 others: Decide whether to use schema references in the schema registry - https://phabricator.wikimedia.org/T206824 (10Pchelolo) p:05Triage>03Normal