[02:41:29] 10Analytics, 10Analytics-Kanban: [Bug] Type mismatch for a few other schemas - https://phabricator.wikimedia.org/T216771 (10Milimetric) [02:47:05] 10Analytics, 10Analytics-Kanban, 10Fundraising-Backlog: CentralNoticeImpression refined impressionEventSampleRate is int instead of double - https://phabricator.wikimedia.org/T217109 (10Milimetric) [02:49:57] 10Analytics, 10Analytics-Kanban, 10Product-Analytics: Popups schema has the wrong type for popupDelay - https://phabricator.wikimedia.org/T217110 (10Milimetric) [02:53:50] 10Analytics, 10Analytics-Kanban, 10Performance-Team: ServerTiming schema value for duration is 0 - https://phabricator.wikimedia.org/T217111 (10Milimetric) [06:50:36] 10Analytics, 10Operations, 10ops-eqiad, 10Patch-For-Review, 10User-Elukey: rack/setup/install labsdb1012.eqiad.wmnet - https://phabricator.wikimedia.org/T215231 (10elukey) @Cmjohnson would it be possible to move the host in a new Rack? [07:14:27] hello people [07:14:40] going to apply the rmstore-on-hdfs setting on the testing cluster [07:27:33] (03PS6) 10Fdans: Refactor dashboard metric widget [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/492060 (https://phabricator.wikimedia.org/T187806) [07:27:38] (03CR) 10Fdans: Refactor dashboard metric widget (0312 comments) [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/492060 (https://phabricator.wikimedia.org/T187806) (owner: 10Fdans) [07:27:59] elukey: HELLO LUCA [07:28:06] HELLO FRAN [07:29:44] (03CR) 10jerkins-bot: [V: 04-1] Refactor dashboard metric widget [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/492060 (https://phabricator.wikimedia.org/T187806) (owner: 10Fdans) [07:33:51] elukey@analytics1028:~$ sudo -u hdfs hdfs dfs -ls /user/yarn/rmstore [07:33:54] Found 1 items [07:33:56] seems working [07:33:59] drwxr-xr-x - yarn hadoop 0 2019-02-26 07:32 /user/yarn/rmstore/FSRMStateRoot [07:34:02] ah! [07:40:21] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Hadoop Yarn stores a ton of znodes related to running/old applications - https://phabricator.wikimedia.org/T216952 (10elukey) Seems working fine! ` elukey@analytics1028:~$ sudo -u hdfs hdfs dfs -ls /user/yarn/rmstore Found 1 items drwxr-x... [07:58:23] so only with dropping the test rmstore from zookeeper we went from ~51.4K znodes to 43.7k [08:02:17] wow elukey [08:03:49] joal: bonjour :) [08:04:25] elukey: currently reading the ticket to be sure I have the correct background [08:05:06] joal: yep yep, looking forward to know your thoughts.. I have some pros/cons about zk vs HDFS but so far I didn't find anything against this setting [08:05:53] the main issue that I can think about is the fact that zk could enforce some ACLs/rules/etc.. to prevent two masters (split brain scenario) to update some rmstore znodes [08:05:59] meanwhile this doesn't happen with HDFS [08:06:15] but in theory HA should prevent this problem from happening [08:06:50] since it properly uses znodes to avoid a split brain [08:07:14] and at the end we are talking about the state of yarn apps [08:07:48] the rmstore is needed to preserve them across failovers etc.. [08:08:52] elukey: Your concern seems legit, but also very low-probability - Another thing I wonder would be the performance related aspects: data coming from more places - Better or not? [08:09:35] joal: I didn't get the last point about data coming from more places [08:10:28] when getting data from zk, it come all from the same place - When querying hdfs for various files, you need multiple round-trips (namenode + datanodes) [08:12:02] ah yes yes [08:12:05] makes sense now [08:12:39] we might pay a bit more latency for sure, but we'd avoid to hammer a delicate service like zk [08:13:00] I am pretty sure that under /rmstore we have a lot more than 10k znodes [08:13:01] elukey: HDFS might actually provide better perf based on parallel serving - but not sure :) [08:13:33] sure elukey - resiliency at the cost of a few millis - I thank you for that :) [08:14:09] joal: so green light from your side about the overall idea? [08:14:18] not planning to do it now of course :D [08:14:24] but maybe tomorrow? [08:14:38] so we leave one day of testing just in case something explodes [08:15:04] Yes elukey - It would be awesome to get an idea of perf changes from the test cluster, but I think it;ll be not feasible [08:15:15] yeah :( [08:15:27] well we have metrics etc.. [08:15:33] but few jobs working atm [08:16:32] Right - Another thing coming to mind is putting YARN at stake of HDFS overload [08:17:10] If hdfs is overloaded from some jobs, how can we garantee that yarn will have its resources in time? [08:17:40] true [08:18:49] for example an overload of the namenode [08:19:10] in that case Yarn wouldn't be able to store its data [08:19:26] right - Or a global overload (like 90% full space as we had) [08:19:59] The Hadoop FileSystem client retry policy specification. Hadoop FileSystem client retry is always enabled. This is specified in pairs of sleep-time and number-of-retries i.e. (t0, n0), (t1, n1), ..., the first n0 retries sleep t0 milliseconds on average, the following n1 retries sleep t1 milliseconds on average, and so on. The default value is (2000, 500). [08:20:44] so in theory Yarn will retry before giving up [08:20:49] like a regular HDFS client [08:21:10] elukey: I wouldn't have trust you if you had said otherwise :) [08:21:45] I don't think that the rmstore data is so heavy requested that it would ever be a problem, but it might be a corner case of yarn availability [08:21:58] buuut always better than breaking zk and kafka :P [08:22:01] that's my point [08:22:14] sure, better than breaking other stuff for sure [08:22:19] Let's do it :) [08:22:27] ack :) [08:23:09] thanks elukey - I'm assuming you keep notes on the stuff we discuss and you'll put it on a wiki page - So plenty thanks more :) [08:23:17] joal: tomorrow probably not fine for you, maybe thursday? [08:23:39] Thursday will be good :) [08:23:44] super [08:23:58] Many more thanks elukey for remembering my schedule <3 [08:26:50] joal: I am trying to fix my brain's LRU algorithm [08:26:57] :D [08:27:54] elukey: I like the idea that brains might actually have some kind of bloomfilter incorporated - The idea of a bloom in my head feels so good :) [08:29:49] ahahhahah [08:29:55] so peaceful yes [08:44:16] (03CR) 10Joal: Use db_mapping to find the hostname (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/492209 (https://phabricator.wikimedia.org/T215290) (owner: 10Milimetric) [08:46:12] Oooook - fdans - I'm ready for some vetting :) [08:46:47] * joal breath deeply and warms its joints [08:47:53] joal: coming back from a quick errand! [08:51:45] fdans: ping me when you're good :) [08:57:24] joal: I'm here! omw bc [09:05:00] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Hadoop Yarn stores a ton of znodes related to running/old applications - https://phabricator.wikimedia.org/T216952 (10elukey) I had a chat with Joseph about pros and cons of this solution: * zookeeper might give more guarantees in split br... [10:39:11] elukey: Hey, may ask something about the layout of our new multi-instance MySQL? [10:40:01] elukey: Namely, I can't find one database with the analytics-mysql tool (as documented here: https://wikitech.wikimedia.org/wiki/Analytics/Data_access#MariaDB_replicas) [10:40:12] (03PS1) 10Ladsgroup: Fixes for nwe multisource db setup [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/492988 (https://phabricator.wikimedia.org/T213894) [10:40:55] elukey: and that database would be: cognate_wiktionary - one of my dashboards (http://wmdeanalytics.wmflabs.org/Wiktionary_CognateDashboard/) depends critically upon it. [10:40:57] (03PS2) 10Ladsgroup: Fixes for new multisource db setup [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/492988 (https://phabricator.wikimedia.org/T213894) [10:41:23] hey elukey. Can you review https://gerrit.wikimedia.org/r/c/operations/puppet/+/492986 ? [10:42:33] Thanks! [10:43:50] Amir1: merged, running puppet on stat1007 [10:44:03] GoranSM: hi! [10:44:07] elukey: Hi! [10:44:26] interesting, let me check [10:44:34] elukey: Thank you! [10:45:34] GoranSM: ah so this is a special case, the db is not contained in any mw-config .dblist file [10:46:05] ah seems to be in x1 [10:46:07] elukey: That is the database of the MW Cognate extension: https://www.mediawiki.org/wiki/Extension:Cognate [10:46:26] GoranSM: you need to use --use-x1 [10:46:33] elukey: let me try, please [10:46:58] elukey: YES! Thank you very much! [10:47:11] GoranSM: nice! Glad also that you are using analytics-mysql :) [10:47:27] elukey: It's nice and simple, why not use it [10:47:39] elukey: Thanks to whoever developed the tool! [10:49:45] \o/ [11:03:45] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Patch-For-Review: eventlogging fails flake8 due to new upstream version, breaking CI - https://phabricator.wikimedia.org/T212396 (10hashar) 05Open→03Resolved [11:04:13] hey elukey :] can you please CR this micropatch? https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/492995/ [11:05:32] oh, elukey, should have explained in the CR description... this is to prevent problems if we deploy refinery-source tomorrow, because the refinery code that accompanies those source changes is not ready yet [11:05:43] this way we can deploy source tomorrow, w/o problems [11:06:31] (03CR) 10Mforns: [C: 04-1] "Still needs changes!" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/484250 (https://phabricator.wikimedia.org/T212014) (owner: 10Mforns) [11:10:40] elukey: Thank you so much, sorry I was at meetings [11:12:30] mforns: o/ [11:12:45] elukey, heya [11:12:57] I am fine to deploy it, does it need to be done now or tomorrow before the deploy? [11:13:18] elukey, as long as it's before the deploy it's ok! [11:13:47] mforns: I am asking because if 0.85 is not there some jobs might fail today no? [11:14:17] elukey, no no, 85 is already there [11:14:24] tomorrow will be 86 [11:14:41] ahhh [11:15:09] that's why, the code that was merged and will be deployed tomorrow should not run for ELSanitization [11:15:15] yet [11:15:31] ok so I am merging now [11:15:36] thanks! :D [11:17:44] mforns: done! [11:17:51] cooool [11:18:13] will check the next run at the hour [11:18:50] (03PS3) 10Ladsgroup: Fixes for new multisource db setup [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/492988 (https://phabricator.wikimedia.org/T213894) [11:19:51] (03PS4) 10Ladsgroup: Fixes for new multisource db setup [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/492988 (https://phabricator.wikimedia.org/T213894) [11:20:55] Amir1: thank you for this work! --^ [11:27:16] No worries. I hope I get it out of the door soon [11:27:43] my big problem is that I have only one day per week for wikidata stuff, otherwise this would have been done way sooner [11:35:07] ah :( [11:37:10] (03PS5) 10Ladsgroup: Fixes for new multisource db setup [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/492988 (https://phabricator.wikimedia.org/T213894) [12:09:56] (03CR) 10Joal: "answers inline, code following." (033 comments) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/492756 (https://phabricator.wikimedia.org/T215442) (owner: 10Joal) [12:10:07] (03PS2) 10Joal: Add JsonSchemaConverter to spark package [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/492756 (https://phabricator.wikimedia.org/T215442) [12:42:45] (03PS2) 10Joal: Add change_tags to mediawiki_history [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/492320 [12:42:49] (03PS1) 10Joal: Update mediawiki-reconstruction with log info [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/493012 [12:48:07] * elukey lunch! [12:48:10] (03PS1) 10Ladsgroup: Fix connecting to the right port of multisource db setup [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/493013 (https://phabricator.wikimedia.org/T213894) [12:51:16] (03PS2) 10Joal: Update mediawiki-reconstruction with log info [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/493012 [14:51:31] (03PS7) 10Fdans: Refactor dashboard metric widget [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/492060 (https://phabricator.wikimedia.org/T187806) [14:57:58] (03PS8) 10Fdans: Refactor dashboard metric widget [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/492060 (https://phabricator.wikimedia.org/T187806) [14:58:39] 10Analytics, 10Analytics-Kanban, 10Product-Analytics: Superset's rolling average feature results in error message - https://phabricator.wikimedia.org/T213488 (10elukey) 05Open→03Stalled p:05High→03Normal [14:58:46] (03PS9) 10Fdans: Refactor dashboard metric widget [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/492060 (https://phabricator.wikimedia.org/T187806) [16:01:03] ping joal [16:04:29] (03CR) 10Milimetric: [C: 03+2] Refactor dashboard metric widget [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/492060 (https://phabricator.wikimedia.org/T187806) (owner: 10Fdans) [16:06:07] 10Analytics, 10Analytics-Kanban, 10Product-Analytics: Superset's rolling average feature results in error message - https://phabricator.wikimedia.org/T213488 (10jlinehan) @elukey Sure, but the reason for the patch being offered was due to the uncertainty about how long an upgrade (i.e. returning to tracking... [16:24:39] 10Analytics, 10EventBus, 10Train Deployments: Catchable fatal error: Argument 1 passed to EventBusHooks::sendResourceChangedEvent() must be an instance of LinkTarget, Title given in /srv/mediawiki/php-1.33.0-wmf.19/extensions/EventBus/includes/EventBusHooks.php on li... - https://phabricator.wikimedia.org/T217145 [16:24:55] 10Analytics, 10EventBus, 10Train Deployments: Catchable fatal error: Argument 1 passed to EventBusHooks::sendResourceChangedEvent() must be an instance of LinkTarget, Title given in /srv/mediawiki/php-1.33.0-wmf.19/extensions/EventBus/includes/EventBusHooks.php on li... - https://phabricator.wikimedia.org/T217145 [16:25:24] 10Analytics, 10EventBus, 10Train Deployments: Catchable fatal error: Argument 1 passed to EventBusHooks::sendResourceChangedEvent() must be an instance of LinkTarget, Title given in /srv/mediawiki/php-1.33.0-wmf.19/extensions/EventBus/includes/EventBusHooks.php on li... - https://phabricator.wikimedia.org/T217145 [16:26:43] 10Analytics, 10EventBus, 10Wikimedia-production-error: Catchable fatal error: Argument 1 passed to EventBusHooks::sendResourceChangedEvent() must be an instance of LinkTarget, Title given in /srv/mediawiki/php-1.33.0-wmf.19/extensions/EventBus/includes/EventBusHooks.... - https://phabricator.wikimedia.org/T217145 [16:42:31] 10Analytics, 10EventBus, 10Wikimedia-production-error: Catchable fatal error: Argument 1 passed to EventBusHooks::sendResourceChangedEvent() must be an instance of LinkTarget, Title given in /srv/mediawiki/php-1.33.0-wmf.19/extensions/EventBus/includes/EventBusHooks.... - https://phabricator.wikimedia.org/T217145 [16:44:13] ottomata: refactor deployed, no-op, all good! [16:45:08] great! [16:58:52] 10Analytics, 10Analytics-Kanban, 10Product-Analytics, 10Readers-Web-Backlog: Popups schema has the wrong type for popupDelay - https://phabricator.wikimedia.org/T217110 (10Tbayer) [17:04:13] 10Analytics, 10EventBus, 10Wikimedia-production-error: Catchable fatal error: Argument 1 passed to EventBusHooks::sendResourceChangedEvent() must be an instance of LinkTarget, Title given in /srv/mediawiki/php-1.33.0-wmf.19/extensions/EventBus/includes/EventBusHooks.... - https://phabricator.wikimedia.org/T217145 [17:06:54] ottomata: hi! quick question, is it normal that the vagrant eventbus role wanted node version 10? [17:07:33] It seems to be provisioning fine now, but I'd like to be able to recommend the correct dev setup procedure for others who will work on the thing I'm working on [17:09:17] AndyRussG: its new [17:09:29] we are in the process of moving to new eventgate service [17:09:36] so eventlogging-service-eventbus will be deprecated eventually [17:09:41] hopefully by end of next quarter [17:09:49] ah yeah that was the thing that was complaining I think [17:10:05] we'd like to upgrade all mw vagrant node services to node 10 [17:10:15] but that is a trickier upgrade than just this one [17:10:25] AndyRussG: what's the thing your working on again? [17:12:19] I just added the unstable Debian repo to the VM and upgraded only nodejs to 10, then added npm::node_version: 10 to puppet/hieradata/common.yaml, as per instructions in puppet/modules/eventgate/manifests/init.pp [17:12:34] It seems to have provisioned correctly after that [17:12:39] gfreat [17:12:57] if that's the right thing to do, I'll just recommend it to other on the team for when they review my code [17:13:14] the actual thing I'm working on is a script to monitor changes in CN banners [17:13:41] and so to get it to know which banners to watch, it needs to follow changes to CN campaign configurations [17:13:50] 10Analytics, 10EventBus, 10Patch-For-Review, 10Wikimedia-production-error: Catchable fatal error: Argument 1 passed to EventBusHooks::sendResourceChangedEvent() must be an instance of LinkTarget, Title given in /srv/mediawiki/php-1.33.0-wmf.19/extensions/EventBus/i... - https://phabricator.wikimedia.org/T217145 [17:13:52] hence the plan to get those on Kafka [17:14:18] right, cool [17:14:36] AndyRussG: you plan to emit those events from mediawiki? [17:14:50] AndyRussG: as for changes, i wouldn't expect that you have to manually add an unstable repo [17:14:58] i would have thought that adding npm::node_version: 10 woudl be enough [17:14:59] it wasn't? [17:15:27] ottomata: it might have been enough, but I did it before adding that line, so I can't confirm 8p [17:15:49] ottomata: yes, emitting from mw. Last week you sent me a bunch of links on how to do that, seems like it'll be pretty smooth sailing [17:16:05] ok great right. [17:16:06] Just adding a hook to CN for EventBus to handle [17:16:27] AndyRussG: i wonder if we could design your event schema for eventgate instead of eventlogging-service-eventbus [17:16:30] it is mostly the same. [17:16:42] just a couple of schema differences [17:16:46] ottomata: how soon would that go live to prod? [17:17:04] AndyRussG: the 'analtyics' endpoint will is live today/tomorrow [17:17:18] and we will start emitting real mw events to it probably next week [17:17:25] we will also have a 'main' endpoint [17:17:33] that will function a bit more like the existing eventbus endpoint [17:17:51] will your events be consumed by frack stuff, or just in hive/druid etc.? [17:19:18] ottomata: hive/druid actually not needed at all [17:19:26] I imagine the script will run on the FR cluster [17:19:51] ok [17:19:55] So I was imagining a kafka consumer [17:20:01] in that case stick to eventbus for now [17:20:04] and we'll migrate with everythign else later [17:20:18] ah ok, happy to try the new thing if it's useful to you btw! [17:20:26] i guess it depends on when you deploy! [17:20:34] if it is next quarter, then let's use new thing [17:20:39] if this, probably better to just wait [17:20:46] hmmm [17:20:54] we will have to do a migration for all the eventbus style events anyway, so +1 for your new one won't really matter [17:21:06] yeah hopefully this quarter, not certain though [17:21:16] k, i say proceed with eventbus style [17:21:18] would the emitting code in MW be basically the same? [17:21:21] yes [17:21:28] there'd be one or two fields you'd have to set differently [17:21:28] ok cool [17:21:40] and a different 'endpoint' when doing EventBus::getInstance [17:21:48] k sounds pretty smooth [17:21:54] e.g. EventBus::getInstance('eventgate-main') instead of EventBus::getInstance('eventbus') [17:22:04] ah right [17:22:04] aside from that its the same [17:22:29] I'll hopefully be working on documentation over the next few weeks too, so thaht will make it easier to do later [17:22:30] I'll keep you posted and pls don't hesitate to reach out any time you'd like us to guinea pig eh! [17:22:35] k great, thanks! [17:22:41] likewise!!! :) [17:24:38] 10Analytics, 10Analytics-Kanban, 10Product-Analytics: Superset's rolling average feature results in error message - https://phabricator.wikimedia.org/T213488 (10elukey) >>! In T213488#4984851, @jlinehan wrote: > @elukey Sure, but the reason for the patch being offered in the first place was due to the uncert... [17:30:14] * elukey ofF! [17:37:01] (03PS1) 10GoranSMilovanovic: ETL procedures [analytics/wmde/WDCM] - 10https://gerrit.wikimedia.org/r/493082 [17:37:18] (03CR) 10GoranSMilovanovic: [V: 03+2 C: 03+2] ETL procedures [analytics/wmde/WDCM] - 10https://gerrit.wikimedia.org/r/493082 (owner: 10GoranSMilovanovic) [17:37:24] (03Merged) 10jenkins-bot: ETL procedures [analytics/wmde/WDCM] - 10https://gerrit.wikimedia.org/r/493082 (owner: 10GoranSMilovanovic) [17:38:55] 10Analytics, 10WMDE-Analytics-Engineering: Pyspark2 fails to read.csv when run with spark2-submit - https://phabricator.wikimedia.org/T217156 (10GoranSMilovanovic) [17:52:05] (03PS1) 10WMDE-Fisch: Update userprop scripts to new db setup [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/493086 (https://phabricator.wikimedia.org/T216613) [17:55:17] So I'm able to tail eventbus.log fine, it shows events. However, run from within the VM, kafkacat -C -b localhost:9092 -t mediawiki.revision-create (for example) says, "% ERROR: Topic mediawiki.revision-create error: Broker: Leader not available" [17:55:24] any ideas? [17:55:39] AndyRussG: the topci is prefixed [17:55:40] try [17:55:46] -t datacenter1.mediawiki.revision-create [17:56:17] ottomata: yeee successs! thx!!!! [17:58:36] 10Analytics, 10WMDE-Analytics-Engineering: Pyspark2 fails to read.csv when run with spark2-submit - https://phabricator.wikimedia.org/T217156 (10GoranSMilovanovic) @JAllemandou Tell me that it's me not escaping something necessary to escape when deploying a pyspark2 script with `spark2-submit`? [18:10:27] 10Analytics, 10WMDE-Analytics-Engineering: Pyspark2 fails to read.csv when run with spark2-submit - https://phabricator.wikimedia.org/T217156 (10Ottomata) You are deploying in yarn mode, which runs on the Hadoop cluster, not locally, so your local files will nto exist on the remote executor. You can provide t... [18:11:25] 10Analytics, 10EventBus, 10Wikimedia-production-error: ConfigException from line 339 of /srv/mediawiki/php-1.33.0-wmf.19/extensions/EventBus/includes/EventBus.php: EventBus::getInstance requires a configured $eventServiceName - https://phabricator.wikimedia.org/T217146 (10Aklapper) [18:16:27] 10Analytics, 10EventBus, 10Wikimedia-production-error: ConfigException from line 339 of /srv/mediawiki/php-1.33.0-wmf.19/extensions/EventBus/includes/EventBus.php: EventBus::getInstance requires a configured $eventServiceName - https://phabricator.wikimedia.org/T217146 (10Jdlrobson) [18:17:01] 10Analytics, 10EventBus, 10Wikimedia-production-error: ConfigException from line 339 of /srv/mediawiki/php-1.33.0-wmf.19/extensions/EventBus/includes/EventBus.php: EventBus::getInstance requires a configured $eventServiceName - https://phabricator.wikimedia.org/T217146 (10Jdlrobson) [18:17:18] 10Analytics, 10EventBus, 10Wikimedia-production-error: ConfigException from line 339 of /srv/mediawiki/php-1.33.0-wmf.19/extensions/EventBus/includes/EventBus.php: EventBus::getInstance requires a configured $eventServiceName - https://phabricator.wikimedia.org/T217146 (10Jdlrobson) 05duplicate→03Resolved... [18:49:59] Hey team - back online [19:00:21] heya joal :] [19:33:22] 10Analytics, 10EventBus, 10Wikimedia-production-error: Catchable fatal error: Argument 1 passed to EventBusHooks::sendResourceChangedEvent() must be an instance of LinkTarget, Title given in /srv/mediawiki/php-1.33.0-wmf.19/extensions/EventBus/includes/EventBusHooks.... - https://phabricator.wikimedia.org/T217145 [19:35:02] 10Analytics, 10WMDE-Analytics-Engineering: Pyspark2 fails to read.csv when run with spark2-submit - https://phabricator.wikimedia.org/T217156 (10GoranSMilovanovic) @Ottomata Thank you, Andrew. One thing: does this has anything to do with any of the recent changes in our Spark (if I remember correctly, Spark 1.... [19:38:32] 10Analytics, 10WMDE-Analytics-Engineering: Pyspark2 fails to read.csv when run with spark2-submit - https://phabricator.wikimedia.org/T217156 (10Ottomata) No don't think so, I would expect you'd always have to do this. Were you for sure running with `--master yarn` before? [19:46:33] 10Analytics, 10MediaWiki-Database, 10Research, 10Wikidata: Improve interlingual links across wikis through Wikidata IDs - https://phabricator.wikimedia.org/T215616 (10JAllemandou) Hi @Isaac Sorry for the issue. I correcte the query above (last query, join criteria: `AND ws.sitelink.title = title_namespace... [20:26:19] 10Analytics, 10EventBus, 10Wikimedia-production-error: Catchable fatal error: Argument 1 passed to EventBusHooks::sendResourceChangedEvent() must be an instance of LinkTarget, Title given in /srv/mediawiki/php-1.33.0-wmf.19/extensions/EventBus/includes/EventBusHooks.... - https://phabricator.wikimedia.org/T217145 [20:28:45] 10Analytics, 10EventBus, 10Wikimedia-production-error: Catchable fatal error: Argument 1 passed to EventBusHooks::sendResourceChangedEvent() must be an instance of LinkTarget, Title given in /srv/mediawiki/php-1.33.0-wmf.19/extensions/EventBus/includes/EventBusHooks.... - https://phabricator.wikimedia.org/T217145 [20:29:59] 10Analytics, 10EventBus, 10Wikimedia-production-error: ConfigException from line 339 of /srv/mediawiki/php-1.33.0-wmf.19/extensions/EventBus/includes/EventBus.php: EventBus::getInstance requires a configured $eventServiceName - https://phabricator.wikimedia.org/T217146 (10hashar) a:03Pchelolo Thank you :-) [20:39:36] (03PS3) 10Ottomata: Event(Logging) schema loader [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/492399 (https://phabricator.wikimedia.org/T215442) [21:18:41] (03PS4) 10Ottomata: Event(Logging) schema loader [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/492399 (https://phabricator.wikimedia.org/T215442) [21:25:47] (03CR) 10Ottomata: [C: 03+1] "One comment on types but looks great!" (031 comment) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/492756 (https://phabricator.wikimedia.org/T215442) (owner: 10Joal) [21:27:34] 10Analytics, 10MediaWiki-Database, 10Research, 10Wikidata: Improve interlingual links across wikis through Wikidata IDs - https://phabricator.wikimedia.org/T215616 (10Isaac) Hey @JAllemandou - this is great! thanks for catching that - looks all good to me now too. [21:39:48] I suggested to someone to contact you all on the analytics public list, and I hear from them that http://news.gmane.org/gmane.org.wikimedia.analytics/ is broken. Should this page link to one of your existing pages? [21:39:51] (I'm confused) [21:51:01] (03CR) 10Nuria: Event(Logging) schema loader (034 comments) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/492399 (https://phabricator.wikimedia.org/T215442) (owner: 10Ottomata) [21:51:22] ottomata: yt? [21:51:48] leila: ? [21:52:01] nuria: ya [21:52:18] nuria: ? [21:52:22] ottomata: i .. ahem.. posted an opinionated review to your eL schemas patch [21:52:33] oh boy! [21:52:44] leila: you have two messages above here that me no comporedou [21:52:57] nuria retro? :) [21:53:06] ottomata: for real? [21:53:28] nuria: re the first one, now I understand myself what I was saying. [21:54:00] nuria: basically, https://lists.wikimedia.org/mailman/listinfo/analytics is linking people to a page on news.gmane.org that people can post from to the list, but that link doesn't work [21:54:44] ottomata: i think we should not do static classes for these functionality but also I think we need a class that is concerned with specifics of EL [21:54:52] leila: ah, ya, no idea who manages that [21:54:55] nuria: is http://news.gmane.org/gmane.org.wikimedia.analytics/ still active? If not, you should consider updating the https://lists.wikimedia.org/mailman/listinfo/analytics page to reflect it. [21:55:07] nuria: then remove it, please. :) [21:55:08] nuria: ya we might need specific class, really i am just trying to keep it simple in functions only [21:56:09] (03CR) 10Ottomata: Event(Logging) schema loader (032 comments) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/492399 (https://phabricator.wikimedia.org/T215442) (owner: 10Ottomata) [21:57:20] ottomata: I think static classes make things harder rather than easier because there are no functions in java, just classes . And static classes are very different from functions [21:57:38] very different or just different? :p [21:58:16] ottomata: very different, they require initialization from class loader , cannot implement interfaces and are not easily testable [21:58:38] that's different than classes [21:58:43] not differnent than functions :p [21:58:51] explain not easily testable? [21:58:53] ottomata: https://martinfowler.com/bliki/StaticSubstitution.html [21:59:57] (btw, i'm gonna argue for fun, I'm not opposed to singleton) [22:00:02] yaya, but i'm not doing that. [22:00:09] i'm not trying to make a static class that reads any configs [22:00:22] ALL functionality here is providable via static method params. [22:00:34] which makes this class really just a function namespace [22:01:41] ottomata: mmm, no it makes this class hardly modifiable. I could say ELFetcher("http://test.domain') or ELFetcher("/my/machine/") if i had access to instatiation [22:01:59] you can still do that, no? [22:02:19] getEventLoggingSchemaUri(name, revision, "http://test.domain') [22:02:50] ottomata: in the methods yes, but i am specifying a "state" for the static class disguised as a parameter to a static method (not a function) [22:03:29] ottomata: substitution for testing is much more easily done in singletons than statics [22:03:47] ottomata: and the link i pasted explains it better than i just did [22:04:36] ottomata: also singletons allow for better private/public split of methods [22:04:52] nuria i read link, i still don't buy it. [22:04:58] maybe MAYBE because I have this static cache. [22:05:15] ottomata: which plays well with lifecycle (the cache) [22:05:15] but, for the the sake of this argument lets pretend I don't (I just added it in the last patch or two) [22:05:25] if i didn't have this cache [22:05:38] these static methods are equivalent to immutable functions [22:05:51] which are the easiest thing to test and reuse [22:06:42] ottomata: mmm no, you have a static initializer which can only exist in a lifecycle [22:06:56] oh you mean the code block? [22:06:59] ottomata: supported by the classloader loading the calss [22:07:02] *class [22:07:40] ottomata: yes, it is stateful but in a less obvious way than if it was a singleton [22:07:58] ottomata: the static initializer is used as a "constructor" [22:08:48] ottomata: and if i wanted to use that class with a different capsule i could not cause there is no way to do all the same but leave capsule to be different, makes sense? [22:10:41] nuria: I very much hope there will be no different capsule [22:10:51] burt ya, i agree that perhaps that stuff should be in a different EL specific class [22:10:59] ottomata: let's say "absence" of acapsule [22:11:01] but ok, for the sake of this argument then [22:11:06] nuria absence? [22:11:14] if you don't want capsule [22:11:15] ottomata: schemas without capsule [22:11:17] use getJsonSchema() [22:11:28] instead of getEventLoggingSchema() [22:11:40] ottomata:then you should not need to have loaded all that capsule code in teh same class, righttttt???? [22:11:56] ah i see, sure! [22:12:50] ottomata: that is where having a singleton with one EL specific implementation that deals with capsules helps (it also fits well with cache initialilization cause you need to plugin that into the lifecycle) [22:12:56] hmm nuria isn't that going to have to happen anyway, even if there was a different EL class with a singleton? [22:13:03] private static AddressBook soleInstance = new AddressBook(); [22:13:07] e.g. [22:13:17] private static ELLoader soleInstance = new ELLoader(); [22:13:34] but the ELLoader constructor is just going to do the same stuff to build the EVENTLOGGING_CAPSULE_SCHEMA [22:13:41] so its going to be called by class loader anyway, no? [22:13:45] in the link you sent me: [22:13:55] In particular with this code we have to create an instance of the actual service even if we never use it - because the sole instance is initialized in a static initializer. [22:13:59] "In particular with this code we have to create an instance of the actual service even if we never use it - because the sole instance is initialized in a static initializer." [22:14:40] ottomata: when it gets instantiated yes [22:14:59] ottomata: but if iam retrieving another schema (no EL) it does not get instatiated [22:15:07] *instantiated [22:15:49] hmmm, i think it does [22:15:50] doesn't it? [22:15:57] if the class has private static final MaxmindDatabaseReaderFactory instance = new MaxmindDatabaseReaderFactory(); [22:16:00] in the class body [22:16:04] it gets instantiated by loader, no? [22:16:15] ottomata: also functions are self contained and in this case all the methods are accessing variables external to them [22:16:53] ottomata: if you set up instatiation like that yes it would [22:16:53] 10Analytics, 10Scoring-platform-team: [Discuss] ORES model development and deployment processes - https://phabricator.wikimedia.org/T216246 (10Halfak) Meeting scheduled for Thursday, Feb 28th @ 1630UTC. I've preemptively made an a notes document here: https://etherpad.wikimedia.org/p/ores_usecases_for_ml_infr... [22:17:49] nuria: that's the example you sent me! [22:17:53] what's the other way of doing it? [22:17:59] ottomata: let me try to summarize, if we want code to be truly functional functions do not access variables that exist outside their scope. Java is not well suited for that because there is class scope but not function scope (traditional no lambdas, etc) [22:18:59] 10Analytics, 10Scoring-platform-team (Current): [Discuss] ORES model development and deployment processes - https://phabricator.wikimedia.org/T216246 (10Halfak) p:05Triage→03Normal a:03Halfak [22:19:15] nuria aye, i see that, but i think thats a bit beside your point. Ok sure, the functions are accessing some constants. but those constants themselves are immutable. [22:19:24] ottomata: static classes make testing difficult because they do not allow for easy substitution for testing (because static methods are accessing static variables) [22:19:32] you are trying to convince me that the static stuff doesn't get loaded by classloader if you use a singleton [22:19:35] and I think it does! [22:19:48] ottomata: no, sorry. it loads [22:20:07] so the static initialiation block is equiavalent to static loading of a singleton [22:20:13] ottomata: what i am saying is that this static code is using the class loader setting of the static block [22:20:22] as a way to set up state [22:20:45] but its the same thing as doing static singleton = new Singleton [22:20:51] if the Singleton constructor builds the same stuff [22:20:56] as in the static {} block [22:21:33] ottomata: not the same lifecycle wise cause a singleton is an object (and a such mean to carry state) [22:21:50] what do you mean lifecyclewise? [22:21:58] ottomata: a static class is just a "collection of methods" that access some constants [22:23:26] ottomata: lifecycle cause it uses a constructor which is made for this purpose [22:23:30] all i can see the singleton does is move the constants from static scope to instance scope. [22:23:46] yeah,b ut the constructor is called at the same time for the singleton as the static constant initialization [22:23:51] ottomata: and has ability to implement an interface [22:24:16] heheh nuria, maybe i'll move this to a Scala object and then you won't care but it will be exactly the same thing :p [22:25:14] ottomata: in scala there are functions but not in java , if you build this in scala without using any "var" ya, i buy it [22:25:42] ya, everything here is with static final, which is kinda like val [22:26:02] scala val [22:26:07] ottomata: mmm, no t at all [22:27:26] hehm, actually nuria in scala an object is kind of sugar for a singleton :p [22:28:55] ottomata: it is not that i love singletons (really, i do not) butI like less static classes with a bunch of methods that access static variables with static initializers that cannot implement interfaces [22:29:18] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Move AQS to nodejs 10 - https://phabricator.wikimedia.org/T210706 (10Milimetric) Ok, @elukey I loaded fake data in the deployment cluster and verified AQS is working well. Base case done. You can do the `profile::aqs::use_nodejs10: true` magic. And I'll... [22:30:10] :p [22:30:11] https://stackoverflow.com/a/1757691 [22:32:21] anyway nuria will consider, probably a good idea especially since my cache is a little extra funky in static context [22:32:30] def should have seperate class for EL stuff [22:32:40] i'll put WIP back on the commit messag e [22:32:52] ottomata: i can do changes if you want later on today [22:33:15] nuria: sure have at it! :) [22:33:33] nuria: btw not sure if you saw [22:33:35] this will be, used with https://gerrit.wikimedia.org/r/#/c/analytics/refinery/source/+/492756/ [22:33:57] next we just need a glue between the two things that gets the schemaURI from an event. [22:34:04] that's gonna be a little funky, but we'l lfind the right place [22:34:38] ottomata: ooohhhh nice [22:36:30] nuria: we might want to add a function to these class(es) that takes a ether a event JSON String (or ObjectNode) and the schemaURIField and returns the schema [22:36:50] we'll hhave to figure out how to load just a single event from some hourly data, but if we get that then we can use it to load the schema [22:37:38] thanks nuria! i'm off for the day ttyt [22:39:37] ottomata: super thanks