[01:24:10] :) [06:59:53] (03PS1) 10Matthias Mullie: Fix report name [analytics/limn-multimedia-data] - 10https://gerrit.wikimedia.org/r/349379 [08:20:07] hello folks :) [08:20:27] I am going to restart aqs (nodejs) on our aqs* cluster for security upgrades [08:20:33] if you see anything weird please let me know [08:29:39] joal: found something amazing this morning - sudo cumin 'R:Class = role::aqs and not aqs1004*' 'restart-aqs' -b 1 -s 120 [08:29:49] (1004 was restarted manually) [08:30:14] we have a script (restart-aqs) that depools the host from pybal before restarting, and the repools it [08:30:41] and cumin executes the script on one host at the time every 120 seconds [08:51:44] ooo hello elukey it was lonely here :) [08:53:02] o/ [09:01:45] (03CR) 10DCausse: "thanks!" (033 comments) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/345863 (owner: 10DCausse) [09:14:01] elukey: From n [09:14:17] elukey: (again) From now on I'll call SpiceMan [09:14:22] elukey do you have a moment to talk nella batcaverna about EL? [09:19:12] fdans: I surely have a moment but EL is not among the things that I know a lot :( [09:20:07] elukey: don't worry then, I'll ask someone else a bit later, I can continue on my other stuff :) [09:21:55] fdans: anything in particular that I can check? [09:25:54] elukey: it's related to how event errors are persisted as opposed to event capsules [09:26:29] I'm adding a new schema for automated requests and I'm wondering if I need a specific writer for it [09:28:54] no idea sorry :( [09:38:29] Monsieur fdans-san, I've just quickly checked the dataset generated with your new field: it seems very correct :D [09:38:35] This is a WIN ! [09:39:48] fdans: --^ [09:40:06] fdans: I'm assuming it doesn't ping you when I name you fdans-san :D [09:44:00] joal YESSSSSSSS [09:44:06] awesome [10:05:53] * elukey early lunch! [10:35:48] !log restart pivot for nodejs security upgrades [10:35:50] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [10:49:13] 06Analytics-Kanban: Investigate rise in IE views from Pakistan since 2015 - https://phabricator.wikimedia.org/T157404#3201066 (10fdans) A few more pointers: The UA string is used across a few hundred IP addresses in the lapse of an hour. It would be interesting to expand the query to the whole day and see if th... [10:54:05] (03PS2) 10Joal: Update restbase oozie spark job [analytics/refinery] - 10https://gerrit.wikimedia.org/r/349266 (https://phabricator.wikimedia.org/T163479) [11:27:47] hellooo team :] [11:45:36] mforns: o/ [11:45:55] konichiwa! [11:46:59] restarting for updates... [13:30:35] taking a break a-team [13:30:42] o/ [13:46:28] ooh, joal the job finished [13:51:53] !log set innodb_flush_log_at_trx_commit = 0 and sync_binlog = 300 on bohrium's mysql [13:51:54] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [14:30:44] (03PS11) 10Ottomata: [WIP] Spark + JSON -> Hive [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/346291 (https://phabricator.wikimedia.org/T161924) (owner: 10Joal) [14:31:02] elukey: do you know what cache-control params you think we should try off the top of your head or have any docs to point me to? I'm gonna submit a patch to https://github.com/wikimedia/puppet/blob/fe9a43d4098901f00219a92efa700022565b7fbd/modules/statistics/templates/analytics.wikimedia.org.erb [14:37:35] hm, another thought: what if we put a little "purge caches" toggle button on dashiki and if you flip it on it adds a "no-cache" header. Checking to see what that would do [14:41:57] 06Analytics-Kanban: Getting different versions of the same file - https://phabricator.wikimedia.org/T163338#3201548 (10elukey) We could simply lower down Cache-Control's max-age to one hour, in order to force Varnish to check after that time (upon receiving a request) if the content has changed or not (If-Modifi... [14:42:25] doh you can't do that from the client [14:46:29] milimetric: o/ - I was writing in the task and didn't see your messages :) [14:47:21] joal hiiiiiii if you have a few mins before standup, would love to talk some implicits :) [14:48:10] 06Analytics-Kanban: Getting different versions of the same file - https://phabricator.wikimedia.org/T163338#3201563 (10Milimetric) yeah, I mean 1 hour caches are fine, accomplish some of what we want. It's really weird though, with Cache-Control everything seems possible except for the one way that I figure eve... [14:48:26] elukey: better in the task, replied there [14:48:33] I'll submit patch with 1 hour change [14:49:20] milimetric: or maybe more, I lost track of the use case(s) that required a fresh cache [14:50:28] something that is usually done is to add a version to the static contents, like thisisafilethatwillnotchange_1.txt, and then cache it for say a year. Then you change it, and you call it thisisafilethatwillnotchange_2.txt [14:50:41] I mean ideally we'd always want fresh data, because if the update happens right after varnish caches, we have stale data for two days [14:50:41] but it is not really flexible for our use case :) [14:51:04] milimetric: one day maximum no? [14:51:07] yeah, seems unnecessarily complicated though, like, cache-control is just busted [14:51:07] why two? [14:51:23] say something computes on 1/1 [14:51:26] then again on 1/2 [14:51:44] varnish caches the 1/1 version on 1/1 and 1 second [14:52:10] serves that, caches again on 1/2 and 1 second, but the 1/2 computation takes a little longer so it's done on 1/2 and 3 seconds [14:52:31] so then users see the 1/1 version until 1/3 when they see the 1/2 version, and so on always behind 1 day [14:52:52] depending how you count, I was thinking 1/1 was already stale on 1/1 and 1 hour [14:52:54] but yeah, 1 day [14:54:04] "cache for 300 days, but if a request comes in and at least 1 hour has passed since a cached response, check if the content changed." [14:54:18] this is == to set max-age to 1 hour and re-validate [14:54:27] or maybe I am not getting your point [14:54:48] (with Etag et all if the file has not changed it will not be retrieved) [14:55:44] talk in bc? [14:55:56] sure 1 sec :) [14:58:03] elukey: all our static content is versioned, that is not an issue [14:58:14] elukey: the data however is not [15:02:03] ottomata: after standup for implicits? [15:03:20] ya [15:08:13] ping fdans [15:30:23] joal: i think i figured out my implicits [15:30:34] i thought i had some weird scoping stuff, but i had just used them wrong [16:02:03] joal: I manually applied the change to Pivot.w.o, seems good and the syntax checks are fine [16:02:08] can you double check? After that I'll merge [16:02:12] great elukey ! [16:02:32] Looks good on UI elukey :) [16:03:20] merged :) [16:15:50] milimetric: change merged :) [16:15:59] I think that FileETag is by default enabled [16:16:05] this is why we have Etag :) [16:16:23] let's see how it goes now [16:17:42] just verified that the new cache-control header is returned [16:17:43] all fine [16:17:44] :) [16:47:08] elukey: do we know what resources the etag ins calculated on? [16:47:19] elukey: otherwise those two might not match ...right? [16:48:07] nuria: if I got it correctly should be mtime by default [16:49:02] should be FileETag https://httpd.apache.org/docs/2.4/mod/core.html#fileetag [16:49:13] Default:FileETag MTime Size [16:52:00] theoretically Varnish should save it rather than re-calculate, and then send it back to thorium for validation [16:52:10] but I could be very wrong [16:52:26] (but it would make everything easier) [16:55:38] going afk for the weekend people :) [16:55:39] o/ [16:56:10] nuria: let me know later on what is your doubt, I might have missed your point sorry :( [16:56:17] bye team! [16:56:20] elukey: on meeting, can talk later [16:56:32] elukey: no worries, will add to ticket [17:44:46] 06Analytics-Kanban: Pre-generate mysql ORM code for sqoop - https://phabricator.wikimedia.org/T143119#3202074 (10Nuria) Ping @Milimetric lower priority than our design but if you feel you ned to grab an item you could do this one [17:45:41] 06Analytics-Kanban, 13Patch-For-Review: Add daily unique devices dataset to pivot - https://phabricator.wikimedia.org/T159471#3202076 (10Nuria) Friendly remainder that before we can close this item the pivot splash screen needs to have a link to this dataset [17:55:28] 06Analytics-Kanban, 13Patch-For-Review: Getting different versions of the same file - https://phabricator.wikimedia.org/T163338#3202091 (10Nuria) Reading this: https://httpd.apache.org/docs/2.4/mod/core.html#fileetag looks like ETag is calculated per resource served. Since statics for dashiki are versioned we... [18:16:59] ottomata: just sent the email update! Thanks :) [18:17:13] danke! [18:19:20] bye team, have a nice weekend! [18:46:10] 10Analytics-EventLogging, 06Analytics-Kanban: Research Spike: Better support for Eventlogging data on hive - https://phabricator.wikimedia.org/T153328#3202201 (10Nuria) [18:46:12] 10Analytics-EventLogging, 06Analytics-Kanban, 13Patch-For-Review: Write Spark schema differ / Hive DDL generator - https://phabricator.wikimedia.org/T161924#3202200 (10Nuria) 05Open>03Resolved [18:51:13] byye! [19:10:16] 06Analytics-Kanban, 13Patch-For-Review: Add zero carrier to pageview_hourly data on druid - https://phabricator.wikimedia.org/T161824#3144399 (10Nuria) I do not see the zero carrier yet, maybe we need to restart the job? [19:12:24] joal: quick scala q: [19:12:30] i want to do a negative case match [19:12:33] like [19:12:43] fields.find where dataType! != StructType [19:13:10] i can do this like [19:13:15] field.dataType.typeName != "struct" [19:13:21] but that kinda sucks [19:13:30] i feel like I should be able to do it cooler with a case pattern match somehow [19:13:51] fields.find { [19:13:51] case !StructType(_) => true [19:13:51] } [19:13:56] but that doesn't work [19:20:09] ottomata: and you only have 1 option [19:20:32] as is " x not typeof StructType" [19:20:54] yes [19:20:56] well, for now, but yet [19:22:31] yes [19:23:25] I've actually done all I need to do, but I do it two different ways in differenet places, and neither way I do it feels very scala-y [19:23:33] its ok, nuria, i'll just keep cleaning and work on this in review [19:34:10] Hey ottomata [19:34:16] still here ? [19:35:00] ya, I'm getting something simpler than what I was trying but still dont' really like the fact that I'm comparing typeName string [19:35:07] joal: say I have a Seq[StructFields] [19:35:26] and I want to return true if any of those StuctFields dataType is not a StructType [19:35:49] fields.exists(f => dataType != StructType) [19:36:04] now i do dataType.typeName != "struct" [19:36:08] which i guess is fine [19:36:46] hm [19:37:28] 10Analytics-Tech-community-metrics, 06Developer-Relations, 07Epic: Visualization/data regressions after moving from korma.wmflabs.org to wikimedia.biterg.io - https://phabricator.wikimedia.org/T137997#3202358 (10Aklapper) a:05Lcanasdiaz>03None [19:40:28] jo al: if you are not working, don't worry about it, i'm just touching things up [19:40:38] i'm sure there will be tons of stuff yo help me clean up in review [19:41:45] ottomata: thinking about it now [19:42:25] ottomata: Something like that: https://gist.github.com/jobar/9f621173d5ddcbd1ad28756d97bbc081 ? [19:44:02] this is so simpel, i feel like i tried this............. [19:44:05] trying... [19:44:05] haha [19:46:54] ottomata: works? [19:47:52] i think so! at least the IDE is happy! :) [19:47:57] huhu [19:47:58] :) [19:47:58] moving things around will run tests in a sec [19:48:02] sure [19:54:09] joal: looking good, getting cleaner all the time! [19:54:14] yay :) [19:54:25] * joal love clean functional code [21:13:36] 10Analytics, 10Fundraising-Backlog, 10MediaWiki-extensions-CentralNotice: Make banner impression counts available somewhere public - https://phabricator.wikimedia.org/T115042#3202804 (10DStrine) @Nuria I'm wondering if we can meet and talk about options. Community members need to know a few base things like:... [21:29:16] (03PS12) 10Ottomata: [WIP] Spark + JSON -> Hive [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/346291 (https://phabricator.wikimedia.org/T161924) (owner: 10Joal) [21:36:31] (03PS13) 10Ottomata: [WIP] Spark + JSON -> Hive [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/346291 (https://phabricator.wikimedia.org/T161924) (owner: 10Joal) [21:38:38] (03CR) 10Ottomata: "StructExtensions and TestStructExtensions is ready for review joal! Still gotta work on cleaning up the job part." [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/346291 (https://phabricator.wikimedia.org/T161924) (owner: 10Joal) [21:52:47] (03PS14) 10Ottomata: [WIP] Spark + JSON -> Hive [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/346291 (https://phabricator.wikimedia.org/T161924) (owner: 10Joal) [22:17:53] (03PS7) 10Nuria: Changes internal aqs api to accept a project or array of same [analytics/dashiki] - 10https://gerrit.wikimedia.org/r/347305 (owner: 10Fdans) [22:20:03] (03PS8) 10Nuria: Changes api glue code to accept a project or array of same [analytics/dashiki] - 10https://gerrit.wikimedia.org/r/347305 (owner: 10Fdans)