[05:59:53] 10Analytics-Kanban, 10DBA, 10Operations, 10ops-eqiad, 10User-Elukey: db1046 BBU looks faulty - https://phabricator.wikimedia.org/T166141#3334659 (10Marostegui) Please, have a plan B just in case this host doesn't come back up, it is a very old server and we know that sometimes, old servers once powered o... [07:09:09] 10Analytics-Kanban, 10DBA, 10Operations, 10ops-eqiad, 10User-Elukey: db1046 BBU looks faulty - https://phabricator.wikimedia.org/T166141#3334712 (10elukey) Thanks @Marostegui, I didn't think the situation was so desperate :D If there could be the risk of a bigger failure I'd change idea about the BBU an... [07:11:00] 10Analytics-Kanban, 10DBA, 10Operations, 10ops-eqiad, 10User-Elukey: db1046 BBU looks faulty - https://phabricator.wikimedia.org/T166141#3334717 (10Marostegui) I don't want to be pessimistic, but I have had issues with old servers in the past, so just wanted to give a heads up to make sure you guys have... [07:27:24] Hello people [07:27:48] I am slowly applying the new raid raid configs to analytics[1058-1068] as last step to unify them [07:27:54] let me know if you see any issue [07:35:11] 10Analytics: Some fields in Pivot should be numbers - https://phabricator.wikimedia.org/T167494#3334762 (10Gilles) [07:41:20] Hi elukey, thanks for doing that :) [07:46:56] joal: I added a couple of graphs to https://grafana.wikimedia.org/dashboard/db/analytics-hadoop (iops and iowait), didn't see anything chaning [07:46:59] *changing [07:47:07] morning :) [08:22:42] 10Analytics-Kanban, 10Operations, 10ops-eqiad, 10Patch-For-Review, 10User-Elukey: Review Megacli Analytics Hadoop workers settings - https://phabricator.wikimedia.org/T166140#3334853 (10elukey) Finally the same setting across all analytics workers: ``` elukey@neodymium:~$ sudo cumin 'R:class = role::ana... [08:22:57] all analytics workers with Current Cache Policy: WriteBack, ReadAdaptive, Direct, No Write Cache if Bad BBU [08:54:24] elukey: Looking at 90 days back of hadoop metrics: the version bump really changed things :) [08:58:39] joal: any metric in particular? [09:01:52] Particularly the GC and memory ones [09:06:06] we have also added the -Xms settings that completely changed how GC was performed [09:06:14] yup [09:06:25] does it match with CDH upgrade? [09:06:54] hm, I'd guess so, but better to double chekc [09:09:59] hm elukey - Can't find anything in logs related [09:10:18] seems to havbe happened around 2017-04-12 [09:11:09] need to leave now for 1h [09:11:10] later [09:11:21] o/ [09:11:33] (I started to work on Xmx in April, might be me) [09:11:40] *Xms [11:10:35] 10Analytics: Some fields in Pivot should be numbers - https://phabricator.wikimedia.org/T167494#3334762 (10JAllemandou) Hi @Gilles, thanks for the feedback, it would indeed be better to handle numeric fields as numreic dimensions in Pivot. However the version we have doesn't allow that, and we won't upgrade Pivo... [11:11:13] * fdans lunch! [11:25:17] really? [11:25:32] so soon?? :D :D :D [11:35:08] * elukey lunch as well! [11:58:19] (03PS2) 10Joal: Update cassandra loading for unique devices [analytics/refinery] - 10https://gerrit.wikimedia.org/r/357871 (https://phabricator.wikimedia.org/T167043) [12:53:14] A-team - I need to care Lino this evening, so I'll miss standup. I'll send an e-scrum and will be avail [12:53:37] available when Melissa comes back home (not late after standup) [12:57:56] joal: hello Linoooo o/ [12:58:02] :D [12:58:17] elukey: I can come to standup with him if you want to say hello :) [13:00:01] ahahah no no just say hello to him on my behalf, is enough :) [13:00:06] okey :) [13:00:44] 9hey yall [13:03:51] 10Analytics-Kanban: Rename last_access_uniques to per-domain uniques - https://phabricator.wikimedia.org/T167043#3335190 (10JAllemandou) [13:04:17] 10Analytics-Kanban: Rename last_access_uniques to per-domain uniques - https://phabricator.wikimedia.org/T167043#3315501 (10JAllemandou) [13:04:24] Hi milimetric [13:06:32] 10Analytics, 10Analytics-EventLogging, 10DBA: Add index on event_action, event_isAnon and event_namespaceId to NavigationTiming tables - https://phabricator.wikimedia.org/T70396#3335198 (10Marostegui) Is this still needed? [13:06:44] 10Analytics, 10Analytics-EventLogging, 10DBA: Add index on event_action, event_isAnon and event_namespaceId to NavigationTiming tables - https://phabricator.wikimedia.org/T70396#3335199 (10Marostegui) a:05Springle>03None [13:10:08] 10Analytics-Kanban, 10Patch-For-Review: Count project-wide unique devices (like *.wikipedia.org) - https://phabricator.wikimedia.org/T143928#3335202 (10JAllemandou) [13:11:24] 10Analytics-Kanban, 10Patch-For-Review: Update per-domain uniques fresh-sessions computation - https://phabricator.wikimedia.org/T167005#3335203 (10JAllemandou) [13:36:19] (03PS1) 10Joal: Correct typo bug of unique-devices-per-domain [analytics/refinery] - 10https://gerrit.wikimedia.org/r/358017 (https://phabricator.wikimedia.org/T167043) [13:36:57] helloooo milimetric wanna sync up at 4pm CEST? [13:37:00] in 24min [13:37:06] hey fdans sure [13:37:53] btw taylor swift is back on spotify after 2 years, it's the best day ever [13:42:54] I've never used spotify, I used to love music before rhapsody broke my heart, now I just wander around in silence [13:50:44] I use google music and thought it would be terrible but it's...great for me [13:51:18] (03CR) 10Ottomata: [V: 032 C: 032] Correct typo bug of unique-devices-per-domain [analytics/refinery] - 10https://gerrit.wikimedia.org/r/358017 (https://phabricator.wikimedia.org/T167043) (owner: 10Joal) [13:52:20] (03CR) 10Ottomata: [C: 031] Update cassandra loading for unique devices [analytics/refinery] - 10https://gerrit.wikimedia.org/r/357871 (https://phabricator.wikimedia.org/T167043) (owner: 10Joal) [13:59:24] milimetric: omg rhapsody [13:59:50] I think i've been in spotify for close to 10 years now [14:00:01] batcave? [14:01:00] milimetric: ^ [14:01:07] omw [14:50:49] 10Analytics-Kanban: Preserve userAgent field in apps schemas - https://phabricator.wikimedia.org/T164125#3335485 (10mforns) Hi @Tbayer and @Dbrant I modified the white-list to reflect the discussion we had in this task. Please review the gerrit change here: https://gerrit.wikimedia.org/r/#/c/298721/8..9/files/... [15:12:53] Hey folks. I'm trying to host a public data file from stat1002 [15:13:02] I see we have a new folder called /a/published-datasets/ [15:13:23] I'm planning to just generate this file once, but I figure that many people could download it over the next couple of years. [15:13:34] Is this the right place to put it? [15:25:04] (03CR) 10Ottomata: [C: 032] Specifying test dependency on snappy [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/357324 (owner: 10Nuria) [15:25:19] (03CR) 10Ottomata: [C: 032] "I had the same snappy test problem, and this change fixed it for me!" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/357324 (owner: 10Nuria) [15:25:22] (03CR) 10Ottomata: [V: 032 C: 032] Specifying test dependency on snappy [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/357324 (owner: 10Nuria) [16:04:42] halfak: public data is sourced from the machine that used to be called 1001 so it will get rsync there [16:05:12] nuria_, of course. What folder should I put it in on stat1002 [16:25:07] * elukey off! [16:28:55] Hey nuria_ [16:29:36] joal: on meeting can talk in abit [16:29:39] k [17:04:28] * halfak is just going to put this file wherever [17:23:46] I boldly created a "one-off" directory and used what I hope will be a useful structure. I even added a README to the folder. [17:33:55] elukey, you still there? [18:15:50] joaL; back, sorry halfak [18:15:57] hey Niharika [18:16:06] oops sorry Niharika [18:16:10] wrong ping [18:16:13] Hey nuria_ [18:16:36] yessir [18:17:12] nuria_: https://gerrit.wikimedia.org/r/#/c/358062 :) [18:17:37] ottomata: super speedy man [18:17:49] ottomata: let me talk to joal and will look [18:18:24] nuria_: ottomata +1ed my patch for cassandra loading, I was was wondering if you were ok with me deploying and moving forward with unique cheges monday next week [18:19:48] joal: yes, did you corrected the commit message and add ticket? [18:20:33] nuria_: I added the task in commit message, and put all the productionisation scheme in those tasks (not in commit message, too big) [18:21:58] joal: ah i see iit, but i do not understand why do we need to drop our druid data though [18:22:12] nuria_: field name change: host --> domain [18:23:07] joal: i see, we will need to send a note about it as people's book marks will break [18:24:09] nuria_: good point [18:24:51] joal: we can probably do it in two steps. stop druid loading (send note) + do oozie + cassandra work [18:25:59] joal: once that is done send another note and restart druid jobs [18:26:18] joal: we should not announce new data though until it has baked a while [18:26:30] nuria_: We can start druid job and not announce data? [18:26:34] Like that it computes [18:26:42] new project-wide domains [18:26:49] data though [18:26:56] joal: not "renamed" older data [18:27:29] joal: we should not expose on druid data that is not final and i think we want to do final sanituy checking on project wide uniques, there coudl be bugs on our code [18:27:43] *sanity checking [18:27:48] *there could be [18:28:11] nuria_: While I agree, I we can just leave the pivot endpoint without title [18:28:48] or not start druid jobs and stat them later, pretty much the same to me [18:29:02] joal: ya, that is fine too [18:29:12] Data will be present in files so I wonder if druid is that important ... [18:29:27] keeping it anonymous and advertising should be enough I guess [18:29:44] ok, so moving forward on monday :) [18:29:49] Thanks nuria_ ! [18:34:51] joal: also i do not think we should have project-wide files : https://dumps.wikimedia.org/other/unique_devices/project_wide/2017/ [18:35:00] joal: w/o having totally vetted data [18:35:23] nuria_: same idea for me: not advertising the data should be enough [18:35:25] joal: we want to look at plots for a while with final refactor to rule out bugs on our end [18:36:00] nuria_: I can archive onto a place which is not synchronised with dumps [18:36:37] joal: ya, let's have data on hive (and maybe on druid without tags) before we publish it externally [18:38:04] ok, will document [18:38:58] joal: rather why don't we file a a task with wjhat to do after we call data good? (after our new code is been baking for a while) [18:39:34] joal: we need to : update tags on pivot dataset, set up jobs to publish data to files externally , document those files in wikitech and announce that are available [18:39:42] joal: seems worthy of a task [18:39:50] nuria_: will do to keep track, but it's a pain to regenerate only archived data - Better to generate it in a place not synchronised, and only copy /relaunch [18:40:42] joal: we can do that if you feel is easier. If there are no bugs we can relaunch [18:41:20] joal: let's file a task however to keep track of all this work [18:43:58] 10Analytics: Finalise unique_device_project_wide - https://phabricator.wikimedia.org/T167539#3336133 (10JAllemandou) [18:44:00] nuria_: --^ [18:45:15] 10Analytics: Final steps to expose project wide unique devices data - https://phabricator.wikimedia.org/T167539#3336146 (10Nuria) [18:45:23] 10Analytics-Kanban, 10Patch-For-Review: Count project-wide unique devices (like *.wikipedia.org) - https://phabricator.wikimedia.org/T143928#3336147 (10JAllemandou) [18:45:29] 10Analytics-Kanban: Final steps to expose project wide unique devices data - https://phabricator.wikimedia.org/T167539#3336133 (10Nuria) [18:45:47] joal: ok, thank you i edit it, befoire we do all that i will like to ask tilman to loot at data [18:45:49] nuria_: Also documented the thing about launching oozie job to archive in my folder [18:46:00] no prob - sounds good :) [18:46:20] nuria_: It's production wihtout external visibility - looks great :) [18:46:27] Going for diner now :) [18:46:34] Have a great weekend a-team [18:46:41] Thanks for the answers nuria :) [18:53:02] laters joal! [18:53:58] 10Analytics-Kanban: Document that old deleted pages have empty fields in Analytics Cluster edit data - https://phabricator.wikimedia.org/T165201#3336156 (10mforns) Wrote this, I hope it's enough to have it in the Data_Lake/Edits page, as opposed to on each of the data sets' pages (would have been duplicated). ht... [19:00:19] halfak: if you put files on stat1003 on /a/published-datasets they will appear here: /a/published-datasets [19:00:26] halfak: https://analytics.wikimedia.org/datasets/ [19:00:59] halfak: we are working on making this more obvious, see https://phabricator.wikimedia.org/T159409 cc milimetric [19:02:22] milimetric: do we think the dataset move is something we can do next week or should we move it into paused? [19:02:58] I'm doing it now in about an hour [19:03:05] I just put up a patch for the jsonconfig problem [19:03:14] nuria_: ^ [19:03:29] milimetric: oohhhh [19:03:40] milimetric: did you get some help with that? [19:25:17] nuria_: yes, both Gergo and Max helped me, but they were more like, "dude, read the error message" [19:25:22] which I did, like thirty times [19:25:32] and then I think I know what they mean now :) [19:31:35] 10Analytics-Kanban: Document that old deleted pages have empty fields in Analytics Cluster edit data - https://phabricator.wikimedia.org/T165201#3336243 (10Nuria) Ping @Neil_P._Quinn_WMF to confirm this ticket can be closed [19:36:25] mforns: edited docs abit [19:36:34] mforns: for data lake [19:36:38] nuria_, ok [19:38:24] nuria_, makes sense, thx! [19:42:58] 10Analytics, 10Analytics-EventLogging, 10DBA: Add index on event_action, event_isAnon and event_namespaceId to NavigationTiming tables - https://phabricator.wikimedia.org/T70396#3336282 (10Nuria) Resolving, EL and dashboards have much changed , data from navigation timing can be found in graphana: https://gr... [19:43:12] 10Analytics, 10Analytics-EventLogging, 10DBA: Add index on event_action, event_isAnon and event_namespaceId to NavigationTiming tables - https://phabricator.wikimedia.org/T70396#3336285 (10Nuria) 05Open>03declined