[00:23:49] 10Analytics-Kanban, 10Analytics-Wikistats: Wikistats 2 UI feedback From Erik Z - https://phabricator.wikimedia.org/T178084#3681760 (10Milimetric) Erik, I want to fix most of these things for the first launch, and most of them are easy to fix. I've already submitted a couple of patches. The only question I ha... [00:24:16] 10Analytics-Kanban, 10Analytics-Wikistats: Alpha release: Wikistats 2 UI feedback From Erik Z - https://phabricator.wikimedia.org/T178084#3681763 (10Milimetric) p:05Triage>03High a:05Nuria>03Milimetric [00:44:44] (03CR) 10Milimetric: [V: 032 C: 032] "Looking good, I left a comment on the injection comment :)" (031 comment) [analytics/aqs] - 10https://gerrit.wikimedia.org/r/379227 (https://phabricator.wikimedia.org/T175805) (owner: 10Joal) [00:53:22] (03CR) 10Milimetric: [C: 04-1] Add central notice component and detect adblock (034 comments) [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/383798 (https://phabricator.wikimedia.org/T177491) (owner: 10Fdans) [00:58:10] (03CR) 10Milimetric: "if you disagree with my suggestion, just go ahead and self-merge" (031 comment) [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/383589 (https://phabricator.wikimedia.org/T177646) (owner: 10Fdans) [02:48:09] (03PS1) 10MaxSem: Update comment [analytics/refinery] - 10https://gerrit.wikimedia.org/r/383967 [07:07:41] (03CR) 10Joal: [V: 032 C: 032] "LGTM ! Approved" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/383967 (owner: 10MaxSem) [07:08:00] (03CR) 10Joal: [V: 032 C: 032] "A little nit though: Commit message should be more explicit about what comment is changed" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/383967 (owner: 10MaxSem) [09:18:02] (03PS4) 10Joal: Fix druid datasources for uniques jobs [analytics/refinery] - 10https://gerrit.wikimedia.org/r/383172 (https://phabricator.wikimedia.org/T175162) [09:18:11] (03CR) 10Joal: [V: 032] Fix druid datasources for uniques jobs [analytics/refinery] - 10https://gerrit.wikimedia.org/r/383172 (https://phabricator.wikimedia.org/T175162) (owner: 10Joal) [09:27:08] (03CR) 10Fdans: [C: 032] Exclude bots from pageviews [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/383725 (owner: 10Milimetric) [09:27:23] (03CR) 10Fdans: [V: 032 C: 032] Exclude bots from pageviews [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/383725 (owner: 10Milimetric) [09:29:15] elukey: do you have a phab ticket for this kafka problem I could link? [09:30:54] dcausse: didn't open one but the original issue was tracked in https://phabricator.wikimedia.org/T172681 [09:31:14] elukey: thanks [09:51:15] elukey: o/ [09:51:22] joal: o/ [09:51:25] elukey: do you want us to discuss indexation? [09:52:14] joal: currently in the middle of the LVS code reviews for druid-public-broker.svc.eqiad.wmnet, I'd skip it for the moment if it is not an issue :( [09:52:26] elukey: no problem at all [09:52:36] elukey: would you make an exception about deploys on friday for AQS ? [09:52:52] I was wrong and we are not going to use druid-public.svc.eqiad.wmnet, but druid-public-broker.svc.eqiad.wmnet [09:52:55] * joal whistles looking up to the sky [09:53:24] joal: if it is a minor change we might do it, I'll not be here until tuesday though (from tomorrow onwards :P) [09:54:00] elukey: no change on currently existing endpoints - The deploy would for wks2 backend [09:58:33] elukey: I tried to fix the version for all kafka clients we use but there is somewhere I'm not sure how to do... when using Spark KafkaUtils.createRDD (https://spark.apache.org/docs/2.0.0/api/java/index.html?org/apache/spark/streaming/kafka/KafkaUtils.html) [10:00:10] dcausse: let's see if the issue will still pop up in the logs, if so we'll try to fix spark too ok? [10:00:20] elukey: sounds fine [10:01:54] joal: about the deploy - we can do it if it is only some code that does not affect the current production one [10:04:27] elukey: This is what it is :) [10:04:35] green light from me then [10:04:40] Yay :) [10:05:10] elukey: there'll be 2 deploys: 1 for AQS code, 1 aqs config (adding Druid info in setup) [10:05:59] dcausse: if you could ping me in here when you run the next say two kafka jobs it would be great so I'll double check logs [10:06:20] elukey: sure, it's not ready yet [10:06:27] sure sure, whenever you are :) [10:06:30] :) [10:06:33] joal: can I see the aqs config? [10:06:39] elukey: sure :) [10:06:57] elukey: maybe we should wait for LVS to be ready? [10:07:03] or we'll change in the config? [10:07:25] elukey: the patch is basically this https://gerrit.wikimedia.org/r/#/c/379730/1, without auth [10:09:43] joal: druid hosts mean that aqs will try to contact one of them? [10:10:07] elukey: there'll actually be only one host [10:10:28] but yes this is the point - To have AQS query druid and test our endpoints internally [10:12:35] sure sure, the lvs can wait, we can deploy next week with the new target when it will be ready [10:12:39] it might be today but I am not sure [10:12:53] Makes sense elukey [10:13:02] elukey: for now we can pik one host and that'll be it [10:13:18] * elukey nods [10:17:59] ok elukey - Way to go: first deploy conf as it doens't impact other endpoints, then deploy AQS [10:22:38] (03CR) 10Fdans: [C: 032] Tweak order and alignment of lists for readability [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/383728 (owner: 10Milimetric) [10:22:50] (03CR) 10Fdans: [V: 032 C: 032] Tweak order and alignment of lists for readability [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/383728 (owner: 10Milimetric) [10:24:41] ack [10:27:28] elukey: another point: Should we use http or https for querying druid? I'm assuming http is fine, but let me know otherwise [10:28:00] http is fine, all public data [10:28:05] awesome [10:35:21] elukey: starting the clients, (relforge1001 will connect right now and hadoop machines will connect in around 1hour) [10:35:50] elukey: CR updated for conf update [10:40:45] dcausse: last exception was at [2017-10-13 09:05:02,543] UTC, looking good! [10:41:48] 10Analytics-Kanban, 10Analytics-Wikistats, 10Patch-For-Review: Create Druid public cluster such AQS can query druid public data - https://phabricator.wikimedia.org/T176223#3682533 (10JAllemandou) a:05JAllemandou>03elukey [10:50:29] joal: one question - I am trying to refactor the code in a way that the druid parameters are optional, namely if set the yaml config changes, otherwise it doesn't [10:51:02] is it fine right? I am thinking about labs for example, in which we might not want to run very soon a druid cluster [10:53:28] hm ... elukey: I don't know how to do that easily [10:55:37] elukey: in AQS project spec, we specify {{options.druid}} - I don't know how aqs behaves if druid doesn't exist in options objet [10:55:41] I'm gonna chek that [10:57:27] elukey: no providing a druid option doesn't crash the system - However it fails miserably if you try to access an endpoint using druid as backend [11:01:59] ok so it will need a druid cluster from now on [11:02:10] elukey: I'm trying to solve that [11:06:48] joal: sorry for the trouble, just trying to make a puppet patch that doesn't break labs immediately :D [11:07:11] makes sense elukey !!! [11:16:17] I am also doing a quick refactoring to the aqs code [11:16:34] so I'll add profile parameters for druid that will become class parameters [11:16:45] 10 mins hopefully :) [11:28:12] elukey: I have a path throwing 500 when tring to access druid-based endpoints without druid config - Is that good for you? [11:31:44] joal: I didn't mean to force you to introduce a hack if not needed, do you like the idea? [11:31:57] I mean, if not we can skip it [11:33:47] it might not now that I think about it [11:37:09] elukey: My hack is ready, and I prefer to fail in a known way if Druid conf is not set (as for labs for instance) [11:37:49] so no bother - I had forgotten about labs not having druid :( [11:39:09] joal: so this is my first draft https://gerrit.wikimedia.org/r/#/c/379730/3 [11:40:11] so basically leaving druid config optional in the druid class [11:40:23] and then configure everything via profile/role [11:40:38] buuut the druid options in hiera needs to be set with this proposal change in puppet anyway [11:40:54] it is good though to have a known failure scenario if druid is not configured [11:41:33] elukey: most aggreed :) [11:43:32] (03PS15) 10Joal: Add mediawiki-history-metrics endpoints [analytics/aqs] - 10https://gerrit.wikimedia.org/r/379227 (https://phabricator.wikimedia.org/T175805) [11:43:58] elukey: https://gerrit.wikimedia.org/r/#/c/379227/15/sys/mediawiki-history-metrics.js (line 51) [11:45:01] elukey: You have forgotten druid scheme in your patch :) [11:45:16] also elukey: Why not parameterizing druid port in hiera? [11:45:22] many things, since apparently I forget also to do git add newfile [11:45:24] * joal feels dumb [11:45:52] the port is unlikely to change so I didn't bother adding it, but I can do it if you want [11:46:10] elukey: I wonder if LVS changes could lead to port change for instance [11:46:13] probably not [11:46:51] nono [11:55:32] joal: I like https://gerrit.wikimedia.org/r/#/c/379227/15/sys/mediawiki-history-metrics.js:51 [11:56:38] elukey: no easy way to test this in regular tests, I did test it owever by removing druid config and validating behavior [12:03:55] elukey: That patch about druid conf means we can acutally deploy AQS and do conf after :D [12:04:41] nice :) [12:13:07] still trying to make puppet work, stupid hiera bug [12:22:01] elukey: taking a small break, will be back after, and hopefully deploy :) [12:23:42] joal: sure.. today I'll log off in two hours more or less, need to run some errands.. (as FYI) [12:35:59] * elukey lunch! [13:44:29] fdans_onholiday: you still on holiday? [13:45:59] milimetric: yeah I moved it to today, worked a few hours this morning to compensate yesterday’s [13:46:08] gotcha, enjoy! [14:02:41] * elukey logging off earlier today, be back on wed! o/ [14:13:01] bye elukey ! [14:14:29] Hey milimetric - SHould we push for deploy and test of aqs or should we wait for monday? [14:15:19] joal: I think just AQS with no cassandra deploy is really safe, because we can even do just one of the servers first [14:15:45] and I'm happy to hang out with you to provide a second pair of eyes [14:15:51] milimetric: I agree - We'll need some puppet with that though (druid config) - ottomata, are you with us? [14:16:03] milimetric: I'd love to have it live tonight as you may guess :) [14:16:06] joal: I think luca already merged that, no? [14:16:21] milimetric: I think it's still in CR [14:17:16] milimetric, ottomata: https://gerrit.wikimedia.org/r/#/c/379730/ [14:17:24] aha, I see [14:17:53] I saw the gerrit pings this morning and I thought it was merged [14:17:54] Also milimetric, I updated reading druid config to fail with 500 in case druid conf is not set (at least behavior is predictable and message clear) [14:17:55] but it's just "ready" [14:18:09] yop [14:18:35] And I did a last pass yesterday over english language of endpoints ( many small mistakes) [14:18:57] ok, cool, so then if ottomata merges this thing, we're good to go (he's not on irc right now) [14:19:09] yes I think so [14:19:28] joal: yeah, I should have helped more with that, sorry [14:19:33] milimetric: don't worry :) [14:19:59] milimetric: We can actually merge now, and wait for config to come (we should just 500 in asnwers) [14:20:16] Arf, no - there still is something I'd like to test - values for monitoring of endpoints [14:21:12] values? [14:21:30] Yes - batcave for a minute? [14:21:34] omw [14:36:18] (03PS16) 10Joal: Add mediawiki-history-metrics endpoints [analytics/aqs] - 10https://gerrit.wikimedia.org/r/379227 (https://phabricator.wikimedia.org/T175805) [14:36:44] ok, this looks deployable now [14:37:57] However milimetric, I think it would be safer to depool the node on which we deploy [14:38:12] yeah, that also needs ops though [14:38:55] joal: yt? got a min for bc? [14:39:00] sure ottomata [15:20:53] mforns: your el test refine cron is merged and in place [15:21:04] watch it at the next hour ? make sure it works? [15:21:27] ottomata, ok, what about the wmf-style errors? [15:21:51] wmf-style errors? [15:22:04] oh jenkins complaints? [15:22:08] i commented in ticket [15:22:14] those are about new style guidelines [15:22:21] we'd need a bigger refactor to fix those [15:25:18] 10Analytics-Kanban: Remove AppInstallIId from EventLogging purging white-list - https://phabricator.wikimedia.org/T178174#3683358 (10mforns) [15:26:37] ottomata, ok [15:39:18] 10Analytics, 10cloud-services-team (Kanban): Remove logging from labs for schema https://meta.wikimedia.org/wiki/Schema:CommandInvocation - https://phabricator.wikimedia.org/T166712#3683389 (10Nuria) ping @elukey Can we drop the CommandInvocation table? Thank you [15:55:53] milimetric: need to leave for some errand and food [15:55:56] joal: can I finish cooking another 20 minutes or so? [15:55:58] oh good :) [15:56:03] k, see you after [15:56:13] milimetric: Do you mind if we merge-deploy in 2/3 hours? [15:56:21] not at all [15:56:29] awesome :) [15:56:31] later [16:11:10] 10Analytics, 10Performance-Team: Explore NavigationTiming by faceted properties - EventLogging refine - https://phabricator.wikimedia.org/T166414#3683473 (10Nuria) a:03mforns [16:30:02] 10Analytics-Kanban, 10Analytics-Wikistats: Alpha release: Wikistats 2 UI feedback From Erik Z - https://phabricator.wikimedia.org/T178084#3683520 (10Erik_Zachte) @Milimetric thanks for caring! Yes I saw Marcels scroll solution and it works for me. So it's just the default content on first visit that my rema... [16:43:24] (03PS1) 10Ottomata: Use rand() instead of hostname,sequence for sampling webrequest data [analytics/refinery] - 10https://gerrit.wikimedia.org/r/384074 [16:48:36] (03CR) 10Ottomata: [V: 032 C: 032] Use rand() instead of hostname,sequence for sampling webrequest data [analytics/refinery] - 10https://gerrit.wikimedia.org/r/384074 (owner: 10Ottomata) [16:49:52] !log deployed refinery to use rand() for webrequest sampling [16:49:53] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [17:08:56] 10Analytics, 10Research: geowiki data for Global Innovation Index - 2017 - https://phabricator.wikimedia.org/T178183#3683651 (10leila) [17:09:32] 10Analytics, 10Research: geowiki data for Global Innovation Index - 2017 - https://phabricator.wikimedia.org/T178183#3683651 (10leila) [17:12:15] 10Analytics, 10Research: geowiki data for Global Innovation Index - 2017 - https://phabricator.wikimedia.org/T178183#3683699 (10leila) @Milimetric this is the repeat of the last year's request. Are you aware of any changes in geowiki data collection and storage that may become a blocker for sharing this data w... [17:15:09] (03PS1) 10Ottomata: Add webrequest/sample/coordinator.propertes and make coord names consistent [analytics/refinery] - 10https://gerrit.wikimedia.org/r/384076 [17:18:29] (03CR) 10Ottomata: [V: 032 C: 032] Add webrequest/sample/coordinator.propertes and make coord names consistent [analytics/refinery] - 10https://gerrit.wikimedia.org/r/384076 (owner: 10Ottomata) [18:13:30] ok milimetric [18:13:34] Are you ready? [18:14:47] yes joal but for some reason my other keyboard broke, one minute [18:15:27] ottomata, when you're ready: https://gerrit.wikimedia.org/r/384084 :) [18:15:27] milimetric: take your time, I'm triple cheking stuff [18:15:30] xfer is all done [18:16:50] thanks halfak [18:17:27] joal: gotta restart, brb [18:17:35] sure [18:18:12] (03PS11) 10Joal: Update mediawiki-history-reduced oozie job [analytics/refinery] - 10https://gerrit.wikimedia.org/r/379000 (https://phabricator.wikimedia.org/T174174) [18:19:09] (03PS17) 10Joal: Add mediawiki-history-metrics endpoints [analytics/aqs] - 10https://gerrit.wikimedia.org/r/379227 (https://phabricator.wikimedia.org/T175805) [18:20:04] joal: ok, working, sheesh [18:20:06] omw cave [18:20:11] OMW too ! [18:24:37] (03CR) 10Milimetric: [V: 032 C: 032] Add mediawiki-history-metrics endpoints [analytics/aqs] - 10https://gerrit.wikimedia.org/r/379227 (https://phabricator.wikimedia.org/T175805) (owner: 10Joal) [18:34:31] (03PS1) 10Joal: Update aqs to 202606b [analytics/aqs/deploy] - 10https://gerrit.wikimedia.org/r/384088 [18:36:00] (03CR) 10Milimetric: [V: 032 C: 032] Update aqs to 202606b [analytics/aqs/deploy] - 10https://gerrit.wikimedia.org/r/384088 (owner: 10Joal) [19:09:30] 10Analytics-Kanban: Remove AppInstallIId from EventLogging purging white-list - https://phabricator.wikimedia.org/T178174#3684016 (10Nuria) a:05mforns>03Nuria [19:23:24] Pchelolo: hey we're trying to deploy and having a weird problem with versioning of restbase-mod-table-cassandra [19:23:54] Pchelolo: so we had 0.8.15 in our AQS (I know, super old!) and so it pulled in 0.8.18 which gives this error when running tests: [19:24:16] Error: Cannot find module './test/utils/test_utils.js'\n at Function.Module._resolveFilename (module.js:469:15)\n at Function.Module._load (module.js:417:25)\n at Module.require (module.js:497:17)\n at require (internal/module.js:20:19)\n at Object. (/home/dan/projects/aqs/node_modules/restbase-mod-table-spec/index.js:17 [19:25:07] and if you look at that line, yeah, there's a bug there, but upgrading restbase-mod-table-cassandra to any of the more recent versions doesn't solve the problem [20:00:05] milimetric: I continued testing from home: tests fail but code works (except for a bug, I'll push a new patch) [20:01:45] (03PS1) 10Joal: Correct bug in druid URI building [analytics/aqs] - 10https://gerrit.wikimedia.org/r/384103 [20:05:32] milimetric: But except from that, works like a charm [20:05:54] milimetric: Ah, no - I removed cassandra related endpoints :) [20:06:30] joal: uh... yeah, we need those :) [20:06:37] :D [20:07:08] heh, it's ok, Pchelolo is probably getting back from lunch and I'll see what he thinks [20:07:17] if he wants to upgrade, I can take a look and maybe start a patch [20:07:25] but I have to take off at a normal time today too [20:07:34] No worries [20:08:04] All endpoint give correct results except new-registered-users [20:13:02] This --^ is due to a bug at indexation time - Correction coming [20:14:17] (03PS12) 10Joal: Update mediawiki-history-reduced oozie job [analytics/refinery] - 10https://gerrit.wikimedia.org/r/379000 (https://phabricator.wikimedia.org/T174174) [20:21:17] mforns: question if you are there [20:21:34] joal: if you want we can deploy your history-metrics-only version to stat1005 or something like that, and play with it? [20:21:55] oh, can we hit the public cluster from inside? no, probably [20:21:58] milimetric: feasible, it's actually easier to do from home :) [20:22:07] I think we can, not sure though [20:22:25] yeah, if all you want to see are metrics loading up in the front end, we can hangout and try to make it work :) [20:22:42] hehe [20:22:54] milimetric: I'll go to bed soon, thanks for offering [20:22:57] 10Analytics-Kanban, 10Patch-For-Review: Remove AppInstallIId from EventLogging purging white-list - https://phabricator.wikimedia.org/T178174#3684227 (10Nuria) Also, what about raw userAgents? [20:23:09] ok :) good! sleep well, talk to you Monday [20:23:18] milimetric: I'm happy already to have found the 2 bugs I corrected this evening [20:23:42] \o/ :) [20:24:13] milimetric: Looks like public cluster can be queried from inside analytics vlan :) [20:24:41] oh interesting [20:25:55] (03PS2) 10Joal: Correct bug in druid URI building [analytics/aqs] - 10https://gerrit.wikimedia.org/r/384103 [20:27:42] nuria_, yes? [20:27:55] mforns: how about userAgent on whitelist? [20:28:15] nuria_, well... that one was negotiated with reading [20:28:32] (03PS1) 10Joal: [FUN] AQS for druid only [analytics/aqs] - 10https://gerrit.wikimedia.org/r/384113 [20:28:37] mforns: aham, so what was the status of negotiation? [20:28:40] milimetric: --^ if you want to have fun :) [20:28:44] it's only in mobile schemas [20:28:53] and the user agent is parsed [20:29:24] (03PS2) 10Joal: [FUN] AQS for druid only [analytics/aqs] - 10https://gerrit.wikimedia.org/r/384113 [20:29:38] the agreement was: we keep the whole parsed user agent now, until we manage to purge it partially, meaning: some user-agent fields are kept and some are purged [20:30:30] mforns: i see, do we have a ticket about that? [20:30:36] nuria_, remember Luca and I tried to partially purge map fields in Prague, but it made the algorithm less performant in mysql [20:30:43] mforns: yes [20:30:53] nuria_, yes, that conversation is in a ticket for sure [20:31:02] let me look for it [20:31:35] (03PS13) 10Joal: Update mediawiki-history-reduced oozie job [analytics/refinery] - 10https://gerrit.wikimedia.org/r/379000 (https://phabricator.wikimedia.org/T174174) [20:32:29] nuria_, https://phabricator.wikimedia.org/T164125 [20:33:40] when we do the purging in hive, we'll be able to apply this partial-map-purging quite easily with spark-scala, so I'm not super worried [20:34:35] mforns: i think task is marked done but shoudl be open right/ [20:34:56] 10Analytics-Kanban: Preserve userAgent field in apps schemas - https://phabricator.wikimedia.org/T164125#3684250 (10Nuria) 05Resolved>03Open [20:35:10] nuria_, we can create another one to apply partial map purging when we are able to [20:35:22] mforns: ok, let's do that [20:35:36] 10Analytics-Kanban: Preserve userAgent field in apps schemas - https://phabricator.wikimedia.org/T164125#3222702 (10Nuria) 05Open>03Resolved [20:35:48] Signing off for tonight team, see you on monday :) [20:37:45] bye joal! [20:37:50] nuria_, creating it [20:39:27] 10Analytics: Partially purge user-agent map for EventLogging mobile schemas - https://phabricator.wikimedia.org/T178198#3684282 (10mforns) [20:39:39] (03CR) 10Milimetric: [V: 032 C: 032] Correct bug in druid URI building [analytics/aqs] - 10https://gerrit.wikimedia.org/r/384103 (owner: 10Joal) [20:39:48] bye team, me too signing off, have a nice weekend! [20:40:39] 10Analytics-Kanban, 10Patch-For-Review: Remove AppInstallIId from EventLogging purging white-list - https://phabricator.wikimedia.org/T178174#3684298 (10Nuria) See work to be done for UAS: https://phabricator.wikimedia.org/T178198#3684282 [21:35:01] 10Analytics-Kanban, 10Patch-For-Review: Remove AppInstallIId from EventLogging purging white-list - https://phabricator.wikimedia.org/T178174#3684426 (10Tbayer) Folks, we spent quite a bit of time just a few months ago on a comprehensive review of the purging settings for all apps schemas, which included discu... [21:36:35] 10Analytics-Kanban, 10Patch-For-Review: Remove AppInstallIId from EventLogging purging white-list - https://phabricator.wikimedia.org/T178174#3684430 (10Tbayer) >>! In T178174#3684227, @Nuria wrote: > Also, what about raw userAgents? See T164125#3284763 , it seems we lack a bit of institutional memory here. I... [22:14:51] 10Analytics-Kanban, 10Patch-For-Review: Remove AppInstallIId from EventLogging purging white-list - https://phabricator.wikimedia.org/T178174#3684537 (10Nuria) The appinstallid was wrongly included on whitelist when it should not have been, this is a mistake on our end when putting whitelist together so this i...