[01:11:53] 10Analytics, 10Analytics-Cluster, 10Wikimedia-General-or-Unknown: Browser and platform stats for logged-in vs. anon users for security and product support decisions - https://phabricator.wikimedia.org/T58575#2960075 (10Liuxinyu970226) [01:13:08] 10Analytics, 10Analytics-Cluster, 10Wikimedia-General-or-Unknown: Browser and platform stats for logged-in vs. anon users for security and product support decisions - https://phabricator.wikimedia.org/T58575#2960077 (10Liuxinyu970226) [01:13:24] 10Analytics, 10Analytics-EventLogging, 07Need-volunteer: Add sanitized User-Agent to default fields logged by EventLogging - https://phabricator.wikimedia.org/T54295#2960086 (10Liuxinyu970226) [06:17:10] 10Analytics: Add statistics and error logging to mediawiki history reconstruction job - https://phabricator.wikimedia.org/T155971#2960420 (10JAllemandou) [08:39:27] (03CR) 10Joal: "Quick comments for improvement inline." (033 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/331794 (https://phabricator.wikimedia.org/T155141) (owner: 10Mforns) [11:55:38] joal: just added the ACLs for reach dbproxy1010:3306 from the analytics vlan [11:55:52] Wouhou elukey ! [11:55:56] tested? [11:56:16] there is only one thing left that needs Jaime's comment, namely if we also need to whitelist dbproxy1011 (that might be a backup or something similar) [11:56:29] yep telnet an1044 -> dbproxy1010 works :) [11:56:43] awesome elukey :) [11:56:47] Thanks a mil mate [11:56:53] but we need to use the CNAME [11:57:05] Doesn't bother me too much [11:57:45] labsdb-analytics.eqiad.wmnet [11:57:59] because afaicu this one might point to dbproxy1010 or 1011 [11:58:33] k elukey [11:58:43] elukey: just tried with telnet, seems working :) [11:58:59] elukey: do you know if Jaime works today? [11:59:47] not sure, but I pinged also Manuel (in a meeting now) [11:59:53] k elukey [12:00:21] elukey: let me know when you have Manuel around, I'd like to ask him what process I should follow to ask for a user on that db [12:01:46] sure [12:01:52] also aqs1007 is alive! \o/ [12:01:58] no cassandra on top of course [12:02:09] awesome elukey :) [12:02:23] let me know if I can help [12:04:42] ah Eric answered! [12:05:06] so say we bootstrap aqs1007 with a new logical rack for cassandra [12:05:15] the first instance will stream from the whole cluster [12:05:24] the second one from the rack only [12:05:27] :/ [12:06:04] but the main issue is reasoning about a cluster with physical rack vs logical racks not in sync [12:06:38] from what I gathered, Restbases uses the Row number as rack (so groups of racks rather than single ones) [12:15:39] elukey: I now understand as well the lading (second instance for 1007 will only use same rack, so same machine ... meh) [12:15:57] elukey: however I don't understand the thing about physical / logical racks :S [12:17:01] me too [12:17:35] Eric says that theoretically having replication == racks helps when you have to reason about what happens in the cluster [12:18:46] elukey: I understand it makes things easier in the brain - He advise is that we use 3 racks ? (since we want rep factor = 3) [12:21:24] yeah something like that [12:22:18] elukey: if machines are physically in the same rack, I'm not opposed - if they are not, it'll get even messier in my opinion [12:46:16] joal: I have the same opinion but I feel that I am missing something [12:46:30] I'll try today to check restbase's config [12:46:48] in the meantime, I am installing the os on all the new aqs nodes :) [12:46:58] 40 cores and 128GB of ram !!! [12:47:13] Maaaaaan :) [12:47:19] huhuhu [12:47:35] * joal want to test clickhouse on those beasts :) [12:49:25] hahahah [13:07:29] 06Analytics-Kanban: Run a 1-off sqoop over the new labsdb servers - https://phabricator.wikimedia.org/T155658#2961083 (10elukey) >>! In T155658#2956313, @elukey wrote: > Are there any outbound rules for the Analytics VLAN by any chance? I can see the following on cr1: > > ``` > show configuration firewall famil... [13:56:02] elukey: heya, any news from Manuel? [14:00:22] ottomata: HelllLLLLOOOoooo :) Just in time :) [14:00:35] joal: he just finished the meeting and went to lunch :) [14:00:45] hiiiii [14:00:58] wasssuuup? [14:01:28] ottomata: kafkacat -C -b kafka1012.eqiad.wmnet:9092 -t test_banner_impressions_joal [14:01:46] OOoOOo! [14:02:13] nice! I know what to do... :) [14:03:17] ottomata: please set 'minute' query granularity - I aggregate and change timestamp for that [14:03:30] segment? [14:03:37] daily? [14:03:48] segment daily, query minute? [14:03:53] yessir please [14:03:59] k [14:04:12] optimal segment time is month, but we'll do that batch [14:04:17] thanks :) [14:04:55] oh him, but the streaming job only emits once per minute? [14:05:14] ottomata: once every 10 seconds, but timestamp is updated to be at minute [14:05:21] :) [14:05:28] oh ok [14:05:33] hm , did it stop then? [14:05:42] Yes, I stopped it [14:05:45] oh [14:05:45] k [14:06:00] Need to be a bit constructed than in a shell ;) [14:08:51] ottomata: I solved the inbound/outbound mistery for the analytics VLAN [14:09:01] also, o/ [14:09:11] as always it was me confusing things [14:09:29] the network ACLs are for each router's port [14:10:15] so from the router's point of view, *inbound traffic* is whatever flows *to* the interface from the host/switch attached [14:10:37] from the Analytics VLAN perspective, is traffic from others networks [14:11:10] sorry, traffic *to* other networks [14:16:06] 06Analytics-Kanban, 10Fundraising-Backlog, 13Patch-For-Review: Productionize banner impressions druid/pivot dataset - https://phabricator.wikimedia.org/T155141#2961220 (10mforns) @DStrine @AndyRussG Yes, I'm already on it :] [14:16:11] ahhh [14:16:17] k [14:16:21] right [14:16:43] so that is allowing traffic from analytics into the router's physical port destined to those IPs/ports? [14:16:44] elukey: ? [14:17:28] yes! [14:21:03] aye cool [14:21:04] good to konw [14:21:05] know [14:22:18] ottomata: do you like https://gerrit.wikimedia.org/r/#/c/332106/9/modules/role/manifests/analytics_cluster/client.pp ? The rest is only one liners [14:22:35] (I am also running pcc just in case) [14:23:13] mornin [14:23:57] o/ [14:24:18] (https://puppet-compiler.wmflabs.org/5181/ looks good) [14:25:08] 10Analytics, 10Analytics-EventLogging, 06Performance-Team, 07Performance, 07Regression: EventLogging schema modules take >1s to build (max: 22s) - https://phabricator.wikimedia.org/T150269#2961246 (10Ottomata) Wow. Gilles asked in an email if someone from Analytics could help. I can! I don't know much... [14:25:21] 06Analytics-Kanban, 13Patch-For-Review: Productionize loading of edit data into Druid (contingent on success of research spike) - https://phabricator.wikimedia.org/T141473#2961247 (10Milimetric) Awesome, thanks @JAllemandou, can mark this done then. [14:25:33] elukey: sure [14:25:35] that's better? [14:25:47] oh this is the :: thing [14:25:48] sure [14:25:50] i don't miwd [14:25:51] midn [14:25:52] mind [14:25:55] super, merging [14:27:29] (03CR) 10Mforns: [WIP] Add banner impressions jobs (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/331794 (https://phabricator.wikimedia.org/T155141) (owner: 10Mforns) [14:29:34] hey hmm, nuria, milimetric, am reading quarterly goals [14:29:40] i don't think the frontend stuff should be called 'data lake' [14:29:46] eh? [14:29:54] I should read that, doing so now [14:29:56] sorry quarterly review [14:32:02] also, nuria, should the public eventstreams slide read 'productionize/deploy public eventstreams?' instead of 'launch'? [14:32:04] 10Analytics, 10Analytics-EventLogging, 06Performance-Team, 07Performance, 07Regression: EventLogging schema modules take >1s to build (max: 22s) - https://phabricator.wikimedia.org/T150269#2780126 (10Gilles) No, it's time it takes to generate the JS on the backend, which happens more frequently than one... [14:32:10] 'launch' is what we were doing this quarter [14:32:13] (ohh too early for nuria) [14:33:53] milimetric: slides say https://phabricator.wikimedia.org/T125854 is done? [14:33:56] its not, is it? [14:34:14] no [14:35:05] ok, commented [14:35:49] I'm kind of ok with calling it the Data Lake *front-end* [14:36:48] but yeah, we should talk about it [14:40:23] i thought data lake was more about the backend, and our name for data warehouse, etc. [14:41:34] https://en.wikipedia.org/wiki/Data_lake [14:43:55] 10Analytics, 10Analytics-EventLogging, 06Performance-Team, 07Performance, 07Regression: EventLogging schema modules take >1s to build (max: 22s) - https://phabricator.wikimedia.org/T150269#2961273 (10Ottomata) Ok, I looked in the extension to see where it might reference those schemas, but I don't see an... [14:45:51] ott [14:45:55] doh [14:46:07] ottomata: yeah, it's what the place where all the data is called [14:46:15] but she's saying this is the data lake _front-end_ [14:46:39] 10Analytics, 10Analytics-EventLogging, 06Performance-Team, 07Performance, 07Regression: EventLogging schema modules take >1s to build (max: 22s) - https://phabricator.wikimedia.org/T150269#2961277 (10Gilles) Well, since EventLogging users can create any number of schemas on the fly, I'm not surprised tha... [14:46:53] it's good because it explains why we're building a data lake (to make different front-ends to it, and get data published) [14:46:57] 10Analytics, 10EventBus: log-events topic emitted in EventBus - https://phabricator.wikimedia.org/T155804#2955138 (10Ottomata) Hm, interesting. It'd be really helpful to design a schema for log events then. Can extensions emit custom log events in any format, or do they just set some 'log type' field to a cu... [14:47:18] it's bad because it's really just called "wikistats" and is just one of the many ways to access data in the data lake [14:52:44] 10Analytics, 10Analytics-EventLogging, 06Performance-Team, 07Performance, 07Regression: EventLogging schema modules take >1s to build (max: 22s) - https://phabricator.wikimedia.org/T150269#2961293 (10Gilles) Here's an example ResourceLoader request for the NavigationTiming schema's JS: https://en.wikipe... [14:52:47] 10Analytics, 10Analytics-EventLogging, 06Performance-Team, 07Performance, 07Regression: EventLogging schema modules take >1s to build (max: 22s) - https://phabricator.wikimedia.org/T150269#2961294 (10Ottomata) Oh, ok. I think I'm getting it. So a user requests a page (say maybe after a recent deploy) t... [14:54:13] ottomata: back on ;) [14:54:30] joal: aye, just finished email checking :) [14:54:44] np ottomata, just letting you know ;) [14:55:33] 10Analytics, 10Analytics-EventLogging, 06Performance-Team, 07Performance, 07Regression: EventLogging schema modules take >1s to build (max: 22s) - https://phabricator.wikimedia.org/T150269#2961311 (10Gilles) Yes, that's the chain of events. Anything's possible, the issue probably is in the mechanism used... [15:00:24] (03PS1) 10Joal: Add spark streaming job for banner impressions [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/333655 [15:00:28] ottomata: --^ [15:02:28] 10Analytics, 10Analytics-EventLogging, 06Performance-Team, 07Performance, 07Regression: EventLogging schema modules take >1s to build (max: 22s) - https://phabricator.wikimedia.org/T150269#2961344 (10Gilles) There is some very suspicious locking code with a 20s timeout in RemoteSchema.php My hunch would... [15:02:50] ottomata: I'm sorry you're my main go-to person this afternoon, I have another question for you when you'll have a minute (on mediawiki history scala CR) [15:03:02] joal: hit me! [15:04:19] (03CR) 10Ottomata: "Cool, I'd rather not merge this though, as for now we are just in PoC mode and trying stuff out." [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/333655 (owner: 10Joal) [15:04:28] ottomata: About the page and user data extractor files / objects - changing their names: I have picked [User|Page]EventFromLoggingBuilder [15:04:38] ottomata: is that okey? [15:04:47] have to remember my comment... [15:04:48] :) [15:05:19] I'm not found of "From", but "EventLogging" without From means too much in any of my contexts :) [15:05:54] ottomata: link : https://gerrit.wikimedia.org/r/#/c/325312/12/refinery-job/src/main/scala/org/wikimedia/analytics/refinery/job/mediawikihistory/page/PageHistoryDataExtractors.scala [15:07:45] joal: you are talking about renaming the classes? [15:07:51] objects [15:07:51] ? [15:07:52] classes + files [15:07:55] yes [15:07:57] so [15:08:14] PageHIstoryDataExtractor -> PageEventFromLoggingBuilder? [15:08:18] I have already renamed functions inside (made a lot of sense) [15:08:23] correct [15:08:25] hm [15:08:40] Or even just: PageEventBuilder ? [15:08:50] joal: Manuel told me that the accounts are probably managed by Chase.. [15:08:53] hmm, that sounds a little better [15:08:57] did you have anything specific to ask? [15:09:04] PageEventBuilder [15:09:04] PageEventFromLog(Builder) [15:09:26] joal: are all of these Event Extractor/Builders using log rows? [15:09:37] elukey: nope, I'll talk to chase tomorrow, meeting planned :) [15:10:10] gooood [15:10:21] I am going to add the last network ACL [15:10:22] ottomata: builders use log row yes, but some functions are used in other places [15:10:30] Thanks a lot elukey :) [15:10:43] I'll be all set up for tomorrow elukey, that's really awesome :) [15:11:00] aye that's fine. Ja joal i think PageEventBuilder is good. PageEvent____ [15:11:13] PageEventParser, PageLogEventParser [15:11:13] ottomata: Great :) [15:11:22] PageLogEventBuilder [15:11:28] naw [15:11:30] PageEventBuilder [15:11:50] i think its fine not to explicitly refer to 'log' in the obj name, especially if all of these behave about the same way [15:11:50] I used: buildMovePageEvent [15:11:56] ok [15:11:58] PageLogEventBuilder [15:12:00] sounds consistent [15:12:05] sorry [15:12:05] and buildSimplePageEvent [15:12:07] PageEventBuilder [15:12:11] cool [15:12:13] as you suggested :) [15:12:13] +1 [15:12:16] great [15:12:50] another one on naming: you suggested MediawikiEvent instead of HistoryEvent - makes sense but involves a lot of change, so triple checking going :) [15:12:56] ottomata: --^ [15:13:47] yeah, joal, i think that makes more sense to me. i'm not sure what a history event is...everything is history [15:13:51] except the future [15:14:00] 10Analytics, 10Pageviews-API: Yearly endpoint for the /pageviews/top API - https://phabricator.wikimedia.org/T154381#2909045 (10Milimetric) This is different from the per-article monthly endpoint. The storage needs would indeed be very modest, but with the current pipeline the job to compute the top article f... [15:14:22] ottomata: MediawikiEvent for the global object is cool - now for sub-objects: MediawikiEventPageInfo ? MediawikiEventPagePortion ? MediawikiEventPagePart [15:14:38] vs what? [15:14:41] ottomata: I can't make them subclasses, so it kinda need to be in the name [15:14:56] vs https://gerrit.wikimedia.org/r/#/c/325312/12/refinery-job/src/main/scala/org/wikimedia/analytics/refinery/job/mediawikihistory/page/PageEvent.scala [15:14:56] ? [15:14:58] oops [15:15:01] vs PageEVent? [15:15:15] nope: https://gerrit.wikimedia.org/r/#/c/325312/12/refinery-job/src/main/scala/org/wikimedia/analytics/refinery/job/mediawikihistory/denormalized/HistoryEvent.scala [15:15:22] sorry [15:15:22] eyah [15:15:22] i mean [15:15:26] you have these sub parts [15:15:29] but what are they sub parts of? [15:15:30] e.g. [15:15:33] MediawikiEventPageInfo [15:15:36] is a subpart of? [15:15:39] PageEvent? [15:15:44] MediawikiEvent [15:15:47] only [15:15:56] AH [15:15:57] ah [15:15:57] right [15:15:58] got it [15:16:03] In the same file [15:16:05] MediawikiPageEvent [15:16:11] but yeah, explicit is better [15:16:38] hmmm .. knowing we have PageEvent as well, I kinda don't like it :( [15:16:54] MediawikiEventPageDetails [15:16:56] k [15:17:00] yes ! [15:17:02] detials works [15:17:10] i dunno [15:17:12] info is fine too [15:17:28] fields? [15:17:42] these seem to be simple case classes with params/fields only? [15:17:44] Could do [15:17:50] correct [15:17:54] MediawikiEventPageFields [15:17:54] ? [15:17:59] To overcome the 22-limit parameter [15:18:04] works for me ! [15:18:12] OH that's the only reason they exist? [15:18:12] hahah [15:18:17] man, scala. [15:18:30] i kinda like MediawikiEventPageFields [15:18:34] MediawikiEventPageFields, MediawikiEventUserFields, MediawikiEventRevisionFields and MediawikiEvent [15:18:35] hmmm [15:18:36] although [15:18:39] you could have an instance of MediawikiEventPageFields [15:18:43] does that makes sense? [15:18:52] o: MediawikiEventPageFields = ... [15:19:03] pageFields.pageTitle [15:19:04] hmmm [15:20:05] for usage, I kinda prefer pageInfo.pageTitle, but it's almost never used [15:20:11] yeah.... [15:20:18] i'm leaning towards Info or Detail now [15:20:24] uhuh [15:20:33] which do you prefer? [15:20:47] I think Details is clearer [15:20:50] ok [15:20:53] i'm into it [15:20:54] info is very far too braod :) [15:21:00] cool awesome [15:21:00] Details it is. [15:21:16] I'll stil confirm with mforns, but it's on the way :) [15:21:17] I think Detail (not plural is better) [15:21:18] ja? [15:21:25] MediawikiEventPageDetail [15:21:26] hmmm [15:21:27] yessir [15:21:35] what what? [15:21:36] or Details? [15:21:37] hmmm [15:21:38] hahah [15:21:48] mforns: you can backlog the last 10 minutes ;) [15:21:55] hehe ok, reading [15:23:45] ottomata: that eventlogging resource loader issue might be really tricky [15:23:53] yeah [15:24:01] I remember ori, timo, and others wracking their brains about it [15:24:20] never made much sense to me, but caution is needed [15:25:16] joal, ottomata, names look good to me :] [15:25:34] 06Analytics-Kanban: Run a 1-off sqoop over the new labsdb servers - https://phabricator.wikimedia.org/T155658#2961478 (10elukey) >>! In T155658#2961083, @elukey wrote: > Last step is to verify with Jaime or Manuel if we need to whitelist dbproxy1011 too, since it might be dbproxy1010's backup if it fails (rememb... [15:25:38] mforns: so, MediawikiEventPageDetail, or MediawikiEventPageDetailS ? [15:25:45] ottomata too --^ [15:25:50] milimetric: this is def somethign i don't know much about [15:25:59] timo is involved in this ticket, i guess [15:26:04] joal, maybe plural, but both are fine [15:26:31] would we ever want to distinguish a single param from that case class as a 'detail'? [15:26:33] probably not... [15:27:18] joal: network work done, all set [15:27:31] * joal hugs elukey :) [15:27:58] hm ottomata - this argument is favor of singular, right? [15:28:03] i think so? [15:28:03] haha [15:28:07] :) [15:28:13] yeah [15:28:15] no [15:28:18] i'm into plural now. [15:28:21] thinking abou tit [15:28:24] if we called this [15:28:27] 'params' [15:28:29] we would never call it [15:28:33] MediawikiEventPageParam [15:28:38] agreed [15:28:40] we would do MediawikiEventPageParams [15:28:43] so, let's do details [15:28:46] Ok ! [15:28:50] ok! [15:28:52] :) [15:29:02] It's a wrap, now back to updating the code :) [15:29:07] Thanks a lot ottomata and mforns :) [15:29:08] thanks for name bouncing! [15:29:09] :) [15:29:51] haha [15:31:28] joal: qq [15:31:33] can you call "ts" "dt" [15:31:37] in banner impressions? [15:31:44] we usually use ts for integer timestamps [15:32:02] also, what are metricsSpec? [15:32:11] request_count, normalized_impressions [15:32:11] ? [15:32:12] mforns, ottomata: when talking about a MediawikiEvent in comments, I shall use mw_event - OK ? [15:32:15] and count? [15:32:32] joal: if in comment and you want to abbrev, i thikn don't make it look like code [15:32:35] MW Event is fine [15:32:47] ottomata: ok ! [15:32:58] MWEvent even, without space [15:33:01] either i guess :) [15:33:26] ottomata: back to druid: no count, see https://gerrit.wikimedia.org/r/#/c/331794/4/oozie/banner_impressions/druid/daily/load_banner_impressions.json [15:33:40] ottomata: the reason for ts is for druid convention :( [15:33:41] ah! [15:33:44] oh? [15:33:46] druid convention? [15:34:25] well, so far we've named druid timestamps ts without thinking of the ts/dt convention (sorry, my bad, didn't realise) [15:35:15] ottomata: Because druid examples name them ts :( [15:35:52] * joal cries in sadness for not having a brain big enough for conventions :( [15:35:55] haha [15:36:13] ja, dt are string ISO 8601s, and ts are integer unix epochs (either seconds or milliseconds) [15:36:26] joal: can we fix? how many places are we using ts incorrectly? [15:36:31] ottomata: Let's use dt for banners then [15:36:38] ok [15:37:09] k [15:38:03] joal, qq as well :] is pageview_data_directory used at all? here: https://github.com/wikimedia/analytics-refinery/blob/master/oozie/pageview/druid/daily/coordinator.xml#L19 [15:38:31] I havent found any use of it in the pageview druid load workflow [15:39:05] mforns: needs to defined for pageview/datasets.xml ;) [15:39:14] aaaaaah [15:39:51] joal: https://gist.github.com/ottomata/20cd0d9292f0be45c41edb80b41cf8fb [15:39:53] look ok? [15:39:58] ok ok got it [15:41:04] ottomata: we have pageviews in druid, that's all, the rest is not prod [15:41:10] ok - reading [15:41:16] ottomata: reading [15:46:54] ottomata: https://gist.github.com/jobar/8ff98301298cf1b74795cf8eccad0615 [15:47:00] ottomata: 2 minot changes [15:47:38] doubleSum and? [15:47:49] oh, status_code? [15:47:54] correct [15:47:59] rest seems very much ok [15:48:05] i don't see status_code in your data [15:48:10] hm [15:48:24] possibly cause it doesn't appear on url [15:48:38] now my wonder is: will druid accept null values? [15:48:53] ottomata: currently updating job [15:49:18] git st [15:49:21] oops [15:55:15] ottomata: back up [15:58:07] k [15:58:38] joal: still don't see status_code [16:00:33] ottomata: Aaaaah ! [16:01:34] joal: standup! [16:01:34] :) [16:01:37] joal: standdupppp [16:01:48] get up standup, standup for your rights [16:05:27] (03PS1) 10Joal: Update oozie job loading pageview in druid [analytics/refinery] - 10https://gerrit.wikimedia.org/r/333668 [16:11:02] joal: status_code":"2.2" ? [16:11:08] yep :) [16:11:13] what is status code? [16:11:18] trying again new stuff (off and on) [16:11:21] a banner impression thing? [16:11:24] I actually don't know ottomata [16:11:26] yes [16:11:26] haha ok [16:11:31] cool [16:36:06] (03Abandoned) 10Milimetric: [WIP] Script loading of edit history [analytics/refinery] - 10https://gerrit.wikimedia.org/r/301293 (owner: 10Milimetric) [16:36:33] 10Analytics-EventLogging, 06Analytics-Kanban, 06Performance-Team, 07Performance, 07Regression: EventLogging schema modules take >1s to build (max: 22s) - https://phabricator.wikimedia.org/T150269#2961773 (10Nuria) [16:47:15] 06Analytics-Kanban, 10Pageviews-API, 06Reading-analysis: Skewed pageviews for Azerbaijani and Bulgarian Wikipedias, September, October and November 2016 - https://phabricator.wikimedia.org/T153699#2961818 (10Nuria) [16:52:04] 06Analytics-Kanban, 10MediaWiki-API: Copy cached API requests from raw webrequests table to ApiAction - https://phabricator.wikimedia.org/T155478#2944303 (10Nuria) [16:52:21] 06Analytics-Kanban, 10MediaWiki-API: Copy cached API requests from raw webrequests table to ApiAction - https://phabricator.wikimedia.org/T155478#2944303 (10Nuria) Need to spend some calendar time talking to reserach [16:52:33] 06Analytics-Kanban, 10MediaWiki-API, 06Research-and-Data: Copy cached API requests from raw webrequests table to ApiAction - https://phabricator.wikimedia.org/T155478#2961858 (10Nuria) [16:52:39] joal: ok, tranquility running [16:52:43] feel free to keep your spark job going [16:53:02] 06Analytics-Kanban, 06Research-and-Data: Coordinate with research to vet metrics calculated from edit data lake - https://phabricator.wikimedia.org/T153923#2961859 (10Nuria) [16:53:18] 10Analytics, 10RESTBase, 06Services, 15User-mobrovac: configure RESTBase pageview proxy to Analytics' cluster on wiki-specific domains - https://phabricator.wikimedia.org/T119094#2961862 (10Milimetric) [16:53:21] 10Analytics, 06Labs, 10Pageviews-API, 10wikitech.wikimedia.org: wikitech.wikimedia.org missing from pageviews API - https://phabricator.wikimedia.org/T153821#2961864 (10Milimetric) [16:57:06] 06Analytics-Kanban, 06Operations: Periodic 500s from piwik.wikimedia.org - https://phabricator.wikimedia.org/T154558#2961899 (10Nuria) [17:01:20] 10Analytics-EventLogging, 06Analytics-Kanban, 13Patch-For-Review: Add user_agent_map field to EventCapsule - https://phabricator.wikimedia.org/T153207#2961906 (10Nuria) [17:01:23] 10Analytics, 10Analytics-EventLogging, 07Need-volunteer: Add sanitized User-Agent to default fields logged by EventLogging - https://phabricator.wikimedia.org/T54295#2961908 (10Nuria) [17:03:13] 10Analytics, 10Analytics-Cluster, 10Wikimedia-General-or-Unknown: Browser and platform stats for logged-in vs. anon users for security and product support decisions - https://phabricator.wikimedia.org/T58575#2961911 (10Nuria) General Browser stats available at http://analytics.wikimedia.org Resolving. [17:03:29] 10Analytics, 10Analytics-Cluster, 10Wikimedia-General-or-Unknown: Browser and platform stats for logged-in vs. anon users for security and product support decisions - https://phabricator.wikimedia.org/T58575#2961912 (10Nuria) 05Open>03Resolved a:03Nuria [17:04:16] 10Analytics, 10Analytics-EventLogging, 10MediaWiki-Vagrant: EventLogging vagrant role fails to provision - https://phabricator.wikimedia.org/T131085#2961914 (10Nuria) 05Open>03Resolved [17:09:23] 10Analytics, 06Operations, 10netops, 13Patch-For-Review: Open temporary access from analytics vlan to new-labsdb one - https://phabricator.wikimedia.org/T155487#2944637 (10Nuria) ping @elukey is this a duplicate [17:09:34] 06Analytics-Kanban, 06Operations, 10netops, 13Patch-For-Review: Open temporary access from analytics vlan to new-labsdb one - https://phabricator.wikimedia.org/T155487#2961934 (10Nuria) [17:18:27] 10Analytics, 10Wikimedia-General-or-Unknown: Disable queries for recent data on stats.grok.se - https://phabricator.wikimedia.org/T155785#2954564 (10Nuria) We are planning on backfilling old pageview data (likely not in pageview APi but other endpoint) WMF Analytics team does not have access to stats.grok.se [17:23:04] 10Analytics: Meta-statistics on MediaWiki history reconstruction process - https://phabricator.wikimedia.org/T155507#2961974 (10Nuria) [17:23:07] 10Analytics: Add statistics and error logging to mediawiki history reconstruction job - https://phabricator.wikimedia.org/T155971#2961976 (10Nuria) [17:24:19] 06Analytics-Kanban, 06Operations, 10netops, 13Patch-For-Review: Open temporary access from analytics vlan to new-labsdb one - https://phabricator.wikimedia.org/T155487#2961977 (10elukey) @Nuria no sorry this was probably the good one, I commented in https://phabricator.wikimedia.org/T155658... Sorry @JAlle... [17:40:32] milimetric: can you ask for agreggates of views of a page across all projects on pageview api? [17:43:45] 10Analytics, 10CirrusSearch, 06Discovery, 06Discovery-Search: Load cirrussearch data into druid - https://phabricator.wikimedia.org/T156037#2962176 (10EBernhardson) [17:44:07] 10Analytics, 10CirrusSearch, 06Discovery, 06Discovery-Search: Load cirrussearch data into druid - https://phabricator.wikimedia.org/T156037#2962189 (10EBernhardson) Once we figure out what data we need I can workout the pipeline for getting the data in there. But first we should figure out what we want. [17:45:16] 10Analytics, 10CirrusSearch, 06Discovery, 06Discovery-Search: Load cirrussearch data into druid - https://phabricator.wikimedia.org/T156037#2962193 (10EBernhardson) [17:46:17] milimetric: looks like monthly parameter is working werll, will try to deploy to prod after sync up with doiscovery cca-team [17:46:21] milimetric: looks like monthly parameter is working werll, will try to deploy to prod after sync up with doiscovery cc a-team [17:47:39] k, cool, I'll upgrade node on beta when you're done so you have a way to rollback and test hotfixes [17:48:32] nuria: to your earlier question, no, not yet, but we could build something using wikidata inter-language links [17:49:04] milimetric: ok, let's do deployment to prod right after meeting [17:54:30] I'll skip this one, work on the slides [17:55:14] milimetric: do you know if there is an existing ticket to add page view data to cirrus? [17:56:13] coreyfloyd: I'm not super familiar, but I know ebernhardson has been doing work on similar stuff [17:56:21] 10Analytics, 10Analytics-Cluster, 10Wikimedia-General-or-Unknown: Browser and platform stats for logged-in vs. anon users for security and product support decisions - https://phabricator.wikimedia.org/T58575#2962248 (10GWicke) 05Resolved>03Open @Nuria, as far as I can tell https://analytics.wikimedia.org... [17:56:34] milimetric: thanks [17:57:35] ebernhardson: ^ do you know the answer to that? Reading Infrastructure exposed to the action API, but we can’t make queries based on page view data yet. https://phabricator.wikimedia.org/T144865 [18:02:50] aqs100[789] are all up and running \o/ [18:03:11] woohoo [18:04:09] coreyfloyd: hmm, looking [18:04:33] milimetric: what do you mean by add page view data to cirrus? [18:04:42] s/milimetric/coreyfloyd/ [18:05:05] being able to sort cirrus results by page views [18:05:11] coreyfloyd: perhaps interesting would be the discovery.query_clicks table [18:05:20] ebernhardson: ^ specifically [18:05:38] coreyfloyd: this joins the cirrus logs against webrequest, and basically each row is about a clicked link [18:06:05] coreyfloyd: page views being # of searches, or # of click throughs? [18:06:21] as in, do you want data about /wiki/Foobar, or about searches on /wiki/Special:Search? [18:06:44] ebernhardson: pages views being the data from these endpoints: https://wikimedia.org/api/rest_v1/#!/Pageviews_data/get_metrics_pageviews_top_project_access_year_month_day [18:07:39] coreyfloyd: we have an hidden to control the weight of pageviews in the ranking function [18:07:46] hidden param I mean [18:07:50] coreyfloyd: oh, other way around, you want to do searches based on page view data? [18:08:22] but this can be messy unless you do simple filter queries (e.g. hastemplate) [18:09:13] coreyfloyd: https://en.wikipedia.org/w/index.php?search=~test&title=Special:Search&go=Go&cirrusPageViewsW=1000 will get results mostly sorted by pageviews [18:09:33] ebernhardson: so the specific use case is using the “nearcoord” feature in cirrus and then sorting by page views [18:10:11] basically yielding the most popular articles in a given locaiton [18:10:58] coreyfloyd: hmm, using a plain nearcoord with nothing else should sort by a combination of incoming links and page views, we could probably expose a profile that is strictly page views based [18:12:01] ^ this is the right solution, this would be exposed as a 'qiprofile' param in the search api [18:12:19] if it helps to understand, you can see the scoring in: https://en.wikipedia.org/w/index.php?title=Special:Search&fulltext=1&search=neartitle%3ASan_Francisco&cirrusDumpResult&cirrusExplain=pretty [18:12:55] basically the full text portion returns 1 for all results, and the second part is a combination of popularity score and incoming links. Adding a popularity-only profile is pretty easy and can be done with a mediawiki-config deploy [18:13:22] (neartitle and nearcoord do the same thing, difference is only where the coordinate is sourced from) [18:13:48] ebernhardson: dcausse ok - i think popularity score and links is ok [18:14:13] ebernhardson: dcausse - so we dont need a popularity only profile if that one already exists [18:14:33] coreyfloyd: you need one if don't want to use hidden cirrus param [18:14:49] today pageviews is very low compared to inclinks [18:15:26] dcausse: sorry not sure I know what you mean by hidden parameter [18:15:51] i see you joal ! :) [18:15:52] Flushed {test_banner_impressions_joal={receivedCount=468, sentCount=468, droppedCount=0, unparseableCount=0}} [18:15:55] you can tune the weights with hidden params (&cirrusPageViewsW and &cirrusIncLinksW) [18:16:45] dcausse: got it, and is that just not recommended? [18:16:55] dcausse: or only supposed to be used for debugging? [18:17:04] coreyfloyd: any query argument that starts with cirrusSearch is for debugging, or sometimes unit tests. [18:17:17] for prod usage, like apps that will be deployed long time, we can setup explit profiles [18:17:18] coreyfloyd: only supposed for debugging and experimenting, the api will display a watrning if you use them [18:17:31] * elukey logging off team! o/ [18:18:28] s/unit tests/ab tests/ [18:18:35] ebernhardson: got it, so then I just need to file a ticket for exposing the profile? [18:18:37] coreyfloyd: if you have a nearcoord query you want to tune on enwiki you can add &cirrusPageViewsW=1000 to the api url to see how it'd look like [18:18:56] 10Analytics-Dashiki, 06Analytics-Kanban, 13Patch-For-Review: Add extension and category (ala Eventlogging) for DashikiConfigs - https://phabricator.wikimedia.org/T125403#1986718 (10demon) Nothing is too simple for security review, but it's simple enough that I reviewed it in about 2 minutes. {{approved}} an... [18:19:49] https://en.wikipedia.org/w/api.php?action=query&format=json&maxlag=200&generator=search&gsrsearch=museum+nearcoord%3A37.77666667%2C-122.39&gsrlimit=10 [18:20:41] https://en.wikipedia.org/w/api.php?action=query&format=json&maxlag=200&generator=search&gsrsearch=museum+nearcoord%3A37.77666667%2C-122.39&cirrusPageViewsW=1000&gsrlimit=10 [18:20:50] dcausse: am I doing that right? [18:21:10] coreyfloyd: yes I think so, change to -1000 to see it changes something [18:21:17] nice elukey wowowow [18:21:28] hmm… getting unrecognized paramaeter [18:21:31] dcausse: ^ [18:21:41] coreyfloyd: it will always show unrecognized, it's a hack that happen inside cirrus it's not part of the api [18:21:55] hey ottomata ! [18:22:06] coreyfloyd: that's (part of) why we have to deploy a real profile for any kind of prod usage [18:22:16] a-team: deploying aqs... [18:22:22] ebernhardson: aha! [18:22:25] heyyy! [18:22:32] ebernhardson: got it and it does change the sort [18:23:00] ottomata: seems that the data is flowing, but I don't anything result in pivot yet :( [18:23:02] coreyfloyd: the real question is, is this sort better? :) We could perhaps setup an AB test first comparing a few options, and see what works best for users [18:23:54] ebernhardson: dcausse thats what i was going to go find out - is there documentation on tuning the weights? [18:23:57] joal: we need a datasource configured there, yes? [18:24:05] i can play with this for a bit and see what looks good [18:24:14] OH [18:24:25] joal: the datasource i emit is probably differen than what you guys had configured before? [18:24:26] not sure. [18:24:26] coreyfloyd: not really, cirrusIncLinkssW can be used to tune weights for inclinks [18:24:43] ottomata: That's exactly the question :) [18:24:49] i named it differently? [18:24:55] do these show up automaticallY? or do we need to create them? [18:24:57] in pivot? [18:25:11] found it in pivot, not the same, no problem :) [18:25:38] ottomata, elukey : wait ... do we depool when we deploy everytimne (seems like we should as scap cfg is going to bounce service) [18:25:51] hopefully the deploy process will do that individually, lemme see [18:25:53] nuria: IIRC we don't [18:26:19] ottomata: stopping the thing, trying to debug something [18:26:23] joal: that means all inflight requests are going to get a 500 upon deploy right? [18:26:27] cc ottomata [18:26:36] coreyfloyd: these things are hard to tune, best is A/B testing, or some labelled data we could use to tune this automatically but there's no right answers :/ [18:26:39] which likely will trigger an alarm [18:26:43] nuria: only the currently served request (q very small subset [18:27:10] dcausse: ebernhardson makes sense. thanks for the help - I’ll see what we con come up with, maybe do some beta a/b testing and check back in [18:27:17] IIRC elukey used to depool when we were not sure about deploy process [18:27:26] But we happened to do it without depool I think [18:28:28] i think mobrovac added some scap configs to eventbus to depool while deploying, but i'm looking and I don't see what does it [18:28:34] so it should be possilbe, and probably aqs should do it [18:28:39] but i'm not sure [18:29:07] joal: Right, we are returning 500 to the 40 reqs per sec we have for as many secs as teh restart takes [18:29:55] ottomata: there is no depool by default [18:29:59] https://www.irccloud.com/pastebin/CQE6GCUA/ [18:30:11] it's a restart [18:30:20] nuria: I don't think - scap deploys 1 after the other, and once stopped, the LB stops sending requests [18:31:08] if it is configured to deploy one after the other, do we know that it is? [18:31:22] joal: do we have a health check for the LB so it knows to stop sending requests? [18:31:23] eventbus scap has [18:31:23] group_size: 2 [18:31:25] which i think it would do that [18:32:09] joal: right, i do not think that scap cfg has the progressive rollout [18:32:31] sounds like we need a ticket to make this better :) [18:33:02] ottomata: filing now [18:33:23] nuria: I'm pretty sure scap deploys one after the other :) [18:33:51] mobrovac: yt? [18:34:17] (03PS2) 10Joal: Add spark streaming job for banner impressions [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/333655 [18:34:59] (03CR) 10Joal: [C: 04-1] "Not to be merged, only POC for now." [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/333655 (owner: 10Joal) [18:39:27] joal: mmmm... not so sure [18:39:54] hm nuria, I think in any case the is way to manually force it if you want [18:40:03] to do one server at a time [18:42:19] joal: but i do not have permits to depool so it would not matter very much [18:42:45] nuria: well, you can be sure you restart one server at a time via scap is what I mean :) [18:47:45] hey nuria and team. [18:48:41] gone for diner a-team, later [18:48:42] milimetric: deploy to prod done but monthly parameter doesn't seem to work (it does work in beta) [18:48:48] leila: hello [18:49:15] nuria finally does scap deploy one after the other? [18:49:17] nuria: do you have 10-min to talk about edit data? I talked to GII folks this morning, and I understand their problem better now. I think there is an opportunity for us to provide data, if we have it, to them and this will be beneficial for Wikipedia and the work the community does on that front. If we don't have the specific data, I'd like to figure out how/if we should have it for next year, and I want to tell you the plan B I think we can embark [18:49:18] hm..., I'll take a look in a bit nuria [18:49:39] joal: i do not think so, no. [18:49:56] leila: I'd like to listen in on that if you don't mind [18:49:56] k nuria, thanks for feedback [18:50:22] actually, you may know the answer milimetric, so I don't need to take the time from both of you. :D [18:50:36] K, batcave? [18:50:44] sure, milimetric. moving to a room now. [18:51:29] 10Analytics: Improve AQS deployment - https://phabricator.wikimedia.org/T156049#2962466 (10Nuria) [18:51:45] nuria: I can't ssh into aqs1001 :( [18:51:56] leila, milimetric : give me a sec to make sure it is all good qith deployment [18:53:09] ok, nuria. :) [18:53:18] milimetric: nah, it is not ok cause it did not deploy code [18:53:31] nuria: I'm trying to take a look too, what are the servers? [18:53:40] milimetric: servers have old code [18:53:51] aqs 100[4,5,6] [18:54:03] milimetric: [18:54:08] https://www.irccloud.com/pastebin/b6qHdv30/ [18:55:19] milimetric: retrying cc ottomata [18:58:06] ottomata: i cannot deploy cc milimetric [18:58:09] error is: [18:58:12] https://www.irccloud.com/pastebin/JaLDr3JH/ [18:59:53] leila: we can talk for a sec cc milimetric [19:00:02] nuria: with you shortly [19:00:05] nuria: we're in the batcave, I will try to deploy after [19:00:27] milimetric: it is not working, you will get teh same error so no need to retry [19:00:39] nuria: i meant after andrew gets back to us [19:00:48] milimetric: ah, sorry [19:00:50] but come chat about edit data [19:07:05] 10Analytics, 06Labs, 10Pageviews-API, 10wikitech.wikimedia.org: wikitech.wikimedia.org missing from pageviews API - https://phabricator.wikimedia.org/T153821#2962511 (10Krenair) @milimetric, you sure that's a duplicate? [19:19:07] Hey ottomata. There's excitement building around https://phabricator.wikimedia.org/T148843 [19:19:20] Would you be willing to post any estimates and ideas there? [19:20:11] halfak: sure, they are super guesstimate esitmates [19:20:13] 10Analytics, 10Analytics-Cluster, 06Operations, 06Research-and-Data, and 2 others: GPU upgrade for stats machine - https://phabricator.wikimedia.org/T148843#2962587 (10DarTar) @Ottomata from a budget perspective I think we can move forward with this immediately, we should be able to fund the expense but I'... [19:20:31] ottomata, make sure to put them in tags ;) [19:20:35] k [19:28:18] nuria: here, what's up? [19:30:51] mobrovac: nuria is trying to deploy aqs, i think we need to enable auto-depooling [19:30:55] we can do that, right? [19:31:07] group_size: 2 [19:31:09] anything else? [19:31:13] auto-deploying? [19:31:18] what is that? :P [19:31:20] depooling [19:31:22] during deploy [19:31:28] oh depooling [19:31:31] sorry, misread [19:31:31] hahah [19:31:33] :) [19:31:44] deploy in prod? [19:31:46] ja [19:31:55] beta doesn't have etcd, so can't do it there [19:31:56] ok [19:32:12] lemme check something [19:32:20] https://phabricator.wikimedia.org/T156049 [19:32:35] sorta related ^ [19:34:54] ah snap https://github.com/wikimedia/analytics-aqs-deploy/blob/master/scap/aqs-prod has not been updated [19:35:33] :) [19:35:53] mobrovac: i'm running out for a sec, nuria should be back shortly, she said she was running to a library and would be back soon [19:36:31] going to check back in ~ one hour, let me know if you guys need me before via hangouts :) [19:36:39] ottomata: nuria: for depooling and repooling nodes, you need to add these lines to your deployment checks.yaml: https://github.com/wikimedia/mediawiki-services-citoid-deploy/blob/master/scap/checks.yaml#L6-L13 [19:36:55] (replacing citoid with aqs, ofc) [19:37:25] 10Analytics, 06Labs, 10Pageviews-API, 10wikitech.wikimedia.org: wikitech.wikimedia.org missing from pageviews API - https://phabricator.wikimedia.org/T153821#2962736 (10Milimetric) Oh, no, my fault, I got confused by the assumption around https://wikitech.wikimedia.org/api/rest_v1/ not working. That part... [19:37:31] 10Analytics, 06Labs, 10Pageviews-API, 10wikitech.wikimedia.org: wikitech.wikimedia.org missing from pageviews API - https://phabricator.wikimedia.org/T153821#2962738 (10Milimetric) 05duplicate>03Open [19:40:04] mobrovac: thank you, will look into that in few. much appreciated [19:40:13] np [19:44:55] 06Analytics-Kanban, 15User-Elukey: Ongoing: Give me permissions in LDAP - https://phabricator.wikimedia.org/T150790#2962759 (10Milimetric) @spatton: could you also specify whether each user is a WMF employee or a contractor that has signed the NDA? This will help add them to the right LDAP group. [19:46:41] 06Analytics-Kanban, 15User-Elukey: Ongoing: Give me permissions in LDAP - https://phabricator.wikimedia.org/T150790#2962780 (10spatton) Absolutely @Milimetric - everyone listed above is an WMF employee. Thanks! [19:50:23] 10Analytics-Dashiki, 06Analytics-Kanban, 13Patch-For-Review: Add extension and category (ala Eventlogging) for DashikiConfigs - https://phabricator.wikimedia.org/T125403#2962802 (10Milimetric) Thank you for the merge. I've created https://deployment.wikimedia.beta.wmflabs.org/wiki/Dashiki:TestConfig to tes... [19:52:00] 10Analytics, 06Labs, 10Pageviews-API, 10wikitech.wikimedia.org: wikitech.wikimedia.org missing from pageviews API - https://phabricator.wikimedia.org/T153821#2962803 (10Krenair) Are they even being collected given that wikitech is not behind varnish? [19:54:29] 10Analytics: Improve AQS deployment - https://phabricator.wikimedia.org/T156049#2962819 (10Nuria) We need to add this to deployment checks for deppoling pooling: https://github.com/wikimedia/mediawiki-services-citoid-deploy/blob/master/scap/checks.yaml#L6-L13 [19:56:27] 10Analytics, 06Labs, 10Pageviews-API, 10wikitech.wikimedia.org: wikitech.wikimedia.org missing from pageviews API - https://phabricator.wikimedia.org/T153821#2892095 (10Nuria) @Krenair if wikitech is not behing varnish pageviews cannot be collected. Correct. Seems that we can close ticket? [20:06:41] elukey: , nuria, moving aqs deploy chat back here [20:06:50] k [20:07:00] super [20:07:10] 10Analytics: geowiki data for Global Innovation Index - https://phabricator.wikimedia.org/T131889#2962842 (10Milimetric) Ok, some details about what data is available. Wiki projects store IPs in a table called recentchanges. A project called geowiki [1] mines this table and computes edits per country per proje... [20:07:45] Error: Cannot find module '/srv/deployment/analytics/aqs/deploy-cache/revs/025ef23c156d4de38fe3a0a4ee282ecb61f5b4a0/src/server.js' [20:08:34] src is empty [20:09:31] yep git log shows something weird [20:09:36] not totally sure what happened to this deploy, going to deploy to aqs1004 again [20:09:45] ah! [20:09:47] i think it worked that time [20:10:27] hm, src submodule is different commit? [20:11:02] nope, no good [20:11:17] ottomata: on tin the branch is the one expected? [20:11:24] (just trying to think) [20:11:25] branch? [20:11:36] yeah we had two branches a while ago [20:11:38] src matches what the comit log says it should be [20:11:46] usually submodules don't have the branch names, just the comit shas [20:11:55] (detached from a7eb80d) a7eb80d Monthly request stats per article title [20:12:00] is that the correct commit? [20:12:02] that's what is on tin [20:12:14] ah ok I was referring to the main repo, but it should be ok if the submodule is ok [20:12:24] elukey: i updated tin to master [20:12:36] elukey: [20:12:40] https://www.irccloud.com/pastebin/voY1Ii9M/ [20:12:55] elukey: so the sha is the one on master [20:13:04] oh main repo, ok [20:13:08] wait [20:13:10] main repo is? [20:13:10] haha [20:13:12] deploy or src? [20:13:17] i guess you mean deploy [20:13:21] yeah [20:13:32] hm, i don't really know the status of what you all are trying to deploy, but 1004 is def not right [20:13:39] aqs not running there [20:13:52] it says [20:13:53] +4d9f5160687f8dc3df3401453d2da5e861c19db7 src (remotes/origin/master-18-g4d9f516) [20:14:18] trying things... [20:14:27] ottomata: changeset is running on beta w/o troubles so it must be something else [20:14:29] sudo -u deploy-service git submodule update [20:14:29] fatal: reference is not a tree: a7eb80d239e36b0e43d46653c2145812a86c24f7 [20:14:33] yeah, its not the change [20:14:40] its scap/submodule bustedness [20:14:41] dunno why yet [20:14:45] jajaja [20:15:10] deploy repo is at [20:15:12] commit 025ef23c156d4de38fe3a0a4ee282ecb61f5b4a0 [20:15:12] Author: nuria [20:15:12] Date: Thu Jan 19 12:49:13 2017 -0800 [20:15:13] Update aqs to a7eb80d [20:15:17] that seems good [20:15:30] ya [20:15:32] deploy repo looks fine [20:15:44] i'm goign to move aqs/deploy out of the way, and try deploying fresh to 1004 [20:16:09] ottomata: removing everything? [20:16:14] move :) [20:16:18] mv aqs ./aqs.otto.1 [20:16:21] might have to run puppet, not sure [20:16:40] i'm using a big fat hammer [20:16:47] ah yes :D [20:17:04] this is what i've done in beta when it breaks [20:17:05] submodules man [20:17:08] and scap [20:17:11] sometimes it doesn't go right [20:17:14] i don't know why [20:17:28] now its good! [20:17:46] looks good now! [20:17:49] ahahahh [20:17:51] weird [20:17:54] hmm, not fully good [20:18:45] ok, yes good [20:18:50] i was deploying from a weird dir [20:18:57] /srv/deployment/analytics/aqs/deploy-cache/cache/test/test_local_aqs_urls.sh works fine [20:19:03] yeah, ok [20:19:05] aqs is up and running [20:19:07] yeah [20:19:09] weiiird [20:19:12] ok, i'm going to repool [20:19:25] checking reqs with httpry [20:19:30] elukey: ok, should we fix aqs-prod hosts, and then try to do full deploy? [20:19:49] ottomata: I'd proceed one host at the time, depooling first [20:20:17] it is scap-deploy --limit aqs1005.eqiad.wmnet or similar? [20:20:25] 10Analytics: Improve AQS deployment - https://phabricator.wikimedia.org/T156049#2962466 (10Ottomata) To add depooling and group based rolling deploys, we should adapt the same settings from citoid and apply them to aqs scap.cfg: https://github.com/wikimedia/mediawiki-services-citoid-deploy/blob/master/scap/che... [20:20:42] elukey: scap deploy --limit SERVER [20:20:51] elukey: but you have to deppol by hand [20:21:00] yep yep [20:21:07] what do you think ottomata ? [20:21:13] just for aqs1005, to be sure [20:21:13] 10Analytics: Improve AQS deployment - https://phabricator.wikimedia.org/T156049#2962883 (10Ottomata) I think we'll also want `group_size: 2` [20:21:24] I'll follow up on --^ tomorrow [20:21:43] elukey: sure [20:21:52] did you want to deploy to the new hosts too? [20:22:01] they aren't in lvs yet [20:22:40] ah nono still need to do some work, will do it during the next days [20:22:42] thanks :) [20:23:23] ottomata, elukey : thank you, will check monthly parameter now [20:23:45] ottomata: checking aqs1004 [20:24:37] joal: pretty coOOl! https://yarn.wikimedia.org/proxy/application_1480065021448_201730/streaming/ [20:24:39] ottomata: monthly is working now [20:24:50] \o/ [20:24:51] ok [20:24:52] cool [20:24:52] so [20:25:00] next one 1005 [20:25:04] yall got this? [20:25:14] not a clue why it broke [20:25:19] but my big fat hammer seems to have helped [20:25:20] :) [20:25:22] ottomata: that is unsettling [20:25:46] ottomata: I can picture you with a cigar saying "I love it when a plan comes together" [20:25:57] ottomata: but yes, i got it, let me check new code is working everywhere [20:26:01] k [20:26:49] aqs1005 next? [20:26:54] ottomata: no, aqs1005 no work [20:27:05] ottomata: ah wait, you did not deploy [20:27:09] yeah :) [20:27:09] ottomata: right? [20:27:14] ottomata: i need to do it [20:27:20] ottomata: ooohhh [20:27:27] elukey: ok, ahem... got it NOW [20:28:18] :) [20:30:18] elukey, ottomata : checked all hosts all working well with new monthly code, thanks agaian [20:30:20] *again [20:30:35] so we went all in with 1005/1006 without depooling ? :D [20:30:43] :D [20:30:54] ¯\_(ツ)_/¯ [20:32:12] all right checked on aqs100[56] /srv/deployment/analytics/aqs/deploy-cache/cache/test/test_local_aqs_urls.sh and all the repos [20:32:19] code looks good, aqs is running fine [20:32:32] https://config-master.wikimedia.org/conftool/eqiad/aqs - all hosts pooled [20:32:51] good, I guess I can logoff [20:33:00] ottomata, nuria - anything left? [20:33:42] elukey: not for deploy so see ya to morrow [20:34:08] super [20:34:14] thanks ottomata for the hammer! [20:34:18] :D [20:34:22] byyyeeeee o/ [20:36:11] milimetric: new monthly parameter now works from inside restbase but doesn't if you ask for it from the outside: https://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/de.wikipedia/all-access/all-agents/Barack_Obama/monthly/2016010100/2016123100 [20:36:23] milimetric: is there an outstanding change we need to do? [20:36:50] nuria: yeah, I guess we were wrong and there's a config we need to change [20:37:00] mobrovac: yt? [20:37:03] we initially looked at it in puppet and didn't see anything [20:37:08] let's look again [20:37:30] milimetric: it should not be in puppet if it assets on value of application parameters, at least i hope it isn't [20:37:45] let's see if mobrovac knows [20:38:29] 10Analytics, 10Analytics-Cluster, 06Operations, 06Research-and-Data, and 2 others: GPU upgrade for stats machine - https://phabricator.wikimedia.org/T148843#2962931 (10Ottomata) We guesstimated that a stat1002 like replacement would cost around $10K (these machines have a lot of storage...we may reevaluate... [20:39:23] 10Analytics, 10Analytics-Cluster, 06Operations, 06Research-and-Data, and 2 others: GPU upgrade for stats machine - https://phabricator.wikimedia.org/T148843#2962934 (10Ottomata) That is, if you all are ok with waiting until sometime in Q4 for this. If not, we'd have to get a smaller form factor GPU and pu... [20:39:47] nuria: I can't find it because puppet has been re-organized a *lot* [20:40:18] I don't want to waste hours looking through it, but it would be nice if we got a walk-through from our opsy teammates, to help reset our brains [20:41:38] milimetric: let's make sure this setting is on puppet , it might not be [20:41:43] Pchelolo: yt? [20:43:25] 10Analytics-Dashiki, 06Analytics-Kanban, 13Patch-For-Review: Add extension and category (ala Eventlogging) for DashikiConfigs - https://phabricator.wikimedia.org/T125403#2962978 (10demon) According to https://deployment.wikimedia.beta.wmflabs.org/wiki/Special:Version is it installed. [20:46:32] running home real quick... [20:49:15] or gwicke yt? [21:04:29] nuria: here now [21:05:08] Pchelolo: ahem question of the day: we have added a new parameter ("monthly" ) to pageview api, so queries like: [21:05:20] Pchelolo: "https://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/de.wikipedia/all-access/all-agents/Barack_Obama/monthly/2016010100/2016013100" [21:05:26] Pchelolo: are "valid" ones [21:06:07] Pchelolo: this works if we hit restbase from aqs hosts [21:06:09] nuria@aqs1006:~$ curl http://localhost:7232/analytics.wikimedia.org/v1/pageviews/per-article/de.wikipedia/all-access/all-agents/Barack_Obama/monthly/2016010100/2016120200 [21:06:48] Pchelolo: but doesn't work from the "outside" , so I imagine we have to whitelist this "monthly" parameter somewhere, but where? [21:07:01] Pchelolo: does this make sense? I can elaborate more if not. [21:07:01] nuria: PR coming [21:08:20] Pchelolo: okeis [21:08:38] Pchelolo: is that code we forgot? [21:08:51] nuria: https://github.com/wikimedia/restbase/pull/746 [21:11:02] Pchelolo: i see, we forgot that bit [21:11:43] this cinfig duplication is not ideal to be honest.. [21:12:54] Pchelolo: agreed, one should be "build" from the other [21:13:06] Pchelolo: i do not have permits to merge please do teh honors [21:13:43] kk, will do after the tests pass [21:15:09] 06Analytics-Kanban, 15User-Elukey: Ongoing: Give me permissions in LDAP - https://phabricator.wikimedia.org/T150790#2963102 (10Tnegrin) omg -- it works and it's wonderful. Thanks all! [21:17:51] 10Analytics: geowiki data for Global Innovation Index - https://phabricator.wikimedia.org/T131889#2963112 (10leila) a:03leila [21:18:48] thanks ottomata :] [21:19:34] thank you! [21:19:46] 10Analytics: geowiki data for Global Innovation Index - https://phabricator.wikimedia.org/T131889#2963123 (10leila) thanks @Milimetric . Based on my discussion today with you, Nuria, Rafael and Jordan, I will check the data you linked above and work on possible ways we may be able to release a scoring of countri... [21:20:06] 10Analytics, 06Research-and-Data: geowiki data for Global Innovation Index - https://phabricator.wikimedia.org/T131889#2963124 (10leila) [21:41:24] 10Analytics, 10EventBus, 13Patch-For-Review, 06Services (done), 05WMF-deploy-2017-01-24_(1.29.0-wmf.9): EventBus produces non-canonical page urls - https://phabricator.wikimedia.org/T155066#2963164 (10Pchelolo) Moving to `done` pending deployment. [21:45:26] Pchelolo: are you also deploying the PR? [21:45:34] Pchelolo: (fine to say no) [21:51:56] nuria: we didn't have a lot to deploy today, I'm in a meeting. be back in a moment [21:52:08] Pchelolo: np, no urgency at all [22:02:25] ok nuria I will make a small patch and deploy afterwards [22:49:56] nuria: I've done a restbase deploy [22:51:39] thanks Pchelolo [22:52:00] I'm sorry I confused Nuria, the other config used to be in puppet a long time ago, I remembered wrong [23:56:29] 10Analytics, 10Analytics-Cluster, 06Operations, 06Research-and-Data, and 2 others: GPU upgrade for stats machine - https://phabricator.wikimedia.org/T148843#2963557 (10ellery) I'm in no rush, especially if I can get some budget to rent GPUs on AWS in the meantime.