[00:52:56] Analytics, Developer-Relations, MediaWiki-API, Reading-Admin, and 4 others: Metrics about the use of the Wikimedia web APIs - https://phabricator.wikimedia.org/T102079#2221517 (bd808) [00:53:00] Analytics, Community-Tech-Tool-Labs, Developer-Relations, MediaWiki-API, and 4 others: Determine which Action API parameters to whitelist/blacklist for action_param_hourly aggregate table - https://phabricator.wikimedia.org/T132283#2221515 (bd808) Open>Resolved | Action | Parameter | | -... [07:32:30] Analytics-Kanban: analyse AQS queries over the previous month or weeks to have a better understanding of how compaction should behave - https://phabricator.wikimedia.org/T133016#2221835 (JAllemandou) Done over march data: https://docs.google.com/a/wikimedia.org/spreadsheets/d/1Jm6s25e0T1npXhfM5fVtrvuC-4LM8D... [07:51:24] Analytics, RESTBase, Services, User-mobrovac: configure RESTBase pageview proxy to Analytics' cluster on wiki-specific domains - https://phabricator.wikimedia.org/T119094#2221857 (mobrovac) In order to put a strawman forward and progress the discussion, I'm reposting my proposal from T114830: |... [08:35:09] https://puppet-compiler.wmflabs.org/2509/analytics1027.eqiad.wmnet/ shows no relevant difference for the new Hue config [08:35:16] \o/ [08:35:55] elukey: Yay !P [08:37:12] and it seems working with git submodules properly, so double \o/ [08:37:29] joal: morning! Let me know when you want to stop our dear friend Camus [08:37:38] elukey: morning as well :) [08:37:57] elukey: given Andrew email, we can wait :) [08:41:13] sure sure, it was a "ping me whenever you prefer" [08:41:24] elukey: in order to have the jobs finished (or almost) without putting too late, I think we should stop camus after jobs for hour 13:00 UTC starts [08:41:57] all right! [08:41:59] Which means stopping camus at about 15:30 our time [08:42:27] elukey: we could even wait 15:55, for camus to have less to catch up :) [08:45:20] yep makes sense! [08:57:46] Analytics-EventLogging, Operations, RESTBase, Services, and 2 others: RESTBase should handle the X-Analytics header - https://phabricator.wikimedia.org/T133139#2221908 (mobrovac) [08:58:29] elukey: joal: mind taking a look at ^ and tell me if that makes sense? [08:59:31] Hi mobrovac [08:59:50] bonjour joal [08:59:53] makes sense to me for most, I don't know about the echoing (nuria or ottomata should know better) [09:00:09] gr8, thnx! [09:00:21] Bonjour ! Je ne savais pas que tu parles français :) [09:00:41] lived in france for too long not to speak it :P [09:00:47] huhuhu [09:01:09] 4y in rennes and another in marseille [09:01:12] elukey: did you see? puppet is failing on analytics1003 since 13hrs [09:01:26] moritzm: goooood morning! [09:01:36] something is broken with the mysql setup, but didn't look closer [09:01:41] and good morning as well :-) [09:01:48] I didn't see it but I blame ottomata, that node is super new one :P [09:01:50] right mobrovac, I think you told me last year at Wikimania, but my brain hadn't writtne [09:01:58] haha [09:01:58] he was working on it yesterday, checking [09:03:16] mobrovac parla anche italiano molto bene, uomo dalle mille risorse :D [09:03:30] lol [09:04:10] elukey: before going to school, as a kid, i was able to speak only a kind of veneto [09:04:19] :) [09:05:00] living in a bilingual env helps you to pick up langs easily [09:06:07] * joal is jealous :) [09:10:36] mobrovac: hahahahahah [09:13:21] Analytics, RESTBase, Services, User-mobrovac: configure RESTBase pageview proxy to Analytics' cluster on wiki-specific domains - https://phabricator.wikimedia.org/T119094#1817629 (JAllemandou) Bike-shediiiiiiiing ! To mirror a bit more the current format, I'd suggest: | Public API endpoint | AQ... [09:15:05] a-team I'm AFK for a while [10:35:26] Analytics-Cluster, Analytics-Kanban, Patch-For-Review: Use MySQL as Hue data backend store - https://phabricator.wikimedia.org/T127990#2222563 (elukey) We decided to implement only the database support for Hue without all the initialization part (not worth the effort). This task will be done in two s... [10:41:25] !log started rsync of /srv from stat1001 to stat1004 (/srv/stat1001) [10:41:27] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log, Master [11:19:16] Analytics-EventLogging, Operations, Performance-Team, Patch-For-Review: "Throughput of EventLogging NavigationTiming events" UNKNOWN - https://phabricator.wikimedia.org/T132770#2222719 (akosiaris) p:Triage>Normal [11:30:15] * elukey lunch! [12:46:06] Analytics-Kanban, Operations, Patch-For-Review: Upgrade stat1001 to Debian Jessie - https://phabricator.wikimedia.org/T76348#2223054 (elukey) started the rsync for /srv with (thanks @Dzahn!): ``` rsync -avp /srv rsync://stat1004.eqiad.wmnet:/srv ``` that is still ongoing. After that I'll also backup... [13:42:33] Analytics-Kanban, Operations, Patch-For-Review: Upgrade stat1001 to Debian Jessie - https://phabricator.wikimedia.org/T76348#2223436 (elukey) All data backupped in /srv/stat1001 on stat1004, we should be ready to proceed. [13:43:36] Analytics-Kanban, Operations, Patch-For-Review: Upgrade stat1001 to Debian Jessie - https://phabricator.wikimedia.org/T76348#2223448 (elukey) [13:50:05] yoohooo good morning [13:51:35] Good morning ottomata :) [13:51:42] You're earlier than expected :) [13:51:48] We are about to stop camus [13:51:51] elukey: here? [13:52:19] joal: yep :) [13:52:32] camus kicking time? [13:52:34] elukey: --^ [13:54:11] !log puppet stopped on analytics1027 together with Camus (via crontab -e) [13:54:13] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log, Master [13:54:22] joal: was doing that : [13:54:23] :) [13:54:27] elukey: every camus instance, right ;) [13:54:48] cool! [13:54:49] ok [13:55:01] ja, thing was over early, and it was right next door to a cafe [13:55:09] ottomata: we should be good to go in about half an hour [13:55:37] * joal hopes everything will work as expected :) [13:56:00] * elukey grabs a coffee [13:58:23] k cool [13:58:48] will check on hadoop jobs in a bit [13:59:06] ottomata: is it really 10:45 am for you ? [13:59:22] cause I was thinking it would be at 4:45pm my time [13:59:25] naw, its 9:45 [13:59:28] well [13:59:30] 9:00 [13:59:32] now [13:59:34] sorry [13:59:35] 10:00 now [13:59:38] is the time for me [13:59:40] ahhhhh, you cheated ;) [13:59:44] i cheated? [14:00:09] I thought you said 10:45 your time in your email [14:00:12] i did! [14:00:20] So I wasn't expecting you before :) [14:00:26] i wasn't either! [14:00:29] i'm here early [14:00:30] :) [14:00:43] Ah ok :) This is what cheating meat to me :) [14:09:19] ottomata: o/ [14:09:25] what do you think about merging https://gerrit.wikimedia.org/r/#/c/284204/ [14:09:28] ? [14:11:08] elukey: was reading it now! :) [14:16:58] (PS7) Amire80: Add sorted errors [analytics/limn-language-data] - https://gerrit.wikimedia.org/r/282228 [14:18:10] ottomata: mmmm do you mean all the comments or only the ones explaining the fields? [14:18:40] I would discard the ones with ## [14:19:01] like ## host= [14:21:17] elukey: the ones explaining :) [14:21:31] alll right! Code review updated :) [14:21:37] I am also running the puppet compiler [14:21:39] what i sometimes (but not always) do [14:21:43] if a value is unset [14:21:56] and the upstream config file has the config commetned [14:21:59] is leave the comment in there [14:22:16] https://puppet-compiler.wmflabs.org/2512/analytics1027.eqiad.wmnet/ [14:22:22] that way, if you are casually reading the conf file (instead of looking at the puppet template), you can see what the possible confs are [14:22:24] immediately [14:22:35] yep yep makes sense :) [14:22:44] and, if you happen to be hacking/deving something in labs or whatever, you can just uncomment and provide a value manually if you want [14:23:07] cool, that looks good [14:23:19] LGTM elukey! [14:23:28] goooood! merging :) [14:23:29] merged [14:23:31] haha [14:23:34] beat ya! [14:23:47] brb [14:26:15] b [14:34:56] 0 jobs running! [14:35:25] ottomata: I was five minutes late :) [14:36:06] ottomata, elukey: You're good to go :) [14:36:10] cool [14:36:16] still a little early, should we start though? [14:36:21] seems fine? [14:36:25] no one is using it [14:36:41] good for me [14:36:41] stopping puppet [14:37:24] its pretty loud in this cafe, but shall we batcave? :) [14:37:26] for fun!? [14:37:32] elukey: ? [14:37:48] joal: you don't have to come, but elukey and i were gonna do together [14:37:56] but you can come and hang if you want! [14:38:11] ottomata: sounds good! Let me check if I can grab the conf room [14:38:16] m [14:38:17] k [14:38:18] if you don't mind, I'll be there, listen (sometimes) and other stuff :) [14:39:36] etherpad here https://etherpad.wikimedia.org/p/analytics-meta [14:51:19] Analytics: reportupdater to support multiple output folders per config - https://phabricator.wikimedia.org/T126549#2223698 (mforns) Open>Invalid I think this task is actually done, because RU puppet setup now supports that a query repository has more than one query folder, and we can setup a RU job f... [15:06:27] Analytics, DNS, Operations, Traffic: Create analytics.wikimedia.org - https://phabricator.wikimedia.org/T132407#2223733 (Nuria) @BBlack: let us know if you think we can proceed with this and whether fab is an acceptable way to deploy [15:18:12] Analytics, DNS, Operations, Traffic: Create analytics.wikimedia.org - https://phabricator.wikimedia.org/T132407#2223774 (BBlack) I really have no idea about the fab deployment method (whether it's ok, how we automate it and grant access, where it's fetching data from, etc), or how/when we're goin... [15:26:13] Analytics-EventLogging, Operations, RESTBase, Services, and 2 others: RESTBase should handle the X-Analytics header - https://phabricator.wikimedia.org/T133139#2221908 (Nuria) The premise of this ticket is .. ahem.. pretty incorrect, let's catch up on IRC. [15:26:22] joal: are you all planning to setup the disks in your new hosts in the usual way (a single raid0)? [15:27:16] nuria_: enlighten me, please! :P [15:27:32] or elukey, or ottomata, or $whoever [15:28:12] * urandom liberally highlights nicks [15:29:27] urandom: $whoever will unlikely respond :P [15:29:34] * mobrovac trolling [15:30:14] mobrovac: $whoever just did! [15:30:19] boom. [15:30:20] Analytics-Kanban: Build a javascript client for the unique devices API - https://phabricator.wikimedia.org/T133159#2223841 (mforns) [15:30:24] urandom: we didn't discuss it but I guess that we should keep the current config no? [15:30:29] any reason why we should change? [15:30:32] elukey: maybe. [15:30:58] elukey: consider the blast radius of an array failure [15:31:28] elukey: is it safe to assume that the new nodes are also 10T in size? [15:31:58] mobrovac: so 99% of pageviews go through varnish as they are cached [15:32:32] elukey: are you planning on using multiple instances? [15:32:38] mobrovac: the x-analytics header is populated fine for all those pageviews [15:32:53] nuria_: the procedure you linked from the VCL happens on the way out (i.e. varnish checks the response's x-analytics header) [15:32:54] * ebernhardson is going to guess the oozie job failure mail is nothing to worry about [15:32:58] mobrovac: it can be use one of two ways [15:33:21] * mobrovac listening :) [15:33:35] urandom: I was planning to have a meeting with you and joal next week to discuss the best way forward :) [15:33:56] Analytics-Kanban: Build a javascript client for the unique devices API - https://phabricator.wikimedia.org/T133159#2223841 (mforns) [15:33:58] Analytics, DNS, Operations, Traffic: Create analytics.wikimedia.org - https://phabricator.wikimedia.org/T132407#2223885 (akosiaris) >>! In T132407#2211054, @Nuria wrote: > > @BBlack > This would not be a full-fledged service. What we would be deploying either via puppet of fab is just html/js... [15:34:05] mobrovac: the x-analytics header is used in two different ways: [15:34:21] urandom: would you be free more or less at this time one day of next week? [15:34:41] mobrovac: clients can set it to use it as a bad of analytics values. See for example x-analytics extension: [15:34:46] oozie is yelling at me via email -- "Fatal Error - Oozie Job load_mediawiki-wmf_raw.ApiAction,2016,4,20,13-wf" [15:35:05] mobrovac: https://github.com/wikimedia/mediawiki-extensions-XAnalytics [15:35:13] looks like it mad about partitioning the last 2 hours of data? [15:35:19] bd808: we are doing maintenance, moving oozie from one server to the other :) [15:35:39] elukey: monday? [15:35:49] mobrovac: and we use it in VCL as an agreggator of values that we are going to pass to varnishkafka [15:35:51] urandom: looks good! Will send you the meeting invite [15:35:54] mobrovac: and that is this code: https://github.com/wikimedia/operations-puppet/blob/production/templates/varnish/analytics.inc.vcl.erb#L162 [15:35:57] elukey: kk! [15:36:02] ll nuria_ [15:36:10] nuria_: thnx for the pointers! [15:36:12] will study them [15:36:38] mobrovac: it basically says: 'add to x-analytics cookies and headers that have analytics value' [15:37:44] mobrovac: it will look into the header just in case was set by client and also will append to it whatever we think is of interest thus the analytics name of varnish template [15:38:06] nuria_: ok, so from what i can see, the header is actually generated server-side [15:38:12] mobrovac: both [15:38:13] in the extension, that is [15:39:06] mobrovac: the extension uses php yes. [15:40:15] mobrovac: let me run some queries once hive is back up and we can disprove the unlikely theory that is a case issue [15:40:52] nuria_: bblack said earlier: it doesn't make sense for a client to send us X-Analytics, unless that was to influence our code that would look at that field and set an outbound X-Analytics flag in the response [15:40:56] urandom: currently solving some cluster issues, will come back to you soon [15:41:19] nuria_: and: the normal flow of things is that X-Analytics is only set on the outbound side. An appserver can set X-Analytics fields in its response to varnish, and varnish will almost certainly set more (appending). [15:41:29] joal: no worries; or we can talk about it on monday [15:45:01] mobrovac: I think that is the likely problem yes. But i disagree that the client should not set those, i do not see why not, otherwise we would have a proliferation of headers. [15:45:40] mobrovac: any out of bound item the client wants to communicate will become its own header [15:46:18] nuria_: the way i see it is that "the problem" here is the fact that varnish takes no action on the incoming x-analytics header, but expects the BE to do that (just found some code in MobileFrontend that does just that) [15:46:47] nuria_: so, if you hit the cache, varnish will react to whatever is there (in the cached response) regardless of what the client sent [15:47:03] at least that's my current understanding [15:49:26] nuria_: https://github.com/wikimedia/mediawiki-extensions-MobileFrontend/blob/master/includes/MobileFrontend.hooks.php#L109-L122 [15:51:32] mobrovac: ya, that code was the "seed" i think for x-analytics extension: https://github.com/wikimedia/mediawiki-extensions-XAnalytics/blob/master/XAnalytics.class.php#L55 [15:53:03] mobrovac: it's looking (see last e-mail) that we are just going to have to have different headers per key, value pair on X-analytics [15:53:19] nuria_: hive is useable [15:53:24] we are just fixing some oozie job stuff [15:53:38] nuria_: yup, reading it right now [15:54:13] mobrovac: and we will need to apologize to apps team as we misslead them to think this might work , i would certainly prefer 1 header for all out of bound analytics communication. [15:55:16] nuria_: conceptually i think your suggestion to them (about them sending the header) makes sense - the client knows if this is really a page view or not, but the problem comes once the response is cached [15:55:42] once x-analytics: pageview=1 is cached, all reqs for that URI will have that [15:55:54] which will then render the results incorrect [15:56:10] (PS1) BryanDavis: Remove bjorsch from ApiAction email list [analytics/refinery] - https://gerrit.wikimedia.org/r/284472 [15:58:26] nuria_: mobrovac haven't read all of your backscroll but am curious [15:58:33] when is a cached pageview=1 not a pageview? [15:59:08] ottomata: ya, that is the part i do not get either [15:59:16] bd808: aye, sorry for that spam, we missed a step in our migration plan that means we have to resubmit all hive oozie jobs [15:59:20] i've already done yours [15:59:31] and, this was much easier with the combined mediawiki avro bundle [15:59:32] so thanks for that! [15:59:42] ottomata: awesome [16:00:00] I figured it was related to the migration when I poked here [16:00:13] mobrovac: what could be the problem with caching those? [16:00:14] I mostly wanted to hear what I did "we're on it" [16:00:14] ottomata: nuria_: that was more of a question than a statement [16:00:41] ebernhardson: hiya, yt? [16:01:01] mobrovac: ah, cause that is what i do not get , i do not see a problem there but maybe i am missing something out [16:01:02] ottomata: nuria_: is it possible that a retrieval of a URI does not constitute a pageview when the first (cached) response was? [16:01:22] mobrovac: for the apps? no i do not think so [16:01:37] ottomata: yup [16:01:39] mobrovac: teh tagging is a way for us to have to inspect (via regex) a url [16:02:11] ebernhardson: popularity_score-coord needs to be killed and resubmitted [16:02:11] ottomata: do i need to re-kick off our oozie jobs? [16:02:18] nuria_: like, if i manually ask for the URI from the browser or curl or something, that shouldn't be counted as a mobileapp pageview [16:02:35] ebernhardson: i think i've done the other ones [16:02:42] its just that one, since it is outside of refinery [16:02:46] as that's nor a mobile view or an android app view [16:03:06] since i'm lurking: that's a good point; there is currently no difference. A request from the app for /api/rest_v1/page/mobile-sections-lead/ is a pageview 100% of the time [16:03:24] <ottomata> ebernhardson: this one: [16:03:24] <ottomata> https://hue.wikimedia.org/oozie/list_oozie_coordinator/0010115-160223202501439-oozie-oozi-C/ [16:03:34] <nuria_> mobrovac: such is teh nature of http [16:03:35] <ottomata> you need to do a discovery repo hdfs deploy [16:03:37] <mdholloway> though mobrovac makes a good point that requests from other clients shouldn't be counted and would be if it were cached as a pageview [16:03:39] <ottomata> and resubmit the oozie job [16:03:46] <ottomata> we are going to revisit this problem [16:03:50] <ottomata> i can explain if you want more info... :) [16:03:59] <nuria_> mobrovac: as mdholloway said, if you request api.php with an app user agent and you get a 200 response [16:04:07] <nuria_> mobrovac: that will be counted as an app pageview [16:04:13] <ebernhardson> ok already killed, easy enough to deploy and resubmit [16:04:17] <ottomata> yup perfect [16:04:20] <nuria_> mobrovac: but maybe there is a problem beyond this one that i am missing [16:04:26] <ottomata> just gotta do hdfs deploy (to pick up new hdfs-site.xml) [16:04:29] <mdholloway> i suppose using headers preserves flexibility in the event we want to change what constitutes a pageview within the client [16:04:30] <ottomata> and then resubmit [16:04:42] <mobrovac> nuria_: yes, correctly, so it's gotta be the right UA and that makes sense [16:04:58] <nuria_> mobrovac: ya, that might be it actually, the UA [16:05:06] <ebernhardson> ottomata: more just curious (already running it), why does it need to be re-deployed to hdfs? [16:05:13] <nuria_> mobrovac: is required for it to be counted in one case but not the other [16:05:20] <elukey> nuria_: standup?? [16:05:27] <nuria_> elukey: yes, sorry [16:05:40] <ottomata> ebernhardson: ja [16:05:42] <ottomata> so [16:05:51] <ottomata> the way we do this in refinery, and the way yall copied from us [16:06:07] <ottomata> was to make the hdfs deployment copy the hive-site.xml into the deployed dir in hdfs [16:06:17] <ebernhardson> ahh, ok that makes sense [16:06:18] <ottomata> hive-site.xml contains connection information for hive clients [16:06:21] <ottomata> that info has changed [16:06:23] <elukey> !log camus re-enabled on analytics1027 [16:06:26] <analytics-logbot> Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log, Master [16:06:28] <ottomata> so, preivously deploy oozie jobs have stale configs [16:06:48] <ottomata> i think w want to change this process though, it makes sense to statically link oozie jobs to hard deploy paths/versions, instead of always using latest [16:06:55] <ottomata> but for this hive-site config, i think we want to always use latest [16:07:13] <ottomata> so, we will probably figure out a better way to link it from oozie jobs in hdfs, and then show you what we do [16:07:14] <ottomata> and you can adapt too [16:07:21] <ebernhardson> i'll kill and restart the transfer_to_es job then too [16:08:13] <wikibugs> Analytics-EventLogging, Operations, RESTBase, Services, and 2 others: RESTBase should handle the X-Analytics header - https://phabricator.wikimedia.org/T133139#2224008 (mobrovac) Open>Invalid Nope, that's not what we should do. [16:08:35] <ottomata> ebernhardson: i don't think that one needs it [16:08:38] <ottomata> won't hurt [16:08:43] <ottomata> only ones that use hive need restarted [16:09:08] <ebernhardson> i suppose it reads from hdfs directly instead of using hive so probably not [16:09:28] <ebernhardson> but both do actually. they just happen to load HiveContext so they can use the UDF's [16:09:45] <ottomata> hm, i think as long as there is not a hive oozie action [16:09:47] <ottomata> in your job [16:09:50] <ottomata> it won't need restarted [16:10:02] <ottomata> if those have a hive action, then ja restart them [16:10:57] <ebernhardson> both restarted, was easy enough to just copy/paste the commands out of wikitech anyways. Thanks for the reminder! [16:21:55] <wikibugs> Analytics, DNS, Operations, Traffic: Create analytics.wikimedia.org - https://phabricator.wikimedia.org/T132407#2224049 (Dzahn) Everything Alex already said :) i setup bromine and most of those microsites and yea, it's meant for small static sites. In addition to one of those small puppet roles... [16:24:16] <wikibugs> Analytics, DNS, Operations, Traffic: Create analytics.wikimedia.org - https://phabricator.wikimedia.org/T132407#2224056 (BBlack) Well, the impedance mismatch here on the standard static bromine setup and what analytics is asking for then may be all about the static-ness and deployment process. I... [16:29:10] <wikibugs> Analytics, DNS, Operations, Traffic: Create analytics.wikimedia.org - https://phabricator.wikimedia.org/T132407#2224067 (Dzahn) It would mostly just be about who has +2 on the gerrit repo that holds the actual site content. If the puppet role on our site git clones with "ensure latest" then there... [16:48:03] <wikibugs> Analytics, DNS, Operations, Traffic: Create analytics.wikimedia.org - https://phabricator.wikimedia.org/T132407#2224174 (Nuria) >It would mostly just be about who has +2 on the gerrit repo that holds the actual site content. If the puppet role on our site git clones with "ensure latest" >then th... [16:53:55] <nuria_> mobrovac: updated e-mail thread again, let me know if it doesn't sound good [16:56:10] <mobrovac> kk will take a look [16:57:12] <wikibugs> Analytics: Get http level ops stats for AQS from varnish - https://phabricator.wikimedia.org/T133171#2224202 (Nuria) [16:57:22] <wikibugs> Analytics: Get http level ops stats for AQS from varnish - https://phabricator.wikimedia.org/T133171#2224214 (Nuria) p:Triage>High [17:10:52] <wikibugs> Analytics, Commons, Multimedia, Wikidata, and 3 others: Allow tabular datasets on Commons (or some similar central repository) (CSV, TSV, JSON, XML) - https://phabricator.wikimedia.org/T120452#1854102 (ThurnerRupert) just to add another example which might help the zillion of sports results on wi... [17:11:36] <wikibugs> Analytics, DNS, Operations, Traffic: Create analytics.wikimedia.org - https://phabricator.wikimedia.org/T132407#2224265 (BBlack) When you say "build our code" do you mean building client-side javascript code that's ultimately static content from the server's perspective, or do you mean building s... [17:13:17] <wikibugs> Analytics, Pageviews-API: Allow for arbitrary ranges in /top endpoint of pageviews API - https://phabricator.wikimedia.org/T133176#2224280 (MusikAnimal) [17:13:36] <wikibugs> Analytics, DNS, Operations, Traffic: Create analytics.wikimedia.org - https://phabricator.wikimedia.org/T132407#2224292 (Dzahn) "build code" and "static site" are confusing me a bit. the kind of static site we host on bromine means HTML and CSS and some images. [17:15:28] <wikibugs> Analytics, DNS, Operations, Traffic: Create analytics.wikimedia.org - https://phabricator.wikimedia.org/T132407#2197243 (ori) >>! In T132407#2224265, @BBlack wrote: > When you say "build our code" do you mean building client-side javascript code that's ultimately static content from the server's... [17:15:48] <wikibugs> Analytics, Pageviews-API, RESTBase: RESTBase for wikimedia.org should be on www.wikimedia.org - https://phabricator.wikimedia.org/T133178#2224310 (Krinkle) [17:18:47] <joal> elukey, ottomata: Thanks for the machine shift today :) [17:20:23] <joal> a-team, logging off, will double check jobs after diner :) [17:20:31] <grrrit-wm> (CR) Anomie: "Not so much "doesn't want to know when" as "has absolutely no idea what to do if", to the point of even being able to tell if anything is " [analytics/refinery] - https://gerrit.wikimedia.org/r/284472 (owner: BryanDavis) [17:20:33] <mforns> bye joal! see ya [17:21:58] <mforns> a-team I'm also leaving for a couple hours, need to run an errand and then gym, will be back after dinner. [17:30:25] <elukey> byeee [17:35:43] <wikibugs> Analytics, Pageviews-API: Allow for arbitrary ranges in /top endpoint of pageviews API - https://phabricator.wikimedia.org/T133176#2224357 (MusikAnimal) [17:39:02] <wikibugs> Analytics, Pageviews-API: Allow for arbitrary date ranges in /top endpoint of pageviews API - https://phabricator.wikimedia.org/T133176#2224368 (MusikAnimal) [17:39:42] <elukey> going offline team! byyyeee o/ [17:40:37] <nuria_> ebernhardson: a question about elasticserach [17:40:41] <wikibugs> Analytics, DNS, Operations, Traffic: Create analytics.wikimedia.org - https://phabricator.wikimedia.org/T132407#2224373 (Ottomata) Hm, I had assumed we would just host analytics.wikimedia.org on stat1001. I think we'd like analytics.wikimedia.org to eventually supercede stats.wikimedia.org, an... [17:40:52] <ebernhardson> nuria_: shoot [17:41:18] <nuria_> ebernhardson: somebody was asking me the indexing throughput we handle [17:41:32] <nuria_> ebernhardson: do we have those numbers anywhere? graphana? [17:42:37] <grrrit-wm> (CR) Ottomata: [C: 2 V: 2] Remove bjorsch from ApiAction email list [analytics/refinery] - https://gerrit.wikimedia.org/r/284472 (owner: BryanDavis) [17:43:18] <ebernhardson> nuria_: it's a little hard to make out from here, because the daily rebuild of the completion suggester dwarfs the normal load: https://grafana.wikimedia.org/dashboard/db/elasticsearch?panelId=12&fullscreen [17:43:31] <ebernhardson> nuria_: but 500 index updates per second is normal [17:43:50] <nuria_> ebernhardson: ajam , thank you [17:44:22] <ebernhardson> maybe i should just throw that on a log scale... [17:45:15] <ebernhardson> done, shows a little more info now :) [17:46:20] <ebernhardson> the numbers for 18th through 20th though a bit deceiving ... those are the popularity score updates we push from hadoop into ES, but many of them are noop's. Sadly es still counts them in the indexing stats even though no write occured [17:46:45] <ebernhardson> also there was a reindex of 10 days due to a bug going on in the same time period [17:52:45] <wikibugs> Analytics, Pageviews-API, RESTBase: RESTBase for wikimedia.org should be on www.wikimedia.org - https://phabricator.wikimedia.org/T133178#2224310 (mobrovac) In the case of RESTBase, these are actually two distinct domains for us. We are using `wikimedia.org` as a sort-of //global domain// which expos... [17:54:09] <nuria_> ebernhardson: thank you [17:56:48] <wikibugs> Analytics, DNS, Operations, Traffic: Create analytics.wikimedia.org - https://phabricator.wikimedia.org/T132407#2224433 (Nuria) >In general, analytics.wikimedia.org will host static files (html, js, tsvs, etc.), but not just for one service (dashiki / reportcard). Correct, some dashiki plots will... [18:04:56] <wikibugs> Analytics-Cluster, Analytics-Kanban: Add icinga process alerts for hive, oozie and mysql analytics-meta instance - https://phabricator.wikimedia.org/T133182#2224456 (Ottomata) [18:12:07] <wikibugs> Analytics-Kanban: Unique Devices javaascript node module - https://phabricator.wikimedia.org/T133184#2224506 (Nuria) [18:18:24] <wikibugs> Analytics-Kanban: Unique Devices javascript node module - https://phabricator.wikimedia.org/T133184#2224541 (madhuvishy) [18:39:21] <wikibugs> Analytics, DNS, Operations, Traffic: Create analytics.wikimedia.org - https://phabricator.wikimedia.org/T132407#2197243 (madhuvishy) I want to explain the current setup in labs a little bit and point out that it won't work as is in prod - and needs some re-working. The idea is we have a single in... [18:47:13] <wikibugs> Analytics, Commons, Multimedia, Wikidata, and 3 others: Allow tabular datasets on Commons (or some similar central repository) (CSV, TSV, JSON, XML) - https://phabricator.wikimedia.org/T120452#1854102 (brion) Couple quick notes: * pretty cool. :) * I worry about efficiency of storage and queries... [18:50:55] <wikibugs> Analytics-Cluster, Analytics-Kanban: Fix hive-metastore vs libmysql-jar race condition when provisioning new hive metastore server - https://phabricator.wikimedia.org/T133198#2224787 (Ottomata) [19:03:48] <wikibugs> Analytics, DNS, Operations, Traffic: Create analytics.wikimedia.org - https://phabricator.wikimedia.org/T132407#2224904 (Ottomata) > these dashboards would have to be served at analytics.wikimedia.org/browser-reports, analytics.wikimedia.org/edit-reports etc. And the apache setup will have to cha... [19:04:19] <wikibugs> Analytics, DNS, Operations, Traffic: Create analytics.wikimedia.org - https://phabricator.wikimedia.org/T132407#2224906 (Ottomata) > and use scap3 Actually, I don't think we can use scap3 if they aren't in git, since scap3 deploys via git. [19:06:51] <nuria_> : ottomata in hive i am getting anew warning: [19:06:54] <nuria_> https://www.irccloud.com/pastebin/VT6T3dNq/ [19:07:13] <ottomata> nuria_: hm das weiird [19:07:13] <ottomata> hm [19:07:16] <ottomata> that's via hive? [19:07:29] <nuria_> running hive -f select.hql [19:08:37] <ottomata> huh, on stat1002? i just ran that with your select_pageview.hql file no prob [19:09:02] <nuria_> ottomata: yes, [19:09:16] <nuria_> ottomata: try : [19:09:45] <nuria_> ottomata: nuria@stat1002:~/tmp$ hive -f select_decimal.hql [19:10:44] <ottomata> trying it (i'm sudoed as you) [19:10:54] <nuria_> k [19:12:46] <ottomata> huh! [19:13:38] <ottomata> i see it, but the query finishes, ja? [19:14:06] <ottomata> weird that that happens later in the query [19:14:31] <nuria_> ottomata: ya, it works fine [19:14:49] <nuria_> ottomata: but funny cause that query is super simple other than a decimal cast [19:18:27] <ottomata> nuria_: weird [19:18:29] <ottomata> i dunno [19:18:37] <ottomata> i doubt this is related to the server move we did today [19:18:58] <ottomata> possible it was related to cdh 5 upgrade we did a few months ago [19:19:01] <ottomata> and just have never run into it before [19:23:00] <nuria_> ottomata: k [19:23:27] <nuria_> ottomata: no need to follow through now [19:31:39] <ottomata> o [19:31:40] <ottomata> k [19:32:06] <wikibugs> Analytics, MediaWiki-extensions-WikimediaEvents, The-Wikipedia-Library, Wikimedia-General-or-Unknown, Patch-For-Review: Implement Schema:ExternalLinkChange - https://phabricator.wikimedia.org/T115119#2224963 (Samwalton9) What's the status of this? Do we need to do anything? :) [19:37:52] <HaeB> what's the best way to mark a schema page on meta as invalid/not implemented/merged to another schema? [19:38:21] <HaeB> normally i would just redirect the page to the other schema, but that does not seem to be possible in the Schema: namespace (which enforces JSON syntax) [19:38:46] <HaeB> (case in point: https://meta.wikimedia.org/wiki/Schema:PageLinkInteraction was merged into https://meta.wikimedia.org/wiki/Schema:Popups ) [19:40:22] <wikibugs> Analytics-Kanban: Client values inbound in X-analytics header are reflected in outbound X-Analytics on varnish - https://phabricator.wikimedia.org/T133204#2225002 (Nuria) [19:40:39] <wikibugs> Analytics-Kanban: Client values inbound in X-analytics header are reflected in outbound X-Analytics on varnish - https://phabricator.wikimedia.org/T133204#2225015 (Nuria) p:Triage>High [19:41:41] <nuria_> HaeB: I would note it on talk page. [19:41:59] <HaeB> ok, did that already ;) [19:45:08] <wikibugs> Analytics-Cluster, Analytics-Kanban, Patch-For-Review: Add icinga process alerts for hive, oozie and mysql analytics-meta instance - https://phabricator.wikimedia.org/T133182#2225040 (Ottomata) https://icinga.wikimedia.org/cgi-bin/icinga/status.cgi?search_string=analytics1003 [19:50:25] <wikibugs> Analytics, Analytics-Cluster: Deploy hive-site.xml separately from refinery - https://phabricator.wikimedia.org/T133208#2225074 (Ottomata) [19:52:36] <wikibugs> Analytics, Wikipedia-Android-App-Backlog: Investigate recent decline in views and daily users - https://phabricator.wikimedia.org/T132965#2225093 (Tbayer) Update, just so that people reading along here know: There has been extensive internal discussion about this (the accompanying thread with people from... [20:01:52] <wikibugs> Analytics, Analytics-Cluster: Deploy hive-site.xml separately from refinery - https://phabricator.wikimedia.org/T133208#2225157 (Ottomata) @EBernhardson you should track this task so once done you can adapt your hive oozie jobs that are outside of refinery. [20:02:11] <wikibugs> Analytics, Analytics-Cluster: Deploy hive-site.xml to HDFS separately from refinery - https://phabricator.wikimedia.org/T133208#2225159 (Ottomata) [20:20:43] <ottomata> nuria_: i am ready ping me if i am not in room [20:20:46] <ottomata> in 10 mins [20:21:23] <nuria_> k [22:38:05] <wikibugs> Analytics-Cluster, Analytics-Kanban, Operations, netops, Patch-For-Review: setup/deploy server analytics1003/WMF4541 - https://phabricator.wikimedia.org/T130840#2225679 (RobH)