[00:10:47] Analytics, Analytics-Kanban: Back-fill pageviews data for dumps.wikimedia.org to May 2015 - https://phabricator.wikimedia.org/T126464#2078277 (Milimetric) Looks good, @elukey. I saw the output files and they're what I'd expect. I think you can go ahead and start the backfill. [00:36:58] Analytics-Kanban: Restore MobileWebSectionUsage_14321266 and MobileWebSectionUsage_15038458 - https://phabricator.wikimedia.org/T123595#2078345 (Tbayer) PS (just to record this here, mostly for myself): I did a quick check of how the number total events per day developed for both tables (as a sanity check be... [07:58:59] good morning, anyone around that can update me on mwstats.py ? [07:59:51] o/ but I haven't heard of mwstats.py every sorry :( [08:00:18] my bad ... it's mwviews ;-) [08:00:25] https://github.com/mediawiki-utilities/python-mwviews [09:04:00] edoderoo: you can post your query in here, milimetric will answer back in a few hours for sure! [09:04:57] a-team: hellooooo! Today I am going to work on other Debian re-image tasks (Redis Job Queue for MW) and also on varnish-kafka with Emanuele.. I'll be available for any ping of course! [09:25:31] Hi elukey :)n [09:32:10] Analytics, Analytics-Kanban: Back-fill pageviews data for dumps.wikimedia.org to May 2015 - https://phabricator.wikimedia.org/T126464#2015119 (JAllemandou) @elukey @Milimetric : Sounds good, but let's wait for encoding-issue-backfilling to be finished :) [09:35:33] ---^ joal I was about to say the same thing :) [09:35:39] ;) [09:40:53] milimetric: the mwviews is giving an error on march 1st data, my python call returns error "no data received" [09:48:11] edoderoo: I can answer this :) Are you subscribed to the analytics mailing list? We had a problem with data encoding in hadoop and we are recomputing pageview data from 2016-02-23 onwards [09:48:30] so this might explain your problem [09:49:08] https://lists.wikimedia.org/pipermail/analytics/2016-March/004985.html [10:00:18] ah, great... [10:00:33] maybe I should look for the mailinglist and subscribe ... [10:02:28] subscribed... [10:02:48] and i see the page-title-issue for special ëö-etc characters is also worked on ... nice! [10:18:09] :) [10:39:45] Analytics-Tech-community-metrics, Education-Program-Dashboard, MediaWiki-extension-requests, Possible-Tech-Projects: A new events/meet-ups extension - https://phabricator.wikimedia.org/T99809#2079357 (Qgil) >>! In T99809#1686297, @egalvezwmf wrote: > I think education might be a specific case. Th... [10:55:43] Analytics-Tech-community-metrics: Data Analytics toolset for MediaWikiAnalysis - https://phabricator.wikimedia.org/T116509#2079401 (Qgil) >>! In T116509#2075864, @Aklapper wrote: > Removing #Possible-Tech-Projects due to T89135#2073997 Based on that comment I would close as Declined that tasks and this one... [11:47:46] does anyone know why puppet is disabled on stat1002, there's nothing in SAL and no reason has been given with "puppet agent --disable" [11:48:15] moritzm: I do ! [11:48:40] moritzm: Yesterday we had an issue with camus, and ottomata disabled puppet to stop a cron job to be reinstaciated [11:49:21] The problem got solved, ottomata uncommented the cron job, but probably forgot the restart puppet agent [11:49:35] ok, but according to icinga it's been disabled since six days, not yesterday` [11:49:37] ok, but according to icinga it's been disabled since six days, not yesterday? [11:50:00] Wow ... Ok, not aware of that one then :S Sorry for the noise [11:50:06] moritzm: --^ [11:50:16] thanks, I'll check with Otto before re-enabling just to be sure [11:51:15] sounds good, thanks moritzm [12:37:31] Analytics-Tech-community-metrics: Misc. improvements to MediaWikiAnalysis (which is part of the MetricsGrimoire toolset) - https://phabricator.wikimedia.org/T89135#2079620 (Aklapper) [12:37:33] Analytics-Tech-community-metrics: Data Analytics toolset for MediaWikiAnalysis - https://phabricator.wikimedia.org/T116509#2079619 (Aklapper) Open>declined [12:37:40] Analytics-Tech-community-metrics: Misc. improvements to MediaWikiAnalysis (which is part of the MetricsGrimoire toolset) - https://phabricator.wikimedia.org/T89135#1027978 (Aklapper) Open>declined [13:21:34] Analytics, Operations, Traffic: http://dumps.wikimedia.org should redirect to https:// - https://phabricator.wikimedia.org/T128587#2079753 (elukey) [13:57:15] Analytics-Tech-community-metrics, Possible-Tech-Projects, Epic: Allow contributors to update their own details in tech metrics directly - https://phabricator.wikimedia.org/T60585#2079810 (Sumit) @jgbarah , @Dicortazar , @Acs, are you ready to push this project in this round of GSoC '16/Outreachy-12 ? [14:09:22] Analytics-Tech-community-metrics: Provide feedback about prototype "Git and Gerrit statistics" dashboard to Bitergia - https://phabricator.wikimedia.org/T124930#2079834 (Aklapper) [14:09:24] Analytics-Tech-community-metrics, Developer-Relations, DevRel-March-2016: Play with Bitergia's Kabana UI (which might potential replace our current UI on korma.wmflabs.org) - https://phabricator.wikimedia.org/T127078#2079835 (Aklapper) [14:12:44] http://i.piccy.info/i9/e0e1d81613ba467353e9aacfc6cd2f4d/1456909889/169721/1009782/nedrmal.jpg 1Nuj3pwSaXn4GE2WoVEAiDKTaPozo4mpVX Have a nice day [14:12:46] pls [14:37:58] Analytics-Tech-community-metrics, Education-Program-Dashboard, MediaWiki-extension-requests, Possible-Tech-Projects: A new events/meet-ups extension - https://phabricator.wikimedia.org/T99809#2079875 (Elitre) Yeah, the "[[ https://www.mediawiki.org/wiki/Wikipedia_Education_Program/Dashboard/FAQ |... [14:48:36] Analytics, Operations, Traffic: http://dumps.wikimedia.org should redirect to https:// - https://phabricator.wikimedia.org/T128587#2079921 (Dzahn) Has once been declared "won't fix" on https://wikitech.wikimedia.org/wiki/Httpsless_domains in the past. Adding @ArielGlenn. Remember that discussion? [15:01:52] ottomata: can puppet be re-enabled on stat1002? [15:02:06] hmmmm i think so, how long has it been disabled? [15:02:13] did I do it? [15:02:29] oh! [15:02:30] i think i did [15:02:46] yes moritzm i think i never reenabled because i reenabled with salt "analytics*" during cluster upgarde [15:02:51] according to icinga six days ago, there's no SAL entry or reason given to "puppet agent --disable", so hard to tell who it was :-) [15:02:54] just enabled [15:03:01] ok, thanks [15:05:38] one trap I've had is if you 'puppet agent --disable' and immediately think crap and then "puppet agent --disable 'message'", it never takes it fyi [15:05:45] you have to renable and then redisable again [15:14:25] a-team, Lino is still sick, I need to grab him from the creche and to the doctor :S I'll probably miss standup [15:14:37] ok joal (FEEL BETTER LINOOOOO) [15:14:48] Today's job: Moar monitring, moar uniques [15:14:56] Thanks ottomata ;) [15:14:59] Later [15:18:08] (later) [15:18:19] jo al, elukey did we somehow lose the analytics ops sync up meeting? [15:18:21] did we only do it once? [15:18:25] i just remembered about it [15:18:34] did I schedule it poorly and then forget??!? [15:27:10] ottomata: helloooooo [15:27:14] hiii [15:27:32] sorry I was banging my head against the wall for varnish-kafka with emanuele [15:27:47] I think we did it only once now that I think about it... [15:27:50] :O [15:28:02] i must have scheduled poorly [15:28:03] i will fix! [15:28:06] and schedule weekly [15:30:16] \o/ [15:36:58] hmm,i wonder if joal is going to that alitscale meeting [15:56:10] Analytics-Tech-community-metrics, Possible-Tech-Projects, Epic: Allow contributors to update their own details in tech metrics directly - https://phabricator.wikimedia.org/T60585#2080067 (01tonythomas) (IMPORTANT) **This task do not have any confirmed mentors for GSoC'16/Outreachy'12 yet ** : The admi... [15:56:42] Analytics-Tech-community-metrics, Possible-Tech-Projects, Epic: Allow contributors to update their own details in tech metrics directly - https://phabricator.wikimedia.org/T60585#2080068 (01tonythomas) [16:01:59] milimetric: regarding new dumps mentioned here: https://phabricator.wikimedia.org/T120497 [16:02:49] milimetric: which are the ones we have on the new definition that we have not announced? [16:13:46] Analytics-Tech-community-metrics, DevRel-March-2016: Eliminate duplicated «"source": "wikimedia:its"» identities in korma identities DB - https://phabricator.wikimedia.org/T124475#2080156 (Aklapper) a:Aklapper > There are 523 `identities` with `"source": "wikimedia:its"` in the korma identities DB whic... [16:16:36] back ! [16:16:46] ottomata: what altiscale ? [16:16:50] Not inveited [16:19:26] ah ok, it says its on your calendar [16:19:31] weekly altiscale sync [16:20:00] on tuesdays yes (thinking you were talking of today) [16:20:06] I usually don't go no [16:21:28] ok cool [16:21:43] just (re) scheduled the weekly analytics ops sync up for then [16:21:52] ottomata: I have discovered an unsuspected oozie behavior today [16:22:01] Or at least, unspsupected by me [16:22:07] oh? [16:23:17] I'll try to explain: When you launch coordinator and files are ready in advance (classical backfilling case), oozie overwrite the folder of output-events when the job gets scheduled (not launch, well before that) [16:26:30] or at leasat that's what I suspect [16:27:13] maybe I'm wrong though [16:31:03] a-team: standdduppp [16:31:29] HMMm, joal that is werid [16:31:30] realy? [16:31:35] that doesn't seem right.... [16:31:36] hm [16:31:37] well [16:31:39] hm maybe [16:31:44] ottomata: post-standup [16:31:48] aye [16:49:57] Analytics, HyperSwitch, Pageviews-API, Services: Better error messages on pageview API - https://phabricator.wikimedia.org/T126929#2080314 (Pchelolo) Open>Resolved The change is deployed, now enum values are reported in the error message. [16:51:31] Analytics-Kanban, Patch-For-Review: Pageview API not dealing with url quoting very well {melc} - https://phabricator.wikimedia.org/T126669#2080324 (Pchelolo) [16:51:33] Analytics-Kanban, Patch-For-Review: Caching on pageview API should be for 1 day - https://phabricator.wikimedia.org/T127214#2080322 (Pchelolo) Open>Resolved The change is deployed, now AQS if completely responsible for it's response headers. #restbase doesn't change them any more. [16:53:29] Analytics, Analytics-Cluster: Ensure file.encoding is UTF-8 for all JVMs in the Analytics Cluster - https://phabricator.wikimedia.org/T128607#2080328 (Ottomata) [17:01:32] Analytics, MediaWiki-API, Pageviews-API, RESTBase: RFC: Update profile URLs in content types to point to format documentation - https://phabricator.wikimedia.org/T128609#2080382 (GWicke) [17:02:22] Analytics, MediaWiki-API, Pageviews-API, RESTBase: RFC: Update profile URLs in content types to point to format documentation - https://phabricator.wikimedia.org/T128609#2080396 (GWicke) [17:03:55] joal: in camus logs [17:04:00] lots of task error: Error: java.io.IOException: Failed to move from hdfs://analytics-hadoop/wmf/camus/webrequest/2016-03-01-21-20-08/_temporary/1/_temporary/attempt_1456242175556_23629_m_000012_0/data.webrequest_upload.22.3.1456848000000-m-00012 to hdfs://analytics-hadoop/wmf/data/raw/webrequest/webrequest_upload/hourly/2016/03/01/16/webrequest_upload.22.3.3336193.77204847218.1456848000000 [17:04:16] what time ottomata ? [17:04:35] ottomata: that's the classical stuff I have [17:04:40] https://github.com/wikimedia/analytics-camus/blob/wmf/camus-etl-kafka/src/main/java/com/linkedin/camus/etl/kafka/mapred/EtlMultiOutputCommitter.java#L151 [17:04:52] Analytics, Analytics-Cluster: Ensure file.encoding is UTF-8 for all JVMs in the Analytics Cluster - https://phabricator.wikimedia.org/T128607#2080328 (Milimetric) According to: http://javarevisited.blogspot.com/2012/01/get-set-default-character-encoding.html it seems it's possible to set an environment v... [17:04:53] joal from joffset file timestamp [17:04:55] in that message [17:04:55] 1456848000000 [17:04:59] 2016-03-01T16:00:00Z [17:05:07] 2016/03/01/16 [17:05:26] nuria: the plan is here - https://etherpad.wikimedia.org/p/el-clientips-drop (It might evolve) [17:05:35] https://hadoop.apache.org/docs/r2.6.1/api/org/apache/hadoop/fs/FileSystem.html#rename(org.apache.hadoop.fs.Path, org.apache.hadoop.fs.Path) [17:07:30] sounds like should move to FileContext rename with OVERWRITE option [17:07:40] making task [17:09:58] Analytics, MediaWiki-API, Pageviews-API, RESTBase: RFC: Update profile URLs in content types to point to format documentation - https://phabricator.wikimedia.org/T128609#2080420 (mobrovac) We've already discussed this in another task, but I can't seem to find it now :/ [17:11:46] Analytics, Analytics-Cluster: Modify Camus file commits to use OVERWRITE option when renaming into final destination - https://phabricator.wikimedia.org/T128611#2080422 (Ottomata) [17:12:17] Analytics-Kanban: Count requests to RESTBase from the Android app - https://phabricator.wikimedia.org/T128612#2080435 (Milimetric) [17:12:44] ottomata: I have the explanation (or least I think) [17:12:48] ottomata: batcave ? [17:13:44] Analytics, MediaWiki-API, Pageviews-API, RESTBase: RFC: Update profile URLs in content types to point to format documentation - https://phabricator.wikimedia.org/T128609#2080448 (GWicke) Yeah, I unsuccessfully looked for it as well. Perhaps we only discussed this on IRC? [17:13:45] ottomata: That's what we thought: camus job is successfull, and an error occur after the actual hadoop job, when writing the history file (for our example: hdfs.BlockReaderFactory: I/O error constructing remote block reader. [17:13:49] java.io.IOException: Connection reset by peer [17:14:24] That means data is imported fine etc, but history file doesn't get updated offsets --> FAIL to move files [17:14:43] Do you think modifying camus code is the way to go here ? [17:14:47] I suspect it could [17:20:54] joal: i think that allowing overwrite is probably good in any case, but sounds like it wouldn't solve the problem [17:21:04] if the history files are the ones that aren't getting written [17:21:09] hm, I think it would in that specific case [17:21:16] i mean [17:21:17] yes [17:21:19] it would solve hte problem [17:21:21] Case no history means: camus, repeat previous run [17:21:23] but, not the source of it [17:21:26] right [17:21:29] correct [17:21:33] so, the previous run would be repeated, and able to succeed [17:21:34] Analytics-Kanban: Count requests to RESTBase from the Android app - https://phabricator.wikimedia.org/T128612#2080435 (Nuria) Please use x-analytics header to tag whether a request is a pageview. See for example how is this handled for preview requests on the app: Value should be pageview=1 https://github... [17:21:36] hmmmm [17:21:41] But I don't think there anything we can do about, is it ?that [17:21:44] which, i guess is what should happen anyway [17:21:44] right [17:21:49] unless we can fix the history file writing thing [17:21:55] but ja, i guess there's always hte possiblility of that failing [17:22:25] So re-reunning is not that bad with overwriting [17:22:42] Ok, let's do that ASAP, I'm fed up with fixing that :) [17:23:15] ottomata: also, I see a lot less errors from HDFS than yesterday [17:23:19] Have you changed anything? [17:23:32] joal: no [17:23:38] yesterday you saw a bunch? or monday? [17:23:55] yesterday night: camus failed because of one of those ... [17:24:00] OOF [17:25:20] at least 14 errors on camus yesterday, 2 today [17:25:58] weird [17:37:16] Analytics-Kanban, Wikipedia-Android-App-Backlog: Count requests to RESTBase from the Android app - https://phabricator.wikimedia.org/T128612#2080522 (bearND) [17:39:55] Analytics-Kanban, Wikipedia-Android-App-Backlog: Count requests to RESTBase from the Android app - https://phabricator.wikimedia.org/T128612#2080435 (bearND) I think we should do this for both MW API and RESTBase page content requests; either on the lead section or remaining sections requests. [17:40:31] Analytics, Analytics-Kanban: Back-fill pageviews data for dumps.wikimedia.org to May 2015 - https://phabricator.wikimedia.org/T126464#2080534 (Milimetric) agreed [17:40:42] Analytics-Kanban, Patch-For-Review: Caching on pageview API should be for 1 day - https://phabricator.wikimedia.org/T127214#2080536 (Milimetric) Thank you! [17:42:24] Analytics-Kanban, Wikipedia-Android-App-Backlog: Count requests to RESTBase from the Android app - https://phabricator.wikimedia.org/T128612#2080547 (Milimetric) this would certainly simplify the pageview definition if all clients adopted it, so let's do it. [17:44:58] ottomata: wanna talk about the EL puppet change? [17:45:31] :q [17:47:10] ottomata: Found in doc ! Coordinator rerun: If -nocleanup is given, coordinator directories will not be removed; otherwise the 'output-event' will be deleted. [17:47:14] makes sense [17:47:56] ottomata: gonna get some food, wanna talk about the puppet stuff after scrum of scrums? [17:55:08] Analytics-Cluster, Analytics-Kanban: Modify Camus file commits to use OVERWRITE option when renaming into final destination - https://phabricator.wikimedia.org/T128611#2080613 (JAllemandou) p:Triage>High [17:55:18] I'm off caring Lino, will be back later [17:55:47] madhuvishy: ja am following up on eventbus DC kafka stuff [17:55:51] AHHH htiming [17:55:53] i need food hmmmm [17:55:57] i have an appt. at 2:30 [17:56:01] in 1.5.hs [17:56:06] yes madhuvishy i think there is time for us before then [17:56:13] then i can work with milimetric after my appt. [17:56:17] ottomata: okay let me know whenever :) [18:03:34] Analytics-Kanban, Wikipedia-Android-App-Backlog: Count requests to RESTBase from the Android app - https://phabricator.wikimedia.org/T128612#2080435 (GWicke) All RB requests pass through the text varnishes, so all accesses (including X-Analytics) are already available as part of the regular log stream. If... [18:03:53] getting some lunch [18:08:44] a-team: going offline! talk with you tomorrow :) [18:22:38] madhuuuu [18:23:45] ottomata: yess [18:23:56] ok whaat the hecky ja? why that change? [18:23:57] i just sent announcement to lists [18:24:02] i dont know [18:24:22] puppet wants to variable substitute %{...}? [18:25:08] ja but it shouldn't in this case [18:25:13] i know [18:25:18] its just a string [18:25:21] and there's no $ [18:25:21] madhuvishy: maybe its just an artifact of the compiler being f unky [18:25:25] i'm inclined to just try it [18:25:27] aah [18:25:34] hmmm [18:25:37] OO [18:25:42] let's cherry pick it in beta and try... [18:26:19] oh you can do that? try puppet changes in beta alone? i thought it was not self hosted [18:26:32] ja beta has its on pupetmaster [18:26:34] deployment-puppetmaster [18:26:35] ohh [18:26:37] all hosts in beta use it [18:26:47] then there's no need to even make this intermediate change [18:26:52] it is synced from upstream via cron [18:26:59] oh? [18:27:01] ottomata: unless we want it to be configured via hiera [18:27:01] oh i see [18:27:02] hm [18:27:06] hmm [18:27:08] i just wanted to test it safely [18:27:15] the format specifier change [18:27:16] ja true, i mean, we need to make a commit, we aren't allowed to edit puppetmaster [18:27:27] but we could cherry pick the commit and then un cherry pick [18:27:31] puppet is the only thing you can test w/o merging on beta I think [18:27:46] mostly, yes [18:27:46] ottomata: it's documented somewhere [18:27:58] hmmm ottomata can you show me? batcave? [18:27:59] madhuvishy: it doesn't hurt to do the hiear thing [18:28:00] lets just do it [18:28:01] sure [18:28:41] https://wikitech.wikimedia.org/wiki/Nova_Resource:Deployment-prep/How_code_is_updated#Cherry-picking_a_patch_from_gerrit [18:29:17] bd808: thanks ! [18:33:18] joal: yt? [18:33:31] joal: oozie job for monthly uniques for last access hasn't run for february...daily data is working fine [18:44:18] Analytics-EventLogging, Analytics-Kanban, Patch-For-Review: Add IP field only to schemas that need it. Remove it from EL capsule and do not collect it by default {mole} - https://phabricator.wikimedia.org/T126366#2080766 (Tbayer) >>! In T126366#2073624, @Nuria wrote: > @asherman, @Tbayer : IPs as an o... [18:45:52] madhuvishy, ottomata : do you have a sec about changes to capsule on EL? [18:46:06] " do you have a sec to talk about about changes to capsule on EL?"\ [18:46:10] nuria: join us on batcave [18:47:07] btw, yall - cache-control:s-maxage=86400, max-age=86400 is coming from AQS, so we're in control of our own headers now :) [18:47:08] Analytics-EventLogging, Analytics-Kanban, Patch-For-Review: Add IP field only to schemas that need it. Remove it from EL capsule and do not collect it by default {mole} - https://phabricator.wikimedia.org/T126366#2080774 (Nuria) I am closing this ticket but if you wish to use piwik for blog please ope... [18:47:24] milimetric: cache headers make me happy [18:55:19] madhuvishy: how do I set a schema to auto-purge everything after 90 days, what steps need to be taken? [18:55:31] I need to do this on request from the https://meta.wikimedia.org/wiki/Schema_talk:Echo owners [18:55:52] milimetric: yeah, we need to ask the DBAs [18:56:07] ok, so update that talk page and make a task for Jaime? [18:56:32] yup! [18:56:47] milimetric: purging was stopped until disk size issues were resolved [18:56:56] milimetric: i do not think is enabled yet [18:57:01] nuria: this is a separate request, I'll write and cc the internal list [18:57:19] milimetric: yeah i think on jaime's side he can just club the two changes [18:57:27] k [18:57:35] milimetric: ok, roan's e-mails about echo is to delete data entirely , correct? [18:58:17] nuria: yes, they want all data gone [18:58:34] milimetric: we can do that ourselves no? [18:58:34] so I was thinking the easiest way is to just set purging [18:58:39] yeah [18:58:43] we can't drop the table [18:58:53] first, events will still come in for a while [18:58:57] right [18:58:58] until all the clients are updated [18:59:01] ya [18:59:05] purging is best [18:59:10] its currently set to Auto-purge recipientUserId, recipientEditCount, eventId + eventCapsule PII after 90 days [18:59:22] although I don't think even these rules are enabled yet [19:00:07] mforns would know status better [19:00:30] milimetric, madhuvishy : no purging is enabled [19:00:45] Analytics, DBA: Purge all Schema:Echo data after 90 days - https://phabricator.wikimedia.org/T128623#2080811 (Milimetric) [19:00:49] milimetric, madhuvishy : jaime didn't want to do it until all disk issues were fixed [19:01:04] nuria: okay i guess now we can? [19:01:15] may be we should bump that ticket [19:01:17] milimetric: last of which was dropping the majority of data in the huge edit table, that just happened [19:01:52] nuria: I know, he'll get to it whenever, but this is not urgent, just cleanup [19:02:08] https://phabricator.wikimedia.org/T108850 [19:02:30] Analytics, DBA: Set up auto-purging after 90 days {tick} - https://phabricator.wikimedia.org/T108850#2080841 (Nuria) [19:02:55] milimetric, madhuvishy : I think we probably want to do the autoincrement +replication before purging [19:03:11] milimetric: so we have several tickets opened [19:03:49] yeah, that's ok, I'll prioritize this low just to be safe [19:04:00] Analytics, DBA: Purge all Schema:Echo data after 90 days - https://phabricator.wikimedia.org/T128623#2080843 (Milimetric) p:Triage>Low [19:04:56] ottomata (or madhuvishy ) how can i see the reason why the last access monthly job did not run.. hue... ahem... never works for me but i have heard other people have ACTUALLY SEEN that ui working [19:05:06] a-team: the one big news from scrum of scrums that I picked up was that Multi-Content is coming to Content Handler. So that's very exciting for a potential Dashiki extension, Graph extension, etc. [19:05:32] * milimetric goes to finish his sandwich [19:07:45] nuria: ? [19:07:52] ottomata: yes [19:08:01] ottomata: was wondering... [19:08:12] ottomata: there is no data for last access monthly for feb [19:08:15] https://hue.wikimedia.org/oozie/list_oozie_coordinator/0016578-160223202501439-oozie-oozi-C/ [19:08:24] oozie job -info 0016578-160223202501439-oozie-oozi-C [19:08:28] ottomata: hue ui does not work [19:08:43] ottomata: never has for me [19:08:49] that doesn't work? [19:08:54] what doesn't it do? [19:09:13] it looks like it is running [19:09:15] started 03/02/16 17:03:14 [19:09:19] ottomata: neverloads [19:09:27] ottomata: that job is january though [19:09:29] nuria: even now? [19:09:29] you can run oozie commands from stat1002 and use the cli [19:09:56] ottomata: taht is what i did but the only job i found was that one which is January's [19:10:01] ottomata: not beb [19:10:04] *not Feb [19:10:14] ottomata: i used: oozie jobs -filter status=RUNNING | grep last_access [19:10:20] nuria: you're right - may be joseph is rerunning them? [19:10:31] madhuvishy: start date is [19:10:39] 2016-01-01 00:00 GMT [19:10:55] yes [19:11:00] that is correct [19:11:10] but that's not the same as job start time [19:11:11] Wed, 02 Mar 2016 17:02:28 [19:11:16] madhuvishy: ok, i see but created today [19:11:21] is when the job claims to have been created [19:11:22] yes [19:11:40] madhuvishy: k, i guess he must be rerunning them.. but not sure why [19:11:43] it says NEXT MATERIALIZED TIME Mon, 01 Feb 2016 00:00:00 [19:11:56] so i think feb will run after [19:11:59] ya not sure why [19:12:10] joal: if you see this ^ [19:15:49] ok i gotta go to my appt. see yall in a bit [19:19:15] Analytics, Collaboration-Team-Backlog, DBA, Notifications: Purge all Schema:Echo data after 90 days - https://phabricator.wikimedia.org/T128623#2080913 (Legoktm) [19:28:35] Analytics-Kanban, Wikipedia-Android-App-Backlog: Count requests to RESTBase from the Android app - https://phabricator.wikimedia.org/T128612#2080953 (Nuria) >All RB requests pass through the text varnishes, so all accesses (including X-Analytics) are already available as part of the regular log stream. th... [19:59:07] back! [20:00:01] WELCOME [20:07:43] OOok milimetric, hiya [20:07:54] oh actually, gimme 5ish mins... [20:09:36] ok [20:09:41] i guess 2 mins :) [20:11:13] Analytics-Kanban, Patch-For-Review: Puppetize reportupdater to be executed in stat1002 and run the browser reports {lama} - https://phabricator.wikimedia.org/T127327#2081201 (Ottomata) [20:11:44] milimetric: ja want to tlak whenever you got a sec [20:16:32] Analytics, Analytics-EventLogging, Privacy: Allow opting out from logging some of the default EventLogging fields on a schema-by-schema basis - https://phabricator.wikimedia.org/T108757#2081219 (Tgr) IP has been dropped unconditionally in T126366/T128407. [20:43:39] hey ottomata, ok ready [20:43:49] had to respond to some folks, but all quiet now [20:44:35] Analytics-EventLogging, Analytics-Kanban, Patch-For-Review: Add IP field only to schemas that need it. Remove it from EL capsule and do not collect it by default {mole} - https://phabricator.wikimedia.org/T126366#2081362 (madhuvishy) Open>Invalid We'll remove IPs for all schemas - Follow here {{... [20:45:15] milimetric: cool [20:45:19] soo ja [20:45:37] https://phabricator.wikimedia.org/T127327 [20:46:23] let's batcave! [20:46:37] ok! [20:57:31] Analytics-Kanban, Wikipedia-Android-App-Backlog, Mobile-App-Android-Sprint-78-Platinum: Count requests to RESTBase from the Android app - https://phabricator.wikimedia.org/T128612#2081520 (MBinder_WMF) [21:11:15] (PS8) Nuria: Fetch Pageview Data from Pageview API [analytics/dashiki] - https://gerrit.wikimedia.org/r/270867 (https://phabricator.wikimedia.org/T124063) [21:12:50] (CR) Nuria: ""all" project fixed plus dashing scheme for breakdowns. I think there is one more thing as square color on left navbar doesn't match graph" [analytics/dashiki] - https://gerrit.wikimedia.org/r/270867 (https://phabricator.wikimedia.org/T124063) (owner: Nuria) [21:12:58] (CR) Nuria: Fetch Pageview Data from Pageview API (1 comment) [analytics/dashiki] - https://gerrit.wikimedia.org/r/270867 (https://phabricator.wikimedia.org/T124063) (owner: Nuria) [21:14:08] milimetric: you guys batcaving still? [21:14:50] nuria: yes, but andrew's hacking and I'm testing your patch [21:15:10] nuria: I'm going to rebase your thing to make sure that's clean [21:15:29] nuria: no, it can't rebase - I'll rebase manually [21:15:42] nuria, I'm back and saw the messages about uniques [21:15:45] milimetric: k [21:16:12] milimetric: see my last comment there is one last thing i need to look into [21:16:14] joal: k [21:16:29] joal: question, did you restarted last access jobs? [21:16:29] yeah, i saw the legend on the left doesn't match [21:16:37] nuria: My guess is that the job got caught in the middle of backfilling, with data eing deleted, moving etc, and failed [21:16:37] I can take a look at that nuria, since I'm rebasing anyway [21:16:42] joal: looks like monthly one is rerunning January [21:16:54] joal: it's the one from Jan not Feb though [21:16:54] nuria: sorry - on the hangout now [21:17:16] milimetric: k, let me know when you rebase, will check in later today [21:17:18] nuria: The one I see is feb [21:17:34] joal: ah ok, my mistake then [21:17:42] hm, let me triple check [21:18:05] joal: will talk in a bit, on meeting now [21:18:51] nuria: there is data for January [21:19:03] and the job having failed is feb [21:19:17] I'll wait for the backfilling to have finished before restarting it [21:19:41] Now I get the one job for january you have seen: my test for data archive [21:32:49] nuria: sorry got dropped off earlier than i thought ;) [21:33:58] ottomata: are you still around? [21:37:46] yes [21:39:41] ottomata: https://integration.wikimedia.org/ci/job/tox-jessie/5307/console [21:39:53] this EL service test claims to be failing [21:39:57] is that normal? [21:42:03] nuria: going off, ok on uniques ? [21:42:54] joal: https://hue.wikimedia.org/oozie/list_oozie_coordinator/0016578-160223202501439-oozie-oozi-C/ is the the coordinator? [21:43:13] this one claims to be running Jan [21:43:25] madhuvishy: this coordinator is my test for archiving data [21:43:31] joal: aah [21:43:36] that's why the confusion [21:43:50] This is the one you're after: 0000112-160223202501439-oozie-oozi-C [21:44:03] :) [21:44:06] joal: [21:44:08] that one [21:44:12] Feb workflow [21:44:16] claims to be killed [21:44:27] it died because of backfilling issues I think [21:44:35] joal: aah [21:44:40] need to restart then? [21:45:01] I'll restart after backfilling is done: we are currently missing 4 days of data [21:45:13] nuria: you're right, the pattern thing is trickier, I'll try to fix it [21:45:33] joal: ok cool [21:45:37] :) [21:45:39] so we are not running feb yet [21:45:45] makes sense [21:45:47] thanks :) [21:46:03] Np :) [21:46:12] hm, madhuvishy no [21:46:33] ottomata: all tests pass on my local - and I dont think I did anything to make that test fail [21:46:36] so dunno [21:46:47] m [21:46:53] ok yeah i'd push through it [21:46:56] will have to look later [21:47:10] ottomata: ok asking jenkins to recheck [21:49:07] ottomata: on https://gerrit.wikimedia.org/r/#/c/273557/ you suggested using oozie/webrequest/load as an example for the bundle pattern. I'm not 100% I understand how that would work for the mediawiki import jobs. [21:49:30] Would there be one set of xml and different properties files for each type? [21:50:21] bd808: no, bundles wrap coordinators [21:50:33] so, in the same way that you parameterize workflows with coordinators [21:50:39] you parameterize coordinators with bundles [21:50:47] so, there'd be a new bundle.xml file [21:50:50] and a bundle.properties file [21:50:58] the bundle would declare instances of the coordinators to create [21:51:10] in this case, ApiAction and CirrusSearchRequestSet [21:51:34] https://github.com/wikimedia/analytics-refinery/blob/master/oozie/webrequest/load/bundle.xml#L27 [21:51:58] any properties that are specific to each of those datasets (there aren't many..mainly just names) would be parameterized [21:52:09] hmm.. I guess oozie/cassandra was more what I was seeing in my head [21:53:02] that is similar...just a lot more parameters [21:53:13] I'm glad to do the coding for this but my grasp of the domain is tenuous. I'm pretty much at copy-n-paste level of comprehension [21:53:36] bd808: am happy to guide, do you know how to test oozie stuff? [21:54:14] ottomata: nope! Which page should I start reading for that? [21:57:56] ok gone for today :) [21:58:02] See you folks tomorrow ! [22:11:04] laters! [22:11:10] bd808: ah! [22:11:28] https://wikitech.wikimedia.org/wiki/Analytics/Cluster/Oozie [22:11:32] it might be a little out of date [22:11:35] and doesn't have bundle docs [22:11:51] so everything I need to know except not :) [22:11:54] must be a wiki [22:13:00] haha [22:13:04] i mean, the how to test is there [22:13:13] a job is a job to oozie [22:13:19] whether or not it is a bundle or a coordinator or a workflow [22:13:23] jobs composed of other jobs :) [22:14:02] bd808: you also might want to read some oozie docs on those things [22:14:03] if you haven't eyt [22:14:30] https://oozie.apache.org/docs/3.3.0/BundleFunctionalSpec.html [22:15:06] so do want to end up with one bundle that runs both the cirrus and api jobs or two separate bundles that use common coordinator.xml and workflow.xml files? [22:16:42] one bundle [22:17:06] bd808: it'll look very much like webrequest/load [22:17:12] just look at the oozie files there [22:17:20] bundle.properties and the .xml files [22:17:36] only 4 files [22:17:40] oh and a datasets file [22:17:44] you've got one for api action [22:17:47] and there is one for cirrus [22:17:50] those will be merged into one file [22:18:17] bd808: i know you just copy pasted to make this patch [22:18:29] any of the places where you edited to do your copy paste [22:18:33] you'll turn into parameters [22:18:38] *nod* [22:19:07] I'm still a bit confused about how one properties file will fill things in though [22:19:55] the specific params will not be in the properties file [22:19:58] it'll be in bundle.xml [22:20:15] https://github.com/wikimedia/analytics-refinery/blob/master/oozie/webrequest/load/bundle.xml#L27 [22:20:30] oh! *light bulb* [22:20:43] you can see here that each of the coordinators launched as part of the webrequest load bundle have configurations [22:20:51] and, since your dataset and the cirrus dataset are so similar [22:20:55] likely it'll be just like this [22:21:00] just the name of the dataset to use [22:21:21] which you can parameterize [22:21:23] and fill in here [22:21:23] https://github.com/wikimedia/analytics-refinery/blob/master/oozie/mediawiki/cirrus-searchrequest-set/load/coordinator.xml#L72 [22:21:32] instead of dataset= "hardcoded name" [22:21:34] it'll be [22:21:37] somethign like [22:21:48] dataset=${dataset_name} [22:21:49] or something [22:22:25] ha, the name of the coordinator has already been parameterized :p https://github.com/wikimedia/analytics-refinery/blob/master/oozie/mediawiki/cirrus-searchrequest-set/load/coordinator.xml#L3 [22:24:08] I think I'm getting the idea. one workflow.xml, one coordinator.xml and then the bundle.xml runs the two sets of jobs [22:24:18] bundle.xml launches coodinators [22:24:24] which run the workflows [22:24:28] its parameters all the way down :) [22:24:29] so [22:24:43] a coordinator launches a series of time based workflows [22:24:50] (not always time based, but usually) [22:24:58] with the time parameters filled in for the workflow [22:25:15] like paths (like we have hte hourly paths in hdfs), table partition information, etc. [22:25:26] then the workflow just takes that and does whatever it has to do [22:25:27] like [22:25:33] alter table add partition TIME at PATH [22:25:48] you could take one of these parameterzied workflow.xml files [22:26:09] fill in any parameters yourself, either via a .properties file, or by passing -Dkey=value on the CLI [22:26:15] and a single workflow owuld run [22:26:25] the coordinator is basically just doing that for you at a regularly defined schedule [22:26:45] sometimes (like now) you have datasets that have the same frequency and the same workflow [22:26:58] in that case, you can abstract things like names of the datasets, basepaths where the time buckets are, etc. [22:27:05] and launch multipel coordinators with THOSE params filled in [22:27:10] in this cause [22:27:14] a single bundle will have 2 coordiantors [22:27:17] one for ActionApi [22:27:21] and one ofr CirrusSearchRequestSet [22:27:40] but, still only a single coordinator.xml file [22:27:57] in OO terms, i guess its kinda like the coordinator is the class, and the bundle instantiates it [22:28:16] so, if we do this now [22:28:27] as more people do MW avro+monolog -> kafka stuff [22:28:35] to get it into hive [22:28:47] all they'd have to do is add a new entry in the datasets.xml file, and add a new coodinator instance referencing that dataset [22:28:55] then we can relaunch the bundle with the new configs [22:29:11] bd808: have you logged into hue before? [22:29:28] I think I tried one day but it didn't work then [22:29:32] https://hue.wikimedia.org/oozie/list_oozie_bundle/0000012-160223202501439-oozie-oozi-B [22:29:35] its a little better now [22:29:37] since last week [22:29:53] ldap credentials right? [22:29:58] shell username, ldap pw [22:30:34] ah. that's probably what I did wrong before (wikitech user vs shell user) [22:30:53] you in? [22:30:56] yup [22:31:07] k so i linked you to the running webrequest load bundle [22:31:11] you can see it has 4 coordinators [22:31:15] click on the text one [22:31:32] that'll show you the details for the webrequest_text coordinator instance [22:31:45] (IF IT EVER LOADS< GEEZ CMON HUE) [22:32:19] so much thinking [22:32:22] when you find yourself in times of trouble, faster hue comes to you [22:32:25] haha [22:32:35] Boom! [22:32:42] oof [22:32:44] big django stack trace [22:33:02] "OperationalError: database is locked" [22:33:24] ok, better now [22:33:25] i thikn [22:33:26] dunno [22:33:29] it was busy i guess :/ [22:33:37] got a to do for some hue db work... [22:33:44] want to put it in mysql [22:33:48] curently its a local derby db on disk :/ [22:33:59] does it work now? [22:34:00] https://hue.wikimedia.org/oozie/list_oozie_coordinator/0000015-160223202501439-oozie-oozi-C/?bundle_job_id=0000012-160223202501439-oozie-oozi-B [22:34:03] ^ that's the text coord [22:34:13] there you can see all of the individual workflow intantiations [22:34:15] one per hour [22:34:43] bah, ha, maybe hue doesn't support multiple users :p [22:35:16] nah [22:35:16] hue is on an sqlite database i think? [22:35:17] bah [22:35:21] naw, derby [22:35:22] at least, i got an error that included sqlite once [22:35:24] OR [22:35:25] hm [22:35:27] no you are right [22:35:29] ebernhardson: [22:35:37] its some other service that defaults to derby [22:36:03] NEway [22:36:04] whatever [22:36:07] hue is great when it works [22:36:08] GAH [22:36:20] sqlite is not a database [22:36:26] anyway, bd808 i just wanted to show you the viz of the bundles and coords and workflows [22:37:04] bd808: my favorite part is inserting text into integer fields. sqlite says 'why not?' [22:37:45] bd808: ok, i'm gonna run out in a sec, any Qs for me before I go? [22:37:50] you can always email or ping me or jo al anytime [22:37:54] I think I get it now. The bundle.xml will have a for cirrus and another for api. Each will override properties that differ for the coordinator.xml downstream [22:38:03] luca is learning a bunch of this too, so he can probably help [22:38:07] yup, exactly [22:38:11] I'll try to make a patch and then you can review [22:38:15] bd808: i'm around to help too if you need [22:38:17] k cool, sounds good [22:38:19] yeah madhu can help! [22:38:29] bd808: you can test your stuff from stat1002 too [22:38:41] just put the xml files in your user dir in hdfs [22:38:53] and make a property file that outputs to different places [22:38:56] I don't always test, but when I do I do it live in prod ;) [22:39:11] heheh, just make it write elsewhere, use a different hive table, etc. [22:39:13] in your own hive db [22:39:44] i did have the analytics cluster all running in beta.....but it looks like maybe logs filled up the disks and now the nodes are borked :( :( [22:39:49] haven't really investigated [22:40:33] bd808: not live! :) [22:41:00] just like you'd log in somewhere and play with stuff in your home dir, you can do that in hdfs too! [22:41:30] ok then, i'm out for a bit laters! [23:02:49] (PS9) Milimetric: Fetch Pageview Data from Pageview API [analytics/dashiki] - https://gerrit.wikimedia.org/r/270867 (https://phabricator.wikimedia.org/T124063) (owner: Nuria) [23:02:58] ok, nuria, I think I figured out all the problems ^ [23:03:27] the Project Selector and Visualizer weren't matching up because the labels were different (project Url in one case and project dbname in another) [23:04:17] the labels got flipped, which I thought was weird for a bit, (it said all-access: enwiki) instead of (enwiki: all-access), but I just flipped them back in the visualizer to be {color}: {pattern} because I think that makes more sense anyway [23:04:55] and there were a couple of tests failing, one was a weird issue that existed in the past with a race condition, and the other just needed an update since your last patch [23:06:23] (CR) Milimetric: [C: 1] Fetch Pageview Data from Pageview API (1 comment) [analytics/dashiki] - https://gerrit.wikimedia.org/r/270867 (https://phabricator.wikimedia.org/T124063) (owner: Nuria) [23:06:43] good night everyone :) [23:17:03] milimetric: I'm excited for this change :D