[00:14:27] SMalyshev: I can try [00:14:33] sup? :) [00:14:38] madhuvishy: cool [00:15:59] madhuvishy: so the question is as follows: suppose we have a dataset and I want to group it into batches (e.g. 1000 rows per batch) and process it in batches (e.g. send it out to another service) [00:17:02] madhuvishy: what would be the best way to do such thing? e.g. I can do data.foreach() and manually cound rows etc. but I;m not sure it's the best way, and also the last batch may be less than 1000... [00:17:43] I know there's take() but it only takes from the start... is there something like iterated take()? [00:17:45] SMalyshev: aah, I don't think I've done that in the past, all our spark jobs load entire partitions [00:18:06] interesting, let me think [00:19:28] I could also do probably count() and then partition by (count/1000) but that looks like too expensive probably [00:21:17] SMalyshev: yeah, things like count are not lazy, so will take a long time [00:22:06] yeah that's why I wonder if I can do better [00:26:38] SMalyshev: where does this 1000 number come from? [00:27:10] as in, what is the motivation? do you just want to process smaller batches at a time, or do you need it to be a specific size everytime? [00:28:06] madhuvishy: it's not exactly 1000, just some big number. It comes from basically we'd be sending data about documents to ElasticSarch, and for each document it's 1-2 numbers, so we don't want to do HTTP request for each document. Instead, we want to batch a number of updates (e.g. 1000) and use ES bulk API to update all of them in one request [00:28:39] is this the pageviews stuff? [00:28:44] yes [00:28:47] its for both pageviews and page rank [00:29:01] Partitions may be the right approach, although they are not evenly sized :( [00:29:05] yeah [00:29:09] that's what i was thinking [00:29:15] but if you're looking specifically at pageviews [00:29:18] right, both. And maybe more in the future, I guess. I want the script to be generic "get tons of data from analytics and ship it to ES" [00:29:21] which is why I asked if it needs to be an exact division [00:29:27] the refined, ETLd pageview table is a lot smaller and more structured than the raw webrequest data [00:29:27] sooo [00:29:34] like, the aggregates [00:29:37] Ironholds: how big is the partition? [00:29:59] SMalyshev, much like post-war europe, the size of the partition varies wildly [00:30:00] Ironholds: this is much later in the pipeline. Something else will read in data and generate scores for page rank, and page views. Those final scores are written out to something in hdfs [00:30:07] ebernhardson, ahhh [00:30:11] Ironholds: this script just reads the pre-computed data, and sends it to ES [00:30:28] madhuvishy: no, doesn't need to be exact. but if it's 100G of data, there's a chance i will choke memory or ES [00:30:37] SMalyshev, oh it'll never be that much [00:30:42] (unless that's hyperbolic) [00:30:51] SMalyshev: Spark dataframes have a repartition function [00:30:54] yeah I have no idea :) [00:31:10] i think you split it into arbitrary smaller partitions [00:31:16] madhuvishy: yeah I've seen but it says how many partitions, right, now how many per partition? [00:31:21] or did I miss an option? [00:31:34] SMalyshev: yeah you dont know how many per partition [00:31:39] because I have no idea how many pages there will be... [00:31:42] you'd have to experiment a bit [00:31:45] on the question of existing partition size [00:31:50] we can do a bit of experimenting as madhuvishy says [00:31:58] like, SELECT COUNT(*) GROUP BY [all the partitions] [00:32:07] over a day of data, work out the divide [00:32:31] ok, I guess that can be a way to do it... how expensive is repartitioning? [00:32:38] if we're looking at the ETLd webrequests to start with you will probably be fine but might not be. If we're looking at the aggregated pageviews data you will definitely be fine. [00:32:52] but I can just run a hive query and get you some baseline numbers if you throw a phab ticket into the analytics sprint ;p [00:33:15] SMalyshev: it wouldn't be as even, but maybe foreachPartition? that allows you to run a process per partition, and within a partition you would manually batch 1000 and then send [00:33:48] but the last request in each partition (however many) will be smaller [00:34:01] ebernhardson: yeah that's what I'm looking for but I don't know how big the partition is... I guess yeah, if I say I want 10k partitions then probably they'd be around 1000 [00:34:21] Ironholds: is the data in hive partitioned by hour or sth? [00:34:31] s/hive/hadoop [00:34:38] madhuvishy, hour and source [00:34:42] well, ymdh and source [00:34:55] that script won't be working on raw analytics data though. It would be working on output of aggregator script. [00:35:09] It may also work to convert the data frame into an rdd and group by something like that [00:35:15] and map across each group [00:35:19] which would probably be just page_id:score pairs for all pages withing the timeframe [00:35:45] SMalyshev: aah grouping by page id won't work? [00:36:01] I am not sure how many rows you'd have per page [00:36:04] 1 :) [00:36:10] aah [00:36:11] madhuvishy: hm... That's an interesting idea... if I try to group by padeId%1000 it may work [00:36:46] why %1000? [00:36:49] though I'm concerned it's much more work than it should be. repartitioning may be cheaper [00:37:14] SMalyshev: may be - you can try them on the spark-shell and see how long it takes [00:37:45] yeah I'll try that probably and see. ok, I've got some good ideas, thanks! [00:37:46] yeah I'll try that probably and see. ok, I've got some good ideas, thanks! [00:38:23] np! let me know how it goes [03:43:03] (PS11) Nuria: Add pageview quality check to pageview_hourly [analytics/refinery] - https://gerrit.wikimedia.org/r/240099 (https://phabricator.wikimedia.org/T109739) (owner: Joal) [05:36:50] Analytics, Discovery, EventBus, MediaWiki-General-or-Unknown, and 7 others: EventBus MVP - https://phabricator.wikimedia.org/T114443#1744752 (Smalyshev) @GWicke I would be interested to participate. I'll be in the office, could you add me to the invite? [07:33:55] Analytics-Tech-community-metrics, DevRel-October-2015, Patch-For-Review: Present most basic community metrics from T94578 on one page - https://phabricator.wikimedia.org/T100978#1744824 (Luiscanasdiaz) @aklapper your changes were merged. I do think it is a good approach. [07:36:28] Analytics-Tech-community-metrics, DevRel-October-2015, Patch-For-Review: Fine tune "Code Review overview" metrics page in Korma - https://phabricator.wikimedia.org/T97118#1744836 (Luiscanasdiaz) @aklapper not sure about the impact of this changes, I mean, everything work but makes a bit difficult the m... [08:01:34] Analytics-Wikistats: Discrepancies in historical total active editor numbers - https://phabricator.wikimedia.org/T87738#1744887 (Nemo_bis) Independent debugging could also be performed by checking whether any individual wiki exhibits such a fluctuation in the same month and whether there are significant fluc... [08:06:09] Analytics, Discovery, EventBus, MediaWiki-General-or-Unknown, and 7 others: EventBus MVP - https://phabricator.wikimedia.org/T114443#1744891 (mobrovac) >>! In T114443#1744752, @Smalyshev wrote: > @GWicke I would be interested to participate. I'll be in the office, could you add me to the invite?... [09:08:14] Analytics-Backlog: Sanitize pageview_hourly - https://phabricator.wikimedia.org/T114675#1744996 (JAllemandou) Task breakdown: # Write an oozie coordinator to backfill sanitization from exisiting pageview_hourly to pageview_hourly_new (path TBD) # Backfill by hand: create pageview_hourly_new, launch and moni... [09:50:37] Analytics-Kanban, RESTBase, Services, Patch-For-Review, RESTBase-API: configure RESTBase pageview proxy to Analytics' cluster {slug} [3 pts] - https://phabricator.wikimedia.org/T114830#1745043 (akosiaris) [09:50:39] Analytics, Services, operations, Patch-For-Review: Set up LVS for AQS - https://phabricator.wikimedia.org/T116245#1745040 (akosiaris) Open>Resolved a:akosiaris LVS for AQS is up and running. We had to migrated restbase on AQS to port 7232 to avoid conflicting with the services restbase inst... [09:50:58] !log restart cassandra on aqs1003 [09:51:01] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log, Master [10:03:31] Analytics-Tech-community-metrics, DevRel-October-2015: Correct affiliation for code review contributors of the past 30 days - https://phabricator.wikimedia.org/T112527#1745058 (Aklapper) I pushed my first update. [[ http://korma.wmflabs.org/browser/scr-contributors.html | Data on korma ]] should get updat... [10:11:05] morning a-team [10:13:57] Hi mforns [10:14:07] hello! [10:16:57] mforns_: about backfilling EL, have you done it yesterday with Dan ? [10:17:51] (PS2) Joal: Correct camus-partition-checker to use hdfs conf [analytics/refinery/source] - https://gerrit.wikimedia.org/r/247847 [10:31:00] joal, sorry didn't see your message. We tried, we deployed the new patch and created a separate eventlogging instance in dan's home folder [10:31:23] ok [10:31:27] np mforns :) [10:31:38] joal, we created a file containing all events that were created during the outage [10:31:43] right [10:31:46] and piped it to the mysql consumer [10:31:57] however, it didn't go as expected [10:32:22] aouch :( [10:32:26] we were slowly progressing but it was too late for me.. so I left [10:32:37] k [10:32:49] Thanks a lot for having taken care about that [10:33:06] we were managing to get the events inserted, with some timezone issues... [10:33:24] but we got an error created by a badly formated event [10:33:47] that crashed the process [10:33:55] hm, how is that even possible, given we validated everything ? [10:34:01] we should fix that error first probably, and then try to backfill again [10:34:20] mmm, it seemed some parsing issue [10:34:55] k [10:34:57] the event text contained several single quotes, double quotes and slashed [10:35:01] *slashes [10:35:07] pfff [10:35:14] hehehe [10:35:18] classical, but uncool [10:36:15] joal, I'm going to work in the browser report changes now, and in the afternoon, I'll take that again with Dan [10:36:59] (PS3) Joal: Update camus-partition-checker [analytics/refinery/source] - https://gerrit.wikimedia.org/r/247847 [10:37:09] ok mforns_ [10:37:15] Thanks a lot [10:37:22] np [10:44:01] (CR) Joal: "Tested with:" [analytics/refinery/source] - https://gerrit.wikimedia.org/r/247847 (owner: Joal) [10:54:48] * joal is gone chopping some wood [10:54:55] * joal will be back in a few hours [10:56:51] Analytics, Services, operations: Set up LVS for AQS - https://phabricator.wikimedia.org/T116245#1745136 (mobrovac) [11:44:42] (PS1) Christopher Johnson (WMDE): adds graphite module [wikidata/analytics/dashboard] - https://gerrit.wikimedia.org/r/248022 [11:44:56] (PS4) Mforns: Add oozie job to compute browser usage reports [analytics/refinery] - https://gerrit.wikimedia.org/r/246851 (https://phabricator.wikimedia.org/T88504) [11:45:28] (CR) Mforns: Add oozie job to compute browser usage reports (5 comments) [analytics/refinery] - https://gerrit.wikimedia.org/r/246851 (https://phabricator.wikimedia.org/T88504) (owner: Mforns) [11:47:40] (CR) Christopher Johnson (WMDE): [C: 2 V: 2] adds graphite module [wikidata/analytics/dashboard] - https://gerrit.wikimedia.org/r/248022 (owner: Christopher Johnson (WMDE)) [13:25:32] * joal is back ! [13:29:14] hayiIiii [13:31:52] Hey ottomata :) [14:03:57] ottomata: I have tested the CamusPartitionChecker: works fine :) [14:05:09] I let you review / merge, I'll add puppet code and we can discuss on how to deploy [14:06:36] awesoome [14:07:14] (PS1) Christopher Johnson (WMDE): adds bulk sparql query and output scripts removes total_views [analytics/limn-wikidata-data] - https://gerrit.wikimedia.org/r/248033 [14:09:50] ottomata: reading the camus puppet module module [14:10:11] ottomata: shouldn't the camus init.pp also depend on refinery beiing deployed ? [14:10:25] Marbe we can't enforce that since refinery is manually deployed ? [14:10:59] Like ensuring that /srv/deployment/analytics/refinery exists ? [14:11:16] joal: that would be nice, but i didn't want to introduce that dependency. [14:11:21] kinda weird to depend from module to role [14:11:26] so, i made the script a parameter instead [14:11:43] with a default value, makes sense [14:11:55] ah but [14:12:01] And refinery is a role instead of a module because of being manually deployed ? [14:12:01] we could force the dependency here [14:12:04] https://github.com/wikimedia/operations-puppet/blob/production/manifests/role/analytics/refinery.pp#L62 [14:12:09] well, actually, it is already being forced [14:12:11] via the [14:12:13] require role::analytics::refinery [14:12:28] so, use of the camus module does not explicitly depend on refinery [14:12:35] but our use of it does via the role:::: camus class [14:12:49] joal: refinery is a role, but could be a module. maybe. [14:13:09] i generally like to keep very WMF specific things out of modules, unless it makes a lot of sense [14:13:19] ottomata: I am just trying to put my head around the stuff, not change it :) [14:13:25] aye [14:13:33] in the case of a refinery module, the only think i would put in the module would be the main role::analytilsc::refinery class [14:13:41] the role dep, is it in camus role ? [14:13:43] anything else in the refinery.pp role file, i thin i would keep in a role [14:13:49] there is no camus role. [14:14:07] joal: generally, but not always, i think of modules as very pluggable libaries [14:14:13] right [14:14:14] and roles as usages of those libraries [14:14:19] sometimes, modules must use other modules [14:14:26] so the line is kinda blurry [14:14:35] (PS2) Christopher Johnson (WMDE): adds bulk sparql query and output scripts removes total_views [analytics/limn-wikidata-data] - https://gerrit.wikimedia.org/r/248033 [14:15:39] ottomata: so the dependency of refinery is made in the role that also uses camus [14:15:42] right ? [14:17:28] yes, um [14:17:32] role ... refinery::camus [14:17:33] does [14:17:38] require role::analytics::refinery [14:17:47] which ensures that that class is realized before it [14:17:58] and that class includes [14:17:59] package { 'analytics/refinery': [14:17:59] provider => 'trebuchet', [14:17:59] } [14:23:39] ottomata: shall I go by adding camus::checker class and use it in refinery::camus role ? [14:23:53] camus::check would be in the camus submodule [14:24:19] Analytics, Discovery, EventBus, MediaWiki-General-or-Unknown, and 6 others: Define edit related events for change propagation - https://phabricator.wikimedia.org/T116247#1745395 (Ottomata) COOL. As part of this discussion, I'd like us to think about not only fields that are relevant to edit event... [14:27:28] HMMM joal, hm. [14:27:34] i see the reason for your questions now. [14:27:35] hm. [14:27:47] so, camus checker is 100% in refinery, right? [14:27:52] hm. [14:28:25] camus checker needs refinery-job, a camus.properties file, hadoop conf (for HDFS), and hadoop + spark libs [14:29:24] Oh, and java, obviously [14:29:30] ottomata: --^ [14:30:53] joal: hm [14:31:00] maybe add a wrapper in refinery/bin for your thing [14:31:15] hmm [14:31:15] no [14:31:20] is this going to be launched by oozie? [14:31:43] As we prefer ottomata [14:31:51] whatcha think? [14:31:57] Since it's java, it is launchable via oozie [14:32:09] right, but the reason to lauch by oozie would be data based [14:32:10] hm. [14:32:12] but since it's very camus related, I would have bundled it with camus [14:32:16] yeah [14:32:16] hm [14:32:18] holaaa, let me know if you need help backfilling eventlogging cc milimetric [14:32:30] joal, maybe you should just incorporate this comand into the existing bin/camus wrapper? [14:32:34] as an option [14:32:35] like [14:32:56] morning nuria. we've gotta fix that bug. I just had some blood drawn, so I'm having a late start, gotta get some food :) [14:33:08] hmmm [14:33:11] but you don't have the date in there. [14:33:23] ok, milimetric let's talk later, cause when backfilling from a file some scaping is needed [14:33:24] that's going to be the hard part, how do you know which date to check? [14:33:35] ottomata: by default the thing uses the last camus run in history [14:33:41] hm [14:33:42] ok [14:33:48] so, eyah, then that does make sense [14:33:56] you can pass it a date if you prefer, but with no date, it uses the last one [14:34:00] all you need is the properties file then [14:34:04] yessir [14:34:06] so you can add it to the bin/camus script [14:34:11] since that is also being passed the properties file [14:34:48] k ottomata, will look into that [14:35:07] --check-flag [14:35:08] or something [14:35:12] of just --check [14:35:20] which then makes this job be launched after the camus one is doen. [14:35:22] done [14:36:07] hm, question: when you say done, you mean launched right, not finished ? [14:36:21] uh [14:36:27] i mean finished, after the camus mr job is finished [14:36:30] right? [14:36:50] so, bin/camus script does [14:36:53] sys.exit(os.system(command)) [14:36:58] instead, don't exit [14:36:59] will lokk into that camus script [14:37:03] just check return val [14:37:11] (mabye? not sure what ret val of hadoop job will be) [14:37:16] but, just don't exist [14:37:18] exit* [14:37:21] then if --check was given [14:37:28] run your job after os.system returns [14:37:45] if you think its worthwhile you can augment the script to use something other than os.system [14:37:52] subprocess or whatever [14:37:57] up to you [14:38:21] ottomata: got it [14:39:41] thx ottomata [14:43:14] nuria: ok, I'm about to try to fix that bug [14:43:31] first I'll try to repro on an04 and I'll fiddle with the python there until I figure it out [14:44:24] milimetric : are you sure there is a bug? [14:44:34] cause backfilling from a file is not teh same [14:44:37] than a stream [14:44:39] oops, sorry, I forgot I didn't email, it's on phab: https://phabricator.wikimedia.org/T116241 [14:45:08] it's that weird schema with the array in it [14:45:17] I thought it was just failing but it actually kills the consumer completely [14:45:25] so we can't insert the other events [14:45:52] the other bug is that the change somehow didn't work, it wasn't sleeping and the memory use grew to like 15GB before we killed it [14:45:56] gotta look into that too, but the bug first [14:46:03] nothin's ever easy :) [14:46:05] milimetric: every event or just a particular one of taht schema? [14:46:25] it looks like just that particular one, but we have to protect the consumer from whatever's happening there [14:46:34] it's getting past validation somehow, that's the weird part [14:47:14] milimetric: you know, we should be able to test with a unit test with teh event in question [14:52:57] Analytics-Backlog, Database: Set up bucketization of editCount fields {tick} - https://phabricator.wikimedia.org/T108856#1745447 (jcrespo) I am rolling the schema change right now: ``` MariaDB EVENTLOGGING m4 localhost log > SHOW CREATE TABLE MobileWebWatching_11761466\G *************************** 1. ro... [14:53:59] PROBLEM - Difference between raw and validated EventLogging overall message rates on graphite1001 is CRITICAL: CRITICAL: 26.67% of data above the critical threshold [30.0] [14:55:50] RECOVERY - Difference between raw and validated EventLogging overall message rates on graphite1001 is OK: OK: Less than 25.00% above the threshold [20.0] [15:03:35] Analytics-Kanban: Backfill mobile and upload for oct 20th [3 pts] {hawk} - https://phabricator.wikimedia.org/T116283#1745474 (JAllemandou) NEW a:JAllemandou [15:07:33] Analytics-Tech-community-metrics: Empty "subject" and "creator" fields for mailing list thread on mls.html - https://phabricator.wikimedia.org/T116284#1745489 (Aklapper) NEW [15:07:48] (PS1) Joal: Update oozie diagram to reflect current status [analytics/refinery] - https://gerrit.wikimedia.org/r/248041 [15:08:12] Analytics-Tech-community-metrics: Review/update mailing list repositories in korma - https://phabricator.wikimedia.org/T116285#1745496 (Aklapper) NEW [15:12:35] Analytics-Engineering, Wikidata: Dashboard repository for limn-wikidata-data - https://phabricator.wikimedia.org/T112506#1745505 (JanZerebecki) [15:13:28] ottomata: could you merge this: https://gerrit.wikimedia.org/r/#/c/248045/ [15:13:33] it'll let us keep going with backfilling [15:17:39] milimetric: cool [15:17:42] edit the comment there [15:17:49] sorry actually, no rush, I can just grep -v for now to backfill [15:17:51] the commit is good, but it would nice to have the same info in the comment [15:18:00] oh good point [15:18:12] Analytics-Tech-community-metrics, DevRel-November-2015: Legend for "review time for reviewers" and other strings on repository.html - https://phabricator.wikimedia.org/T103469#1745512 (Aklapper) [15:18:42] (PS2) Joal: Update oozie diagram to reflect current status [analytics/refinery] - https://gerrit.wikimedia.org/r/248041 (https://phabricator.wikimedia.org/T115993) [15:19:00] milimetric, "Exclude CentralNoticeBannerHistory from mysql" :] [15:19:22] are you planning on retrying backfilling? [15:19:53] mforns: yep, nuria and I are in the cave, we're gonna start to try now [15:19:58] (PS1) Joal: Update bin/camus to include CamusPartitionChecker [analytics/refinery] - https://gerrit.wikimedia.org/r/248048 (https://phabricator.wikimedia.org/T113252) [15:20:01] oh ok [15:20:07] may I join? [15:20:10] no! [15:20:13] of course [15:20:14] xD [15:20:14] mforns: those events were not making it into db [15:20:15] :) [15:20:16] yessir:D [15:21:34] Analytics-Kanban, Patch-For-Review: EventLogging mysql consumer can be killed by a bad event with a nested array - https://phabricator.wikimedia.org/T116241#1745517 (Nuria) [15:22:07] Analytics-Backlog, Database: Set up bucketization of editCount fields {tick} - https://phabricator.wikimedia.org/T108856#1745519 (jcrespo) It will take a day to convert MobileWebClickTracking_5929948 (>300GB). If it fails, we will try with regular tokudb online table. Writes can continue without problem w... [15:23:21] Analytics-Kanban, Patch-For-Review: EventLogging mysql consumer cannot insert events that have a nested json schema that includes a plain "array" - https://phabricator.wikimedia.org/T116241#1744214 (Nuria) [15:37:28] Analytics-Tech-community-metrics, DevRel-November-2015: Legend for "review time for reviewers" and other strings on repository.html - https://phabricator.wikimedia.org/T103469#1745559 (Aklapper) `review_time_pending_ReviewsWaitingForReviewer_days_median` provides "Review Time for reviewers (days, median)"... [15:38:22] PROBLEM - Difference between raw and validated EventLogging overall message rates on graphite1001 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [30.0] [15:40:12] RECOVERY - Difference between raw and validated EventLogging overall message rates on graphite1001 is OK: OK: Less than 25.00% above the threshold [20.0] [15:52:34] Analytics-Tech-community-metrics, DevRel-October-2015, Patch-For-Review: Fine tune "Code Review overview" metrics page in Korma - https://phabricator.wikimedia.org/T97118#1745607 (Aklapper) Merged. Thanks! I'm going to close this task once it's live on korma. [16:01:52] ottomata, madhuvishy standuppp? [16:03:11] PROBLEM - Difference between raw and validated EventLogging overall message rates on graphite1001 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [30.0] [16:03:21] Analytics-Cluster, Analytics-Kanban: Use Burrow for Kafka Consumer offset lag monitoring - https://phabricator.wikimedia.org/T115669#1745630 (Ottomata) I was able to successfully create a Burrow .deb for Jessie today. It is a little hacky and needs some work, but the idea should be fine. [16:05:45] Analytics-EventLogging, Analytics-Kanban, Patch-For-Review, WMF-deploy-2015-10-27_(1.27.0-wmf.4): Add the schema name to the EL EventError topic [8 pts] - https://phabricator.wikimedia.org/T115121#1745634 (Nuria) Open>Resolved [16:06:05] Analytics-Cluster, Analytics-Kanban, Patch-For-Review: Decommission remaining old Hadoop Workers {hawk} - https://phabricator.wikimedia.org/T112113#1745637 (Nuria) Open>Resolved [16:06:07] Analytics-Cluster, Analytics-Kanban: {mule} Hadoop Cluster Expansion - https://phabricator.wikimedia.org/T99952#1745638 (Nuria) [16:07:00] RECOVERY - Difference between raw and validated EventLogging overall message rates on graphite1001 is OK: OK: Less than 25.00% above the threshold [20.0] [16:07:42] Analytics-Backlog, Analytics-Cluster: Implement better Webrequest load monitoring {hawk} - https://phabricator.wikimedia.org/T109192#1745641 (Nuria) [16:07:43] Analytics-Cluster, Analytics-Kanban, Patch-For-Review: Improve daily webrequest partition report {hawk} [5 pts] - https://phabricator.wikimedia.org/T113255#1745640 (Nuria) Open>Resolved [16:12:21] Analytics-Kanban: Create deb package for Burrow - https://phabricator.wikimedia.org/T116084#1745657 (madhuvishy) [16:57:22] Analytics-Backlog, Analytics-Cluster, Analytics-Kanban: Procure hardware for future druid cluster - https://phabricator.wikimedia.org/T116293#1745790 (Nuria) NEW [17:02:47] nuria: \o/ [17:07:54] Analytics-Tech-community-metrics, DevRel-October-2015: Automated generation of (Git) repositories for Korma - https://phabricator.wikimedia.org/T110678#1745829 (Aklapper) >>! In T110678#1734742, @Dicortazar wrote: > @Aklapper, how can we automatically retrieve the list of Git repositories available from s... [17:09:52] Analytics-Backlog, Analytics-Cluster, Analytics-Kanban: Procure hardware for future druid cluster - https://phabricator.wikimedia.org/T116293#1745833 (kevinator) This will be similar to T100442 [17:15:54] joal, you want to go to https://plus.google.com/hangouts/_/wikimedia.org/a-batcave-2? [17:15:58] for oozie changes? [17:16:01] sure mforns ! [17:18:10] nuria: https://github.com/wikimedia/operations-puppet/blob/production/manifests/role/eventlogging.pp#L227 [17:18:45] madhuvishy: right, but what machine is m4? [17:18:50] madhuvishy: isn't taht an alias? [17:19:00] madhuvishy: is it db1046.eqiad.wmnet? [17:19:09] oh - hmmm not sure [17:20:52] sorry a-team, got disconnected faster than expected :) [17:27:34] (PS12) Joal: Add pageview quality check to pageview_hourly [analytics/refinery] - https://gerrit.wikimedia.org/r/240099 (https://phabricator.wikimedia.org/T109739) [17:38:17] nuria: I pushed the patch for whitelist, I let you confirm and merge :) [17:38:34] Joal: looking now [17:38:47] if you look at patch 11, you'll see the diff :) [17:38:51] nuria: --^ [17:39:01] joal: what did you do [17:39:10] git-wise that is [17:41:34] joaL; but your change does not have dan's changes, does it? [17:41:38] oh, sorry nuria [17:41:50] it does actually : it is onto it :) [17:42:15] if you look the entire workflow file, it contains the transform part [17:42:38] So what I did: picked the commit using the command given by gerrit [17:42:57] check out a new branch to make that safer than master [17:43:17] joal: i see, yes, it has dan's workflow [17:43:19] actually before picking the commit, I did fetch all and pull [17:43:36] on master, to make sure I had the latest master [17:43:45] Then pick up, then rebase on master [17:43:56] ottomata: i made an initial patch for the module [17:44:18] 3 conflicts to resolve, so update the f3 files, then add them, then rebase continue [17:44:24] finaly review, done :) [17:44:28] nuria: --^ [17:44:31] joal: k, merging, my apologies again [17:44:32] makes sense ? [17:44:40] nuria: not to worrie :) [17:44:56] (CR) Nuria: [C: 2 V: 2] Add pageview quality check to pageview_hourly [analytics/refinery] - https://gerrit.wikimedia.org/r/240099 (https://phabricator.wikimedia.org/T109739) (owner: Joal) [17:45:28] nuria: before I leave, talk about the cassandra java code ? [17:45:48] joal: sure [17:46:00] cave ? [17:46:06] or is it full ? [17:46:27] nuria: https://gerrit.wikimedia.org/r/#/c/247758/3 [17:46:31] looks like it is already on by default [17:46:46] madhuvishy: cool! [17:47:11] ottomata: just saw that , great [17:47:18] ottomata: still have to add the role [17:47:25] ottomata: abandoning patch [17:52:52] nuria: i'm gonna eat lunch and take a moment to breathe. my computer crashed but it's ok now [17:53:14] milimetric: mystery solve with database [17:53:17] *solved [17:55:09] milimetric: let's sync in when you are back [18:27:34] ok, nuria, so the db spike is explained? [18:27:50] (we can talk in cave for a minute, but I have a meeting in 5 [18:27:53] milimetric: yes, mforns do you want to explain? [18:28:10] milimetric, jaime crespo is working on the bucketization of the editCount fields [18:28:12] milimetric: dba is bucketing EL editing data [18:28:22] so executing scripts [18:28:24] o! [18:28:30] milimetric: so no backfilling until that is finished [18:28:32] i know [18:28:33] makes sense [18:28:36] will he ping us? [18:28:36] and this will take at least until tomorrow [18:28:42] i think our stuff is good [18:28:51] it was just slow for this reason [18:29:10] I spoke with him about this this morning, but I didn't think about that when we were discussing, sorry! [18:29:18] sok [18:29:45] so the last thing to check will be what queue size to use to keep memory usage reasonable during backfilling [18:29:54] that'll probably take a few tries, but we'll do it when jaime's done [18:30:05] thx all, i'll put the related tasks in paused for now then. [18:30:58] ok, makes sense [19:01:37] Analytics-Backlog, Wikipedia-iOS-App-Product-Backlog, iOS-5-app-production: Puppetize Piwik to prepare for production deployment - https://phabricator.wikimedia.org/T103577#1746119 (JMinor) [19:04:08] ottomata: are you still in the meeting? [19:04:32] Analytics, Discovery, EventBus, MediaWiki-General-or-Unknown, and 7 others: EventBus MVP - https://phabricator.wikimedia.org/T114443#1746131 (Ottomata) [19:06:27] Analytics, Discovery, EventBus, MediaWiki-General-or-Unknown, and 6 others: Define edit related events for change propagation - https://phabricator.wikimedia.org/T116247#1746147 (Ottomata) etherpad from today's meeting: https://etherpad.wikimedia.org/p/eventbus-events [19:06:35] madhuvishy: no, done. [19:06:50] ottomata: coolll. i pushed changes, and added a role class [19:07:16] need to define what consumer groups to listen to, and not sure if i should hardcode them [19:07:28] can you make it just do all? [19:07:34] i guess tha'td be annoying :) [19:07:42] madhuvishy: hiera :) [19:07:52] ottomata: oh yeah, hiera [19:08:16] a-team: did we decide what we're gonna do for a place to stay in January? [19:08:36] mmmm [19:08:49] ottomata: okay, do i have to do something different for labs? [19:09:11] the other configs i'm importing from kafka::config and i assume it'll already do the right thing [19:09:25] Analytics-Backlog, Wikipedia-iOS-App-Product-Backlog, iOS-5-app-production: Support Pywick in production - https://phabricator.wikimedia.org/T116308#1746155 (JMinor) NEW [19:10:03] Analytics-Backlog, Wikipedia-iOS-App-Product-Backlog: Stand up piwik in a permanent and privacy-sensitive way - https://phabricator.wikimedia.org/T98058#1746166 (JMinor) [19:10:05] Analytics-Backlog, Wikipedia-iOS-App-Product-Backlog, iOS-5-app-production: Support Pywick in production - https://phabricator.wikimedia.org/T116308#1746165 (JMinor) [19:10:41] milimetric: i thought just hotel [19:10:43] i thought we had to [19:10:57] madhuvishy: ja that is the right thing [19:12:32] joal: nice oozie chart! :) [19:13:18] ottomata: ya okay, and if i do hiera('role::analytics::burrow::blah::consumer_groups') it'll get it from the right place - so i don't have to define anything separate for labs? [19:14:20] no, the proper way ( i think) is to just set the variable name on the module. [19:14:28] that's actually how the kafka stuff should work too [19:14:33] buuuuut, kafka predates hiera :) [19:14:48] module params are automatically set from hiera [19:15:06] Analytics-Backlog, Wikipedia-iOS-App-Product-Backlog, hardware-requests, operations, iOS-5-app-production: Production achine to suport pywick analytics - https://phabricator.wikimedia.org/T116312#1746196 (JMinor) NEW [19:15:10] ottomata: uhhh [19:15:28]