[15:11:20] mwalker|away: ^^ [15:11:52] PPena: who's doing the store now? [15:12:38] jeremyb its Bryony Jones and Juli [15:14:31] PPena: so, for a chapter bulk order that would be either or one of them in particular? [15:14:43] bryony is familiar. don't think i've ever heard of juli [15:16:04] jeremyb yep! Sorry I dont have Juli's email handy, but Bry will def get you to the right place :) [15:16:13] Nemo_bis: fyi ^^ [15:16:17] k, danke! [15:16:30] jeremyb: yes I had seen :) [15:17:19] hmpf, neither of them is on bugzilla? [15:17:57] PPena: do you really mean Juli? or Julie? [15:21:24] Nemo_bis Juli Matthews jmatthews@wikimedia.org [15:22:08] ah thanks [15:23:31] oh, and there's also a jmatthewson(sp?) too! [15:23:41] filed https://bugzilla.wikimedia.org/show_bug.cgi?id=53946 [15:24:09] hah [15:24:38] i think james already had bugzilla expereience before he was WMF. maybe these people have never touched it? [15:24:43] We must have someone to pester about https://bugzilla.wikimedia.org/show_bug.cgi?id=37797 [15:24:53] probably [15:24:58] i did have that bug in mind! [15:25:09] of course, everyone should [15:25:13] and also dream it at night [18:18:25] K4-713: u wanna hear the sad part of my dedupe story? [18:18:36] awight: There's a happy part? :p [18:19:07] K4-713: seriously. [18:19:25] But yeah: go ahead. [18:21:32] Well, the only downside is that I'm currently planning to load all contacts in our db, paged of course, and make every comparison. So it's an MxN problem... [18:22:32] ... [18:22:36] Yow. [18:22:49] I have some optimizations... but yeah [18:22:56] Can you do it in chunks? [18:23:02] yes "paged" [18:23:11] Little... jobby chunks, I meant. [18:23:21] ;) totally, that will have to happen [18:23:49] The thing I don't like is that the problem never gets smaller... [18:24:02] Yeah, I would think that should be a requirement. [18:24:07] like, N (number of unreviewed contacts) fluctuates, but M only gets bigger [18:24:30] N should get significantly smaller, though. [18:24:36] Are you, in fact, tagging them? [18:24:39] yep [18:24:50] N can be set to a batch size [18:25:00] well... hm. [18:25:37] I guess what happened is, I realized the comparisons are not things we want to do in the database... [18:25:58] Yeah, probably not. [18:26:10] Well... not all of them, anyway. [18:26:48] exact match on email seems like something we might want to keep there, but I'm sure you thought of that. [18:27:22] K4-713: check this out: https://mingle.corp.wikimedia.org/projects/fundraiser_2012/cards/1054 [18:27:59] I guess, "Notes" is the section I want to draw yr attention to [18:28:17] untruth table... [18:29:52] Wow, this looks like it really should be a supertask, with the 3x identical being the easiest subtask. [18:30:23] And, like, a shitload of additional discrete tasks to add to that. [18:30:26] harr [18:30:42] ...probably grouped on "action". [18:30:50] Building the systems simultaneously is efficient, I think [18:31:34] Well, at the very least, it sounds like it should be broken out into the frequent job, and the infrequent job. [18:31:43] There is supertask #450 [18:32:07] yeah the two jobs was a late afterthought, I think the fast one is probably more important [18:32:42] :( [18:33:00] Meaning, I'm on a rapidly sinking island of onanism at the moment. [18:33:18] I'm not sure the action is... hm. [18:33:42] Those last two "autoreviewed" items look really iffy to me. [18:34:08] Also, is your "reviewed" tag, like, trinary? [18:34:20] I didn't notate that correctly... thanks, I do need to clarify [18:34:34] the last two lines are "Autoreviewed as totally unrelated" [18:34:40] aaaha. [18:34:49] what I mean by "autoreviewed" there is, "do not add to the manual review queue" [18:34:51] Wait. Not trinary. Quad... rinary. [18:34:58] same with the first line [18:35:00] unreviewed, reviewed (daily), reviewed (long job), reviewed (both). [18:35:10] that is "autoreviewed as abso-fucking-lutely identical" [18:35:15] um [18:35:18] yeah i do need that. [18:35:39] Selecting on !tag is a big annoying but fine [18:35:43] bit [18:35:55] You could do a 4-bit number... [18:35:59] HAR [18:36:02] :D [18:36:17] * awight licks wounds [18:36:24] Or, I suppose, two booleans. [18:36:26] :/ [18:36:33] (I'm no fun anymore) [18:36:35] The tag will work ;) [18:37:31] Actually, if it's two booleans, it'll scale better if we decide we need a third, fourth, or... however many rounds. [18:37:36] Likelihood of that? [18:38:28] A tag is the same as a boolean, just with a more disgusting schema [18:38:39] ah [18:38:40] #1054: (AW) Description changed -- https://mingle.corp.wikimedia.org/projects/fundraiser_2012/cards/1054 [18:38:59] Well, I guess I'd deeply prefer to go with the thing that is more performant. [18:39:20] I'm... kind of expecting this to get ugly for exactly that reason. [18:39:26] Ugly is fast. [18:40:05] I'm dedicated to making it slow and pretty [18:40:11] * awight throws down gauntlet [18:40:33] Why do I feel like we're writing a sunday morning cartoon show? [18:40:48] * jvandavier gets out his Acme dynamite  [18:40:49] I think the intensive job is gonna take at least 24 hours per run... [18:41:06] shit. [18:41:09] And it's catching the 5% tail of dupes, where some fool typed there shit in wrong [18:41:22] * K4-713 nods [18:41:24] but the quick job will probably run hourly, taking a few minutes [18:41:27] or less [18:41:42] ALSO [18:41:45] So, the slow one, is *definitely* a much lower priority than the fast one... [18:41:59] * awight 's island sinks a bit [18:42:10] ...but if you're sure it's going to be more efficient to tackle them simultaneously, I'll just... bite down on this popsicle stick and deal with it. [18:42:30] no it's been valuable to be steered away from that for the moment [18:42:49] i was gonna say though, there is an even bigger factor re: performant. [18:43:21] I was just about to ask: Have you talked to Jeff about... not running this on al? [18:43:32] Because, this sounds like something we don't want on al. [18:43:57] ...the long one, anyway. [18:44:31] But even for the other one. Argh. I think I could spend a day just tuning the jenkins jobs to play nice with eachother. [18:44:34] I think Al is getting the axe either way [18:44:43] Sorry RL attack [18:44:43] * K4-713 weeps slightly [18:44:58] I was trying to say, the big factor here is that we aren't actually doing anything in the autoreview [18:45:01] just tagging [18:45:11] only a few cases can be fully automatic [18:45:19] so, the blocker here is manual queue review [18:45:36] it should probably happen at most semi-weekly so nobody goes crazy [18:45:58] When you tag them as manual review... how do you designate the possible match? [18:46:09] I interpret that to mean that we don't care so much about performance, as long as we aren't doing anything geometrically stupid [18:46:17] K4-713: yah excellent question [18:46:25] I have to do something like Civi's dedupe queue [18:46:38] currently, I have... [18:46:39] And, ah... at least at first, we should expect not single possible matches, but intense hairballs of interconnected madness. [18:46:52] As soon as you step away from the easy case. [18:47:09] Yes. I've got a plan for how to deal with multi dupes, I should write that down so you can review it. [18:47:12] It's got to be easy for the reviewer to, ah, parse. [18:47:27] https://gerrit.wikimedia.org/r/#/c/82774/2/sites/all/modules/donor_review/donor_review.install,unified [18:47:33] That is the schema so far. [18:47:44] Probably get jvandavier in on this, too. I feel like he and his team are the likeliest people to actually have to deal with this. [18:47:49] Also... [18:47:51] Thought. [18:48:11] What if we just leave them in hariball mode unless we need to detangle them for support reasons? [18:48:18] err [18:48:32] I think the prophylactic approach is better in the long run [18:48:33] If the hairball view is clear enough, it might not... matter. [18:48:48] But we want to be able to get clean stats and stuff... [18:49:22] I guess we don't really know what we want in that area yet, because we've never had clean data before. [18:49:30] ;) [18:49:43] I... guess those lybunt/sybunt reports... [18:49:53] blahrgh [18:50:19] but also, like total number of donors, analyzing repeat donations, country stats... [19:49:40] #1075: (AW) Tech Task #1075 Dedupe: quick autoreview job: O:AW|TS:B|P:SH|T:TT Description changed -- https://mingle.corp.wikimedia.org/projects/fundraiser_2012/cards/1075 [19:52:40] #1075: (AW) TS:DR -- https://mingle.corp.wikimedia.org/projects/fundraiser_2012/cards/1075 [19:52:41] #1075: (AW) AT:AW|TS:ID -- https://mingle.corp.wikimedia.org/projects/fundraiser_2012/cards/1075 [20:07:42] (PS1) Mwalker: Merge Custom banner classes from master (also GlobalAlloc) [extensions/CentralNotice] (wmf_deploy) - https://gerrit.wikimedia.org/r/83547 [20:07:57] (CR) Mwalker: [C: 2] Merge Custom banner classes from master (also GlobalAlloc) [extensions/CentralNotice] (wmf_deploy) - https://gerrit.wikimedia.org/r/83547 (owner: Mwalker) [20:46:48] Romaine: jfyi -- I just deployed a new feature to centralnotice -- banners can now belong to custom groups which means they get custom hiding cookies [20:46:57] I've changed your two WLM banners to be in the 'wlm2013' group [20:47:16] this also incidently means that you're getting first day traffic again as people will have to rehide with the new cookie [20:52:49] ok [21:30:40] #1054: (AW) Description changed -- https://mingle.corp.wikimedia.org/projects/fundraiser_2012/cards/1054 [21:32:39] #1054: (AW) Description changed -- https://mingle.corp.wikimedia.org/projects/fundraiser_2012/cards/1054 [21:32:40] #1057: (AW) N:a Description changed -- https://mingle.corp.wikimedia.org/projects/fundraiser_2012/cards/1057 [21:33:41] #1057: (AW) P:WNH -- https://mingle.corp.wikimedia.org/projects/fundraiser_2012/cards/1057 [21:33:41] #1057: (AW) P-TS:(s -- https://mingle.corp.wikimedia.org/projects/fundraiser_2012/cards/1057 [21:33:41] #1054: (AW) TS:DR|MtIDo:(s -- https://mingle.corp.wikimedia.org/projects/fundraiser_2012/cards/1054 [21:37:40] #1056: (AW) Description changed -- https://mingle.corp.wikimedia.org/projects/fundraiser_2012/cards/1056 [21:41:40] #1056: (AW) Description changed -- https://mingle.corp.wikimedia.org/projects/fundraiser_2012/cards/1056 [21:43:41] #1076: (AW) Tech Task #1076 Dedupe: audit log: O:AW|TS:B|P:MH|->Sprint 30|P-TS:#DCc|T:TT Description changed -- https://mingle.corp.wikimedia.org/projects/fundraiser_2012/cards/1076 [23:00:21] mwalker: So... how did you solve that longass email domain issue before? :/ [23:03:27] (PS1) Adamw: stifle warnings [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/83580 [23:07:31] (PS1) Adamw: damage email length inside Civi. [wikimedia/fundraising/crm/civicrm] - https://gerrit.wikimedia.org/r/83581 [23:08:03] mwalker: K4-713: please review ^^ and ^^-2 [23:14:09] (CR) Katie Horn: [C: 2 V: 2] stifle warnings [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/83580 (owner: Adamw) [23:18:26] K4-713: 9f77e448cfce72576f728bb75bb90a4d9ed4bb17 [23:21:41] #1077: (AW) Tech Defect #1077 CentralNotice: fix banner logs: O:AW|TS:B|P:MH|T:TD Description changed -- https://mingle.corp.wikimedia.org/projects/fundraiser_2012/cards/1077 [23:21:41] #1077: (AW) ->Sprint 30 -- https://mingle.corp.wikimedia.org/projects/fundraiser_2012/cards/1077 [23:26:13] K4-713: http://en.wikipedia.org/wiki/Email_address [23:28:06] (CR) Katie Horn: [C: 2 V: 2] damage email length inside Civi. [wikimedia/fundraising/crm/civicrm] - https://gerrit.wikimedia.org/r/83581 (owner: Adamw) [23:34:41] #1077: (AW) TS:DR -- https://mingle.corp.wikimedia.org/projects/fundraiser_2012/cards/1077 [23:34:41] #1077: (AW) AT:AW -- https://mingle.corp.wikimedia.org/projects/fundraiser_2012/cards/1077