[02:07:40] <wikibugs_>	 (PS1) Eileen: Towards CRM-20155 clean up form code in order to consolidate function use. [wikimedia/fundraising/crm/civicrm] - https://gerrit.wikimedia.org/r/373173
[02:11:31] <wikibugs_>	 (PS2) Eileen: Add ability to find duplicates for selected contacts. [wikimedia/fundraising/crm/civicrm] - https://gerrit.wikimedia.org/r/373155 (https://phabricator.wikimedia.org/T151270)
[02:39:27] <wikibugs_>	 (PS3) Eileen: Add ability to find duplicates for selected contacts. [wikimedia/fundraising/crm/civicrm] - https://gerrit.wikimedia.org/r/373155 (https://phabricator.wikimedia.org/T151270)
[02:59:20] <wikibugs_>	 (CR) Eileen: "OK - this is working now on staging" [wikimedia/fundraising/crm/civicrm] - https://gerrit.wikimedia.org/r/373155 (https://phabricator.wikimedia.org/T151270) (owner: Eileen)
[03:02:20] <wikibugs_>	 Fundraising Sprint Prank Seatbelt, Fundraising-Backlog, Wikimedia-Fundraising-CiviCRM: civi dedupe: offer dedupe option in a regular search - https://phabricator.wikimedia.org/T151270#3543669 (Eileenmcnaughton) Ok - working on staging now - search for a contact where you know there are duplicate emai...
[03:34:23] <wikibugs_>	 (PS1) Eileen: CRM-20658: Fatal error on Dedupe rule for > 1 match [wikimedia/fundraising/crm/civicrm] - https://gerrit.wikimedia.org/r/373176 (https://phabricator.wikimedia.org/T160571)
[04:19:16] <wikibugs_>	 (PS2) Eileen: CRM-20658: Fatal error on Dedupe rule for > 1 match [wikimedia/fundraising/crm/civicrm] - https://gerrit.wikimedia.org/r/373176 (https://phabricator.wikimedia.org/T160571)
[04:24:45] <wikibugs_>	 Fundraising Sprint Far Beer, Fundraising Sprint Gondwanaland Reunification Engine, Fundraising Sprint Homebrew Hadron Collider, Fundraising Sprint Ivory Tower Defense Games, and 8 others: Errors in CiviCRM dedupe screen - https://phabricator.wikimedia.org/T160571#3543696 (Eileenmcnaughton)
[04:25:32] <wikibugs_>	 Fundraising Sprint Far Beer, Fundraising Sprint Gondwanaland Reunification Engine, Fundraising Sprint Homebrew Hadron Collider, Fundraising Sprint Ivory Tower Defense Games, and 8 others: Errors in CiviCRM dedupe screen - https://phabricator.wikimedia.org/T160571#3103876 (Eileenmcnaughton) I'm pu...
[14:46:40] <pcoombe>	 cwd: heads up that we've got a Big English test launching in 15 minutes
[14:46:59] <cwd>	 pcoombe: thanks!
[14:47:03] <cwd>	 we'll be watching
[14:47:19] <pcoombe>	 cool. It lasts 1 hour, let me know if you see any issues
[14:47:58] <cwd>	 sounds good
[14:48:18] <cwd>	 i expect to see replag spam in here, will probably fiddle some settings and see what happens
[14:52:10] <mepps>	 let me know when you're around ejegg|away
[15:54:51] <ejegg>	 hi mepps!
[15:55:19] <ejegg>	 back in Bogota, just sorting out a couple of account things
[15:55:30] <mepps>	 hi ejegg! are you working today?
[15:55:45] <ejegg>	 yep!
[15:56:49] <mepps>	 great
[15:57:04] <mepps>	 i'm still sick but i'd really like to close out the orphan rectifier stuff today if possible
[15:59:01] <mepps>	 i have to take a 10 minute baby break then want to hangout?
[15:59:14] <mepps>	 i have my checkin with katie at 12:30
[15:59:46] <ejegg>	 sure, sounds good!
[16:15:54] <mepps>	 ejegg queenmary?
[16:16:07] <ejegg>	 mepps one sec!
[16:20:15] <wikibugs_>	 (PS9) Mepps: WIP Orphan Slayer Module [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/370225
[16:22:00] <cwd>	 ejegg: does the donation consumer update c_t?
[16:27:10] <ejegg>	 cwd yep
[16:28:03] <cwd>	 i bet that fights with the front end some
[16:28:05] <cwd>	 over the write lock
[16:28:35] <cwd>	 we'll do the guid thing some day
[16:30:14] <icinga-wm>	 PROBLEM - check_mysql on frdb2001 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 2817
[16:30:26] <ejegg>	 here it is
[16:30:34] <cwd>	 damn
[16:30:47] * cwd silences phone
[16:32:10] <wikibugs_>	 (CR) jerkins-bot: [V: -1] WIP Orphan Slayer Module [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/370225 (owner: Mepps)
[16:35:14] <cwd>	 ejegg: to disable a job you just comment out the schedule?
[16:35:14] <icinga-wm>	 PROBLEM - check_mysql on frdb2001 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 2217
[16:38:18] <cwd>	 !log disabled all dedupe
[16:38:29] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
[16:38:31] <ejegg>	 cwd yep, sorry
[16:38:41] <cwd>	 np, just checked git
[16:38:46] <cwd>	 we'll see if that does it
[16:40:15] <icinga-wm>	 PROBLEM - check_mysql on frdb2001 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1763
[16:43:20] <cwd>	 there are still some dedupes running
[16:43:47] <cwd>	 ejegg, Jeff_Green - opinions on killing em?
[16:44:16] <Jeff_Green>	 multiple?
[16:44:42] <cwd>	 ah nm there are just 2 lines for it
[16:44:49] <cwd>	 but it's been running almost an hour
[16:44:50] <Jeff_Green>	 aside: something heavy is happens ever 4h 30m
[16:44:55] <Jeff_Green>	 oh really
[16:45:04] <cwd>	 i think it's probably this
[16:45:14] <icinga-wm>	 PROBLEM - check_mysql on frdb2001 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1237
[16:46:14] <Jeff_Green>	 which PID are you talking about? 32641?
[16:47:00] <cwd>	 or its parent
[16:47:02] <cwd>	 32633
[16:47:38] <cwd>	 is that the sudo wrapper?
[16:48:17] <Jeff_Green>	 i think 32633 is the wrapper
[16:48:31] <Jeff_Green>	 who knows, at any rate let's look at the debug log
[16:49:55] <cwd>	 the p-c logs don't say much
[16:50:14] <icinga-wm>	 RECOVERY - check_mysql on frdb2001 is OK: Uptime: 571814 Threads: 1 Questions: 12784911 Slow queries: 3383 Opens: 8085 Flush tables: 1 Open tables: 601 Queries per second avg: 22.358 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 0
[16:52:58] <cwd>	 sorry i'm stupid it has not been running for almost an hour
[16:53:04] <cwd>	 almost 20 minutes
[16:53:54] <cwd>	 but clearly dedupe chokes when donations are hot
[16:54:35] <cwd>	 and i can certainly imagine why that would cause replag
[16:54:51] <cwd>	 it usually runs every 5 minutes
[16:55:02] <cwd>	 and takes however much less than that so we don't see mails about overlap
[16:55:13] <cwd>	 but it's taking 20+ minutes when banners are up
[16:58:21] <Jeff_Green>	 something ran around these times: 2:35AM, 7:00AM, 11:30AM, 4:00PM that spiked load on the master db
[17:00:22] <cwd>	 argh
[17:00:31] <cwd>	 and it ran at the same time as these banners
[17:00:47] <cwd>	 which messes up the results
[17:01:13] <cwd>	 however the dedupe thing still applies in that we don't see that mail normally
[17:38:56] <cwd>	 spatton: are banners still up?
[17:39:57] <cwd>	 i think they are not
[17:45:14] <icinga-wm>	 PROBLEM - check_mysql on frdb2001 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 2779
[17:48:13] <cwd>	 whyyyyyyyyy
[17:50:14] <icinga-wm>	 RECOVERY - check_mysql on frdb2001 is OK: Uptime: 575413 Threads: 1 Questions: 13221331 Slow queries: 3383 Opens: 8105 Flush tables: 1 Open tables: 601 Queries per second avg: 22.977 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 0
[18:00:49] <mepps>	 ejegg, meet on the call in 5? i just need to grab some water
[18:01:27] <ejegg|afk>	 sounds good mepps!
[18:07:16] <mepps>	 ejegg meet in queenmary or in the meeting call?
[18:08:20] <ejegg>	 mepps oops, i'm in queenmary
[18:08:27] <ejegg>	 forgot there was a separate meeting call
[18:08:35] <mepps>	 okay joining there!
[18:21:21] <wikibugs_>	 (PS10) Mepps: WIP Orphan Slayer Module [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/370225
[18:42:57] <wikibugs_>	 (PS11) Mepps: WIP Orphan Slayer Module [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/370225
[19:00:01] <ejegg>	 mepps sorry, rejoining
[19:05:10] <wikibugs_>	 (PS12) Mepps: WIP Orphan Slayer Module [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/370225
[19:10:54] <wikibugs_>	 (CR) jerkins-bot: [V: -1] WIP Orphan Slayer Module [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/370225 (owner: Mepps)
[19:19:56] <wikibugs_>	 (PS1) Ejegg: Return inserted IDs for pending and payments_init [wikimedia/fundraising/SmashPig] - https://gerrit.wikimedia.org/r/373347
[19:24:47] <wikibugs_>	 (PS13) Mepps: WIP Orphan Slayer Module, getting expected error message [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/370225
[19:31:10] <wikibugs_>	 (CR) jerkins-bot: [V: -1] WIP Orphan Slayer Module, getting expected error message [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/370225 (owner: Mepps)
[19:47:46] <dstrine>	 AndyRussG:  meeting?
[19:52:48] <wikibugs_>	 Fundraising-Backlog, Wikimedia-Fundraising, MediaWiki-extensions-CentralNotice, Documentation: Banner and donatewiki style guide documentation needs updating - https://phabricator.wikimedia.org/T119821#3546874 (Pcoombe) Open>Resolved There is now some updated donatewiki documentation on c...
[20:11:38] <mepps>	 ejegg do you have any reviews you want me to look at?
[20:11:57] <mepps>	 also can you take a look at the orphan rectifier stuff in donationinterface?
[20:15:14] <ejegg>	 mepps there's the mastercard stuff for review
[20:15:44] <ejegg>	 and that little smashpig one if you want to get the pending db IDs back from storeMessage: https://gerrit.wikimedia.org/r/373347
[20:15:57] <ejegg>	 i'll definitely look at the donationinterface bits!
[20:16:06] <wikibugs_>	 (PS2) Mepps: Update Mastercard logo [extensions/DonationInterface] - https://gerrit.wikimedia.org/r/373109 (https://phabricator.wikimedia.org/T166795) (owner: Ejegg)
[20:16:11] <wikibugs_>	 (CR) Mepps: [C: 2] Update Mastercard logo [extensions/DonationInterface] - https://gerrit.wikimedia.org/r/373109 (https://phabricator.wikimedia.org/T166795) (owner: Ejegg)
[20:20:46] <wikibugs_>	 (Merged) jenkins-bot: Update Mastercard logo [extensions/DonationInterface] - https://gerrit.wikimedia.org/r/373109 (https://phabricator.wikimedia.org/T166795) (owner: Ejegg)
[20:36:21] <cwd>	 eileen, ejegg - either of you have any idea what these load spikes might be? https://ganglia.wikimedia.org/latest/graph.php?r=day&z=xlarge&c=Fundraising+eqiad&h=frdb1001.frack.eqiad.wmnet&jr=&js=&v=1.09&m=load_one&vl=+&ti=One+Minute+Load+Average
[20:36:48] <cwd>	 i do not see any p-c jobs with a 6 hour schedule unless i'm misreading
[20:36:59] <ejegg>	 wow, those are pretty chunky
[20:37:18] <cwd>	 yep and one was right in the middle of the test today
[20:40:07] <ejegg>	 doesn't seem to be the audit parsers
[20:40:15] <wikibugs_>	 (PS2) Mepps: Update MasterCard -> Mastercard [extensions/DonationInterface] - https://gerrit.wikimedia.org/r/373110 (https://phabricator.wikimedia.org/T166795) (owner: Ejegg)
[20:40:20] <wikibugs_>	 (CR) Mepps: [C: 2] Update MasterCard -> Mastercard [extensions/DonationInterface] - https://gerrit.wikimedia.org/r/373110 (https://phabricator.wikimedia.org/T166795) (owner: Ejegg)
[20:41:09] <ejegg>	 cwd silverpop export starts at 06:00, but there's no spike for another hour
[20:42:17] <wikibugs_>	 (Merged) jenkins-bot: Update MasterCard -> Mastercard [extensions/DonationInterface] - https://gerrit.wikimedia.org/r/373110 (https://phabricator.wikimedia.org/T166795) (owner: Ejegg)
[20:42:19] <cwd>	 and that should just be once a day right?
[20:42:25] <ejegg>	 yeah
[20:42:36] <ejegg>	 was wondering if this was multiple things
[20:42:45] <ejegg>	 but the spikes do look pretty regular
[20:43:06] <cwd>	 yeah
[20:43:15] <cwd>	 the last one is bigger presumably cause of the banners
[20:43:21] <cwd>	 hmmmm
[20:43:44] <ejegg>	 do they start on the half hour?
[20:45:51] <cwd>	 not exactly
[20:46:00] <cwd>	 the distribution is a little funky
[20:47:52] <ejegg>	 hmm, do they last a half an hour each?
[20:48:12] <eileen>	 what sort of spike is it?
[20:48:21] <ejegg>	 could it be some huge report?
[20:48:37] <eileen>	 processor or ram or network or writes?
[20:49:10] <wikibugs_>	 (CR) Mepps: Support srcset for card logos (1 comment) [extensions/DonationInterface] - https://gerrit.wikimedia.org/r/373112 (https://phabricator.wikimedia.org/T166795) (owner: Ejegg)
[20:49:35] <cwd>	 eileen: that graph is cpu
[20:49:56] <eileen>	 on the db server I guess
[20:51:15] <wikibugs_>	 (CR) Ejegg: Support srcset for card logos (1 comment) [extensions/DonationInterface] - https://gerrit.wikimedia.org/r/373112 (https://phabricator.wikimedia.org/T166795) (owner: Ejegg)
[20:51:21] <eileen>	 note the silverpop export has a bunch of queries within it & one or more of them might be more intensive. Also I wonder about cache flushing causing it?
[20:53:13] <mepps>	 ejegg see comment above
[20:53:29] <cwd>	 thinking those load spikes are related to ossec
[20:53:36] <cwd>	 fs scans
[20:54:39] <cwd>	 eileen: isn't silverpop running on staging anyway?
[20:54:43] <cwd>	 i think this is that: https://ganglia.wikimedia.org/latest/graph.php?r=week&z=xlarge&c=Fundraising+eqiad&h=frdev1001.frack.eqiad.wmnet&jr=&js=&v=1.09&m=load_one&vl=+&ti=One+Minute+Load+Average
[21:07:37] <ejegg>	 mepps responded!
[21:07:59] <ejegg>	 or rather, mepps: I responded
[21:08:22] <cwd>	 hehe
[21:08:24] <ejegg>	 cwd yep, it's hitting staging
[21:08:45] <cwd>	 cool
[21:09:05] <ejegg>	 but the actual process runs on civi1001
[21:11:16] <wikibugs_>	 (CR) Eileen: [C: 2] Update list of processors in Gateway Reconciliation report [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/370979 (owner: Ejegg)
[21:17:37] <wikibugs_>	 (Merged) jenkins-bot: Update list of processors in Gateway Reconciliation report [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/370979 (owner: Ejegg)
[21:36:25] <eileen1>	 cwd am I right in believing there were no notable issues with replag over the one hour test, or Caitlin's 60k email
[21:36:50] <eileen1>	 no, email not gone yet I uess
[21:37:52] <cwd>	 eileen1: there was some lag, not as bad as last time
[21:41:25] <wikibugs_>	 (PS1) Ejegg: Merge branch 'master' into deployment [extensions/DonationInterface] (deployment) - https://gerrit.wikimedia.org/r/373385
[21:41:59] <wikibugs_>	 (CR) Ejegg: [C: 2] Merge branch 'master' into deployment [extensions/DonationInterface] (deployment) - https://gerrit.wikimedia.org/r/373385 (owner: Ejegg)
[21:42:51] <wikibugs_>	 (Merged) jenkins-bot: Merge branch 'master' into deployment [extensions/DonationInterface] (deployment) - https://gerrit.wikimedia.org/r/373385 (owner: Ejegg)
[21:47:00] <eileen1>	 cwd hmm lag on a one hour test doesn't bode well for BE
[21:47:15] <cwd>	 you ain't kiddin
[21:47:57] <cwd>	 i pulled some numbers yesterday and last week's test was around 1/10 the traffic of a busy big english day
[21:49:00] <ejegg>	 so what's the best way to figure out the source?
[21:50:03] <ejegg>	 we could turn off all the jobs, wait for a big chunk of donations to build up in the queue, then run just the queue consumer at full tilt, to see if that gets us lag
[21:51:47] <eileen1>	 yeah we should test donations without dedupe & silverpop.fetch on - because that will be the case for the first few days of BE
[21:53:27] <cwd>	 that's not a bad idea
[21:53:51] <cwd>	 we turned off a regular security scan, that may have had something to do with it
[21:54:05] <cwd>	 ossec-syscheckd
[21:54:25] <cwd>	 it's mostly doing duplicated work at this point
[23:48:49] <dstrine>	 eileen1:  sory I've been tied up with a bunch of other stuff. What is on staging to be reviewed?
[23:49:39] <eileen1>	 dstrine: if you do a contact search now you will have another action 'Find duplicate contacts' I think
[23:50:04] <eileen1>	 try for a contact you know there is an email dupe for
[23:50:42] <eileen1>	 (eg. pick a name from this link civicrm/contact/dedupefind?reset=1&rgid=13&gid=268&limit=500000000&action=update)
[23:51:00] <eileen1>	 can do hangout / screen share if easier