[00:20:46] Damn, how are we still getting missing gateway messages? [01:56:57] Fundraising Sprint Navel Warfare, Fundraising Sprint Outie Inverter, Fundraising Sprint Prank Seatbelt, Fundraising Sprint Quill Pencil, and 4 others: Are PayPal refunds for recurring donations incorrectly being tagged as EC or vice versa? - https://phabricator.wikimedia.org/T171351#3462035 (XenoR... [02:08:11] argh, ipnpb.paypal.com is dead again [02:10:03] or... was, for a bit [06:04:25] AndyRussG: ARE YOU OK? [06:05:27] looks like quake is not directly in your area [06:05:29] ? [09:39:57] (Abandoned) Hashar: Jenkins job validation (DO NOT SUBMIT) [wikimedia/fundraising/SmashPig] - https://gerrit.wikimedia.org/r/376243 (owner: Hashar) [09:40:27] (Abandoned) Hashar: Jenkins job validation (DO NOT SUBMIT) [extensions/DonationInterface] (deployment) - https://gerrit.wikimedia.org/r/376225 (owner: Hashar) [09:40:29] (Abandoned) Hashar: Jenkins job validation (DO NOT SUBMIT) [extensions/DonationInterface] - https://gerrit.wikimedia.org/r/324066 (owner: Hashar) [10:00:07] PROBLEM - check_puppetrun on americium is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 6 minutes ago with 1 failures. Failed resources (up to 3 shown): Exec[generate-kafkatee.pyconf] [10:05:07] RECOVERY - check_puppetrun on americium is OK: OK: Puppet is currently enabled, last run 41 seconds ago with 0 failures [10:15:07] PROBLEM - check_puppetrun on americium is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 6 minutes ago with 1 failures. Failed resources (up to 3 shown): Exec[generate-kafkatee.pyconf] [10:20:07] PROBLEM - check_puppetrun on americium is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 11 minutes ago with 1 failures. Failed resources (up to 3 shown): Exec[generate-kafkatee.pyconf] [10:25:07] RECOVERY - check_puppetrun on americium is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:00:13] PROBLEM - check_puppetrun on americium is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 6 minutes ago with 1 failures. Failed resources (up to 3 shown): Exec[generate-kafkatee.pyconf] [11:05:13] RECOVERY - check_puppetrun on americium is OK: OK: Puppet is currently enabled, last run 41 seconds ago with 0 failures [12:00:14] PROBLEM - check_puppetrun on americium is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 6 minutes ago with 1 failures. Failed resources (up to 3 shown): Exec[generate-kafkatee.pyconf] [12:05:14] RECOVERY - check_puppetrun on americium is OK: OK: Puppet is currently enabled, last run 41 seconds ago with 0 failures [12:07:54] Fundraising-Backlog, MediaWiki-extensions-CentralNotice, Wikimedia-CentralNotice-Administration: CN Campaign Suppression prior to scheduled start time - https://phabricator.wikimedia.org/T175358#3591257 (Jseddon) [12:08:21] Fundraising-Backlog, MediaWiki-extensions-CentralNotice, Wikimedia-CentralNotice-Administration: CN Campaign Suppression prior to scheduled start time - https://phabricator.wikimedia.org/T175358#3591272 (Jseddon) [13:39:53] fundraising-tech-ops, Operations, procurement: cost estimate for two prometheus+grafana servers for fundraising - https://phabricator.wikimedia.org/T175364#3591415 (Jgreen) [13:42:15] (PS1) Ejegg: Use version 1.0.0 of PHP-Queue [wikimedia/fundraising/SmashPig] - https://gerrit.wikimedia.org/r/376716 [13:44:34] (CR) Ejegg: [C: 2] Use version 1.0.0 of PHP-Queue [wikimedia/fundraising/SmashPig] - https://gerrit.wikimedia.org/r/376716 (owner: Ejegg) [13:45:37] (Merged) jenkins-bot: Use version 1.0.0 of PHP-Queue [wikimedia/fundraising/SmashPig] - https://gerrit.wikimedia.org/r/376716 (owner: Ejegg) [13:56:51] (PS1) Ejegg: Update SmashPig, don't list php-queue as dep [extensions/DonationInterface] - https://gerrit.wikimedia.org/r/376717 [13:57:00] (CR) Ejegg: [C: 2] Update SmashPig, don't list php-queue as dep [extensions/DonationInterface] - https://gerrit.wikimedia.org/r/376717 (owner: Ejegg) [13:58:38] (CR) jerkins-bot: [V: -1] Update SmashPig, don't list php-queue as dep [extensions/DonationInterface] - https://gerrit.wikimedia.org/r/376717 (owner: Ejegg) [13:58:40] (CR) jerkins-bot: [V: -1] Update SmashPig, don't list php-queue as dep [extensions/DonationInterface] - https://gerrit.wikimedia.org/r/376717 (owner: Ejegg) [14:06:46] (PS2) Ejegg: Update SmashPig, don't list php-queue as dep [extensions/DonationInterface] - https://gerrit.wikimedia.org/r/376717 [14:11:21] (PS3) Ejegg: Update SmashPig, don't list php-queue as dep [extensions/DonationInterface] - https://gerrit.wikimedia.org/r/376717 [14:11:45] (PS7) Ejegg: WIP upgrade to new Minfraud Composer package [extensions/DonationInterface] - https://gerrit.wikimedia.org/r/375454 (https://phabricator.wikimedia.org/T128902) [14:13:11] (CR) jerkins-bot: [V: -1] WIP upgrade to new Minfraud Composer package [extensions/DonationInterface] - https://gerrit.wikimedia.org/r/375454 (https://phabricator.wikimedia.org/T128902) (owner: Ejegg) [14:15:46] (CR) Ejegg: "Needs composer clear-cache" [extensions/DonationInterface] - https://gerrit.wikimedia.org/r/375454 (https://phabricator.wikimedia.org/T128902) (owner: Ejegg) [14:20:10] (PS1) Ejegg: Update SmashPig, PHPQueue [extensions/DonationInterface/vendor] - https://gerrit.wikimedia.org/r/376722 [15:10:57] Fundraising-Backlog, MediaWiki-extensions-CentralNotice, Wikimedia-CentralNotice-Administration: CN Campaign Suppression prior to scheduled start time - https://phabricator.wikimedia.org/T175358#3591781 (AndyRussG) I tried reproducing this locally as described, but everything seem worked normally. An... [15:30:12] PROBLEM - check_puppetrun on americium is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 5 minutes ago with 1 failures. Failed resources (up to 3 shown): Exec[generate-kafkatee.pyconf] [15:30:59] Fundraising-Backlog, Wikimedia-Fundraising-Banners: Safari issue with Other amount option in banner: mask currency code letters - https://phabricator.wikimedia.org/T173431#3591859 (Pcoombe) Resolved>Open Re-opening, that change broke using backspace and arrow keys in Firefox so I've reverted it w... [15:35:12] RECOVERY - check_puppetrun on americium is OK: OK: Puppet is currently enabled, last run 41 seconds ago with 0 failures [15:47:33] !log increased cURL timeout for PayPal IPN confirmation to 14 sec [15:47:44] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [16:23:55] Fundraising-Backlog, Wikimedia-Fundraising-Banners: Safari issue with Other amount option in banner: mask currency code letters - https://phabricator.wikimedia.org/T173431#3591960 (Pcoombe) Open>Resolved Wow, for something so basic KeyboardEvents are a total mess of supposedly deprecated/unsuppor... [17:03:33] (CR) Ejegg: "recheck" [extensions/DonationInterface] - https://gerrit.wikimedia.org/r/375454 (https://phabricator.wikimedia.org/T128902) (owner: Ejegg) [17:05:18] (CR) jerkins-bot: [V: -1] WIP upgrade to new Minfraud Composer package [extensions/DonationInterface] - https://gerrit.wikimedia.org/r/375454 (https://phabricator.wikimedia.org/T128902) (owner: Ejegg) [17:18:15] (PS8) Ejegg: WIP upgrade to new Minfraud Composer package [extensions/DonationInterface] - https://gerrit.wikimedia.org/r/375454 (https://phabricator.wikimedia.org/T128902) [17:19:36] (CR) jerkins-bot: [V: -1] WIP upgrade to new Minfraud Composer package [extensions/DonationInterface] - https://gerrit.wikimedia.org/r/375454 (https://phabricator.wikimedia.org/T128902) (owner: Ejegg) [17:21:05] Fundraising-Backlog, MediaWiki-extensions-CentralNotice, Wikimedia-CentralNotice-Administration: CN Campaign Suppression prior to scheduled start time - https://phabricator.wikimedia.org/T175358#3592101 (AndyRussG) Hi... Thanks for noticing this and for the careful description...!!! I'm not sure the... [17:21:52] Fundraising-Backlog, Wikimedia-Fundraising-CiviCRM, Fr-CiviCRM-dedupe-FY2017/18: Find source of unlimited dedupe queries, prevent them - https://phabricator.wikimedia.org/T175382#3592103 (Ejegg) [17:22:04] Fundraising-Backlog, Wikimedia-Fundraising-CiviCRM, Fr-CiviCRM-dedupe-FY2017/18: Find source of unlimited dedupe queries, prevent them - https://phabricator.wikimedia.org/T175382#3592115 (Ejegg) p:Triage>High [17:34:51] (PS1) Ejegg: Log a stack trace for all dedupe queries [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/376760 (https://phabricator.wikimedia.org/T175382) [17:44:36] (CR) jerkins-bot: [V: -1] Log a stack trace for all dedupe queries [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/376760 (https://phabricator.wikimedia.org/T175382) (owner: Ejegg) [17:46:34] Fundraising-Backlog, MediaWiki-extensions-CentralNotice, Wikimedia-CentralNotice-Administration: CN Campaign Suppression prior to scheduled start time - https://phabricator.wikimedia.org/T175358#3591257 (Pcoombe) There were some issues with site reachability in Europe on September 6 and I think at ar... [17:47:40] (PS2) Ejegg: Log a stack trace for all dedupe queries [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/376760 (https://phabricator.wikimedia.org/T175382) [17:51:34] Fundraising-Backlog, MediaWiki-extensions-CentralNotice, Wikimedia-CentralNotice-Administration: CN Campaign Suppression prior to scheduled start time - https://phabricator.wikimedia.org/T175358#3592270 (Jseddon) https://pivot.wikimedia.org/#banner_activity_minutely/line-chart/2/EQUQLgxg9AqgKgYWAGgN7... [17:59:07] ejegg: o/ [17:59:21] hi cwd! [17:59:29] how's things down south? [18:00:49] doing fine down here in the big city [18:00:55] how's the wild west treating you? [18:01:52] oh good, still hot, but starting to cool off at night [18:02:23] still spending a lot of time on the construction projects? [18:02:43] things have started to calm down on that front [18:02:52] must be a relief [18:02:57] still infinity stuff to do but the basic necessities are in place [18:03:20] of course i woke up to new damage from an ant infestation in the ceiling this morning [18:03:32] they are tunneling in the rigid insulation and raining toxic dust everywhere [18:05:24] (PS4) Ejegg: Update SmashPig, don't list php-queue as dep [extensions/DonationInterface] - https://gerrit.wikimedia.org/r/376717 (https://phabricator.wikimedia.org/T133556) [18:05:38] anyway we are still bashing our heads against the replag problem [18:05:41] turning up some interesting stuff [18:08:14] yeah? [18:08:30] for instance, the replag is nothing new, i found alerts about it literally every month since i started [18:08:32] I found a place to instrument the source of some of those email deadlocks... [18:08:45] cwd oh wow, why did we suddenly notice it? [18:09:03] i found some emails between me and jeff trying to figure it out in february too [18:09:13] but, there was a huge, sustained uptick sometime in july [18:09:37] i was thinking, you'd have a lot better of an idea of what changed on the applicatiton side during july [18:09:47] looking in the process-control settings log [18:10:07] examining the repos one by one feels hopeless [18:10:14] just too much data to parse [18:10:28] the silverpop mailing data imports started in July [18:10:42] i do remember that [18:10:44] like July 24th [18:10:48] but that's pretty much time boxed right? [18:11:12] they were running twice an hour most hours [18:11:26] oh really [18:11:32] is it running anymore? [18:11:53] yes, but we're mostly caught up [18:12:02] lemme see how long they're taking these days [18:12:48] it would also be good to know if that caused excessive bloat of tables [18:12:54] in a short time [18:13:14] they don't use temp tables much afaik [18:13:22] but they would do a whole lot of inserts [18:13:48] one thing about the replag being an ongoing issue is something could have lowered the ceiling at which we will lag [18:13:53] e.g. the size of some tables [18:14:40] OK, they're taking about a minute each [18:14:50] oh, but only 1 second of that is inserts [18:15:05] the rest is API and SFTP back and forth with Silverpop [18:15:12] so yeah, they wouldn't be the problem any more [18:16:02] but we added a lot of data really fast when it started right? [18:17:19] Fundraising-Backlog, Wikimedia-Fundraising-CiviCRM, FR-Email: Omnimail recipient load (silently) broken - https://phabricator.wikimedia.org/T175394#3592379 (Ejegg) [18:17:33] cwd yeah, I think so [18:18:58] The tables are civicrm_mailing_provider_data [18:18:59] and... [18:19:15] civicrm_mailing_stats [18:19:19] are those huge? [18:19:37] Well, I gotta get some lunch [18:20:28] cool [18:20:43] what i would be most interested in would be tables that have heavy activity during periods of high load [18:20:47] not sure if those qualify or not [18:49:08] cwd load on those tables would be totally orthogonal to load on the rest of the tables that get stressed during big banner tests [18:50:06] ok, i was wondering if they were selected or joined on for any of the consumers [18:50:16] if not we can probably rule them out [18:50:21] nah, they're pretty standalong [18:50:24] *alone [18:54:33] Well, it's about time to give SmashPig a version number [18:56:40] 0.5 sound right? [19:01:00] (PS1) Ejegg: Finally set a version number - 0.5 [wikimedia/fundraising/SmashPig] - https://gerrit.wikimedia.org/r/376775 (https://phabricator.wikimedia.org/T133556) [19:52:16] Fundraising Sprint Quill Pencil, Fundraising-Backlog, MediaWiki-extensions-CentralNotice, Wikimedia-CentralNotice-Administration, Unplanned-Sprint-Work: CN Campaign Suppression prior to scheduled start time - https://phabricator.wikimedia.org/T175358#3591257 (DStrine) [20:01:43] (PS9) Ejegg: Upgrade to new minfraud Composer package [extensions/DonationInterface] - https://gerrit.wikimedia.org/r/375454 (https://phabricator.wikimedia.org/T128902) [20:25:09] Jeff_Green: bahahaha, to continue the conversation from yesterday... [20:25:13] ...fredge. [20:25:21] fredgerator [20:25:22] yes [20:25:42] As far as I know, Fredgerick replaced the need for the fraud mailer. [20:26:26] I'd like to think we got fredge up and running right about the same time the fraud mailer stopped, but in retrospect there's probably no way we were that coordinated about it. [20:26:39] (and by "we", I basically mean "me") [20:28:10] ok [20:29:43] So, yeah. The important part is being able to confirm that we stopped caring about the mailer somewhere around... three years ago? Four? [20:30:35] I think I lost track of the original thread, or maybe my brain is just broken from staring at graphs too long, does this mean there's something we can shut down now? [20:31:07] Unclear. [20:31:16] ejegg: Hellooooo. [20:31:40] Fundraising Sprint Quill Pencil, Fundraising-Backlog, Wikimedia-Fundraising-CiviCRM, FR-Ingenico, and 2 others: Don't retry recurring donations after expired card or fraud flag errors - https://phabricator.wikimedia.org/T174450#3592884 (Ejegg) Open>Resolved [20:32:05] ejegg: Is there something you found in Minfraud, that you could clean up [read: delete] around the old fraud mailer functionality? [20:46:00] K4-713: oh hi! [20:47:06] I like how we're having two conversations at once. [20:47:25] I basically just rewrote the minfraud filter: https://gerrit.wikimedia.org/r/375454 [20:57:05] I think the old mailer must have been parsing the logs [20:57:16] the old link pointed to a separate subversion repository [20:57:57] the current code does try to send off a mail right after running the filter (once per day) if the reported queries remaining is below a threshold [20:58:10] but I'm not sure if that would even work in our current prod environment [20:58:49] ejegg: That was a different mailer thingy, yes. [20:58:59] I'm surprised that's not generating warnings or something. [20:59:41] I think we haven't let it get below the threshold lately [20:59:42] ejegg: Do we still get a query count back from minfraud every time we run a query? [20:59:49] yep! [20:59:59] We should stuff that number in prometheus. [21:00:10] pleeeese. [21:00:10] oh hey, I guess we could [21:00:44] * ejegg needs to learn how to send stuff to the new system [21:04:20] i have been working on getting the python client deployed [21:04:31] where we can send arbitrary data [21:05:26] it uses some rather objectionable python idioms but seems to work well enough [21:08:15] * dstrine thinks of idioms... https://media.giphy.com/media/svAbhRO8jBMOs/giphy.gif [21:12:09] have we heard from AndyRussG ? [21:22:17] eileen: not to worry, he's safe! [21:23:20] yay [21:42:27] eileen: HI THANKS! [21:42:32] woops capslock [21:42:54] eileen: yes was about to send a reply to your very thoughtful e-mail :) [21:43:48] All good, in fact in this part of town we didn't even notice it. Other parts of Mexico City did feel it pretty strongly, but there was hardly any damage in this part of the country. Mostly in the south-west [21:43:56] thanks so much for checking :) [21:44:20] AndyRussG: good to hear [21:44:33] I looked at a map & it looked like you would be ok - but 8 is big [21:48:51] (PS5) Ejegg: Optionally send more Minfraud parameters [extensions/DonationInterface] - https://gerrit.wikimedia.org/r/275043 (https://phabricator.wikimedia.org/T128902) [21:49:25] Fundraising Sprint Quill Pencil, Fundraising-Backlog, Wikimedia-Fundraising-CiviCRM, Fr-CiviCRM-dedupe-FY2017/18, Patch-For-Review: Find source of unlimited dedupe queries, prevent them - https://phabricator.wikimedia.org/T175382#3593060 (Ejegg) a:Ejegg [21:50:12] Fundraising Sprint Quill Pencil, Fundraising-Backlog, Wikimedia-Fundraising-CiviCRM, Fr-CiviCRM-dedupe-FY2017/18, Patch-For-Review: Find source of unlimited dedupe queries, prevent them - https://phabricator.wikimedia.org/T175382#3592103 (Ejegg) Patch for review is just the diagnostic part of... [21:52:09] Fundraising Sprint Quill Pencil, Fundraising-Backlog, MediaWiki-extensions-DonationInterface, Patch-For-Review: Send more parameters to Maxmind minfraud service - https://phabricator.wikimedia.org/T128902#2089698 (Ejegg) This turned out to require upgrading our minFraud SDK and basically rewritin... [21:53:19] Fundraising Sprint Quill Pencil, Fundraising-Backlog, FR-Ingenico, MediaWiki-extensions-DonationInterface, Spike: spike: investigate creating an ingenico form with no city and state - https://phabricator.wikimedia.org/T151769#2827507 (Ejegg) p:Triage>Normal a:Ejegg [21:57:47] (PS2) Ejegg: Fix a couple auto-inserted braces, break lines [wikimedia/fundraising/SmashPig] - https://gerrit.wikimedia.org/r/376306 [21:57:49] (PS3) Ejegg: Short array syntax [wikimedia/fundraising/SmashPig] - https://gerrit.wikimedia.org/r/376307 [21:59:44] (PS5) Ejegg: WIP getHostedCheckoutStatus [extensions/DonationInterface] - https://gerrit.wikimedia.org/r/366167 (https://phabricator.wikimedia.org/T163948) [22:17:04] eileen: yeah... people in Mexico City are pretty sensitive about quakes, too, since there was a very tragic one in 1985... https://en.wikipedia.org/wiki/1985_Mexico_City_earthquake [22:17:14] a lot of folks remember that [22:18:06] I was just reading that in NZ there are a lot of quakes, too! [22:24:31] Fundraising Sprint Quill Pencil, Fundraising-Backlog, MediaWiki-extensions-CentralNotice, Wikimedia-CentralNotice-Administration, Unplanned-Sprint-Work: CN Campaign Suppression prior to scheduled start time - https://phabricator.wikimedia.org/T175358#3593149 (AndyRussG) @Jseddon hey... I'm su... [22:47:25] AndyRussG: yeah no so many quakes where I live but NZ in general has quite a few [22:47:53] enough that we here the number on an earthquake & it is meaningful to us :-) [22:48:07] right [22:48:25] Yeah here too that scale is commonly understood [22:48:56] Your home seems pretty solid in any case... Made of those nice solid Google Hangout pixels :) [22:50:13] (PS1) Ejegg: WIP allow alternate configurations with 'variant' option [extensions/DonationInterface] - https://gerrit.wikimedia.org/r/376850 (https://phabricator.wikimedia.org/T151769) [22:50:30] OK, I'm out for the weekend. Have a good one, everyone! [22:51:02] (PS1) VolkerE: Use consistent close icon [extensions/CentralNotice] - https://gerrit.wikimedia.org/r/376851 (https://phabricator.wikimedia.org/T50067) [22:51:26] (CR) jerkins-bot: [V: -1] WIP allow alternate configurations with 'variant' option [extensions/DonationInterface] - https://gerrit.wikimedia.org/r/376850 (https://phabricator.wikimedia.org/T151769) (owner: Ejegg) [23:30:08] PROBLEM - check_puppetrun on americium is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 6 minutes ago with 1 failures. Failed resources (up to 3 shown): Exec[generate-kafkatee.pyconf] [23:35:08] RECOVERY - check_puppetrun on americium is OK: OK: Puppet is currently enabled, last run 40 seconds ago with 0 failures