[01:51:18] (CR) AndyRussG: Controls to purge banner content from front-end cache for a language (6 comments) [extensions/CentralNotice] - https://gerrit.wikimedia.org/r/364910 (https://phabricator.wikimedia.org/T168673) (owner: AndyRussG) [03:07:12] (PS1) Eileen: Add filters to mailing report. [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/367847 (https://phabricator.wikimedia.org/T161758) [03:30:54] (PS1) Eileen: Omnimailing - extendeded mailing report - add suppressed [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/367848 (https://phabricator.wikimedia.org/T161758) [03:56:00] Fundraising Sprint Gondwanaland Reunification Engine, Fundraising Sprint Homebrew Hadron Collider, Fundraising Sprint Ivory Tower Defense Games, Fundraising Sprint Judgement Suspenders, and 8 others: retrieve the text/ html and statistics data for m... - https://phabricator.wikimedia.org/T161758#3473641 [03:56:33] Fundraising Sprint Loose Lego Carpeting, Fundraising Sprint Murphy's Lawyer, Fundraising Sprint Navel Warfare, Fundraising-Backlog, and 2 others: Add ability for MG to import to Primary address type - https://phabricator.wikimedia.org/T169025#3473642 (Eileenmcnaughton) [03:56:41] Fundraising Sprint Loose Lego Carpeting, Fundraising Sprint Murphy's Lawyer, Fundraising Sprint Navel Warfare, Fundraising-Backlog, and 2 others: Add ability for MG to import to Primary address type - https://phabricator.wikimedia.org/T169025#3384910 (Eileenmcnaughton) Open>Resolved [04:01:37] Fundraising Sprint Judgement Suspenders, Fundraising Sprint Kickstopper, Fundraising-Backlog, Wikimedia-Fundraising-CiviCRM, and 2 others: Silverpop Figure out how to deal with merged contacts mailing records - https://phabricator.wikimedia.org/T171703#3473652 (Eileenmcnaughton) [04:21:19] (PS1) Eileen: Fix typo causing enotice & suppressed not to populate [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/367851 (https://phabricator.wikimedia.org/T161758) [04:48:44] Fundraising Sprint Gondwanaland Reunification Engine, Fundraising Sprint Homebrew Hadron Collider, Fundraising Sprint Ivory Tower Defense Games, Fundraising Sprint Judgement Suspenders, and 8 others: Drush not handling spaces in quotes / schedule Si... - https://phabricator.wikimedia.org/T171435#3473673 [04:55:17] PROBLEM - check_mysql on frdb2001 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 2186 [04:59:13] Hmm that alert might be just the volume of mailing data being loaded [05:00:08] RECOVERY - check_mysql on frdb2001 is OK: Uptime: 1260470 Threads: 1 Questions: 46313750 Slow queries: 6781 Opens: 9557 Flush tables: 1 Open tables: 608 Queries per second avg: 36.743 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 0 [09:32:02] Fundraising-Backlog, Wikimedia-Fundraising-CiviCRM, Patch-For-Review: Populate country column when creating c_t rows during offline import - https://phabricator.wikimedia.org/T171658#3474004 (Pcoombe) [10:37:44] (Abandoned) Hashar: Jenkins job validation (DO NOT SUBMIT) [extensions/DonationInterface] - https://gerrit.wikimedia.org/r/324066 (owner: Hashar) [11:08:18] (CR) Hashar: "recheck" [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/363141 (owner: Hashar) [11:09:49] (CR) jerkins-bot: [V: -1] CI: install CiviCRM with a fake sendmail [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/363141 (owner: Hashar) [11:12:01] (PS2) Hashar: CI: install CiviCRM with a fake sendmail [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/363141 [11:13:32] (CR) jerkins-bot: [V: -1] CI: install CiviCRM with a fake sendmail [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/363141 (owner: Hashar) [11:13:51] (CR) Hashar: "The mysterious failure only occurs on integration-slave-jessie-1001 while -1002 works fine bah" [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/363141 (owner: Hashar) [11:17:39] Wikimedia-Fundraising-CiviCRM, Continuous-Integration-Infrastructure, Release-Engineering-Team (Kanban): wikimedia-fundraising-civicrm fails with Call to a member function getDriver() on null in phar:///srv/jenkins-workspace/workspace/wikimedia-fundrais... - https://phabricator.wikimedia.org/T171724#3474287 [12:07:24] Wikimedia-Fundraising-CiviCRM, Continuous-Integration-Infrastructure, Release-Engineering-Team (Kanban): wikimedia-fundraising-civicrm fails with Call to a member function getDriver() on null in phar:///srv/jenkins-workspace/workspace/wikimedia-fundrais... - https://phabricator.wikimedia.org/T171724#3474393 [12:09:15] (CR) Hashar: "it fails on both hosts :(" [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/363141 (owner: Hashar) [12:32:02] (Restored) Hashar: Jenkins job validation (DO NOT SUBMIT) [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/151840 (owner: Hashar) [12:32:08] (PS4) Hashar: Jenkins job validation (DO NOT SUBMIT) [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/151840 [12:51:44] (CR) Hashar: "Found it:" [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/363141 (owner: Hashar) [12:55:05] Wikimedia-Fundraising-CiviCRM, Continuous-Integration-Infrastructure, Release-Engineering-Team (Kanban): wikimedia-fundraising-civicrm fails with Call to a member function getDriver() on null in phar:///srv/jenkins-workspace/workspace/wikimedia-fundrais... - https://phabricator.wikimedia.org/T171724#3474628 [12:56:59] (PS1) Hashar: (DO NOT SUBMIT) Strict and verbose ci-populate-dbs [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/367886 (https://phabricator.wikimedia.org/T171724) [12:58:36] (CR) jerkins-bot: [V: -1] (DO NOT SUBMIT) Strict and verbose ci-populate-dbs [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/367886 (https://phabricator.wikimedia.org/T171724) (owner: Hashar) [13:03:44] (CR) Hashar: [C: -1] "sendmail_path is php.ini setting not an amp one. Found out via travis.yaml file:" [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/363141 (owner: Hashar) [13:05:49] Good morning MBeat! Big English banners just went up [13:06:03] thanks, pcoombe ! [13:06:37] Ilike the earlier start time [13:34:04] (PS2) Hashar: Make CI scripts more stricts [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/367886 (https://phabricator.wikimedia.org/T171724) [13:45:10] PROBLEM - check_mysql on frdb2001 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3394 [13:50:06] (PS3) Hashar: Make CI scripts more stricts [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/367886 (https://phabricator.wikimedia.org/T171724) [13:50:10] PROBLEM - check_mysql on frdb2001 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3390 [13:51:10] (PS1) Hashar: Fix misc bash oddities in the CI scripts [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/367893 [13:55:11] PROBLEM - check_mysql on frdb2001 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3376 [14:00:10] PROBLEM - check_mysql on frdb2001 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3375 [14:05:10] PROBLEM - check_mysql on frdb2001 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3369 [14:05:47] (PS3) Hashar: CI: install CiviCRM with a fake sendmail [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/363141 (https://phabricator.wikimedia.org/T161724) [14:06:27] MBeat: Big English test just finished. All looks good from my end :) [14:06:39] (PS4) Hashar: CI: install CiviCRM with a fake sendmail [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/363141 (https://phabricator.wikimedia.org/T161724) [14:06:44] great, thank you! nothing glaring in Zendesk [14:10:10] PROBLEM - check_mysql on frdb2001 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3419 [14:11:44] (PS5) Hashar: CI: install CiviCRM with a fake sendmail [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/363141 (https://phabricator.wikimedia.org/T161724) [14:13:11] (CR) jerkins-bot: [V: -1] CI: install CiviCRM with a fake sendmail [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/363141 (https://phabricator.wikimedia.org/T161724) (owner: Hashar) [14:13:32] Wikimedia-Fundraising-CiviCRM, Continuous-Integration-Infrastructure, Patch-For-Review, Release-Engineering-Team (Kanban): wikimedia-fundraising-civicrm fails with Call to a member function getDriver() on null in phar:///srv/jenkins-workspace/worksp... - https://phabricator.wikimedia.org/T171724#3474819 [14:15:10] PROBLEM - check_mysql on frdb2001 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3449 [14:15:11] PROBLEM - check_mysql on frdb1002 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1215 [14:20:10] PROBLEM - check_mysql on frdb1002 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1301 [14:20:11] PROBLEM - check_mysql on frdb2001 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3458 [14:25:10] PROBLEM - check_mysql on frdb1002 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1394 [14:25:15] PROBLEM - check_mysql on frdb2001 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3454 [14:30:10] PROBLEM - check_mysql on frdb1002 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1479 [14:30:10] PROBLEM - check_mysql on frdb2001 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3458 [14:35:10] PROBLEM - check_mysql on frdb1002 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1555 [14:35:20] PROBLEM - check_mysql on frdb2001 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3460 [14:40:10] PROBLEM - check_mysql on frdb2001 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3450 [14:40:11] PROBLEM - check_mysql on payments2001 is CRITICAL: Slave IO: Connecting Slave SQL: Yes Seconds Behind Master: (null) [14:45:10] RECOVERY - check_mysql on frdb1002 is OK: Uptime: 1296929 Threads: 1 Questions: 69285377 Slow queries: 7992 Opens: 10298 Flush tables: 1 Open tables: 610 Queries per second avg: 53.422 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 0 [14:45:20] PROBLEM - check_mysql on frdb2001 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3470 [14:45:21] PROBLEM - check_mysql on payments2001 is CRITICAL: Slave IO: Connecting Slave SQL: Yes Seconds Behind Master: (null) [14:47:24] (CR) Mepps: "One quick question but overall looks good" (1 comment) [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/365543 (owner: Eileen) [14:50:10] PROBLEM - check_mysql on frdb2001 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3498 [14:50:11] PROBLEM - check_mysql on payments2001 is CRITICAL: Slave IO: Connecting Slave SQL: Yes Seconds Behind Master: (null) [14:51:21] oh wow, those silverpop activity imports are brutal on the db... [14:55:20] PROBLEM - check_mysql on frdb2001 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3505 [14:55:20] PROBLEM - check_mysql on payments2001 is CRITICAL: Slave IO: Connecting Slave SQL: Yes Seconds Behind Master: (null) [14:57:58] hopefully we can move some of these type of alerts to prometheus and have more of a sliding scale [15:00:10] PROBLEM - check_mysql on frdb2001 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3530 [15:00:20] PROBLEM - check_mysql on payments2001 is CRITICAL: Slave IO: Connecting Slave SQL: Yes Seconds Behind Master: (null) [15:03:33] Wikimedia-Fundraising-CiviCRM, Continuous-Integration-Infrastructure, Patch-For-Review, Release-Engineering-Team (Kanban): wikimedia-fundraising-civicrm fails with Call to a member function getDriver() on null in phar:///srv/jenkins-workspace/worksp... - https://phabricator.wikimedia.org/T171724#3474977 [15:05:10] PROBLEM - check_mysql on frdb2001 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3570 [15:05:11] PROBLEM - check_mysql on payments2001 is CRITICAL: Slave IO: Connecting Slave SQL: Yes Seconds Behind Master: (null) [15:05:49] cwd ah, also this is an initial load of all of last year's bulk mailings [15:06:15] k, rain has lessened, gonna bike the rest of the way to the office [15:10:10] PROBLEM - check_mysql on frdb2001 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 3596 [15:10:11] PROBLEM - check_mysql on payments2001 is CRITICAL: Slave IO: Connecting Slave SQL: Yes Seconds Behind Master: (null) [15:15:10] RECOVERY - check_mysql on frdb2001 is OK: Uptime: 1297370 Threads: 1 Questions: 68625910 Slow queries: 7270 Opens: 10231 Flush tables: 1 Open tables: 608 Queries per second avg: 52.896 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 0 [15:15:13] AndyRussG my head is in the clowds today, do you wnat to meet? [15:15:20] RECOVERY - check_mysql on payments2001 is OK: Uptime: 1789429 Threads: 4 Questions: 18090 Slow queries: 0 Opens: 17 Flush tables: 1 Open tables: 80 Queries per second avg: 0.010 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 0 [15:26:42] hi mepps and AndyRussG [15:26:52] hi ejegg! [15:27:46] ejegg: hey, got a second to talk about the mysql issue? [15:27:55] cwd sure [15:28:18] afaik it should dissapate soon [15:28:24] basically we want to avoid excessive replag [15:28:40] if anything happened to the master at this point we'd lose data permanently [15:28:41] lemme see how far through the past year it's gotten [15:28:55] plus i have 250+ text messages from icinga [15:29:26] yeah, so we should do something special when we need to run huge initial data loads like this one [15:29:45] yeah [15:29:54] what was the actual process? a drush command? [15:30:10] PROBLEM - check_mysql on frdb2001 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1932 [15:30:18] yep, it's the new process-control job omnimail_recipient_load; [15:30:50] gotcha [15:31:29] ejegg: would you mind writing a note to tech explaining what happened so we can have a conversation about how to handle it more generally? [15:31:41] i'm sure there's no catch-all fix to long running queries [15:31:47] but we could discuss different approaches [15:32:15] cwd it might not even be a long-running query, just a sustained volume of inserts [15:33:11] it's added 22M rows to civicrm_mailing_provider_data over the past 10 hrs, and 6M to civicrm_mailing_recipients [15:34:33] mepps: oooops!!!!! [15:34:43] aaaarg [15:34:45] sorrrryy! [15:34:49] Totally spaced out [15:35:20] PROBLEM - check_mysql on frdb2001 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1289 [15:35:26] ejegg: let's stop the job and evaluate [15:35:40] how about in 1/2 hour? [15:40:10] PROBLEM - check_mysql on frdb2001 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1259 [15:40:49] cwd wrote that note [15:41:19] thanks! [15:41:19] (Abandoned) Awight: WIP: DonationForm [extensions/DonationInterface] - https://gerrit.wikimedia.org/r/61929 (owner: Awight) [15:41:28] (Abandoned) Awight: WIP device filtering in GlobalAllocation [extensions/CentralNotice] - https://gerrit.wikimedia.org/r/63100 (owner: Awight) [15:41:30] (Abandoned) Awight: [WIP] Minor payment_submethod cleanup [extensions/DonationInterface] - https://gerrit.wikimedia.org/r/64236 (owner: Awight) [15:41:33] (Abandoned) Awight: WIP Adapter is not always initialized with data [extensions/DonationInterface] - https://gerrit.wikimedia.org/r/64345 (owner: Awight) [15:41:35] (Abandoned) Awight: [WIP] GatewayAdapter::isSupported [extensions/DonationInterface] - https://gerrit.wikimedia.org/r/64872 (owner: Awight) [15:41:39] ooh, ghosts! [15:41:41] (Abandoned) Awight: WIP tests for the return_value_map [extensions/DonationInterface] - https://gerrit.wikimedia.org/r/86790 (owner: Awight) [15:41:50] (Abandoned) Awight: WIP dedupe report [wikimedia/fundraising/tools] - https://gerrit.wikimedia.org/r/93429 (owner: Awight) [15:41:54] AndyRussG, that should work, I'll be here until your time 11:45/12:45 EST [15:41:57] somebody doesn't want to be haunted anymore [15:42:08] mepps: ok thanks!!!! many apologies [15:42:17] the power of gerrit compels you! [15:42:18] AndyRussG, it's okay I spaced too [15:42:45] (Abandoned) Awight: [WIP] protect findAccount in case there is no -AccountInfo [extensions/DonationInterface] - https://gerrit.wikimedia.org/r/95873 (owner: Awight) [15:42:49] (Abandoned) Awight: WIP example worldpay audit conf [wikimedia/fundraising/tools] - https://gerrit.wikimedia.org/r/129212 (owner: Awight) [15:42:51] (Abandoned) Awight: [WIP] opt_out preferences apply to email addresses separately [wikimedia/fundraising/tools] - https://gerrit.wikimedia.org/r/133181 (owner: Awight) [15:42:55] (Abandoned) Awight: [WIP] Transparent workaround for access control [extensions/DonationInterface] (php54_test_adapter_collapse) - https://gerrit.wikimedia.org/r/133509 (owner: Awight) [15:43:37] lol [15:43:45] * awight rattles chains in the wings [15:43:50] awight: boo! [15:44:10] * awight faints at potentially running into a real ghost [15:45:18] PROBLEM - check_mysql on frdb2001 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1235 [15:49:45] Jeff_Green: ok, rescheduled it [15:49:54] oops, yaml error [15:49:57] to never? [15:50:08] PROBLEM - check_mysql on frdb2001 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1213 [15:50:09] to after work hrs [15:50:31] please just shut it off until we can evaluate [15:50:57] it really should have been shut off as soon as it was evident it was causing the replag [15:51:04] ah, ok [15:51:33] i spent a chunk of this AM looking for it but because it was small queries I didn't connect the lag to the job [15:52:48] i don't know enough about what it's doing to try to look for a better option, but it might be better to schedule downtime and do it as a bulk insert [15:54:34] i would agree that if the lag is totally unavoidable we should schedule downtime to do it [15:54:37] ok, it's out of the schedule [15:54:51] but maybe there's an easier way to throttle it on the way in [15:55:15] i imagine it'll be different for every case [15:55:57] if those API calls are each wrapped in a big txn, we could at least break those down [15:56:31] yeah, although there could be adverse effects to that too [15:56:44] like if multiple txns caused excessive index recalculation [15:56:58] oh? [15:57:14] pure conjecture [15:57:23] what's the end result in the db of the API call, does it just add a row to one table or is it touching a bunch of tables? [15:57:24] so, if a txn runs for 20 minutes, that means 20 minutes of replag, right? [15:57:51] hmm, probably depends on the isolation level of the txn [15:58:18] Jeff Green looks like inserting to two tables, both only used for this 3rd party mailer data [15:58:44] hmm, I see the inserts to one table in the code, maybe the other table is populated by a different call? [16:00:48] that's a good question re. replag, I'm a little surprised it was so bad for single row inserts [16:01:37] i wonder if it would be better to do something like temporarily disable indexes, "load data infile" or similar, and reenable indexes [16:02:36] is this a once-a-year kind of thing? or is this the beginning of something more frequent? [16:02:59] Jeff_Green: this is a once-in-a-lifetime initial load of a year's worth of data [16:03:12] ok [16:03:13] once we're up to date, we'll be loading the past half hour's worth each time [16:03:47] how much is left to do? [16:04:09] lemme see, we're up to at least the 10th of December [16:04:23] which should be the majority of the mailings [16:04:36] I'll see if I can get stats in the Silverpop console [16:04:41] k [16:05:32] mepps: anytime now is cool! [16:07:58] AndyRussG, great, meet in queenmary? [16:08:16] mepps: K! [16:08:28] * Jeff_Green afk for lunch, biab [16:53:44] Wikimedia-Fundraising-CiviCRM, Continuous-Integration-Infrastructure, Patch-For-Review, Release-Engineering-Team (Kanban): wikimedia-fundraising-civicrm fails with Call to a member function getDriver() on null in phar:///srv/jenkins-workspace/worksp... - https://phabricator.wikimedia.org/T171724#3475326 [16:55:12] PROBLEM - check_mysql on frdb1002 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1233 [16:56:44] hmmm [16:56:52] hmm indeed [16:57:06] checking on the job [16:58:35] oh hey, the report in the silverpop console came back [17:00:12] PROBLEM - check_mysql on frdb1002 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1297 [17:00:48] ejegg: INSERT IGNORE INTO civicrm_mailing_provider_dat [17:00:56] did that thing restart itself or something? [17:01:47] cwd still running the job from almost 2 hours ago [17:02:09] no idea how it decides on the batch size [17:03:12] ah, it's tracking click throughs, opens, and bounces as well [17:05:12] PROBLEM - check_mysql on frdb1002 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1370 [17:08:45] cool! silverpop -> Civi syncing? [17:09:52] awight: yeppers! [17:10:05] so far just dumping into an unconnected table [17:10:12] PROBLEM - check_mysql on frdb1002 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1444 [17:10:13] but there's a flag for 'processed into civi' [17:10:53] meaning we'll be making actual civimail records at some point, i think [17:11:03] fr-tech any news or requests for scrum of scrums? [17:11:56] nice way to do it. [17:12:37] ejegg: you could mention that we are starting the prometheus transition, i think other teams will be happy about that [17:12:44] ganglia->prometheus [17:12:49] sure thing! [17:13:12] ty! [17:15:12] PROBLEM - check_mysql on frdb1002 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1519 [17:20:12] PROBLEM - check_mysql on frdb1002 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1593 [17:23:30] cwd: How does that relate to grafana? Is the foundation also trying to go grafana -> prometheus? [17:23:46] lol, I’m still trying to sort out ganglia -> grafana [17:24:54] awight: prometheus provides the data, grafana provides the graphs [17:25:01] so they work in concert [17:25:08] aha, cool thanks [17:25:12] PROBLEM - check_mysql on frdb1002 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1667 [17:25:23] yipes—and grafana provides the… something for icinga [17:25:38] provides the extra layer of complexity ;-) [17:25:45] does it? i thought icinga remained its own thing [17:25:55] but i am only just starting this Great Adventure [17:27:07] ejegg: thx, nothing her for now! [17:27:21] haha the first step is for us to leave behind our egos, which tell us that we know anything at all [17:29:26] amen brother [17:29:45] another thing i don't know is why the replag on 1002 continues to grow [17:29:57] i can't find any drush procs [17:30:12] PROBLEM - check_mysql on frdb1002 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1747 [17:30:13] RECOVERY - check_mysql on frdb2001 is OK: Uptime: 1305470 Threads: 1 Questions: 78869929 Slow queries: 7270 Opens: 10721 Flush tables: 1 Open tables: 608 Queries per second avg: 60.414 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 0 [17:30:56] Jeff_Green: maybe i should restart replication there? [17:31:22] State: closing tables [17:31:23] cwd huh, the last run of the big recipient load finished about 20 min ago [17:31:27] i wonder if something hung up [17:31:37] cwd: I think it's just slow queries [17:32:07] isn't it weird that 2001 caught up? [17:32:37] you shouldn't need to restart replication if it's reporting that it's connected and whatnot [17:33:27] i suspect what's happening on 2001 is that the degraded RAID means disk IO is reduced [17:33:51] i haven't confirmed that but I've seen it happen before whenever hardware RAID is degraded [17:34:11] that's the weird part, 2001 is the one that caught up, 1002 is still hollering [17:34:18] orlly [17:34:33] 1002 has other jobs that 2001 doesn't [17:34:45] ah yeah good point [17:35:02] it is the read-db fqdn yeah? [17:35:12] PROBLEM - check_mysql on frdb1002 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1824 [17:35:48] processlist looks unexciting except for that "closing tables" thing that seems to be hung [17:37:08] it is the read db yeah [17:37:34] i saw a long-running "closing tables" earlier on frdb2001 [17:38:12] it would be nice to know ~which~ tables it's closing :-) [17:38:19] srsly [17:38:28] this seems like a common problem without a common solution [17:40:12] RECOVERY - check_mysql on frdb1002 is OK: Uptime: 1307429 Threads: 1 Questions: 79100687 Slow queries: 7998 Opens: 10777 Flush tables: 1 Open tables: 610 Queries per second avg: 60.500 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 0 [17:40:31] welp [17:40:36] i guess it was just really slow [17:41:35] Jeff_Green: you didn't turn any knobs to fix that did you? [17:41:46] nope [17:42:12] and the next time I checked after that process ended, there's a new one this time with the query [17:43:47] yeah i see that [18:03:56] (CR) Ejegg: [C: 2] Fix typo causing enotice & suppressed not to populate [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/367851 (https://phabricator.wikimedia.org/T161758) (owner: Eileen) [18:08:35] (CR) Mepps: [C: 2] Update SmashPig and DonationInterface [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/365069 (owner: Ejegg) [18:09:22] woohoo, thanks mepps! [18:09:58] cwd once i deploy ^^^ we can get rid of the legacy SmashPig.yaml [18:11:40] fr-tech i have to take john to the doctor during standup today, will catch up on work a bit later this evening after james goes to sleep [18:11:54] (Merged) jenkins-bot: Fix typo causing enotice & suppressed not to populate [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/367851 (https://phabricator.wikimedia.org/T161758) (owner: Eileen) [18:11:54] fr-tech i'm still here until 3 [18:18:06] (Merged) jenkins-bot: Update SmashPig and DonationInterface [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/365069 (owner: Ejegg) [18:33:42] mepps: hope everything's OK! [18:34:10] AndyRussG, it is, it's a foot injury, not a critical illness [18:39:03] hope it heals up quick! foot injuries are still a bummer [18:44:29] ejegg true! yes i'm hoping it's nothing too major before all our travels coming up! [18:45:46] mepps: ah... hope it gets better in a skip, hop and a jump :) [18:54:04] ejegg: mepps: am I crazy, or is there something fundamentally wrong about this? https://github.com/wikimedia/mediawiki-extensions-CentralNotice/blob/e2a4ff9f87e9ee5a9daf5886e2bf0b7a64c8000f/special/SpecialCentralNotice.php#L820-L829 [18:54:13] AndyRussG, that might make it worse ;) [18:54:42] awww :( sorry didn't mean it that way! [18:56:06] about the CN code ^ seems that after a banner save, it displays may fields in the form just based on what was sent in the post request to save, not what was actually saved in the DB [18:56:19] so users might thing that settings were saved correctly, even if they weren't! [18:56:43] AndyRussG ahh yeah that's a good catch [18:58:12] maybe it was a "workaround" for DB replication lag... though it shouldn't be an issue, in theory, because something behind the scenes is supposed to make sure a user gets an up-to-date snapshot after they save [18:58:25] (something in our DB infrastructure, IIRC) [19:04:04] hmmm not sure that code ever runs, now [19:08:07] hmm that seems problematic too [19:09:40] welcome to CentralNotice! [19:10:10] (PS1) Ejegg: Update SmashPig, DonationInterface, and dependencies [wikimedia/fundraising/crm/vendor] - https://gerrit.wikimedia.org/r/367943 [19:10:16] * AndyRussG tries to escape silly bitterness [19:10:25] (CR) Ejegg: [C: 2] Update SmashPig, DonationInterface, and dependencies [wikimedia/fundraising/crm/vendor] - https://gerrit.wikimedia.org/r/367943 (owner: Ejegg) [19:11:35] (PS1) Ejegg: Merge branch 'master' into deployment [wikimedia/fundraising/crm] (deployment) - https://gerrit.wikimedia.org/r/367944 [19:11:55] aaarg, no, it does run [19:13:18] (CR) Ejegg: [C: 2] Merge branch 'master' into deployment [wikimedia/fundraising/crm] (deployment) - https://gerrit.wikimedia.org/r/367944 (owner: Ejegg) [19:19:38] (Merged) jenkins-bot: Update SmashPig, DonationInterface, and dependencies [wikimedia/fundraising/crm/vendor] - https://gerrit.wikimedia.org/r/367943 (owner: Ejegg) [19:19:40] (Merged) jenkins-bot: Merge branch 'master' into deployment [wikimedia/fundraising/crm] (deployment) - https://gerrit.wikimedia.org/r/367944 (owner: Ejegg) [19:23:37] Fundraising-Backlog, MediaWiki-extensions-CentralNotice: CentralNotice: On saving a banner, form shows values from save request without checking DB - https://phabricator.wikimedia.org/T171774#3475902 (AndyRussG) [19:23:59] ejegg: mepps: ^ task for the abovmentioned shiew [19:24:08] *ishiew [19:24:56] !log disabled queue consumers for CiviCRM update [19:25:05] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:25:06] eschew? [19:35:04] (PS1) Ejegg: hack out php54 polyfill stuff [wikimedia/fundraising/crm/vendor] - https://gerrit.wikimedia.org/r/367948 [19:35:14] (CR) Ejegg: [C: 2] hack out php54 polyfill stuff [wikimedia/fundraising/crm/vendor] - https://gerrit.wikimedia.org/r/367948 (owner: Ejegg) [19:35:45] (PS1) Ejegg: Update vendor (get rid of php54 polyfill includes) [wikimedia/fundraising/crm] (deployment) - https://gerrit.wikimedia.org/r/367950 [19:35:53] (CR) Ejegg: [C: 2] Update vendor (get rid of php54 polyfill includes) [wikimedia/fundraising/crm] (deployment) - https://gerrit.wikimedia.org/r/367950 (owner: Ejegg) [19:43:09] (Merged) jenkins-bot: hack out php54 polyfill stuff [wikimedia/fundraising/crm/vendor] - https://gerrit.wikimedia.org/r/367948 (owner: Ejegg) [19:43:11] (Merged) jenkins-bot: Update vendor (get rid of php54 polyfill includes) [wikimedia/fundraising/crm] (deployment) - https://gerrit.wikimedia.org/r/367950 (owner: Ejegg) [19:46:30] (PS1) Ejegg: Fixes for SmashPig update [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/367953 [19:47:11] oh hey, is fr-tech standup even happening today? [20:07:52] (PS4) Ejegg: Make CI scripts more stricts [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/367886 (https://phabricator.wikimedia.org/T171724) (owner: Hashar) [20:08:40] (CR) Ejegg: "Thanks, hashar!" [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/367886 (https://phabricator.wikimedia.org/T171724) (owner: Hashar) [20:09:55] (CR) Eileen: Update Omnimail GET to add rml fields (1 comment) [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/365543 (owner: Eileen) [20:10:18] (CR) Ejegg: [C: 2] Make CI scripts more stricts [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/367886 (https://phabricator.wikimedia.org/T171724) (owner: Hashar) [20:16:35] (Merged) jenkins-bot: Make CI scripts more stricts [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/367886 (https://phabricator.wikimedia.org/T171724) (owner: Hashar) [20:16:37] Fundraising-Backlog, fundraising-tech-ops: process-control repeated failure handling - https://phabricator.wikimedia.org/T161567#3476087 (cwdent) p:Normal>High Today's mailstrom (ha ha) warrants re-prioritizing this issue. p-c should stop jobs at a fail mail threshold, something like 5 mails in... [20:16:53] dstrine: i hope it is ok that i moved this to high priority ^ [20:17:39] we have succumbed to "alert fatigue" on a few fronts and are seeing too many meaningless ones at this point [20:21:51] cwd I was just talking about that in standup [20:22:08] at least as concerns paypal audit parsing [20:22:36] awesome [20:22:57] seems like some annoying but predictable errors in the audit files? [20:23:03] (PS4) Eileen: Update Omnimail GET to add rml fields [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/365543 [20:24:13] cwd yeah, it's a recurring thing where recurring payments are missing the subscription identifier [20:24:22] meta-recurring bug [20:25:28] think it's safe to catch it and bail in a way that doesn't send mail? [20:25:45] don't want to make it too accepting of bad data or anything [20:25:50] but seems like it happens regularly [20:25:59] and i doubt we are going to get them to fix it [20:26:44] cwd we need to email them to get corrected files whenever it happens [20:27:01] and to yell at them some more about fixing the underlying bug [20:27:36] aah sure [20:27:43] so we don't want to just mask the failure [20:27:54] right [20:29:18] (PS2) Ejegg: Fix misc bash oddities in the CI scripts [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/367893 (owner: Hashar) [20:29:40] (CR) Ejegg: [C: 2] "Oh hey, that's a really handy tool!" [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/367893 (owner: Hashar) [20:30:28] ejegg: is a general purpose mechanism to shut p-c jobs off on excessive failure better for this case? [20:31:29] that would be great for this morning's "aborting, still running" errors [20:32:07] for the audit parsing, I think we'd want the parser itself to decide how many bad lines are too many and to shut itself down [20:32:19] at a predictable point rather than just being killed [20:32:34] sounds reasonable [20:32:39] catch that exception in a loop? [20:33:04] ejegg: thank you for the civicrm reviews / +2 [20:33:14] yeah, it's actually caught in a loop right now, there's just nothing accumulating the failures for a batch failmail or an action on too many [20:33:15] ejegg: sorry for the ton of spam I have emitted earlier today :( [20:33:25] hashar: thank you for the improvements! [20:33:47] heh, and whatever spam CI emitted was drowned out by fundraising's own monitoring spam [20:34:03] ejegg: all of that to move the civicrm job to Nodepool instances (and ensure jobs start with a fresh env on every build) [20:34:07] ahah [20:35:02] oh cool, so they'll be able to run on nodepool now! [20:35:31] (Merged) jenkins-bot: Fix misc bash oddities in the CI scripts [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/367893 (owner: Hashar) [20:36:37] hashar: oh yeah, looks like that 'move to nodepool' patch has been in the works since before we updated things to request php5.6 [20:37:00] which should still be fine on nodepool, right? [20:43:16] ejegg: hopefully :] [20:43:32] I will probably craft a transient job to test it is working all fine [20:43:34] then switch [20:43:54] cool [20:43:54] I dont want you people to be blocked by CI randomly voting -1 on everything [20:44:05] heh, that would indeed be an impediment [20:45:09] (PS6) Hashar: CI: install CiviCRM with a fake sendmail [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/363141 (https://phabricator.wikimedia.org/T161724) [20:47:20] (CR) Hashar: "Untested. That is meant to let me move the wikimedia-fundraising-civicrm job toward Nodepool instances and thus ensure a clean env on ever" [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/363141 (https://phabricator.wikimedia.org/T161724) (owner: Hashar) [20:47:42] ejegg: and that last one, I havent tested it at all. I think I will take care of it tomorrow [20:47:52] k, have a good evening! [20:48:06] if the CI pass on an instance that lacks sendmail, I think I will +2 it [20:48:11] and then migrate the jenkins job [20:48:25] but yeah tomorrow. I dont want to break anything when developers are active :] [20:48:29] thanks for all the reviews! [20:48:46] thanks again for keeping our stuff up to date [21:08:37] (CR) Ejegg: "Looks great, just a few questions inline" (4 comments) [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/365543 (owner: Eileen) [21:13:35] ejegg: cwd just catching up here. RE: T161567 does this need to be done soon? Please note I don't expect a lot to get done next sprint cause it's wikimania time [21:13:36] T161567: process-control repeated failure handling - https://phabricator.wikimedia.org/T161567 [21:13:53] dstrine: it's another nice-to-have [21:14:01] ok [21:14:04] but even if we code it, I'm not sure how long it'll take to get out [21:14:09] hmm [21:14:16] ok [21:14:27] since deployment of that particular tool seems to be a huge headache for ops [21:22:37] heh, we can make time [21:22:43] saves headaches like this morning [21:23:04] I'm more concerned about that die-silently-on-bad-utf8 bug that's still out there [21:23:36] that's cause you didn't get 250 text messages at 7am today :) [21:23:53] but srsly we can roll up a new p-c soon [21:24:21] hopefully knock out all those things [21:27:49] guys just retrying the silverpop get with a shorter time period (from command line not scheduled) to see if that gets through without hurting stuff [21:29:15] (PS10) Ejegg: Unify queue message handling with SmashPig [extensions/DonationInterface] - https://gerrit.wikimedia.org/r/355453 (https://phabricator.wikimedia.org/T95647) [21:29:54] eileen: cool, yeah doing it in smaller bits that don't cause the replag is a fine solution too [21:30:45] cwd right - the issue is to find the tolerance point - because unless I magically find it first go there will be a few rounds of causing annoyance while I figure it out [21:31:01] (PS3) Ejegg: Add country to c_t rows created during imports [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/367806 (https://phabricator.wikimedia.org/T171658) [21:31:09] btw - I didn't get the emails about the delay - I thought I used to? [21:31:17] it will be a bit of a moving target too in case it coincides with some other job [21:31:32] eileen: i can check on that [21:31:53] cwd yeah - possibly - although I think most ohter jobs the sore point is not the db replication [21:32:05] my basic instructions at this point are to kill any process or query that creates replag warnings [21:32:15] ah ok [21:32:26] well hopefully my 12 hour one will sneak through [21:32:32] :) [21:36:00] eileen: as far as emails, i'm pretty sure it's a prod puppet thing so i'll have to file a ticket, also pretty sure it's just me and jeff right now cause other folks were getting annoyed [21:37:26] gotta run, back in a bit [21:44:23] (PS4) Ejegg: Add country to c_t rows created during imports [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/367806 (https://phabricator.wikimedia.org/T171658) [21:46:58] (CR) Eileen: "My head hurts - my reply was quite long & it disappeared!" (3 comments) [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/365543 (owner: Eileen) [21:53:32] cwd - my 12 hour run survived without any emails it seems! [21:55:51] what I notices is that on staging the number of rows in that table is still going up even though finished on live - so I guess there is some replication lag - but not triggered the concern yet [21:57:52] (PS1) Ejegg: clean up insert_contribution_tracking signature [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/368103 [21:59:43] (PS5) Ejegg: Add country to c_t rows created during imports [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/367806 (https://phabricator.wikimedia.org/T171658) [22:00:00] hmm - not fully caught up yet - I guess it could still cross into the dreaded ichinga terrory [22:00:18] (meant to say territory but I kinda like terrory) [22:01:01] hehe [22:07:54] oh - it's stopped updating on dev - survived - will try another 12 hours from the command line [22:09:11] eileen: ack, just realized the queue consumers are still off from my update attempt earlier! [22:09:41] would you mind blessing https://gerrit.wikimedia.org/r/367953 ? [22:09:51] then I'll re-deploy and turn stuff back on slowly [22:10:42] ok looking now [22:10:45] Thanks! [22:11:49] (PS6) Ejegg: Add country to c_t rows created during imports [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/367806 (https://phabricator.wikimedia.org/T171658) [22:13:17] Fundraising-Backlog, Wikimedia-Fundraising-CiviCRM: Direct Mail Appeal not reflecting in contribution records for event - https://phabricator.wikimedia.org/T171794#3476345 (LeanneS) [22:21:27] (CR) Eileen: [C: 2] "This seems consistent with other changes around the queue. The change is only really changing" [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/367953 (owner: Ejegg) [22:26:13] thanks eileen ! [22:27:28] (Merged) jenkins-bot: Fixes for SmashPig update [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/367953 (owner: Ejegg) [22:29:07] (PS1) Ejegg: Merge branch 'master' into deployment [wikimedia/fundraising/crm] (deployment) - https://gerrit.wikimedia.org/r/368105 [22:29:15] (CR) Ejegg: [C: 2] Merge branch 'master' into deployment [wikimedia/fundraising/crm] (deployment) - https://gerrit.wikimedia.org/r/368105 (owner: Ejegg) [22:29:41] trying another silverpop job [22:29:46] 12 hours [22:29:50] cool cool [22:30:02] (Merged) jenkins-bot: Merge branch 'master' into deployment [wikimedia/fundraising/crm] (deployment) - https://gerrit.wikimedia.org/r/368105 (owner: Ejegg) [22:30:07] will nurse it through peak big english & then schedule again [22:30:32] (PS1) Eileen: Update silverpopXmlConnector [wikimedia/fundraising/crm/vendor] - https://gerrit.wikimedia.org/r/368106 [22:30:36] eileen: I'm about to deploy that update again [22:30:47] unless you think it'll disrupt your running job [22:31:51] I don't think it will - the bottleneck seems to just be communicating db transactions to the other db [22:32:00] & that won't be a massive volume will it? [22:33:06] (CR) jerkins-bot: [V: -1] Unify queue message handling with SmashPig [extensions/DonationInterface] - https://gerrit.wikimedia.org/r/355453 (https://phabricator.wikimedia.org/T95647) (owner: Ejegg) [22:34:10] !log updated CiviCRM from 461900edc1e6f2443894b41c4bfa1c88160f9096 to fb83798f068ba3365a286e7f131eb5eb5b0e7aae [22:34:20] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:34:59] ok, that seemed not to break everything [22:35:16] phew [22:35:32] I feel like it would be a hard fail if it were one [22:35:59] yeah, we're only actually touching SmashPig stuff in the queue consumers and a few other places [22:36:15] I just tested the damaged message db re-queueing, and that worked [22:36:21] great! [22:36:29] so I'm going to turn on the queue consumers, starting with antifraud/init [22:36:36] I think I need to understand smash pig better [22:36:52] just want to get silverpop all dusted though at the moment [22:37:12] yeah, good call. smashpig is still kind of a basket of functionality [22:38:07] oh you merged that suppressed fix! I should get that deployed [22:38:13] !log reactivated antifraud / payment-init queue consumer [22:38:24] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:38:25] eileen that just went out along with the smashpig update! [22:38:39] sorry, should have mentioned it [22:39:39] ah great [22:39:47] I'll need to rerun some grabs [22:40:15] although it won't show on the report I want to send caitling to review without this https://gerrit.wikimedia.org/r/#/c/367848/ [22:40:39] failmail on queue just now? [22:43:59] eileen: oops, that was my fault - I ran the thing manually cause I was impatient waiting for the cronjob to fire [22:44:25] yep, looking at the filters and that one right now! [22:44:37] ok, queue consumers look fine, I'll turn the rest back on [22:46:20] !log reactivated remaining fundraising queue consumers [22:46:29] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:48:35] (PS1) Ejegg: Fix blank i18n message added by TranslateWiki [extensions/DonationInterface] - https://gerrit.wikimedia.org/r/368108 [22:48:40] (CR) Ejegg: [C: 2] Fix blank i18n message added by TranslateWiki [extensions/DonationInterface] - https://gerrit.wikimedia.org/r/368108 (owner: Ejegg) [22:49:16] (PS11) Ejegg: Unify queue message handling with SmashPig [extensions/DonationInterface] - https://gerrit.wikimedia.org/r/355453 (https://phabricator.wikimedia.org/T95647) [22:51:18] (CR) jerkins-bot: [V: -1] Fix blank i18n message added by TranslateWiki [extensions/DonationInterface] - https://gerrit.wikimedia.org/r/368108 (owner: Ejegg) [22:51:56] (PS2) Ejegg: Fix blank i18n message added by TranslateWiki [extensions/DonationInterface] - https://gerrit.wikimedia.org/r/368108 [22:52:13] (PS12) Ejegg: Unify queue message handling with SmashPig [extensions/DonationInterface] - https://gerrit.wikimedia.org/r/355453 (https://phabricator.wikimedia.org/T95647) [22:52:34] cd [22:52:37] derp [22:53:19] * ejegg searches for the irssi plugin which asks for confirmation when you enter a valid command line [23:01:32] (CR) Ejegg: "filters work great!" (2 comments) [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/367847 (https://phabricator.wikimedia.org/T161758) (owner: Eileen) [23:01:36] (PS2) Ejegg: Add filters to mailing report. [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/367847 (https://phabricator.wikimedia.org/T161758) (owner: Eileen) [23:01:42] (CR) Ejegg: [C: 2] Add filters to mailing report. [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/367847 (https://phabricator.wikimedia.org/T161758) (owner: Eileen) [23:01:59] (PS2) Ejegg: Omnimailing - extendeded mailing report - add suppressed [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/367848 (https://phabricator.wikimedia.org/T161758) (owner: Eileen) [23:02:06] (CR) Ejegg: [C: 2] Omnimailing - extendeded mailing report - add suppressed [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/367848 (https://phabricator.wikimedia.org/T161758) (owner: Eileen) [23:03:53] (CR) Ejegg: [C: 2] "not sure why composer decided to shuffle installed.json, but this looks fine!" [wikimedia/fundraising/crm/vendor] - https://gerrit.wikimedia.org/r/368106 (owner: Eileen) [23:05:57] (PS5) Ejegg: Update Omnimail GET to add rml fields [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/365543 (owner: Eileen) [23:06:53] (CR) Ejegg: [C: 2] "Looks ready for a road test!" [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/365543 (owner: Eileen) [23:07:16] ok eileen, i'm heading out for now [23:07:34] ejegg: thanks - I've still got some tweaks to do on that groupmember get, but I think having it merged up to date is cleaner [23:08:05] (Merged) jenkins-bot: Add filters to mailing report. [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/367847 (https://phabricator.wikimedia.org/T161758) (owner: Eileen) [23:14:15] (Merged) jenkins-bot: Omnimailing - extendeded mailing report - add suppressed [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/367848 (https://phabricator.wikimedia.org/T161758) (owner: Eileen) [23:15:10] PROBLEM - check_mysql on frdb2001 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 2758 [23:18:10] (Merged) jenkins-bot: Update silverpopXmlConnector [wikimedia/fundraising/crm/vendor] - https://gerrit.wikimedia.org/r/368106 (owner: Eileen) [23:20:10] PROBLEM - check_mysql on frdb2001 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 2733 [23:20:21] (Merged) jenkins-bot: Update Omnimail GET to add rml fields [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/365543 (owner: Eileen) [23:24:00] (PS1) Eileen: Merge branch 'master' of https://gerrit.wikimedia.org/r/wikimedia/fundraising/crm into deployment [wikimedia/fundraising/crm] (deployment) - https://gerrit.wikimedia.org/r/368111 [23:25:20] PROBLEM - check_mysql on frdb2001 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1947 [23:25:39] (CR) Eileen: [C: 2] Merge branch 'master' of https://gerrit.wikimedia.org/r/wikimedia/fundraising/crm into deployment [wikimedia/fundraising/crm] (deployment) - https://gerrit.wikimedia.org/r/368111 (owner: Eileen) [23:26:27] (Merged) jenkins-bot: Merge branch 'master' of https://gerrit.wikimedia.org/r/wikimedia/fundraising/crm into deployment [wikimedia/fundraising/crm] (deployment) - https://gerrit.wikimedia.org/r/368111 (owner: Eileen) [23:27:10] (PS1) Eileen: Update vendor submodule e2f13e9 Update silverpopXmlConnector [wikimedia/fundraising/crm] (deployment) - https://gerrit.wikimedia.org/r/368112 [23:27:26] (CR) Eileen: [C: 2] Update vendor submodule e2f13e9 Update silverpopXmlConnector [wikimedia/fundraising/crm] (deployment) - https://gerrit.wikimedia.org/r/368112 (owner: Eileen) [23:28:04] (Merged) jenkins-bot: Update vendor submodule e2f13e9 Update silverpopXmlConnector [wikimedia/fundraising/crm] (deployment) - https://gerrit.wikimedia.org/r/368112 (owner: Eileen) [23:30:10] PROBLEM - check_mysql on frdb2001 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1923 [23:30:11] PROBLEM - check_mysql on frdb1002 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1210 [23:33:39] !log civicrm update from fb83798f068ba3365a286e7f131eb5eb5b0e7aae to e83c012581305012145eae45495e7e8ea6f4e249 [23:33:51] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [23:35:10] PROBLEM - check_mysql on frdb1002 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1355 [23:35:10] PROBLEM - check_mysql on frdb2001 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1891 [23:40:10] PROBLEM - check_mysql on frdb1002 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1506 [23:40:11] PROBLEM - check_mysql on frdb2001 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1872 [23:45:10] PROBLEM - check_mysql on frdb1002 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1662 [23:45:11] PROBLEM - check_mysql on frdev1001 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1320 [23:45:11] PROBLEM - check_mysql on frdb2001 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1836 [23:50:10] PROBLEM - check_mysql on frdb1002 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1810 [23:50:11] RECOVERY - check_mysql on frdev1001 is OK: Uptime: 1329089 Threads: 1 Questions: 86941942 Slow queries: 21378 Opens: 12265 Flush tables: 1 Open tables: 1009 Queries per second avg: 65.414 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 0 [23:50:12] PROBLEM - check_mysql on frdb2001 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1805 [23:55:11] PROBLEM - check_mysql on frdb1002 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1962 [23:55:12] RECOVERY - check_mysql on frdb2001 is OK: Uptime: 1328570 Threads: 1 Questions: 84756143 Slow queries: 7270 Opens: 11624 Flush tables: 1 Open tables: 608 Queries per second avg: 63.795 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 0