[00:00:09] PROBLEM - check_mysql on frdb1002 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1266 [00:03:14] :| [00:05:08] PROBLEM - check_mysql on frdb1002 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 1430 [00:10:09] RECOVERY - check_mysql on frdb1002 is OK: Uptime: 1724531 Threads: 1 Questions: 177637565 Slow queries: 1160292 Opens: 6703 Flush tables: 1 Open tables: 779 Queries per second avg: 103.006 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 0 [00:10:57] oh snap [00:11:25] ah right, that was just the log table deletes [00:14:10] yeah looked fairly normal [09:16:56] morning peoples [12:53:46] hi jgleeson! [13:07:58] hey mepps :) [13:08:05] how are you? [13:08:47] possibly slightly sick jgleeson :( but i wound up sleeping in this morning [13:08:54] and it's only slightly [13:09:03] aww no :( [13:09:17] hope it doesn't develop into anything bad [13:09:30] nothing worse when it's cold [13:10:15] I'm considering using the gym today for the first time in a few months [13:10:35] my body is not looking forward to it [13:11:49] oh nice! [13:12:23] haha hopefully you'll feel better afterwards [13:13:22] yeah it's always slow getting back into the swing at first but pays off [15:07:50] howdy ejegg :) [15:17:55] hi jgleeson [15:18:03] how's it going today? [15:21:12] it's going good so far, just working on the compound type stats stuff we discussed yesterday. It's coming together now so hopefully be able to demo it later today to the team [15:21:18] great! [15:23:26] hi AndyRussG :) [15:23:36] jgleeson: morning! [15:23:40] Or afternoon, for you :) [15:23:43] How's it going? [15:24:01] yup 3:23pm here :) morning to you [15:24:06] ;) [15:24:28] it's going good so far, how about you.. just getting coffee I guess? [15:24:41] Heh well I've been up for 3 hours, but just really getting to do some work now [15:24:53] And yeah just got some coffee [15:25:11] how is dog'o [15:25:15] I was hoping that all the dog-walking would be done by my kids on days when they're here with me [15:25:23] lol [15:25:35] However, adolescent laziness syndrome apparently has set in early with my 12-year old [15:25:50] haha [15:26:04] ah 12, I bet that's a challenge [15:26:13] hard enough at 2 for me [15:26:26] or 23 months :) [15:26:30] AndyRussG: you got a dog? [15:27:01] hehe it's like moving up levels in a video game... aaaarg I have tons of non-work stuff flying about... At 2 pm (12 pm SF time) I have to go see the landlady of the house I want to rent, and before that the owner of the apartment I'm leaving is coming over [15:27:09] cwd: yeah!!! We adopted a stray last week!!!! [15:27:18] awww :) [15:27:25] His name is Toffee, but I mostly call him Doggo [15:27:32] sounds like a busy one Andy! [15:28:14] jgleeson: yeah... Fortunately after my kids go with their mom on Saturday evening, I'll have all Sunday to make up for lost time [15:28:28] Today also the kids are home! It's a holiday for "Day of the Dead" here [15:28:44] https://en.wikipedia.org/wiki/Day_of_the_Dead [15:29:29] ah I see [15:29:30] cwd: I'll send you a pic in a bit, eh? He's a mid-sized short-haired mix of who knows what breeds [15:29:34] interesting stuff [15:29:43] jgleeson: yeah! it's an interesting festivity [15:29:51] AndyRussG: i'd love that! [15:30:03] was he literally wandering the street and you took him in? [15:30:16] yesterday was the whole festival at Cecilia's school, which also lasted a bit longer that I'd hoped [15:31:07] cwd: yeah, wandering around the apartment complex I'm in. Sofía, my older daughter, especially insisted we take him in. She and her sister played with the dog around the apartment complex for a few hours, and when it was clear he was friendly, we took him to the vet to be checked out [15:31:33] The vet thinks he's around 3 years old. He's fine with being on a leash and has never done his business in the house [15:31:41] And very sweet [15:32:18] awesome :) [15:32:22] yeah! [15:34:28] jgleeson: ejegg: mepps: cwd is there an overview somewhere of our Grafana and related back-end (Prometheus?) infrastructure? Just to try to keep it in mind, for possible integration, as I look at https://phabricator.wikimedia.org/T178930... thx!!! [15:35:42] AndyRussG: nothing visual that I know of [15:35:51] no, but there should be [15:36:06] * cwd makes note [15:36:18] but I know we're doing some of our reporting by writing textfiles to a directory on the Civi box [15:36:24] AndyRussG: at this point we are using prod grafana [15:36:34] which is scraped once a minute by the main prometheus server [15:36:46] but we have a request in for some new computers of our own so we can run a private instance [15:37:22] https://wikitech.wikimedia.org/wiki/Prometheus then? [15:37:26] the load balancers (pay-lvs) all run prometheus [15:37:36] our prometheus servers are separate from prod [15:37:49] prod grafana is set up to scrape them for dat [15:37:51] a [15:38:15] Ah OK so we have our own Prometheus server but not our own Grafana, but we will eventually? [15:38:39] yessir [15:38:43] cool! [15:38:55] so basically each server runs a daemon that exposes a dir of text files [15:39:01] that is the prometheus "client" [15:39:16] and the prometheus servers collect that data, and grafana displays it [15:39:57] so it's pretty easy to have whatever process dump textfiles in that dir [15:40:20] So, basically, a data collector script (or config?) for Prometheus can be just anything that writes a text file in a specific format? Any requirements require wrt language or synchronicity, coordination stuff? [15:40:48] The idea I have is a common lib analytics that can be used both by Prometheus and from within a Jupyter notebook [15:41:02] https://wikitech.wikimedia.org/wiki/SWAP [15:41:20] And should be able to access Analytics's Druid and Hive stores [15:41:24] AndyRussG the component I am working on with hopefully make record and writing stats data out to Prometheus easy for all [15:41:30] recording* [15:41:35] that should be just fine, the format is totally generic [15:41:39] the file looks like: [15:41:41] key val [15:41:43] key val [15:42:21] cwd: jgleeson cool! [15:42:42] jgleeson: which repo were you working in? A temporary github space was it? [15:42:44] it is meant to display "time series" data, which is to say snapshots every so often, but ejegg has made it do other things as well by manipulating the timestamps [15:42:49] yup [15:42:55] will post [15:43:25] jgleeson: thx! [15:43:51] just gonna push latest changes [15:43:59] good time to run into an exception :) [15:44:55] jgleeson: no rush! [15:54:07] Fundraising Sprint Uptight Piano, Fundraising-Backlog, FR-Adyen, FR-Smashpig, and 2 others: Adyen jobs should retry at least once on connect failure - https://phabricator.wikimedia.org/T177893#3729928 (mepps) Oops! The merged task was supposed to be tagged to https://phabricator.wikimedia.org/T17... [15:54:38] AndyRussG: https://github.com/jackgleeson/stats-collector [15:54:45] samples.php is a good place to start for an idea of the API [15:57:10] jgleeson: thanks!!! [17:29:17] ejegg: o/! [17:29:22] I’m poking at the process-control PR [17:29:38] Probably running into the same issues you were in actually building the deb though. [17:29:44] Have you seen this one? > dh_installexamples -O--buildsystem=pybuild [17:29:44] examples/job.example.yaml: 7: examples/job.example.yaml: name:: not found [17:29:45] examples/job.example.yaml: 13: examples/job.example.yaml: Syntax error: newline unexpected [17:29:46] dh_installexamples: debian/examples (executable config) returned exit code 2 [17:29:49] That’s odd. [17:30:42] Looks like it’s trying to read the .yaml files as something they’re not. [17:31:41] hi awight ! [17:32:23] weird, though I hadn't even gotten that far... [17:32:34] There was a debian/docs glitch, I had to remove the file cos apparently anything beginning README* is included by default. [17:32:46] I can help you with the env setup if it’s a good time. [17:32:55] sure! [17:33:27] apt install devscripts build-essential lintian, mostly [17:59:08] fundraising-tech-ops, Operations, netops, ops-eqiad: connect second interface for each frack to opposite switch for each eqiad host - https://phabricator.wikimedia.org/T176975#3643120 (Cmjohnson) the 2nd interfaces are connected, updated the switch descriptions, I did not enable the ports. [18:07:51] ejegg: Holler when you’re wifi’d and have those pkgs…. The next step is just to “make deb” [18:08:10] It might complain about a version mismatch, if the debian/ version > the Makefile, something I need to hack better. [18:08:23] in that case, just increment the Makefile version to 1.0.6 [18:08:24] awight: got 'em installed! everything go smoothly with your deployment? [18:08:27] oh nice [18:08:41] I… am pretty sure but we have this wacky 5xx graph that hasn’t settled yet. [18:09:04] The problem is that it does some kind of crappy moving average, maybe? Whatever it’s doing, it doesn’t give real numbers. [18:09:23] sorry, where exactly does the process-control source itself go in relation to the deb packaging dir? [18:11:21] ah, clone process-control and then clone process-control-debian to process-control/debian [18:11:29] thanks! [18:11:45] The independent repo thing is annoying, but I see why Debian wants it that way. [18:12:32] heh, it's more humble than insisting every project put your OS's specific files in with the rest of its source [18:13:29] hmm, python3-all is a build-dep? think I'll override that... [18:13:45] good point [18:13:51] oh, that's just 1k, thought it meant ALL the python3 packages [18:14:31] Note this though, https://wiki.debian.org/Python/LibraryStyleGuide#Build-Depends [18:14:42] cool [18:15:01] ack, rain, really do have to relocate [18:15:03] brb! [18:15:07] haha [18:15:21] Cloud doesn’t seem intent on going in my direction [18:15:30] The microclimates here are cray cray [18:16:50] hey ejegg, I'm gonna pop out to pick up family but will be back in an hour so would be good to catch you for 10 minutes before I shoot to chat about the stats work :) latest changes are here (updates to samples) if you wanna have a play before I'm back [18:17:16] catch you for 10 minutes, when I'm back in an hour... sorry I know that wasn't clear [18:17:22] also the link https://github.com/jackgleeson/stats-collector :) [18:17:26] ttyl [18:30:28] awight: here's my latest error message - add_directory() only handles directories at /usr/share/perl5/Dpkg/Source/Package/V2.pm line 602. [18:32:39] ah, oops, can't be a symlink [18:33:33] wat. [18:33:48] ah some dpkg internals [18:34:00] I’m gonna relocate, back in 15 [18:34:26] awight: built! [18:34:33] oh snaps, you’re ahead of me then [18:34:38] lessee if it installs those completion scripts... [18:34:39] g'luck! [18:34:45] rad addition. [18:44:29] (PS1) Jforrester: Replace OutputPage::setSquidMaxage with ::setCdnMaxage [extensions/FundraiserLandingPage] - https://gerrit.wikimedia.org/r/388140 [18:51:40] (PS1) Mepps: Merge branch 'master' into deployment [extensions/DonationInterface] (deployment) - https://gerrit.wikimedia.org/r/388146 [18:52:15] (CR) Mepps: [C: 2] Merge branch 'master' into deployment [extensions/DonationInterface] (deployment) - https://gerrit.wikimedia.org/r/388146 (owner: Mepps) [18:55:03] (Merged) jenkins-bot: Merge branch 'master' into deployment [extensions/DonationInterface] (deployment) - https://gerrit.wikimedia.org/r/388146 (owner: Mepps) [18:56:20] (PS1) Mepps: Update DI with default processor_form value [core] (fundraising/REL1_27) - https://gerrit.wikimedia.org/r/388149 [18:56:30] (CR) Mepps: [C: 2] Update DI with default processor_form value [core] (fundraising/REL1_27) - https://gerrit.wikimedia.org/r/388149 (owner: Mepps) [19:01:19] (Merged) jenkins-bot: Update DI with default processor_form value [core] (fundraising/REL1_27) - https://gerrit.wikimedia.org/r/388149 (owner: Mepps) [19:07:03] ejegg i deployed and everything works on non-safari browsers but got an error in safari [19:07:15] so rolling back and making sure the settings are correct [19:07:16] mepps what's the error? [19:07:21] or rather where? [19:08:01] woah and then i went back to safari after rolling back but didn't reload and it works! [19:08:24] err, maybe something in session? [19:08:34] it redirected me ejegg but then i got an error from adyen [19:09:13] oh darn, like we didn't sign it right or something? [19:10:11] yes but then like i said, i retried it afterwards and started getting the right skin [19:10:30] i just opened the link again in a new window in safari now that it's rolledback and got an even weirder error [19:10:37] where it just sent me to a thank you page [19:10:49] err [19:10:57] yeah.... [19:12:11] gotcha ejegg, maybe i'm seeing the error we were trying to fix? [19:12:40] i'm wondering if i should redeploy then since i was able to make it work.. [19:35:25] (PS1) Ejegg: Update bash completion [wikimedia/fundraising/process-control] - https://gerrit.wikimedia.org/r/388158 [19:41:05] fundraising-tech-ops, Operations, netops: bonded/redundant network connections for fundraising hosts - https://phabricator.wikimedia.org/T171962#3730671 (Jgreen) Note ubuntu/trusty config is fairly different, here's a writeup that worked, I just changed bond-mode to active-backup: https://paulmellor... [19:45:51] !log updated payments-wiki from b88b805c3b223404578c3ef8bb675e710611aa2d to a539d275377b251762f4935c3014978253f39c08 [19:45:56] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:57:13] PROBLEM - Host alnilam is DOWN: PING CRITICAL - Packet loss = 100% [20:10:33] RECOVERY - Host alnilam is UP: PING OK - Packet loss = 0%, RTA = 36.18 ms [20:12:56] (PS1) Ejegg: Fix record_skipped call [wikimedia/fundraising/process-control] - https://gerrit.wikimedia.org/r/388163 [20:13:25] cwd so yesterday's deploy just redirected that failspam to root... oops! [20:13:33] Actually tested ^^^ locally [20:13:46] ah heheh [20:13:51] also, I figured out what was missing for bash completion [20:14:06] what was it? [20:14:18] --with bash-completion [20:14:21] in the rules file [20:14:27] aah gotcha [20:14:33] i don't know much about that aspect of packaging [20:14:37] that, and I needed it to rename the completion file [20:14:55] cool [20:15:11] i am in the middle of hacking atm but i will be happy to build that later on [20:15:16] is it a lot of cronspam? [20:15:18] turns out there's an easy way to point to a file in the main source repo, which makes more sense than putting the completion script in the debian bits [20:15:27] i just see the one so far [20:15:41] aah yeah indeed [20:15:42] cwd nah, it's exactly the ones we used to get at fr-tech [20:15:45] (CR) Legoktm: [C: 2] Replace OutputPage::setSquidMaxage with ::setCdnMaxage [extensions/FundraiserLandingPage] - https://gerrit.wikimedia.org/r/388140 (owner: Jforrester) [20:15:54] cool [20:18:08] (Merged) jenkins-bot: Replace OutputPage::setSquidMaxage with ::setCdnMaxage [extensions/FundraiserLandingPage] - https://gerrit.wikimedia.org/r/388140 (owner: Jforrester) [20:55:24] (CR) Ejegg: [C: -1] "missing quotes" (1 comment) [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/387311 (https://phabricator.wikimedia.org/T178003) (owner: Mepps) [21:12:35] (PS6) Ejegg: Make tests more readable by using dates [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/387765 (https://phabricator.wikimedia.org/T179357) (owner: Eileen) [21:12:44] (CR) Ejegg: [C: 2] Make tests more readable by using dates [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/387765 (https://phabricator.wikimedia.org/T179357) (owner: Eileen) [21:41:27] (CR) Ejegg: "Those are some really creative tests! Just wondering if we get a double impression when the promise resolves AFTER the timeout." (1 comment) [extensions/CentralNotice] - https://gerrit.wikimedia.org/r/386883 (https://phabricator.wikimedia.org/T176334) (owner: AndyRussG) [22:03:44] Fundraising Sprint Uptight Piano, Fundraising-Backlog, Wikimedia-Fundraising-CiviCRM, Patch-For-Review: Separate 'donations' and 'drupal' db connection strings mean txn overhead - https://phabricator.wikimedia.org/T178639#3731094 (Ejegg) Open>Resolved Doesn't seem to have hurt anything, b... [22:04:29] Fundraising Sprint Synchronized Screaming, Fundraising Sprint Uptight Piano, Fundraising Sprint turtles that are robotic that destroy the whole world with their foot, Fundraising-Backlog, Unplanned-Sprint-Work: donatewiki_counts and banner impre... - https://phabricator.wikimedia.org/T177331#3731099 [22:04:32] Fundraising Sprint Synchronized Screaming, Fundraising Sprint Uptight Piano, Fundraising Sprint turtles that are robotic that destroy the whole world with their foot, Fundraising-Backlog, and 2 others: Back-fill missing pgehres data - https://phabricator.wikimedia.org/T178009#3731098 (Ejegg) O... [22:05:09] Fundraising Sprint Uptight Piano, Fundraising Sprint turtles that are robotic that destroy the whole world with their foot, Fundraising-Backlog, Patch-For-Review, Unplanned-Sprint-Work: Email stats may be off (possible link to missing data in pg... - https://phabricator.wikimedia.org/T178819#3731100 [22:12:08] cwd is now a good time to take another stab at process-control? [22:12:59] there's this patch to merge in gerrit for the completion script update: https://gerrit.wikimedia.org/r/388158 [22:13:11] and this one for the rootspam: https://gerrit.wikimedia.org/r/388163 [22:36:57] hey all I'm heading out a little early. see you tomorrow [22:37:05] see ya dstrine ! [22:45:26] ejegg: hey sorry my parents stopped by [22:45:33] no worries [22:45:50] yeah let me take a look... [22:48:44] (CR) Cdentinger: [C: 2] Update bash completion [wikimedia/fundraising/process-control] - https://gerrit.wikimedia.org/r/388158 (owner: Ejegg) [22:49:38] (Merged) jenkins-bot: Update bash completion [wikimedia/fundraising/process-control] - https://gerrit.wikimedia.org/r/388158 (owner: Ejegg) [22:52:40] (CR) Cdentinger: [C: 2] Fix record_skipped call [wikimedia/fundraising/process-control] - https://gerrit.wikimedia.org/r/388163 (owner: Ejegg) [22:53:09] (Merged) jenkins-bot: Fix record_skipped call [wikimedia/fundraising/process-control] - https://gerrit.wikimedia.org/r/388163 (owner: Ejegg) [22:53:54] rockin! [22:54:09] ejegg: so nothing in the debian directory should have to change? [22:54:18] So then... the process-control.bash-completion file is just this [22:54:49] https://github.com/ejegg/process-control-debian/blob/master/process-control.bash-completion [22:55:03] and the rules file needs another --with [22:55:23] https://github.com/ejegg/process-control-debian/blob/master/rules#L7 [22:56:29] aah cool [22:56:40] yeah does seem nicer to keep it local [22:57:50] riight, since it changes with the run-job options [22:58:53] (Abandoned) Ejegg: Update CiviCRM submodule [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/386634 (owner: Ejegg) [23:17:29] ejegg: ok, ready to upgrade that pkg [23:17:35] nice! [23:17:40] Doing the turn-off [23:19:21] ok, that's cron stopped [23:19:29] cool, updatin' [23:20:19] ejegg: ok, turn back on? [23:21:19] woohoo, bash-completion works! [23:21:24] a bit slow, but meh [23:21:42] wicked [23:22:38] ok, turned it back on! [23:23:28] and.. I better mosey along. will have my compy and be looking out for failmail [23:23:44] but i'm really not expecting any [23:23:51] sounds good [23:23:53] have a good night! [23:24:05] you too!