[01:08:27] cstone: I'm around if you want to talk through anything on that recurring fix [01:09:30] hey ejegg I just got back to looking at food shenanigans took longer than I thought [01:10:12] no worries. I'll also be around tomorrow [01:12:57] I was thinking just use the modified date to reset the next sched date [01:15:41] (PS1) Cstone: Add fix to reset next_sched_contribution_date for ingenico recurrings. [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/633046 [01:15:45] ok, that works! [01:15:54] that worked locally let me test a couple more [01:18:14] hmm, might need to window that [01:18:36] looks like the earliest modified_date picked up by that is August 19th [01:18:53] so you'd be setting the next_sched to Sept 19th [01:19:05] just a bit too early to be picked up [01:20:11] so maybe for anything modified after Sept 2nd or 3rd add one month, and for the earlier ones add two? [01:20:38] We miss a bunch of make-up payments but we keep the charges closer to the cycle_date [01:21:24] and don't trigger the too-close alarm [01:21:55] hmm that works this two month time math is confusing [01:22:08] i was thinking too much about the steptember ones [01:22:23] simplest to just do 2 updates, right? [01:22:32] yeah [01:30:12] (PS2) Cstone: Add fix to reset next_sched_contribution_date for ingenico recurrings. [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/633046 [01:31:46] bah thats not right [01:32:05] (CR) jerkins-bot: [V: -1] Add fix to reset next_sched_contribution_date for ingenico recurrings. [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/633046 (owner: Cstone) [01:33:03] oh, just the underscore, or something else? [01:33:49] oh quotes too [01:34:09] quote chaos [01:34:23] are we going to leave it at 3 days in the past for catch up? [01:34:30] or go back a bit further [01:35:00] I think we've got that setting at 14 days [01:35:21] which is actually not compatible with our 23-day closeness filter [01:35:29] but right now its doing 3 right [01:36:00] are we overriding that setting on the command line? [01:36:15] oh nope im just reading the wrong box [01:36:26] well then [01:37:49] so if something has cycle day 1 and we charge them now, I guess we just barely squeak by the 23 day filter next month [01:38:03] ok, that cutoff date should work [01:39:32] should it be 8-25 though [01:39:35] to grab the most of them? [01:40:08] oh i see what you mean [01:40:16] man my brain is not on top of this time math tonight [01:42:28] (PS3) Cstone: Add fix to reset next_sched_contribution_date for ingenico recurrings. [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/633046 [01:44:58] looks good in gerrit, just pulling it down to peek in my IDE [01:45:40] cstone your first update has a space instead of an underscore in the modified_date filter [01:45:56] it does indeed [01:46:00] i think it needs one more day [01:47:37] oh yeah, it's already the 9th UTC [01:50:16] (PS4) Cstone: Add fix to reset next_sched_contribution_date for ingenico recurrings. [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/633046 [01:54:52] okok, that looks like the right stuff [01:55:51] (CR) Ejegg: [C: +2] "Looks good!" [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/633046 (owner: Cstone) [02:00:31] run it tomorrow after the recurring is done? [02:00:50] oh ok [02:01:09] sure, while it's still the 9th UTC [02:02:55] (Merged) jenkins-bot: Add fix to reset next_sched_contribution_date for ingenico recurrings. [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/633046 (owner: Cstone) [02:03:19] thanks cstone! [02:03:22] have a good night [02:05:27] oh damn, I forgot I did this: https://gerrit.wikimedia.org/r/c/wikimedia/fundraising/crm/+/556264/ [02:05:37] did we just never make the matching grafana graph? [02:05:49] or is it there and we're just not paying attention? [02:06:13] huh [02:08:51] lessee, how do i get the total for a day of a stat [02:11:15] Fundraising-Backlog, fundraising sprint Theme songs for programming languages, FR-Ingenico, Recurring-Donations, fr-donorservices: Oct. 2020 Ingenico recurrings failing to attempt new donations - https://phabricator.wikimedia.org/T264954 (MBeat33) Thank you @Cstone Will this patch restart all... [02:22:13] Fundraising-Backlog, Wikimedia-Fundraising-CiviCRM, fr-donorservices: Civi: mailing events count really high for 10/6 email send - https://phabricator.wikimedia.org/T265073 (MNoorWMF) If there isn't a technical explanation, some possible reasons for this might be: 1. Many email clients like (Outlook... [02:23:12] grr, sum_over_time is giving me stupid big numbers [02:29:41] Fundraising-Backlog, fundraising sprint Theme songs for programming languages, FR-Ingenico, Recurring-Donations, fr-donorservices: Oct. 2020 Ingenico recurrings failing to attempt new donations - https://phabricator.wikimedia.org/T264954 (Cstone) @MBeat33 yep it will. For ones that were sche... [02:30:13] well, here's the first stab at it anyway: https://frmon.frdev.wikimedia.org/d/Pq1YNMviz/fundraising-overview?editPanel=35&viewPanel=35&orgId=1&refresh=1m&from=now-90d&to=now [02:31:04] Fundraising-Backlog, fundraising sprint Theme songs for programming languages, FR-Ingenico, Recurring-Donations, fr-donorservices: Oct. 2020 Ingenico recurrings failing to attempt new donations - https://phabricator.wikimedia.org/T264954 (MBeat33) Cool, thank you! [02:31:59] nice ejegg|away ! [13:27:33] fundraising-tech-ops, DC-Ops, Operations, ops-eqiad: (Need By: 2020-09-30) rack/setup/install frdb1004.frack.eqiad.wmnet - https://phabricator.wikimedia.org/T260379 (Cmjohnson) Open→Invalid This is an old ticket, @jgreen just made a new task for the same server. Killing this off [14:03:30] Fundraising-Backlog, Wikimedia-Fundraising-CiviCRM, fr-donorservices: Civi: mailing events count really high for 10/6 email send - https://phabricator.wikimedia.org/T265073 (MBeat33) p:Triage→Low Thanks so much for the information, @MNoorWMF that's really helpful. This doesn't impact Donor... [14:43:01] Fundraising-Backlog, Wikimedia-Fundraising-CiviCRM, fr-donorservices: Civi: mailing events count really high for 10/6 email send - https://phabricator.wikimedia.org/T265073 (MNoorWMF) FWIW, I took a look at our unique opens vs our total opens and this little blip did show up but it seems it was only... [15:13:37] Wikimedia-Fundraising-Banners, Wikimedia-production-error: Fundraising banner throwing high amount of errors relating to bad jQuery selector - https://phabricator.wikimedia.org/T264786 (Jdlrobson) Open→Resolved a:Jdlrobson We got a quick fix out of this - thanks @Pcoombe {F32379443} I am go... [15:23:30] hi fr-tech! [15:23:42] ejegg: heyyyyy [15:23:52] hey AndyRussG [15:23:57] howsie goesies? [15:24:03] oh pretty good [15:24:34] thought we might buy a lamp this morning but we woke up late so we'll do that tomorrow [15:24:48] hmmm [15:25:14] yeah those early-morning lighting retailers have crazy hours [15:25:18] ;p [15:25:19] how's things with you? [15:25:31] eh hangin' in there! [15:25:54] kiddos are doing their Zoom classes [15:26:09] doggo is snoozing in the sun [15:26:15] and I'm learning about IP tables [15:26:54] fun! [15:26:56] I got the docker container set up with the right options that it should be able to forward udp 514 packets correctly [15:26:58] yeee [15:27:23] just looking at a few posts online that explain how to do it, trying to understand what they do [15:27:33] yeah, it's fairly complex from the little I've seen [15:28:08] pfff yes it's a maze [15:28:31] here's the post with the answer that at least is thoroughly explained: https://serverfault.com/questions/646522/port-forward-with-iptables [15:29:12] and here's a Wiki article with pics of the maze: https://en.wikipedia.org/wiki/Netfilter [15:29:52] Fundraising-Backlog, Wikimedia-Fundraising-CiviCRM, Recurring-Donations: Add metrics for recurring charge job to Grafana - https://phabricator.wikimedia.org/T199390 (Ejegg) Here's a first stab at it: https://frmon.frdev.wikimedia.org/d/Pq1YNMviz/fundraising-overview?editPanel=35&viewPanel=35&orgId=1&... [15:30:46] Jeff_Green or dwisehaupt I'm trying to get a graph of the total recurring donations charged per day in grafana, but I think we may be collecting the data in a way that makes it impossible [15:30:53] So the job runs once per hour [15:31:09] and writes a .prom file at the end [15:31:09] ok [15:31:36] but when I try doing a total_over_time(recurring_smashpig_completed[1d]) I get ridiculously high numbers [15:32:00] are they ~60x what they should be? [15:32:05] I think because prometheus is collecting that statistic every minute [15:32:08] yeah, I think that's it [15:32:26] so we just divide by 60, I guess [15:32:37] or report it every minute? [15:32:40] the other issue is that I don't really want a rolling 24 hour sum [15:32:46] we only run the job 1x per hour [15:32:57] and when the job is done we report the number charged [15:33:05] it would be pretty tricky to report 1x per hour [15:33:12] thinking [15:33:16] hmm, I guess we could get the info from the db [15:33:49] so yeah, the really interesting statistics are how many we charge in each calendar dat [15:33:52] *day [15:34:08] and how that differs from the same day in the previous month [15:34:37] I wonder whether you can timestamp the data in the prom file to cause prometheus to collect it only once [15:34:38] some variation is expected as cards expire and new people sign up [15:35:20] that timestamp could be a good solution, and save us cluttering up the storage [15:35:58] I can't say with any confidence that it works, but it seems like it should be possible [15:36:40] oh ha https://prometheus.io/docs/instrumenting/writing_exporters/ [15:36:48] "Accordingly, you should not set timestamps on the metrics you expose, let Prometheus take care of that. If you think you need timestamps, then you probably need the Pushgateway instead." [15:38:41] hmph [15:39:26] pushing stuff to prom seems more complicated than writing files [15:39:35] agreed [15:40:12] so what's the process for adding a metric from mysql? [15:40:25] a query to run 1x per day even [15:40:32] I'm not sure prometheus does anything useful with the timestamp, it's concievable it's just used as a measure of staleness in which case the same metric could get scraped a couple times before it decided it is stale [15:41:11] I think that's not really what prometheus was designed to do [15:41:52] one thought would be to use a second backend alongside prometheus, I think grafana can talk to mysql for instance [15:42:45] don't we already have some things populated from mysql queries? [15:43:32] we do, but I think they're all instantaneous readings rather than per-time metrics [15:44:26] I don't totally understand the metric in question, but could it be done as an ever-increasing value? [15:44:58] so you just keep reporting the value and it jumps up every hour or every day? [15:45:22] if you do it that way you can use grafana/prometheus functions to determine the rate of change etc [15:45:54] sure, lemme see how long those queries take [15:46:24] and we can do those queries on a replica, right? [15:46:39] I'm not saying you need to repoll it every minute [15:46:45] sure [15:46:56] we really do want to know the number charged in a calendar day though [15:46:59] we could build them into the existing mysql scrape-to-prom thingie [15:47:05] you would [15:47:29] trying to think of an example [15:50:24] I think this how the network_errors graph works here: https://frmon.frdev.wikimedia.org/d/000000377/host-overview?orgId=1&refresh=5m [15:50:52] I'm pretty sure the underlying metric is just an ever-increasing counter as long as the machine is up [15:50:58] Jeff_Green: how would you group those by calendar day? [15:51:12] so if there are no errors, you'd have repeated scrapes of the same value going into prometheus [15:51:27] For recurring charges, if we do 3000 on Aug 9th and only 2000 on Sep 9th, we should look into that drop [15:52:29] maybe the irate() function would work? this graph uses irate in 5m intervals, maybe you can use 1d? [15:53:15] I don't see any way of windowing that to calendar days [15:53:31] "calculates the per-second instant rate of increase of the time series in the range vector. This is based on the last two data points" [15:53:42] "irate should only be used when graphing volatile, fast-moving counters. Use rate for alerts and slow-moving counters" [15:54:09] It would have to be something like sum_over_time [15:54:34] but with a qualifier in the [] to limit the time range to specific days [15:54:50] and I only see examples of sliding windows [15:55:12] i don't know, also what is a day? [15:55:15] GMT day? [15:56:01] yep, a UTC day [15:56:18] we mark each recurring record with a cycle_day [15:56:25] between 1 and 31 inclusive [15:57:00] then when that day (or the nearest that will occur this month) rolls around in UTC, we start charging the recurrings [15:57:14] only sometimes we screw up and miss a thousand or so [15:57:28] and i REALLY REALLY REALLY want to be able to see that discrepancy in a graph [15:57:41] I can express that in a sql query without too much trouble [15:58:26] If I understand correctly I think it's going to be very difficult to get this to line up perfectly in prometheus, if for no other reason than that AFAIK you can't really define the precise time that prometheus assigns it to [15:58:28] and I guess I can write a script to run once a day at midnight and write the number charged per gateway in the previous day [15:58:48] plus the difference between that and the number charged on the same day in the previous month [15:59:23] then that file just sits there all day [15:59:29] representing yesterday [15:59:31] could this be done in superset? [15:59:56] eyener would know better than I would about that! [16:01:03] that's worth exploring, the analytics DB already has a realtime replica of the civi db to work with [16:02:01] sorry, just catching up...what is the ask? [16:03:06] hi fr-tech [16:03:36] eyener we want a graph with the total number of recurrings charged in the past day (per gateway) [16:03:52] plus the difference between that number and the total number charged on the same day last month [16:03:56] hi cstone! [16:04:16] I was trying to figure out how to get that in grafana, but it's really meant for sliding averages, not calendar day totals [16:04:46] would it be easy to get those calendar day + month over month comparisons in superset? [16:05:10] hmm could we somehow count how many we expected to charge too [16:05:29] I believe so ejegg! [16:17:45] (PS1) Cstone: Merge branch 'master' into deployment [wikimedia/fundraising/crm] (deployment) - https://gerrit.wikimedia.org/r/633206 [16:27:37] ejegg For your purposes you don't care about pre-payment recurring signups, right? When a donor makes a 1 time donation and also selects to 'make it recurring' and then their recurring donation starts the next month [16:27:58] right eyener, this is to monitor the health of the recurring charge job [16:31:14] fr-tech I made port forwarding for syslog in docker work!!!!!! [16:31:58] I mean, I can send text to 127.0.0.1:514 on container one and see it output at port 9014 on container two!!! [16:32:41] AndyRussG: nice! [16:39:17] (Abandoned) Cstone: Merge branch 'master' into deployment [wikimedia/fundraising/crm] (deployment) - https://gerrit.wikimedia.org/r/633206 (owner: Cstone) [16:55:16] Fundraising-Backlog, fr-donorservices: French TY page on mobile: code visible for App user - https://phabricator.wikimedia.org/T265160 (MBeat33) [17:09:52] Fundraising-Backlog: As a SurveyMonkey account owner, I would like to know if we need to update our email allowlist for SurveyMonkey - https://phabricator.wikimedia.org/T265161 (KHaggard) [17:13:51] Fundraising-Backlog: As a SurveyMonkey account owner, I would like to know if we need to update our email allowlist for SurveyMonkey - https://phabricator.wikimedia.org/T265161 (DStrine) @Pcoombe are you linking survey monkey in any particularly complicated way? If not then I think this would not affect simp... [17:15:34] Fundraising-Backlog, Thank-You-Page, fr-donorservices: French TY page on mobile: code visible for App user - https://phabricator.wikimedia.org/T265160 (Pcoombe) [17:16:18] (PS1) Ejegg: Update CiviCRM to 5.31 beta1 [wikimedia/fundraising/crm/civicrm] - https://gerrit.wikimedia.org/r/633212 [17:16:20] (PS1) Ejegg: re-add remaining wmf hacks. [wikimedia/fundraising/crm/civicrm] - https://gerrit.wikimedia.org/r/633213 [17:16:56] eyener: yesterday I replied more fully to your questions on IRC, but maybe it was late for u? dunno if you saw... tl;dr you can't get the legacy banner_count field from pgeheres, it seems, after a quick review, anyway [17:17:23] eyener: but you can get it from Hive. But I'd kinda want to double-check how it's working, too, which wouldn't take long [17:17:27] dwisehaupt: thanks!! :) [17:17:50] Fundraising-Backlog: As a SurveyMonkey account owner, I would like to know if we need to update our email allowlist for SurveyMonkey - https://phabricator.wikimedia.org/T265161 (KHaggard) @DStrine Just chiming in that we just embed survey URLs directly into the Thank You pages, if that helps! I'm thinking m... [17:18:48] (PS1) Cstone: Add fix to reset next_sched_contribution_date for ingenico recurrings. [wikimedia/fundraising/crm] (deployment) - https://gerrit.wikimedia.org/r/633214 [17:19:58] ejegg: just want to make sure I did that correctly ^ [17:20:25] yep, a cherry-pick shows up as just that patch on the deployment branch [17:21:10] ok cool thanks [17:21:23] oh thanks for the tl;dr AndyRussG! I'd be interested to see where that lives in Hive, but no real rush or anything. Since it's not in the FR infrastructure, there's no huge difference between using event_sanitized.centralnoticebannerhistory and another source in my view (at. the moment at least) [17:21:35] (CR) Cstone: [C: +2] Add fix to reset next_sched_contribution_date for ingenico recurrings. [wikimedia/fundraising/crm] (deployment) - https://gerrit.wikimedia.org/r/633214 (owner: Cstone) [17:33:14] hmm, what's up with the "Exception: Value already exists in the database" failmail [17:33:33] trying to add a campaign value that's already there, it looks like [17:34:12] (CR) jerkins-bot: [V: -1] Update CiviCRM to 5.31 beta1 [wikimedia/fundraising/crm/civicrm] - https://gerrit.wikimedia.org/r/633212 (owner: Ejegg) [17:35:56] Wikimedia-Fundraising-Banners, JavaScript, Wikimedia-production-error: Banner code faulty when no disk space available - https://phabricator.wikimedia.org/T264952 (Pcoombe) It appears that this can happen from merely checking for the 'localStorage' object in IE/Edge when no disk space is available, o... [17:36:36] (CR) jerkins-bot: [V: -1] re-add remaining wmf hacks. [wikimedia/fundraising/crm/civicrm] - https://gerrit.wikimedia.org/r/633213 (owner: Ejegg) [17:39:30] (CR) Cstone: [C: +2] "recheck" [wikimedia/fundraising/crm] (deployment) - https://gerrit.wikimedia.org/r/633214 (owner: Cstone) [17:39:56] Fundraising-Backlog: As a SurveyMonkey account owner, I would like to know if we need to update our email allowlist for SurveyMonkey - https://phabricator.wikimedia.org/T265161 (Pcoombe) @DStrine The links on the TY page are just simple URLs like https://www.surveymonkey.com/r/P7WB8JZ?country=US. I know noth... [17:42:17] Fundraising-Backlog, Thank-You-Page, Wikipedia-iOS-App-Backlog, fr-donorservices: Donation redirect to app shows broken inset - https://phabricator.wikimedia.org/T264259 (MBeat33) [17:52:05] !log civicrm revision changed from b86a15a430 to 585eb835d8, config revision is 57843925bb [17:52:10] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:53:05] !log upgrading payments1004 to buster [17:53:09] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:54:17] ejegg: im blanking on how to run this is it just drush update [17:56:09] drush updb [17:56:13] cstone: ^^ [17:56:22] ah of course thanks [17:58:40] fundraising-tech-ops: templatize smashpig main.yaml configuration in fundraising::smashpig - https://phabricator.wikimedia.org/T265162 (Jgreen) [17:58:45] Wikimedia-Fundraising-Banners, JavaScript, Wikimedia-production-error: Banner code faulty when no disk space available - https://phabricator.wikimedia.org/T264952 (Jdlrobson) Yeh unfortunately to use localStorage there are a few hurdles. You might want to consider using the existing mediawiki abstra... [18:01:18] ok the fix has been run ill watch the charge bot to just double check everything [18:02:08] fundraising-tech-ops: templatize smashpig main.yaml configuration in fundraising::smashpig - https://phabricator.wikimedia.org/T265162 (Jgreen) [18:03:37] PROBLEM - check_apache2 on payments1004 is CRITICAL: PROCS CRITICAL: 0 processes with command name apache2 [18:03:43] PROBLEM - check_mysql on payments1004 is CRITICAL: Cant connect to local MySQL server through socket /var/run/mysqld/mysqld.sock (2) [18:03:43] PROBLEM - check_payments_wiki on payments1004 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - string OK not found on https://payments1004.frack.eqiad.wmnet:443https://payments.wikimedia.org/index.php/Special:SystemStatus - 384 bytes in 0.009 second response time [18:03:47] PROBLEM - check_puppetrun on payments1004 is CRITICAL: CRITICAL: Puppet has 66 failures. Last run 1 minute ago with 66 failures. Failed resources (up to 3 shown): Package[libapache2-mod-php],Package[aide-common],Package[libapache2-mod-security2],Package[python-netaddr] [18:10:14] looks like that's all from the upgrade? [18:12:51] thanks cstone [18:13:31] I'mma be afk for a bit but when I'm back I'd be happy to talk through the per-country monthyconvert stuff [18:13:43] RECOVERY - check_apache2 on payments1004 is OK: PROCS OK: 7 processes with command name apache2 [18:13:53] RECOVERY - check_puppetrun on payments1004 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [18:18:15] (PS11) Cstone: Create $wgMonthlyConvertCountries Allows turning on base monthlyConvert variant by country [extensions/DonationInterface] - https://gerrit.wikimedia.org/r/632793 (https://phabricator.wikimedia.org/T250918) (owner: Mepps) [18:18:37] RECOVERY - check_mysql on payments1004 is OK: Uptime: 94 Threads: 10 Questions: 168 Slow queries: 0 Opens: 34 Flush tables: 1 Open tables: 28 Queries per second avg: 1.787 [18:18:43] RECOVERY - check_payments_wiki on payments1004 is OK: HTTP OK: HTTP/1.1 200 OK - 417 bytes in 0.026 second response time [18:28:28] fr-tech, looks like there's a poll running on vagrant usage, I imagine our data might be useful to them. [18:28:31] poll is here: https://phabricator.wikimedia.org/T265164 [18:30:53] !log upgrading payments1003 to buster [18:30:57] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:50:19] AndyRussG, a while back you sent me some info on pageviews_hourly schema (hive) - I can't seem to find it, do you have it handy? [18:51:44] I'm trying to confirm my assumption that we don't track referrer (page or source) on pageviews. I'm 99.999% that is true [18:52:42] eyener: I think we do? [18:52:52] but maybe not in Druid [18:54:37] I'm looking around anywhere really...it doesn't seem like the kind of thing we'd have? [18:55:17] eyener: so it looks like it's not in pageviews, but it is in webrequests, which pageviews is derived from, I just saw [18:55:20] https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Traffic/Pageview_hourly [18:55:38] https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Traffic/Webrequest [18:56:10] and webrequest has an is_pageview field that you can filter on [18:56:44] ah cool AndyRussG. it looks like the hive table wfm.pageview_hourly actully has that `referer_class` attribute as well but I didn't know what it meant [18:57:03] yeah looks like referer_class is much less complete [18:57:29] eyener: as regards the earlier point, you can get beacon/impression calls directly in Hive, with the full data and all the fields that get sent. Much more complete than what gets sent to pgeheres, which it is a source for [18:57:56] becaon/impression calls are webrequests, so you have to query that table and filter for the url path of beacon/impression, basically [18:59:58] aaah this is really cool (webrequests) AndyRussG! [19:00:30] re: beacon/impression : I thought that nothing is (Currently) sent to pgeheres? [19:02:00] do you knwo what DB beacon/impression would be in (Hive)? I'm interested to check it out [19:05:34] (PS2) Cstone: Add nl,it,nb,pl,pt,ro,es translations for failed recurring messages. [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/632365 [19:14:14] !log upgrading payments1002 to buster [19:14:18] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:55:01] eyener: beacon/impression is in webrequests [19:55:22] webrequests is just all the communication between browsers (or other programs that act like browsers) and our servers [19:55:43] it's pretty well the raw web logs, like what you normally get output in text form from a web server [19:56:09] so becaon/impression is just one of many things like that, it's a bunch of needles in that giant haystack [19:56:40] it's a web request because that's how the browser sends that information about centralnotice operation to our servers [19:56:54] and it is used as the basis for the banner impression data in pgeheres [20:04:06] !log upgrading payments1001 to buster [20:04:10] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:11:47] PROBLEM - check_mysql on payments1003 is CRITICAL: Slave IO: Connecting Slave SQL: Yes Seconds Behind Master: (null) [20:13:47] PROBLEM - check_mysql on payments1002 is CRITICAL: Slave IO: Connecting Slave SQL: Yes Seconds Behind Master: (null) [20:23:47] RECOVERY - check_mysql on payments1002 is OK: Uptime: 2256 Threads: 12 Questions: 2819 Slow queries: 0 Opens: 35 Flush tables: 1 Open tables: 29 Queries per second avg: 1.249 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 0 [20:26:45] RECOVERY - check_mysql on payments1003 is OK: Uptime: 5533 Threads: 11 Questions: 6833 Slow queries: 0 Opens: 34 Flush tables: 1 Open tables: 28 Queries per second avg: 1.234 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 0 [20:31:44] !log upgrading pay-lvs1002 to buster [20:31:48] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:41:51] !log upgrading pay-lvs1001 to buster [20:41:55] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:42:27] PROBLEM - check_puppetrun on pay-lvs1002 is CRITICAL: CRITICAL: Puppet has 11 failures. Last run 4 minutes ago with 11 failures. Failed resources (up to 3 shown): File[/etc/rssh.conf],File[/etc/vim/vimrc.local],File[/etc/update-motd.d/99-footer],File[/etc/motd.tail] [20:44:04] that error must be laggy. just tested and there is a clean puppet run. [20:46:42] hmm AndyRussG sorry, I'm not quite following where to find beacon/impression - I'm looking in the wmf.webrequest table [20:47:27] RECOVERY - check_puppetrun on pay-lvs1002 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [20:49:53] eyener: right so that's the massive table with allll the interactions between servers and web browsers (and similar) [20:50:06] you can filter that in many ways [20:50:22] in this case, what distinguishes beacon/impression is the URL [20:51:04] eyener: specifically you'll want to filter for uri_path=beacon/impression, or something similar [20:51:20] since the path is the part of the URL after the server name [20:51:27] aaah thanks! [20:51:39] then you'll want to parse the uri query field for the actual contents [20:51:46] eyener: we have example code that does all that, one sec [20:52:00] oooh awesome [20:52:53] eyener: you can model queries on the code starting at line 40 here: https://gerrit.wikimedia.org/r/plugins/gitiles/analytics/refinery/+/master/oozie/banner_activity/druid/daily/generate_daily_druid_banner_activity.hql [20:53:19] that's the code that queries Hive and generates the Druid data that on impressions that you see in Superset and Turnilo