[00:38:29] (PS1) Ejegg: Rudiments of arbitrary campaigns [wikimedia/fundraising/dash] - https://gerrit.wikimedia.org/r/324634 (https://phabricator.wikimedia.org/T151820) [00:38:59] fr-tech does that look like the right direction for arbitrary campaigns? ^^ [00:45:15] erranding... thanks for making it rain! [00:51:59] ohh - another 23k and .... [01:02:44] anyone else hear that bell...... [01:11:06] (CR) Krinkle: [C: 2] Remove use of deprecated "json" module [extensions/CentralNotice] (wmf_deploy) - https://gerrit.wikimedia.org/r/324375 (owner: Krinkle) [01:11:09] (Merged) jenkins-bot: Remove use of deprecated "json" module [extensions/CentralNotice] (wmf_deploy) - https://gerrit.wikimedia.org/r/324375 (owner: Krinkle) [01:12:42] Krinkle: if u like lmk when it's on prod so I can check a few URLs that trigger the full CN campaign selection code (/me turns up caution to neurosis level) Thx again!!!! [01:27:32] (PS1) Ejegg: Update c3 to 4.11 [wikimedia/fundraising/dash/src/bower_modules] - https://gerrit.wikimedia.org/r/324638 [01:27:45] (CR) Ejegg: [C: 2] Update c3 to 4.11 [wikimedia/fundraising/dash/src/bower_modules] - https://gerrit.wikimedia.org/r/324638 (owner: Ejegg) [01:28:31] (PS1) Ejegg: Update c3 library to 4.11 [wikimedia/fundraising/dash] - https://gerrit.wikimedia.org/r/324639 [01:31:50] (CR) Ejegg: [V: 2] Update c3 to 4.11 [wikimedia/fundraising/dash/src/bower_modules] - https://gerrit.wikimedia.org/r/324638 (owner: Ejegg) [01:32:58] (CR) Ejegg: [C: 2] Update c3 library to 4.11 [wikimedia/fundraising/dash] - https://gerrit.wikimedia.org/r/324639 (owner: Ejegg) [01:34:54] (Merged) jenkins-bot: Update c3 library to 4.11 [wikimedia/fundraising/dash] - https://gerrit.wikimedia.org/r/324639 (owner: Ejegg) [01:36:24] (PS2) Ejegg: Rudiments of arbitrary campaigns [wikimedia/fundraising/dash] - https://gerrit.wikimedia.org/r/324634 (https://phabricator.wikimedia.org/T151820) [02:23:11] (PS1) Ejegg: CSS fixes for new c3 version [wikimedia/fundraising/dash] - https://gerrit.wikimedia.org/r/324643 [02:28:14] aargh, that's 2 hrs wasted on a c3 bug and CSS [02:28:23] signing off to save my sanity [02:51:55] Did someone mention sanity? [02:52:02] * AndyRussG want some [05:40:55] (PS1) Eileen: WIP dedupe throttling [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/324655 [05:44:10] (PS2) Eileen: WIP dedupe throttling [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/324655 [05:46:35] (PS3) Eileen: WIP dedupe throttling [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/324655 [05:48:29] (PS4) Eileen: WIP dedupe throttling [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/324655 [05:55:35] (PS5) Eileen: WIP dedupe throttling [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/324655 [06:36:42] Fundraising Sprint Waiting for Godot, Fundraising-Backlog, Wikimedia-Fundraising-CiviCRM, Unplanned-Sprint-Work: Add option to abort dedupe jobs based on the volume of contributions being processed - https://phabricator.wikimedia.org/T152072#2837364 (Eileenmcnaughton) [06:37:10] Fundraising Sprint Waiting for Godot, Fundraising-Backlog, Wikimedia-Fundraising-CiviCRM, Unplanned-Sprint-Work: Add option to early-exit dedupe jobs based on the volume of contributions being processed - https://phabricator.wikimedia.org/T152072#2837377 (Eileenmcnaughton) [06:37:45] (PS6) Eileen: Add option to early-exit dedupe jobs based on the volume of contributions being processed [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/324655 (https://phabricator.wikimedia.org/T152072) [06:43:46] Fundraising Sprint Waiting for Godot, Fundraising-Backlog, Wikimedia-Fundraising-CiviCRM, Unplanned-Sprint-Work: Add option to early-exit dedupe jobs based on the volume of contributions being processed - https://phabricator.wikimedia.org/T152072#2837382 (Eileenmcnaughton) This code is currently... [13:50:30] Fundraising Sprint Value Subtracting, Fundraising Sprint Waiting for Godot, Fundraising-Backlog, Wikimedia-Fundraising-CiviCRM, FR-2016-17-Q2-Bugs: Engage import failing to import certain significant fields - https://phabricator.wikimedia.org/T146295#2838156 (CCogdill_WMF) Hm, so I see your p... [14:38:15] Fundraising-Analysis, Fundraising-Backlog: Create new git repository for fundraising stats tools - https://phabricator.wikimedia.org/T151982#2838254 (JakeTheDeveloper) a:JakeTheDeveloper>None Ohh I thought this was a different task. Wrong task. [16:31:03] Fundraising-Backlog, fundraising-tech-ops: migrate fundraising.listener.org off of the civicrm webserver - https://phabricator.wikimedia.org/T152106#2838575 (Jgreen) [16:39:19] hey AndyRussG you around? fr-online is talking about restricting impressions to people who have seem more than 10 banners. However... is it possible that this is being calculated wrong as well? Could be be showing tons of banners and not counting them? Maybe to people who have already donated? Is it possible "best day ever" = "most annoying banners ever" ? [16:55:27] dstrine: hi! I don't think we're showing banners to people who have already donated... [16:59:53] dstrine: where can I see whatever calculations/reports people are using? Is this on the dash? [17:04:38] AndyRussG: ask that question on the impressions thread I started yesterday [17:16:38] Where is that pesky list of Drupal modules to enable... [17:17:04] aha. sites/default/enabled_modules [17:19:51] (PS1) Awight: Revert "Link to the correct configuration form" [wikimedia/fundraising/crm] - https://gerrit.wikimedia.org/r/324757 [17:26:09] crm/bin install scripts FTW. though we need to add a "drush en `cat sites/default/enabled_modules`" [17:27:09] the-wub: we're in business! https://github.com/wikimedia/wikimedia-fundraising-stats [17:27:22] I'll upload the existing code to that repo momentarily... [17:27:43] oh. breakfast [17:28:21] random sql question. is there a variable (i'm hoping in civicrm_contribution) that one could use to differentiate a banner contribution from an email contribution? [17:33:54] dstrine: utm_medium in drupal.contribution_tracking [17:34:23] see also https://www.mediawiki.org/wiki/Fundraising_tech/Database_schema#Joining_civicrm_contribution_and_contribution_tracking [17:41:32] thanks! I haven't played with anything in drupal yet. very cool [17:47:33] dstrine: max just sent a report! [17:49:39] two reports, actually... Checking them out! It's all apparently reproducible too, since it's in pybook format [18:00:02] fr-tech: You have been selected for a secret mission. [18:00:02] -- discuss. [18:07:37] ahh, I broke the cutoff filter in that bandaid fix to get the November days to show up [18:09:27] fr-tech this one will fix the cutoff filter (as well as get ready for restoring year switches): https://gerrit.wikimedia.org/r/324634 [18:11:32] i better get dash working locally [18:12:24] I need to do that as well [18:13:36] going to need a replica of a 2013 vintage node environment [18:18:30] ejegg: if this project now has long term goals, don't you think it'd be worth porting to a supported node version? [18:18:56] it seems like a lot of work getting sunk into something where you might well hit a wall [18:19:30] I thought we were talking about a new framework tho [18:19:53] the version it's on isn't even on the "no longer supported" list: https://github.com/nodejs/LTS [18:20:14] hah, super ancient [18:20:52] yeah, we're just waiting on the box to get upgraded to a semirecent OS [18:21:58] I'm trying not to touch any of the node stuff - just SQL and javascript [18:22:45] since it has npm we should be able to get whatever version [18:22:54] the node workflow is pretty much separate from apt [18:23:19] at this point i think it's apt-get install npm && npm install n && n use 0.8.2 (for example) [18:26:09] (PS1) Awight: Copy files from /srv/br [wikimedia/fundraising/stats] - https://gerrit.wikimedia.org/r/324762 [18:26:14] the-wub: ^ want to bless that? [18:31:31] ejegg: was there ever any progress on getting dash into vagrant? [18:31:41] do you run it under vagrant? [18:32:08] nah, I haven't been using that much at all [18:32:22] partly just to extend battery life [18:32:41] yeah [18:32:53] i'd like to try the docker back end [18:36:20] cwd: ooh +1 to that yo [18:36:33] I mean, whatever backend but yes getting it into vagrant [18:38:54] (CR) Awight: [C: 2] "Self-merging import from the statistics staging server." [wikimedia/fundraising/stats] - https://gerrit.wikimedia.org/r/324762 (owner: Awight) [18:39:12] seems like the right solution to a time capsule environment [18:43:48] (CR) Awight: [V: 2] Copy files from /srv/br [wikimedia/fundraising/stats] - https://gerrit.wikimedia.org/r/324762 (owner: Awight) [18:44:05] dstrine: on first glance the BH data looks fine. I should dig into the code they're using to check that it, I guess [18:44:07] totes. also a good solution to tricky provisioning [18:44:45] I'll also reply to the question about tuning the impressions-pageview checks, and send the link for the related task [18:51:23] Fundraising-Backlog, MediaWiki-extensions-CentralNotice: Central Notice: possible CN issue early on December 1st UTC - https://phabricator.wikimedia.org/T152122#2839139 (DStrine) [18:51:37] AndyRussG: FYI^^ [18:51:54] Fundraising-Backlog, MediaWiki-extensions-CentralNotice: Central Notice: possible CN issue early on December 1st UTC - https://phabricator.wikimedia.org/T152122#2839154 (DStrine) p:Triage>High [18:52:03] Fundraising-Backlog, MediaWiki-extensions-CentralNotice: Central Notice: possible CN issue early on December 1st UTC - https://phabricator.wikimedia.org/T152122#2839139 (awight) ``` select sum(count) as impressions, hour(timestamp) as hour from bannerimpressions where timestamp between '20161201000000' a... [18:53:14] Fundraising-Backlog, MediaWiki-extensions-CentralNotice: Central Notice: possible CN issue early on December 1st UTC - https://phabricator.wikimedia.org/T152122#2839170 (awight) The same query for Nov 30 shows that impressions were almost perfectly flat during that hour. [18:54:19] dstrine: awight ahhhhmmmmmm, looking.... [18:56:03] Fundraising-Backlog, MediaWiki-extensions-CentralNotice: Central Notice: possible CN issue early on December 1st UTC - https://phabricator.wikimedia.org/T152122#2839196 (awight) Higher resolution on the drop: ``` select sum(count) as impressions, hour(timestamp) as hour, minute(timestamp) as minute from... [18:56:42] awight: ur query is on the phgres database, right? Looks like the lowest point is at 8 am, not 6? [18:57:07] Let's try pulling from Hive. It has a random sample function that is prolly useful [18:57:12] K4-713: I have exact minutes in which banner service faltered, see the task. 08:10 - 09:02, but it ramps back up slowly. [18:57:18] We needa ask ops [18:57:36] Yeah, I saw that. Jeff_Green said he didn't see anything interesting... [18:59:34] AndyRussG: yep that was pgehres... [18:59:55] how do impressions vs landingpages line up? [19:00:35] because the divot in the landingpages log isn't as dramatic as it is in the impressions stuff from the pgehres db [19:03:19] awight: are you saying the same thing happened yesterday? [19:05:50] Jeff_Green: I think emails make up the difference [19:06:08] Fundraising-Backlog, FR-Adyen: Adyen: setup a form for Canada - https://phabricator.wikimedia.org/T152123#2839215 (DStrine) [19:08:00] looking at raw beaconimpression logs, we start a dropoff right around 8:07 [19:09:14] Huh. awight had it at 8:10. [19:09:34] AndyRussG: Are you tracking this? ^^ [19:09:39] K4-713: [19:09:42] Yes [19:09:53] K4-713: Jeff_Green: awight: can we check banner loader logs too? [19:09:54] okay, groovy. that's all I need to know for now, I think. [19:10:35] AndyRussG: I have provided just about all the info I can. [19:10:39] K4-713: mmm never know what you might have needed to know but didn't know you needed (to know) [19:10:41] thx!!! [19:11:02] awight: Jeff_Green: can we etherpad to share queries and results? [19:11:29] https://etherpad.wikimedia.org/p/T152122_notes [19:12:07] AndyRussG: we've already started a phab ticket, should we just continue there? [19:12:56] Jeff_Green: also! I've found the etherpad is fun sometimes to tweak and share queries without clogging up the eternally logged public discussion on the ticket... but whatever you prefer is great! :) [19:13:43] (but for example: what u just said about raw beaconimpression logs...u pulled that from Hive i guess? or elsewhere?) [19:14:43] I grepped collected logs along the banner impresison pipeline, it's ultimately out of kafka [19:14:59] Jeff_Green: ah K thx :) [19:15:12] we have two established collectors to feed the pgehres db [19:15:36] We didn't have any other more general issues at that time? What do pageviews in general look like at that time? [19:15:52] Maybe some more general Internet outage like we had a month or so ago? [19:16:39] Fundraising-Backlog, MediaWiki-extensions-CentralNotice: Central Notice: possible CN issue early on December 1st UTC - https://phabricator.wikimedia.org/T152122#2839139 (Jgreen) line counts from beaconImpressions logs in the banner impression pipeline: 2016-12-01T08:00 5494 2016-12-01T08:01 5296 2016-12... [19:16:57] whatever it was really looks like it took exactly an hour [19:18:53] dstrine: what do you think of me pulling this in: https://phabricator.wikimedia.org/T99869 ? some renewed interest. [19:20:16] cwd: can we discuss at standup? We have a couple things to sort. I just want to discuss the whole picture [19:20:27] sounds good [19:22:14] (PS1) Awight: Add gitignore and tox support [wikimedia/fundraising/stats] - https://gerrit.wikimedia.org/r/324774 [19:25:43] Fundraising-Backlog: Use referrer and change utm_medium for social media sources - https://phabricator.wikimedia.org/T152125#2839292 (DStrine) [19:30:40] Jeff_Green: awight: just to note, I don't see any deployments or prod deploy logs at that time... (https://wikitech.wikimedia.org/wiki/Server_Admin_Log, https://wikitech.wikimedia.org/wiki/Deployments#Thursday.2C.C2.A0December.C2.A001) [19:37:09] Fundraising Dash, Fundraising-Backlog, MediaWiki-Vagrant: create vagrant role for fundraising dash - https://phabricator.wikimedia.org/T99869#2839338 (DStrine) [19:43:01] Hmmm how do I group by minute in Hive? [19:43:27] Fundraising-Backlog, Wikimedia-Fundraising: Thank You page to Facebook error - https://phabricator.wikimedia.org/T152026#2835881 (Ejegg) Let's update Special:FundraiserRedirector to use the GeoIP cookie when the country code on the URL is invalid - it currently ignores the cookie if there's any country p... [19:57:17] Jeff_Green: awight: also chatting a bit with ottomata and nuria in #wikimedia-analytics :) [19:57:33] Fundraising Sprint Value Subtracting, Fundraising Sprint Waiting for Godot, Fundraising-Backlog, Wikimedia-Fundraising-CiviCRM, FR-2016-17-Q2-Bugs: Engage import failing to import certain significant fields - https://phabricator.wikimedia.org/T146295#2839425 (CCogdill_WMF) From my perspective... [20:15:14] (PS1) Ejegg: Ignore invalid country codes from query string [extensions/FundraiserLandingPage] - https://gerrit.wikimedia.org/r/324789 (https://phabricator.wikimedia.org/T152026) [20:41:49] Fundraising Dash, Fundraising-Backlog: dash does not respect the cutoff value - https://phabricator.wikimedia.org/T152138#2839651 (DStrine) [20:50:49] Fundraising-Backlog, Wikimedia-Fundraising-CiviCRM: Search for email only works if primary email address - https://phabricator.wikimedia.org/T152048#2839688 (DStrine) p:Triage>Normal [20:52:25] Fundraising Sprint Waiting for Godot, Fundraising-Backlog, MediaWiki-extensions-CentralNotice, Unplanned-Sprint-Work: Central Notice: possible CN issue early on December 1st UTC - https://phabricator.wikimedia.org/T152122#2839693 (DStrine) a:AndyRussG [20:56:18] Fundraising Dash, Fundraising-Backlog: dash does not respect the cutoff value - https://phabricator.wikimedia.org/T152138#2839651 (Ejegg) This was something I broke when I made the band-aid fix to show November 29 & 30. It's fixed by https://gerrit.wikimedia.org/r/324634 [20:56:48] Fundraising Dash, Fundraising Sprint Waiting for Godot, Fundraising-Backlog: dash does not respect the cutoff value - https://phabricator.wikimedia.org/T152138#2839719 (Ejegg) p:Triage>Normal a:Ejegg [20:59:29] Fundraising Sprint Waiting for Godot, Fundraising-Backlog, MediaWiki-extensions-CentralNotice, Unplanned-Sprint-Work: Central Notice: possible CN issue early on December 1st UTC - https://phabricator.wikimedia.org/T152122#2839731 (DStrine) p:High>Unbreak! [21:14:08] Fundraising Dash, Fundraising Sprint Waiting for Godot, Fundraising-Backlog: Top days / top hours widget - https://phabricator.wikimedia.org/T152028#2839778 (XenoRyet) a:XenoRyet>None [21:14:15] Fundraising Dash, Fundraising Sprint Waiting for Godot, Fundraising-Backlog, MediaWiki-Vagrant, Unplanned-Sprint-Work: create vagrant role for fundraising dash - https://phabricator.wikimedia.org/T99869#2839781 (DStrine) [21:15:04] Fundraising Sprint Waiting for Godot, Fundraising-Backlog, FR-Adyen, Unplanned-Sprint-Work: Adyen: setup a form for Canada - https://phabricator.wikimedia.org/T152123#2839791 (DStrine) a:XenoRyet [21:18:07] Fundraising-Backlog: Use referrer and change utm_medium for social media sources - https://phabricator.wikimedia.org/T152125#2839820 (DStrine) p:Triage>Normal [21:25:21] Fundraising Sprint Waiting for Godot, Fundraising-Backlog, MediaWiki-extensions-CentralNotice, Unplanned-Sprint-Work: Central Notice: possible CN issue early on December 1st UTC - https://phabricator.wikimedia.org/T152122#2839854 (AndyRussG) Just to note: - CentralNotice logs don't show any chang... [21:30:01] Fundraising Sprint Waiting for Godot, Fundraising-Backlog, MediaWiki-extensions-CentralNotice, Unplanned-Sprint-Work: Central Notice: possible CN issue early on December 1st UTC - https://phabricator.wikimedia.org/T152122#2839903 (AndyRussG) Filtering logs of requests to Special:BannerLoader show... [21:42:03] Fundraising-Backlog, Wikimedia-Fundraising, Patch-For-Review: Thank You page to Facebook error - https://phabricator.wikimedia.org/T152026#2839952 (Pcoombe) That sounds like a great idea, thanks @Ejegg ! [22:03:06] AndyRussG|sortof: RoanKattouw explained that the es* servers are actually "external store", where wiki text content lives. [22:03:42] That isn't a particularly smoky gun however, cos there are caching layers in front of the store that would probably survive even if the store were suddenly empty. [22:04:41] awight: Another weird thing is that the impression rate didn't go down to 0 or anywhere close [22:04:48] It stayed north of 900/min the whole time [22:05:07] While that's 20-30x lower than before, it was definitely still working for some people [22:05:39] *right [22:05:46] That's why I want to check data center [22:06:15] although we determined that donation counts went down in a distributed way across the three continents where we're fundraising [22:06:28] (fyi, ('AU', 'CA', 'IE', 'NZ', 'US', 'GB')) [22:07:01] ah--ooh. right, we still need to tally *which banners were still appearing during the dip [22:07:45] awight: RoanKattouw: exactly, let's try to sort out what was working... Maybe also group by device, country, language, project? [22:08:05] status in beacon/impression [22:08:09] both before and after the dip [22:08:15] Back fully soon! [22:08:42] Take yr time! [22:11:27] Fundraising Sprint Waiting for Godot, Fundraising-Backlog, MediaWiki-extensions-CentralNotice, Unplanned-Sprint-Work: Central Notice: possible CN issue early on December 1st UTC - https://phabricator.wikimedia.org/T152122#2839139 (BBlack) What about the non-geotargeted, did they drop off as well?... [22:16:31] I'm probably being totally obnoxious here, but: If it was a GeoIPLookup problem, users with their GeoIP info already set would be able to pull banners just fine, right? [22:16:43] * K4-713 waits for someone to tell me to scroll up [22:17:03] We stuff that in a cookie, right? [22:17:30] k4 yep, that would totally explain a 90+% dropoff rather than total failure [22:17:49] * K4-713 squints suspiciously [22:18:03] check also ip version? [22:18:14] Have we looked at platforms? [22:18:26] Again, I'm not sure anybody has 90% market share. [22:18:33] But, worth being explicit. [22:19:10] (see last query in the etherpad) I found that it's both mobile and desktop. Seems that all banners are not affected equally. However, it's not just WMF FR banners. [22:19:26] Browser? [22:20:41] donno yet, but that would be odd [22:21:12] Yeah... that would be like one tiny browser in a sea of choices doing something obnoxiously which allowed things to work. [22:22:11] Can't even imagine what that could possibly be. [22:24:38] Also, why am I having mad deja-vu right now? [22:25:20] geoip cookie lasts for quite awhile, i'd actually be surprised if only 10% of browsers have been to wikipedia lately [22:25:32] awight: Is this starting to ring tiny bells for you too, or is that just me? [22:25:51] It's not :( [22:26:01] k see you in another window [22:26:04] yep [22:26:20] * K4-713 removes tiny bells [22:35:21] https://aws.amazon.com/snowmobile/ [22:36:58] haha [22:37:04] wait, really? [22:38:14] Back in school my networks professer used to say: "Never underestimate the bandwidth of a stationwagon full of harddrives on the highway." [22:41:24] crazy [22:41:54] K4-713: the only thing that rings a bell for me is an unexplained brief gap in banner histories last year, shortly after BH was turned on [22:44:49] We had been planning to investigate but eventually that drifted off the edge of the planning world (which is flat, as everyone knows) [22:47:12] K4-713: Yeah when I saw 90% I was thinking well maybe that's how many people have a cold cache [22:50:13] RoanKattouw: what do u mean? [22:50:51] like, no RL modules, no geo cookie? [22:54:22] Looks like cwd has a good point ^ GeoIP cookies last for a while. The one I just got has a pull date of 1 year [22:55:18] would be interesting to see how many page views those cookies last in general [22:55:40] i gotta run some errands but i will be back later and finish this vagrant thing [22:56:13] hmm, there was one geotargeted campaign that didn't seem to be affected though: WMES_Wiki_Loves_Folk_2016 [22:56:28] AndyRussG: ejegg: Yeah exactly, a number of people come to the site with no GeoIP cookie, no RL stuff in localStorage, no RL code in the browser cache, basically as if they've never been to the site before [22:56:46] But some people do have cookies and stuff in their cache [22:57:07] Fundraising Sprint Waiting for Godot, Fundraising-Backlog, MediaWiki-extensions-CentralNotice, Unplanned-Sprint-Work: Central Notice: possible CN issue early on December 1st UTC - https://phabricator.wikimedia.org/T152122#2840168 (AndyRussG) >>! In T152122#2840070, @BBlack wrote: > What about the... [22:57:27] targeted to EA, ES, and IC, and impressions rise steadily from 0700-1000 [23:04:13] Fundraising Sprint Waiting for Godot, Fundraising-Backlog, MediaWiki-extensions-CentralNotice, Unplanned-Sprint-Work: Central Notice: possible CN issue early on December 1st UTC - https://phabricator.wikimedia.org/T152122#2839139 (Ejegg) There wasn't much non-geotargeted stuff runing - just Czech... [23:14:54] RoanKattouw: I think the LocalStorage RL cache is still of for FF users? (not that it's really a possible cause... [23:14:56] ) [23:15:11] Yeah but only for people who have been to lots of wikis [23:15:14] (i.e. people like us) [23:16:24] Hmmm [23:17:29] At this point the most possible cause I can imagine is some DB issue whereby ChoiceData (which sends possible campaign options) was somehow corrupt [23:17:40] GeoIP seems to be out [23:17:52] Yeah Adam and I just chatted IRL [23:18:18] He suggested we look at what country codes were logged both for displayed banners as well as for "there is no banner to display" decisions [23:18:29] To see if there are weird country codes like EU or RF in ther [23:18:51] And I suggested that we look at whether the quantity of (banner impressions + no-banner-to-display events) changes [23:18:53] *changed [23:19:03] Except there is no impression log for "there is no campaign to display" decisions [23:19:05] If there was a shift from impressions to no-banner, then that sum would be stable [23:19:17] Hmph, he seemed to suggest there was some sort of logging for htat [23:19:28] But I don't know enough about CN etc to know what's logged let alone how to get it [23:19:40] The geotargeting decision is made at the campaign selection stage, no beacon/impression logging for that [23:20:01] Hmm. He said something about a URL param, I'll ask him in a minute, he's finishing something now [23:21:18] I guess if it were somehow a weird partical GeoIP outage--i.e., some countries were being blanked, others not--we might see results ejegg just posted to the task [23:21:37] * AndyRussG 's always finishing a finite number of unfinishable things [23:24:24] RoanKattouw: fr-tech I'd like to craft a Hive query that would pull a lot of data for three full hours (7, 8 and 9 hrs today) and that we could download in csv and fiddle around with more comfortably. Does that sound like a good idea? [23:24:48] (ohnoes, RoanKattouw now knows our secret ping word) [23:25:00] +1 [23:25:17] that's what I've been doing, too. I usually jam it into mysql locally [23:26:28] AndyRussG: awight just did the NOT IN version of your query [23:26:36] To see if the non-English countries were flat(tre) [23:26:39] *flat(ter) [23:29:43] Ah... Result? [23:29:52] It's... hm [23:30:06] 13000 outside the dip, 3300 inside it [23:30:13] what about the IN version? [23:30:15] Countries? [23:30:19] * awight looks at task [23:30:28] yes. 0: jdbc:hive2://analytics1003.eqiad.wmnet:100> and geocoded_data['country_code'] not in ('AU', 'CA', 'IE', 'NZ', 'US', 'GB') [23:30:34] otherwise same as your query [23:30:56] Can u paste somewhere? [23:31:02] Do we have ip version somewhere? [23:31:06] Fundraising Sprint Waiting for Godot, Fundraising-Backlog, MediaWiki-extensions-CentralNotice, Unplanned-Sprint-Work: Central Notice: possible CN issue early on December 1st UTC - https://phabricator.wikimedia.org/T152122#2840263 (Ejegg) All the datacenters show the problem, but to different degr... [23:31:21] droped it in the etherpad [23:34:37] awight: I don't undesrtand ur "13000 outside the dip, 3300 inside it"... where did u get that? can u put the result of that query somewhere? [23:34:42] thx :) [23:34:53] yes, one moment pls [23:34:58] aww, I nuked it. [23:35:13] Well, I'm rerunning that query + a suggestion from RoanKattouw that I include DE in the NOT IN. [23:35:29] That should pretty nearly isolate the geolocated banners from the non-geolocated ones, right? [23:35:33] ... [23:35:40] lemme make a list of the banners in the etherpad [23:35:51] oof. [23:35:57] not sure what this all means [23:36:09] awight: I could only find 2 non-geotargeted things running [23:36:11] when I included DE, it goes from 6300 to 3300. [23:36:11] k [23:36:21] one affected, one not [23:36:45] k it's in etherpad [23:36:46] there's a definite difference between datacenters though [23:36:49] ooh [23:36:56] eqiad loses 97% [23:37:01] ulsfo only 65% [23:37:14] ejegg: ah that could just be that geotargeted countries happen to be served by that center [23:37:15] ulsfo is for caching, though [23:38:10] yeah, also definitely correlated with country [23:39:36] I'm going to go through the active banners and... bold the ones which are not in a geolocated campaign [23:39:39] (in etherpad) [23:39:42] eqiad = its region + all logged-in users [23:40:05] So ulsfo = western US + East Asia but only anons in those regions [23:40:37] I forget exactly which regions go to which data center, that's in a config file somewhere [23:43:14] Only two of the banners are not geolocated: CzechWikiCon and WikiFranca_contribution_month2016_call [23:43:18] awight: so, the non-geotarged campaigns are such a tiny fraction of the total [23:43:20] checking whether they experienced the dip. [23:43:43] czech didn't, wikifranca did [23:43:51] but wikifranca had a tiny sample size [23:44:44] yeah it's what you said. Sample size is too small [23:44:56] Why do you say wikifranca did? [23:45:13] I only see 7 bannerimpressions sample rows [23:45:41] I just summed by hour and got 36, 19, 70 (7-8, 8-9, 9-10) [23:45:54] got... [23:45:56] ah [23:46:21] I put my query at the end of the etherpad [23:46:29] aren't there 4 datacenters? [23:47:10] Did any of the FR campaign/banners get thru? And if so, what distinguishes those? [23:47:15] codfw lost 92% and esams lost 86% [23:47:36] I mean, I think 2 datacenters runnning php and 2 that are only varnish and a bit, right? [23:47:47] yeah, the caching ones lost the least [23:48:13] hmm, can we tell which varnish servers ppl came through? [23:49:19] x_cache [23:49:21] I think that's in webrequest [23:49:35] What if instead of downloading a csv we copy to a user table in Hive all the BannerLoader requests from these 3 hours? That'd make querying fast, no? [23:50:29] good call [23:51:58] I'm just imagining what a PITA it'd be futz with more detailed data in a csv [23:52:24] I'd just go to our sampled db but there may be interesting info that's two minute to show up there, maybe? [23:52:50] too minute ("mine-OOt") [23:53:09] There are 2 data centers capable of running PHP (eqiad and codfw), but only eqiad actually runs PHP right now [23:53:21] Ah interesting! [23:53:33] (We have briefly switched to codfw as the primary in the past, and we are working on making multi-DC possible but it's a big project) [23:53:45] The caching-only DCs are ulsfo and esams [23:54:08] RoanKattouw: but for the purpose of serving RL stuff and anon users, all are pretty close to equal, no? [23:54:27] What about geolocation cookie setting? That must work across all 4 [23:54:27] Yeah there's not much difference [23:55:02] Your request first goes to the nearest data center, and if that has a cache miss (or something about your request is uncacheable, e.g. you are logged in or the request is a POST), it forwards to eqiad [23:55:37] I don't know too much about how geoip is implemented these days, but I hear it's in VCL, so that means it runs in each dc separately [23:55:55] So in theory I guess you could have something like "geoip died but only in ulfso" but that seems strange [23:56:07] Also because it runs on Varnishes, an actual outage of geoip seems implausible [23:56:22] awight believed it was more plausbile that it would have returned nonsense [23:56:40] that's some kind of cdb file lookup or something as opposed to db calls? [23:56:44] Fundraising Sprint Waiting for Godot, Fundraising-Backlog, MediaWiki-extensions-CentralNotice, Unplanned-Sprint-Work: Central Notice: possible CN issue early on December 1st UTC - https://phabricator.wikimedia.org/T152122#2840346 (awight) @ejegg Nice discoveries! I don't think `WikiFranca_contri... [23:56:58] isn't geoip implemented as varnish module now? [23:57:08] i.e. it runs locally on each proxy [23:57:55] Yeah it's a compiled VCL module which calls a local .so, and reads local db files. [23:58:18] Yeah exaclty [23:58:37] And is the DB for that local on each proxy node? [23:58:39] I spot checked the db on one of the cp boxes earlier and it last changed 11/16th I think [23:58:50] Right [23:58:53] yeah /usr/share/GeoIP/* [23:58:54] We did see extreme weirdness just *before the first day of the fundraiser last year and I believe 2 years ago, where we were hearing that some people were geolocating to "EU" and stuff. [23:59:05] x_cache header is no smoking gun though [23:59:12] Hmm, do we still have issues with IPv6 geoip? [23:59:14] last year it operated differently I think, not 100% sure [23:59:22] RoanKattouw: with precision? [23:59:23] I know we did last year [23:59:36] Well last year IPv6 users in SF were not seeing banners at all [23:59:41] ah [23:59:42] I haven't been home enough yet to test from my connection [23:59:56] And where I was for Thanksgiving had a v4-only connection\