[00:28:23] 10Analytics, 10Product-Analytics, 10SDC General, 10Wikidata: Data about how many file pages on Commons contain at least one structured data element - https://phabricator.wikimedia.org/T238878 (10Nuria) After looking at this for a bit with @Ladsgroup and @mpopov (cc @Abit ) A commons page can have data fr... [00:30:04] 10Analytics, 10Analytics-Kanban, 10Product-Analytics, 10SDC General, 10Wikidata: Create reportupdater reports that execute SDC requests - https://phabricator.wikimedia.org/T239565 (10Nuria) Please see my comment on https://phabricator.wikimedia.org/T238878#5726624 Seems like the 7.9 million items are fro... [00:31:44] 10Analytics, 10MediaWiki-Cache, 10Research: Creating a wikipedia CDN caching trace - https://phabricator.wikimedia.org/T239885 (101a1a11a) 05Open→03Resolved a:031a1a11a Hi all, Thank you for the quick reply! I have been reading this the task thread @mforns has pointed out and the 2016 task thread, they... [00:40:36] 10Analytics, 10Product-Analytics, 10SDC General, 10Wikidata: Data about how many file pages on Commons contain at least one structured data element - https://phabricator.wikimedia.org/T238878 (10Nuria) Select for @mpopov to look at ` select count(distinct eu_page_id) from mediawiki_page as P JOIN mediawi... [00:43:05] milimetric: can you take a look at my patch with bot changes and let me know if the naming is OK? [05:06:07] PROBLEM - Check the last execution of wikimedia-discovery-golden on stat1007 is CRITICAL: CRITICAL: Status of the systemd unit wikimedia-discovery-golden https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [05:33:56] (03CR) 10Milimetric: "I don't think you staged your changes, PS7 and PS8 seem the same" (033 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/552943 (https://phabricator.wikimedia.org/T238360) (owner: 10Nuria) [07:27:02] (03PS9) 10Nuria: [WIP] Table and workflow for features computations per session [analytics/refinery] - 10https://gerrit.wikimedia.org/r/552943 (https://phabricator.wikimedia.org/T238360) [07:32:23] hola nuria! [07:46:39] RECOVERY - Check the last execution of wikimedia-discovery-golden on stat1007 is OK: OK: Status of the systemd unit wikimedia-discovery-golden https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [08:02:23] Hi team [08:09:25] bonjour! [10:02:01] a-team: can someone give me access to https://phabricator.wikimedia.org/T226730? I'm trying to understand https://phabricator.wikimedia.org/T239672 and saw it mentioned as the reason for the change in the treatment of Special page views. [10:06:43] 10Analytics, 10Event-Platform, 10User-Elukey: Missing Event Stream's stream for #central logins - https://phabricator.wikimedia.org/T240182 (10elukey) Users in #central are: ` 10:56 :: Users #central 10:56 [@rc-pmtpa] [ natuur12] [ Steinsplitter] 10:56 [ BanBot ] [ GVMBot] [ RxyBotLT] [ SULW ] ` Fe... [10:12:50] 10Analytics, 10Product-Analytics: Many special pages missing from pageview_hourly dataset starting on July 23, 2019 - https://phabricator.wikimedia.org/T239672 (10Neil_P._Quinn_WMF) @Nuria I don't understand the logic of whitelisting //only// those three special pages (Search, RecentChanges, and Version). The... [10:17:47] 10Analytics, 10Event-Platform, 10User-Elukey: Event Stream's stream to replace irc.wikimedia.org's #central channel feed - https://phabricator.wikimedia.org/T240182 (10elukey) [11:06:38] 10Analytics, 10Event-Platform, 10User-Elukey: Event Stream's stream to replace irc.wikimedia.org's #central channel feed - https://phabricator.wikimedia.org/T240182 (10elukey) @Legoktm @Krinkle hi :) do you have any context about the above bots? And if we need to keep #central feed long term? [11:08:28] 10Analytics, 10Analytics-Kanban, 10Event-Platform, 10User-Elukey: Documentation improvements for Eventstreams - https://phabricator.wikimedia.org/T240181 (10elukey) >>! In T240181#5724071, @Ottomata wrote: >> Description of parameters in https://stream.wikimedia.org/?doc#/Streams/get_v2_stream_recentchange... [11:13:30] 10Analytics, 10Operations, 10SRE-Access-Requests: Add accraze to analytics-privatedata-users - https://phabricator.wikimedia.org/T240243 (10jcrespo) @Nuria See original request at T226204#5279623 where @Ottomata suggested this group but was not added. Is this something you approve, as an addendum to the orig... [11:17:46] 10Analytics, 10Operations, 10SRE-Access-Requests: Add accraze to analytics-privatedata-users - https://phabricator.wikimedia.org/T240243 (10elukey) To add some context, this was originated by me not finding the user among analytics-privatedata-users when trying to add kerberos credentials. All the members of... [11:26:33] 10Analytics, 10Operations, 10SRE-Access-Requests, 10Patch-For-Review: Add accraze to analytics-privatedata-users - https://phabricator.wikimedia.org/T240243 (10jcrespo) ^I have prepared the patch to merge it as soon as everybody agrees. [11:27:14] 10Analytics, 10Operations, 10SRE-Access-Requests, 10Patch-For-Review: Add accraze to analytics-privatedata-users - https://phabricator.wikimedia.org/T240243 (10jcrespo) a:03Nuria Please reassign to me when ok or if there are comments. [11:35:48] * elukey lunch! [13:22:14] 10Analytics, 10ArticlePlaceholder, 10Wikidata, 10Wikidata-Campsite, 10wikidata-tech-focus: ArticlePlaceholder dashboard stopped tracking page views - https://phabricator.wikimedia.org/T236895 (10Addshore) [13:34:46] hellooo [13:37:24] o/ [13:47:26] elukey: i need to restart varnishkafka-webrequest & varnishkafka-eventlogging there is a note on the service restart gguide to a) highlight the activity to analytics (can i consider this done or is more lead time required 2) do the change in small batches; how small should the batches be and how long should i leave between each on [13:48:43] jbond42: hey, thanks for the heads up! The purpose of the small batches is to do it gently, say 3/4 intances at the time, and leave few seconds to let them recover, nothing more [13:49:33] so i can just run with `cumin -s5 -b 3`? [13:50:25] yes exactly [13:50:39] cool am i good to kick it of now [14:07:42] 10Analytics, 10Operations, 10ops-eqiad: Degraded RAID on dbstore1003 - https://phabricator.wikimedia.org/T239217 (10Marostegui) Disk replaced by John and I can see it rebuilding: ` root@dbstore1003:~# megacli -PDRbld -ShowProg -physdrv[32:4] -aALL Rebuild Progress on Device at Enclosure 32, Slot 4 Completed... [14:08:50] !log rolling restart of varnishkafaka-webrequest and varnishkafaka-eventloggin [14:08:51] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [14:12:13] 10Analytics, 10Performance-Team (Radar): Send X-Analytics information from Varnish to Hadoop with VCL_Log - https://phabricator.wikimedia.org/T196558 (10Gilles) [14:12:36] 10Analytics, 10Performance-Team (Radar): Send X-Analytics information from Varnish to Hadoop with VCL_Log - https://phabricator.wikimedia.org/T196558 (10Gilles) @ema what's the current state or plan for this in ATS? [14:14:13] 10Analytics, 10Product-Analytics, 10SDC General, 10Wikidata: Data about how many file pages on Commons contain at least one structured data element - https://phabricator.wikimedia.org/T238878 (10matthiasmullie) > Ping @matthiasmullie to asses this info That is correct, except for 1 little detail: wbc_ent... [14:15:02] 10Analytics, 10Operations, 10ops-eqiad: Degraded RAID on dbstore1003 - https://phabricator.wikimedia.org/T239217 (10Jclark-ctr) 05Open→03Resolved Replaced Failed Drive [14:16:21] 10Analytics, 10Analytics-Kanban, 10Product-Analytics, 10SDC General, 10Wikidata: Create reportupdater reports that execute SDC requests - https://phabricator.wikimedia.org/T239565 (10matthiasmullie) I'm not really sure what number we want to go with, but I can probably help clarify what kind of data is i... [14:18:04] 10Analytics: Check home leftovers of maxsem - https://phabricator.wikimedia.org/T239047 (10elukey) 05Open→03Resolved a:03elukey All cleaned up! [14:18:59] 10Analytics: Check home leftovers of dfoy - https://phabricator.wikimedia.org/T239571 (10elukey) @DFoy if you are still reading phab notifications, can you tell us if the above can be removed? [14:19:07] 10Analytics, 10Analytics-Kanban, 10Event-Platform, 10User-Elukey: Documentation improvements for Eventstreams - https://phabricator.wikimedia.org/T240181 (10Ottomata) > Just to understand, does this need a deployment of ES or something else? Ya, just haven't done that (need to rebuild scap deploy repo, et... [14:19:23] 10Analytics: Check home leftovers of dfoy - https://phabricator.wikimedia.org/T239571 (10elukey) @Nuria can you triple check the above files and let me know if we can delete? [14:20:24] 10Analytics, 10Operations, 10ops-eqiad: analytics1057's BBU is faulty - https://phabricator.wikimedia.org/T239045 (10elukey) 05Open→03Resolved I have set puppet to check for WriteThrough, not WriteBack, so alarms will go away. This host will be refreshed during the next months. [14:34:58] !log shutdown of stat1004 to check if it can hold a GPU [14:34:59] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [14:35:14] ottomata: o/ - I realized today that the suggested replacement date for stat1004 is 2020 [14:35:28] but we don't have it scheduled for replacement right? [14:38:23] 10Analytics, 10Analytics-Cluster: an-coord1001 hive metastore not listening on ipv6 - https://phabricator.wikimedia.org/T240255 (10elukey) Interesting, I have never realized this. The hive daemons are running with `-Djava.net.preferIPv4Stack=true`, probably similar to all the other hadoop daemons (see T225296#... [14:39:35] oh crazy! [14:39:39] no [14:39:46] but we could do it next FY? [14:40:44] 10Analytics, 10Analytics-Cluster: an-coord1001 hive metastore not listening on ipv6 - https://phabricator.wikimedia.org/T240255 (10Ottomata) I doubt we want to prefer IPv6, (do we?) but maybe we can make Hive listen on both IPs? [14:47:28] ottomata: sure sure [14:51:06] Hi yall! [14:51:15] I have a notebook, and I want to publish it... [14:51:29] I was going to generate html and add it to the data sets web space next to the data... [14:51:52] However when exporting the HTML it loaded files from cloudflare etc, im guessing that is not okay? or is it? should i find replacements? [14:52:02] / is there a better way / docs etc bla bla bla? :) [14:52:55] addshore: hi! there is https://wikitech.wikimedia.org/wiki/SWAP#Sharing_notebooks [14:54:59] 10Analytics-Kanban, 10Better Use Of Data, 10Event-Platform, 10Operations, and 8 others: Set up eventgate-logging-external in production - https://phabricator.wikimedia.org/T236386 (10jlinehan) >>! In T236386#5725166, @Ottomata wrote: > @jlinehan thoughts? I'm considering moving forward with intake-{analyt... [14:55:50] Gone for kids - back in a while [15:14:00] 10Analytics-Kanban, 10Better Use Of Data, 10Event-Platform, 10Operations, and 8 others: Set up eventgate-logging-external in production - https://phabricator.wikimedia.org/T236386 (10Ottomata) > Changing the URL is easy is not really that easy :/ Possible though. Ooook. [15:24:44] * elukey afk, interview [16:04:26] (03CR) 10Mforns: [C: 04-1] "Left some comments, mostly typos." (038 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/552943 (https://phabricator.wikimedia.org/T238360) (owner: 10Nuria) [16:14:05] 10Analytics, 10Analytics-Kanban, 10Event-Platform: Update MW Vagrant to work with EventLogging and EventGate changes - https://phabricator.wikimedia.org/T240355 (10Ottomata) [16:31:57] 10Analytics, 10Product-Analytics, 10SDC General, 10Wikidata: Data about how many file pages on Commons contain at least one structured data element - https://phabricator.wikimedia.org/T238878 (10Addshore) Indeed it will ` mysql:research@dbstore1004.eqiad.wmnet [commonswiki]> select count(*) from wbc_entit... [16:36:46] 10Analytics, 10Research: Section fragment information stripped from webrequests - https://phabricator.wikimedia.org/T240359 (10Isaac) [16:38:48] 10Analytics, 10Analytics-Cluster: an-coord1001 hive metastore not listening on ipv6 - https://phabricator.wikimedia.org/T240255 (10elukey) Very interesting: ` /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -Xmx256m -Xms4g -Xmx10g -Xms4g -Xmx10g -Djava.net.preferIPv4Stack=false -Dcom.sun.management.jmxremote.p... [16:46:15] 10Analytics, 10Analytics-Kanban, 10Product-Analytics, 10SDC General, 10Wikidata: Create reportupdater reports that execute SDC requests - https://phabricator.wikimedia.org/T239565 (10Addshore) ` SELECT COUNT(*) FROM ( SELECT DISTINCT page_id FROM page INNER JOIN slots ON slot_revision_id = page_latest... [16:52:15] 10Analytics, 10Product-Analytics, 10SDC General, 10Wikidata: Data about how many file pages on Commons contain at least one structured data element - https://phabricator.wikimedia.org/T238878 (10Nuria) >but by and large, it's currently wikidata entities being pulled in via Lua It is exclusively wikidata it... [16:54:49] 10Analytics, 10Operations, 10SRE-Access-Requests, 10Patch-For-Review: Add accraze to analytics-privatedata-users - https://phabricator.wikimedia.org/T240243 (10Nuria) Approved on my end [16:58:21] 10Analytics, 10Product-Analytics, 10SDC General, 10Wikidata: Data about how many file pages on Commons contain at least one structured data element - https://phabricator.wikimedia.org/T238878 (10Addshore) >>! In T238878#5728735, @Nuria wrote: >>but by and large, it's currently wikidata entities being pulle... [17:01:36] ping nuria milimetric stand up yoohooo [17:01:54] elukey: for the html files thing, is it fine to have things loading js and css from cloudflare on *.wikimedia.org though? [17:03:20] addshore: where did you get those sources? In theory if they are legit I don't see a big issue, the main problem is that you'll depend on their availability, but since it is a notebook it could be fine [17:05:24] 10Analytics, 10Operations, 10SRE-Access-Requests, 10Patch-For-Review: Add accraze to analytics-privatedata-users - https://phabricator.wikimedia.org/T240243 (10jcrespo) a:05Nuria→03jcrespo [17:05:26] 10Analytics, 10Event-Platform, 10User-Elukey: Event Stream's stream to replace irc.wikimedia.org's #central channel feed - https://phabricator.wikimedia.org/T240182 (10Nuria) From https://www.wikizero.com/de/Benutzer:GiftBot GVMBot seems to be a vadalism fighting bot, this data stream is probbably not of mu... [17:05:38] 10Analytics, 10Operations, 10SRE-Access-Requests, 10Patch-For-Review: Add accraze to analytics-privatedata-users - https://phabricator.wikimedia.org/T240243 (10jcrespo) Thanks! [17:46:39] 10Analytics, 10Analytics-Cluster: an-coord1001 hive metastore not listening on ipv6 - https://phabricator.wikimedia.org/T240255 (10elukey) Ok I know what happens, this is the chain of events: 1) /etc/init.d/hive-metastore eventually calls `/usr/lib/hive/bin/ext/metastore.sh` 2) the file contains ` export HA... [18:03:25] 10Analytics, 10Analytics-Kanban: Estimate percentage wise the number of requests on mediarequest dataset that are previews - https://phabricator.wikimedia.org/T240362 (10Nuria) [18:03:49] 10Analytics, 10Event-Platform, 10User-Elukey: Event Stream's stream to replace irc.wikimedia.org's #central channel feed - https://phabricator.wikimedia.org/T240182 (10Nuria) [18:03:51] 10Analytics, 10User-Elukey: Redesign architecture of irc-recentchanges on top of Kafka - https://phabricator.wikimedia.org/T234234 (10Nuria) [18:05:17] At the moment, all URLS without http(s):// are classified as referer:unknown (ex. 'en.wikipedia.org', 'google.com'). Would there be any issues with removing that condition, or is there some reason that those aren't classified like their extended counterparts? [18:11:24] 10Analytics, 10Analytics-Kanban, 10Product-Analytics, 10SDC General, 10Wikidata: Create reportupdater reports that execute SDC requests - https://phabricator.wikimedia.org/T239565 (10matthiasmullie) No it shouldn’t count pages more than once. UNION omits duplicates (UNION ALL doesn’t), so no need to DIST... [18:11:59] 10Analytics, 10Multimedia, 10Tool-Pageviews: Statistics for views of individual Wikimedia images - https://phabricator.wikimedia.org/T210313 (10Nuria) [18:12:01] 10Analytics, 10Analytics-Kanban: Estimate percentage wise the number of requests on mediarequest dataset that are previews - https://phabricator.wikimedia.org/T240362 (10Nuria) [18:17:03] ebernhar1son: o/ [18:17:14] elukey: howdy [18:17:19] morning :) [18:17:26] if you have 5 mins can we chat about kerberos? [18:17:34] elukey: sure! I'm probably misunderstanding something there [18:18:02] ebernhar1son: nono I just want to be on the same page, I am ignorant about airflow [18:18:35] my only fear was the need of https://airflow.apache.org/docs/stable/security.html#hadoop [18:19:07] for example, hive in kerberos settings will be able to impersonate users (that should be proxy I believe) [18:19:09] elukey: for the purposes of kerberos, i was thinking that kerberos shouldn't know or care that airflow exists anywhere, just whatever jobs get submitted [18:19:33] ebernhardson: that makes sense, I am ok to allow 'airflow' to access 'analytics-search' [18:19:39] (the keytab I mean) [18:19:51] you can even specify the principal, so it should work [18:20:15] but in general the idea is to always launch jobs as analytics-search via airflow [18:20:23] yea that makes sense, spark lets me specify principal and keytab [18:20:28] not to have users send jobs to airflow that in turn sends them to yarn etc.. [18:21:51] elukey: right, although will that run into kinit problems? [18:21:59] basically, the tokens expiring or whtaeveR? [18:22:22] ebernhardson: IIUC airflow will use the keytab to renew periodically the token [18:22:24] from the airflow perspective, when i create a spark task i set two values, principal and keytab, and it goes from there [18:22:58] elukey: ok i'll have to check that out, so far airflow has no knowledge about kerberos. It's just some values airflow passes on to the spark-submit s [18:23:02] spark-submit CLI command [18:23:30] so should be as easy as https://airflow.apache.org/docs/stable/security.html#airflow [18:23:48] spark-submit has also --principal --keytab [18:24:13] elukey: hmm, i'll have to re-review the airflow kerberos integration, but i know that for spark integration it doesn't touch the airflow kerberos, it just passes raw cli args [18:24:18] 10Analytics, 10Multimedia, 10Tool-Pageviews: Statistics for views of individual Wikimedia images - https://phabricator.wikimedia.org/T210313 (10Tgr) Requests to media/beacon contain the image URI and the viewing duration in the query parameters. Currently those are not used in any way. MediaViewer preload re... [18:24:20] generally it is not needed, since when spark runs it will automatically fetch delegation tokens for hdfs and the metastore [18:25:43] ebernhardson: so in theory airflow will have a new thing that will periodically renew the token for airflow, so that when spark-submit is launched it will have something valid in the credential cache [18:25:50] of the user that runs the spark job [18:26:11] ahh, ok that makes sense. So airflow basically keeps it valid, so when spark-submit asks for auth it works [18:26:51] yes, in theory it should work [18:27:18] ebernhardson: you guys launch spark jobs as analytics-search? [18:27:47] elukey: tbh, many of them are launched as my user, because noone else manages them and hue only lets me manage ebernhardson jobs [18:27:54] elukey: but in theory, and moving forward, it should all be analytics-search [18:28:13] ebernhardson: yes yes let's standardize this :) [18:28:21] w/ oozie i always had to ssh in and sudo to run oozie commands [18:29:08] ebernhardson: but the spark jobs launched by airflow are completely automated right? [18:29:14] elukey: yes [18:29:21] all right, then it should work [18:29:28] tomorrow I'll deploy the keytab and the config [18:29:59] if anything breaks on Monday I'll follow up asap to fix the use case [18:30:01] alright, is kerberos in a "working, but optional" state right now in prod, or is it still getting everything in place for a big-bang switch/ [18:30:24] ebernhardson: on Monday we'll do the switch [18:30:27] ok [18:30:33] so everything hadoop related will need to be kerberized [18:31:12] ok, got it [18:32:33] also for that patch, it cherry-picks to production branch fine not sure why gerrit is complaining. It just likes to complain about unmergable things that should probably by cherry-picked instead of merged (i'm biased, i almost never merge and always cherry-pick) [18:32:59] I'll file a new patch in case, it takes a min [18:33:20] ebernhardson: is it ok to explicitly turn kerberos on Monday when you come online? [18:33:39] or is airflow critical and needs to have a procedure when I turn kerberos on? [18:33:45] (03PS1) 10Mforns: Allow drop-older-than to delete under event_sanitized [analytics/refinery] - 10https://gerrit.wikimedia.org/r/556231 (https://phabricator.wikimedia.org/T237124) [18:33:46] (just to add it in my list) [18:33:53] elukey: yea, nothing that we run that ships analytics to prod is super critical, likely noone except me would know that anything is broken for a month or so :) [18:34:25] ebernhardson: ahhaha okok [18:34:34] Analytics doesn't want to break you! [18:34:47] also thanks for the nerd-snipe of ipv6 and hive [18:34:49] well played [18:34:50] :D [18:35:02] lol [18:49:59] * elukey off! [18:50:00] o/ [18:53:35] 10Analytics, 10Multimedia, 10Tool-Pageviews: Statistics for views of individual Wikimedia images - https://phabricator.wikimedia.org/T210313 (10Nuria) >Requests to media/beacon contain the image URI and the viewing duration in the query parameters. Currently those are not used in any way. Understood, we file... [18:55:08] 10Analytics, 10Research: Section fragment information stripped from webrequests - https://phabricator.wikimedia.org/T240359 (10Pcoombe) Fragments aren't even sent in requests, they are handled entirely client side. I was just thinking recently it would be interesting to see how commonly fragment links are use... [18:58:18] (03CR) 10Mforns: "@Nuria, for the record, here's the documentation on the city entropy metric: https://wikitech.wikimedia.org/wiki/Analytics/Data_quality/Tr" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/550498 (https://phabricator.wikimedia.org/T234484) (owner: 10Mforns) [18:58:51] (03CR) 10Mforns: [V: 03+2] "I tested this thoroughly, and it's actually running and populating:" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/550498 (https://phabricator.wikimedia.org/T234484) (owner: 10Mforns) [19:00:13] (03CR) 10Mforns: [V: 03+2] "This is also tested, it's actually running and populating:" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/547320 (https://phabricator.wikimedia.org/T235486) (owner: 10Mforns) [19:12:58] 10Analytics, 10Research: Section fragment information stripped from webrequests - https://phabricator.wikimedia.org/T240359 (10Isaac) 05Open→03Resolved a:03Isaac > Fragments aren't even sent in requests, they are handled entirely client side. @Pcoombe oh yikes, good point, thanks! I had thought I had ver... [19:22:23] (03PS1) 10Joal: [WIP] Add HdfsRsync scala tool [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/556237 [20:06:07] 10Analytics, 10Product-Analytics, 10SDC General, 10Wikidata: Data about how many file pages on Commons contain at least one structured data element - https://phabricator.wikimedia.org/T238878 (10Nuria) I see 7.9 million wikidata items on that table and 1936 mediawinfo items. [20:47:21] 10Analytics, 10Analytics-Kanban, 10Product-Analytics, 10SDC General, 10Wikidata: Create reportupdater reports that execute SDC requests - https://phabricator.wikimedia.org/T239565 (10Nuria) >So, it seems we have 2 completely separate definitions of "structured data": Well, the intent of these numbers is... [20:49:09] (03CR) 10Nuria: Allow drop-older-than to delete under event_sanitized (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/556231 (https://phabricator.wikimedia.org/T237124) (owner: 10Mforns) [20:51:10] (03CR) 10Nuria: [C: 03+2] Add data quality metric: traffic variations per country [analytics/refinery] - 10https://gerrit.wikimedia.org/r/550498 (https://phabricator.wikimedia.org/T234484) (owner: 10Mforns) [20:51:20] (03CR) 10Mforns: Allow drop-older-than to delete under event_sanitized (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/556231 (https://phabricator.wikimedia.org/T237124) (owner: 10Mforns) [20:54:16] (03CR) 10Nuria: "I added couple commoment on naming maybe @milimetric can take a look?" (033 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/547320 (https://phabricator.wikimedia.org/T235486) (owner: 10Mforns) [21:00:31] (03CR) 10Ottomata: "Hm, I thought you wanted to get rid of refinery-tools! :)" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/556237 (owner: 10Joal) [21:01:29] 10Analytics, 10Fundraising-Backlog: Identify source of discrepancy between HUE query in Count of event.impression and druid queries via turnilo/superset - https://phabricator.wikimedia.org/T204396 (10DStrine) [21:07:40] (03CR) 10Nuria: Allow drop-older-than to delete under event_sanitized (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/556231 (https://phabricator.wikimedia.org/T237124) (owner: 10Mforns) [21:20:44] 10Analytics, 10Product-Analytics: Many special pages missing from pageview_hourly dataset starting on July 23, 2019 - https://phabricator.wikimedia.org/T239672 (10Nuria) @Neil_P._Quinn_WMF Indeed it makes sense to include thing such us Special:Book. Can you outline the set of pages that you think denote conte... [21:37:03] (03CR) 10Milimetric: Refactor data_quality oozie bundle to fix too many partitions (032 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/547320 (https://phabricator.wikimedia.org/T235486) (owner: 10Mforns) [21:38:51] thanks nuria and milimetric :] [21:39:55] why is "stats" allowed to be plural and "metrics" not? :P [21:40:30] *"measures" [22:39:05] (03PS2) 10Joal: [WIP] Add HdfsRsync scala tool [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/556237 [22:41:49] (03CR) 10Joal: "I indeed want to get rid of refinery-tools :) Let's discuss jar size and make decision with elukey." [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/556237 (owner: 10Joal) [22:54:02] mforns: i do not think stats has a singular in neither english nor spanish right? is like 'people' , it's a 'collective' name [22:55:12] I see [23:42:47] (03PS10) 10Nuria: [WIP] Table and workflow for features computations per session [analytics/refinery] - 10https://gerrit.wikimedia.org/r/552943 (https://phabricator.wikimedia.org/T238360) [23:47:32] (03PS11) 10Nuria: [WIP] Table and workflow for features computations per session [analytics/refinery] - 10https://gerrit.wikimedia.org/r/552943 (https://phabricator.wikimedia.org/T238360) [23:53:22] (03CR) 10Nuria: [WIP] Table and workflow for features computations per session (039 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/552943 (https://phabricator.wikimedia.org/T238360) (owner: 10Nuria)