[00:04:36] <ottomata>	 jdlrobson:  kafka doesn't have good timestamp based consumption support in the version we have
[00:04:46] <ottomata>	 you kinda can do it with a kafka client, but you'd have to code a little bit
[00:04:51] <ottomata>	 but, you can consume from an offset in the past
[00:05:02] <ottomata>	 so you can guess around till you find a timestamp that's about where you want
[00:05:08] <ottomata>	 also, you'll have to filter
[00:05:10] <ottomata>	 OR
[00:05:25] <ottomata>	 you can use the events in hadoop, buuuuuuut, i'd have to check on their status.  They will be better soon...
[00:06:38] <ottomata>	 from stat1002 you could try
[00:09:01] <ottomata>	  kafkacat -C -b kafka1012.eqiad.wmnet:9092  -c 1 -t eqiad.mediawiki.revision-create  -f "%o %s"
[00:09:08] <ottomata>	 that will print out a single message and its offset
[00:09:19] <ottomata>	 then, to request a specific message at an offset
[00:09:31] <ottomata>	 kafkacat -C -b kafka1012.eqiad.wmnet:9092  -c 1 -t eqiad.mediawiki.revision-create  -f "%o %s" -o <offset>
[05:02:34] <wikibugs>	 10Analytics-Kanban: Preserve userAgent field in apps schemas - https://phabricator.wikimedia.org/T164125#3331227 (10Tbayer) @mforns Great point; thanks for checking, and sorry about the delayed response - I had to take a second look at some of these myself.  Your suggestions all look great to me, except that for...
[05:21:57] <wikibugs>	 10Analytics-Kanban: Preserve userAgent field in apps schemas - https://phabricator.wikimedia.org/T164125#3331235 (10Tbayer) PS: There are also [[https://meta.wikimedia.org/w/index.php?search=%22MobileApp%2A%22&title=Special:Search&profile=advanced&fulltext=1&ns470=1 | four older app schemas]] (each last revised...
[07:19:35] <elukey>	 morning!
[07:19:40] <elukey>	 first failure for cache-maps :)
[07:19:48] <elukey>	 that is now officially gone away
[07:21:42] <elukey>	 !log suspended cache maps as temporary measure to avoid oozie spamming
[07:21:44] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[07:23:24] <elukey>	 I'd probably just kill it
[07:25:31] <elukey>	 !log kill maps webrequest load coordinator as temporary measure to avoid oozie spamming
[07:25:32] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[07:34:13] <elukey>	 for fun I am trying to clean up the refinery
[07:35:48] <elukey>	 ahhh and the camus config is also in puppet
[07:39:10] <elukey>	 https://gerrit.wikimedia.org/r/#/c/357768
[07:40:26] <wikibugs>	 (03PS1) 10Elukey: Remove any trace of the maps cluster [analytics/refinery] - 10https://gerrit.wikimedia.org/r/357769
[07:43:19] <wikibugs>	 (03PS2) 10Elukey: Remove any trace of the maps cluster [analytics/refinery] - 10https://gerrit.wikimedia.org/r/357769
[07:44:35] <wikibugs>	 (03PS3) 10Elukey: Remove any trace of the maps cluster [analytics/refinery] - 10https://gerrit.wikimedia.org/r/357769
[07:54:52] <joal>	 hi elukey 
[07:56:54] <joal>	 elukey: on analytics1003 --> grep " Error on topic webrequest_maps" /var/log/camus/webrequest.log
[07:57:21] <joal>	 elukey: maps data is indeed gone :)
[08:03:20] <wikibugs>	 10Analytics-Kanban, 10Operations, 10ops-eqiad, 10Patch-For-Review, 10User-Elukey: Review Megacli Analytics Hadoop workers settings - https://phabricator.wikimedia.org/T166140#3331391 (10elukey) Based on several guides like http://download.intel.com/support/motherboards/server/sb/configuring_raid_for_opti...
[08:03:56] <elukey>	 joal: o/
[08:05:49] <elukey>	 merged the puppet change, applying it to an1003
[08:07:50] <joal>	 great elukey 
[08:08:06] <joal>	 elukey: will monitor camus logs
[08:09:49] <elukey>	 joal: all good during the last days?
[08:09:56] <joal>	 elukey: yes sir :)
[08:10:01] <joal>	 elukey: busy, but good !@
[08:10:30] <joal>	 elukey: you?
[08:14:07] <elukey>	 joal: same :)
[08:14:10] <elukey>	 we missed you!
[08:14:57] <joal>	 elukey: While I missed my daily time with you guys, I didn't miss work a lot ;)
[08:15:03] <joal>	 I didn't stop long enough :)
[08:16:41] <joal>	 elukey: no more errors in camusPartitionChecker :)
[08:20:07] <elukey>	 \o/
[08:28:19] <wikibugs>	 (03CR) 10Joal: [C: 031] "One mini-nit, and great :)" (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/357769 (owner: 10Elukey)
[08:34:20] <wikibugs>	 (03PS4) 10Elukey: Remove any trace of the maps cluster [analytics/refinery] - 10https://gerrit.wikimedia.org/r/357769
[08:34:27] <elukey>	 joal: like this?? --^
[08:35:33] <joal>	 Yay !
[08:35:53] <elukey>	 \o/
[08:36:07] <wikibugs>	 (03CR) 10Joal: [V: 032 C: 032] "Yes for me! Let's merge :)" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/357769 (owner: 10Elukey)
[08:36:27] <joal>	 elukey: Have we dpeloyed the cluster either yesterday or the day before?
[08:36:54] <elukey>	 not that I am aware of, the last deploy IIRC was done by Marcel last week
[08:36:59] <elukey>	 but I could be wrong
[08:37:13] <joal>	 elukey: okey
[08:37:30] <joal>	 elukey: Will probably synchronise with nuria_ to deploy today
[08:38:01] <elukey>	 super
[08:38:10] <elukey>	 joal: whenever you have time I'd like to discuss with you https://phabricator.wikimedia.org/T166140#3331391
[08:38:26] <elukey>	 (amending it, replicated comments :P)
[08:39:53] <joal>	 elukey: sure, but I don't know really what it's about :)
[08:40:07] <elukey>	 me too! :D
[08:40:28] <elukey>	 jokes aside, the main goal is to unify our configuration for hadoop workers
[08:40:40] <elukey>	 basically we have a strange config for each of them
[08:40:51] <joal>	 elukey: Understood that point
[08:41:05] <elukey>	 each of the 12 disks for data partition is one raid-0 (composed by one disk)
[08:41:09] <elukey>	 it is sort of a JBOD
[08:41:13] <joal>	 elukey: What I don't get understand is why we need raid settings while I thought we were J
[08:41:17] <joal>	 BJBOD sorry
[08:41:46] <joal>	 hm - So we have 12 raid arrays of 1 disk each?
[08:42:02] <elukey>	 exactly
[08:42:27] <elukey>	 from our perspective it is essentially JBOD
[08:42:55] <elukey>	 last week Jaime added a check for all the dbs to verify that WriteBack was in place for raid controllers of dbs
[08:43:25] <joal>	 hm - Not sure what it means
[08:43:34] <elukey>	 (writing :)
[08:43:49] <joal>	 :D
[08:44:12] <joal>	 dumb question from a no-hardware man: Why do we use RAID for single disks?
[08:44:31] <elukey>	 so I wasn't aware of WriteBack at all, but basically when a write happens the raid controller puts it in cache and acks the I/O op as done, without waiting for the disk to have effectively done it
[08:45:07] <elukey>	 the caveat to prevent losses is that a phisical battery called BBU powers the raid cache
[08:45:34] <elukey>	 so in case of power loss of the host data is preserved for X amount of time
[08:45:36] <joal>	 elukey: You are my enlightener
[08:45:39] <elukey>	 ahahah
[08:45:57] <elukey>	 so we discovered two faulty BBUs with this alarm
[08:46:19] <elukey>	 because two hosts showed WriteThrough, that means "go straight to the I/O on disk, do not cache"
[08:46:25] <joal>	 elukey: Other thing I found: it's interesting to have RAID configure even for single disks array to benefit from the caching of the raid controller
[08:47:59] <joal>	 elukey: I like the config you suggested based on my very small understanding of what it means
[08:48:21] <elukey>	 super, this is what I wanted to ask you :)
[08:49:29] <wikibugs>	 10Analytics-Kanban, 10Operations, 10ops-eqiad, 10Patch-For-Review, 10User-Elukey: Review Megacli Analytics Hadoop workers settings - https://phabricator.wikimedia.org/T166140#3286412 (10JAllemandou) Looks good to me (even if I don't understand in depth what it means). I particularly like the idea of havi...
[08:49:34] <joal>	 elukey: just commented on ticket --^
[08:51:04] <elukey>	 thanks!
[09:06:24] <elukey>	 joal: FYI I am upgrading zookeeper everywhere, if you notice something weird let me know
[09:06:48] <joal>	 sure 
[09:26:28] <elukey>	 main-codfw upgraded, druid just done
[09:39:46] <moritzm>	 I'm upgrading the remaining mysql-connector-java packages on the hadoop cluster now, 1031 has been running this for other a week, so should be fine
[09:40:27] <elukey>	 super
[10:02:35] <wikibugs>	 10Analytics-Tech-community-metrics, 10Developer-Relations (Jul-Sep 2017): Automatically sync mediawiki-identities/wikimedia-affiliations.json DB dump file with the data available on wikimedia.biterg.io - https://phabricator.wikimedia.org/T157898#3331629 (10Albertinisg) All right so, I've generated new indexes...
[10:28:22] <wikibugs>	 10Analytics: Add 'page_is_redirect' field to the mediawiki_history Data Lake tables - https://phabricator.wikimedia.org/T167396#3331684 (10Tbayer)
[10:28:42] <elukey>	 !log executed megacli -LDSetProp NoCachedBadBBU -LALL -aALL on analytics1032 as test - T166140
[10:28:44] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[10:28:44] <stashbot>	 T166140: Review Megacli Analytics Hadoop workers settings - https://phabricator.wikimedia.org/T166140
[10:30:52] <wikibugs>	 10Analytics-Kanban, 10Operations, 10ops-eqiad, 10Patch-For-Review, 10User-Elukey: Review Megacli Analytics Hadoop workers settings - https://phabricator.wikimedia.org/T166140#3331701 (10elukey) Better view:  ``` elukey@neodymium:~$ sudo cumin 'R:class = role::analytics_cluster::hadoop::worker' 'megacli -...
[10:32:20] <wikibugs>	 10Analytics-Tech-community-metrics, 10Developer-Relations (Jul-Sep 2017): Automatically sync mediawiki-identities/wikimedia-affiliations.json DB dump file with the data available on wikimedia.biterg.io - https://phabricator.wikimedia.org/T157898#3331703 (10Aklapper) @Albertinisg: Nice! Please override the dump...
[10:35:24] <elukey>	 !log executed megacli -LDSetProp NoCachedBadBBU -LALL -aALL on analytics1049/45 
[10:35:25] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[10:44:28] * elukey going to lunch!
[12:16:21] <elukey>	 !log run megacli -LDSetProp ADRA -LALL -aALL on analytics1047 to set ReadAheadAdaptive  - T166140
[12:16:23] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[12:16:23] <stashbot>	 T166140: Review Megacli Analytics Hadoop workers settings - https://phabricator.wikimedia.org/T166140
[12:54:56] <elukey>	 !log run megacli -LDSetProp ADRA -LALL -aALL on analytics1047 to set ReadAheadAdaptive on analytics[1042-1046,1048-1057].eqiad.wmnet - T166140
[12:54:59] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[12:55:00] <stashbot>	 T166140: Review Megacli Analytics Hadoop workers settings - https://phabricator.wikimedia.org/T166140
[12:56:04] <elukey>	 I am doing --^ one host at the time with 120s between them
[13:05:21] <milimetric>	 hi! :)
[13:05:27] <joal>	 Hi milimetric  !!
[13:05:46] <milimetric>	 I'm back if anyone needs me, I'm going to catch up on all the pings
[13:07:01] <elukey>	 milimetric: how's going
[13:07:02] <elukey>	 ?
[13:07:54] <milimetric>	 everything's ok
[13:08:01] <milimetric>	 not great, but manageable
[13:11:43] <elukey>	 :)
[13:15:27] <mforns>	 hey milimetric glad you're back
[13:15:46] <milimetric>	 hi! It's nice to be back
[13:32:28] <ottomata>	 oo ya maps is gone!
[13:32:40] <ottomata>	 joal:  we should remove the coordinator from the bundle, ya?
[13:32:51] <joal>	 ottomata: look at your gerrit emails ;)
[13:32:56] <fdans>	 sorry I didn't say hellooooooooo a-team
[13:33:03] * joal points finger to elukey 
[13:33:07] <joal>	 ottomata: --^
[13:33:08] <ottomata>	 eh? gerrit emails?
[13:33:10] <fdans>	 catching up with a buncha emails
[13:33:10] <ottomata>	 oh
[13:33:13] <mforns>	 hellooooooo fdans
[13:33:14] <ottomata>	 sorry there are more emails
[13:33:15] <ottomata>	 looking
[13:33:17] <joal>	 Hi fdans :)
[13:33:22] <ottomata>	 thought i got them all already
[13:33:37] <ottomata>	 i was like "OO i know what to do!  they haven't done it yet!  "
[13:33:40] <ottomata>	 of COURSE they have :)
[13:34:40] <joal>	 Man, I arrived at usual time, elukey had done it all already
[13:38:22] * elukey plays the gingle TTTEEEEAAAAAMMMMMM EEEEEUUUURROOOOPEEEEE
[13:38:36] * elukey blames fdans for the brainwash
[13:39:29] <joal>	 POZORRRR!
[13:41:35] <elukey>	 ahahahah
[13:44:56] <elukey>	 !log AQS cluster in beta wiped and re-bootstrapped due to T167222
[13:44:57] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[13:44:57] <stashbot>	 T167222: Cassandra table storage backend error in Deployment-prep - https://phabricator.wikimedia.org/T167222
[13:48:03] <fdans>	 elukey joal aw I missed you
[13:48:15] <wikibugs>	 10Analytics-Kanban, 10Operations, 10ops-eqiad, 10Patch-For-Review, 10User-Elukey: Review Megacli Analytics Hadoop workers settings - https://phabricator.wikimedia.org/T166140#3332220 (10elukey) Current status is:  ``` elukey@neodymium:~$ sudo cumin 'R:class = role::analytics_cluster::hadoop::worker' 'meg...
[13:48:27] <elukey>	 ottomata: --^
[13:49:11] <wikibugs>	 10Analytics-Kanban: Preserve userAgent field in apps schemas - https://phabricator.wikimedia.org/T164125#3332222 (10mforns) Hey @Tbayer, thanks for looking into this!  > Your suggestions all look great to me, except that for the MobileWikiAppPageScroll, the "funnel" aspect is actually the less important part. Ca...
[13:54:47] <wikibugs>	 10Analytics: Use hive dynamic partitioning to split webrequest on tags - https://phabricator.wikimedia.org/T164020#3332225 (10JAllemandou)
[13:55:05] <wikibugs>	 10Analytics-Kanban: Use hive dynamic partitioning to split webrequest on tags - https://phabricator.wikimedia.org/T164020#3218541 (10JAllemandou)
[13:55:14] <wikibugs>	 10Analytics-Kanban: Use hive dynamic partitioning to split webrequest on tags - https://phabricator.wikimedia.org/T164020#3218541 (10JAllemandou) a:03JAllemandou
[13:55:29] <wikibugs>	 (03PS1) 10Joal: [WIP] Split webrequest into smaller datasets [analytics/refinery] - 10https://gerrit.wikimedia.org/r/357814 (https://phabricator.wikimedia.org/T164020)
[13:55:46] <joal>	 ottomata: If you have a minute, can you take a look at --^ ?
[13:56:10] <joal>	 Ok, taking a break now a-team
[13:56:27] <mforns>	 cyaaa
[14:03:49] <ottomata>	 joal:  was just reading about some hive timestamp stuff
[14:03:50] <ottomata>	 https://issues.apache.org/jira/browse/HIVE-9298
[14:04:02] <ottomata>	 would be nice, would allow us to keep the storage value as ISO 8601
[14:04:44] <ottomata>	 but we don't have hive 1.2 :(
[14:23:29] <elukey>	 mforns: o/ - I added the whitelist field check that you asked yesterday to https://gerrit.wikimedia.org/r/356383
[14:23:46] <wikibugs>	 (03CR) 10Ottomata: "Thinking about the name 'split'.  Maybe we should refer to this as webrequest_exploded?  Or something else?  Also, I would like to see mor" (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/357814 (https://phabricator.wikimedia.org/T164020) (owner: 10Joal)
[14:23:46] <mforns>	 elukey, will review :D
[14:24:33] <elukey>	 (plus test added)
[14:24:53] <elukey>	 mforns: is there anything else pending to do for the script?
[14:25:01] <mforns>	 elukey, I don't think so
[14:25:05] <elukey>	 super
[14:25:50] <wikibugs>	 (03CR) 10Ottomata: [WIP] Split webrequest into smaller datasets (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/357814 (https://phabricator.wikimedia.org/T164020) (owner: 10Joal)
[14:26:04] <mforns>	 elukey, I think though we should also check that all schema prefixes in the white-list also have at least 1 table in the db
[14:26:16] <mforns>	 I mean check that outside sanitize
[14:26:45] <mforns>	 so that if the whitelist has a typo in the schema name, then we do not execute the script and purge data that is expected to live
[14:27:05] <mforns>	 will comment in-line in gerrit
[14:27:23] <elukey>	 I'll add it now, not needed 
[14:27:27] <elukey>	 good point though
[14:32:21] <ottomata>	 ooo elukey there is a kafka 0.11.0.0 RC out... :o
[14:32:49] <ottomata>	 * Exactly-once delivery and transactional messaging
[14:32:49] <ottomata>	 * Streams exactly-once semantics
[14:32:49] <ottomata>	 * Admin client with support for topic, ACLs and config management
[14:32:49] <ottomata>	 * Record headers
[14:32:49] <ottomata>	 * Request rate quotas
[14:32:49] <ottomata>	 * Improved resiliency: replication protocol improvement and single-threaded controller
[14:32:49] <ottomata>	 * Richer and more efficient message format
[14:34:09] <elukey>	 wow exactly-once * is big
[14:34:24] <wikibugs>	 (03CR) 10Joal: "We can definitely use another database to keep that table, like for instance wmf_exploded, or wmf_optim." (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/357814 (https://phabricator.wikimedia.org/T164020) (owner: 10Joal)
[14:36:02] <mforns>	 elukey, oh, didn't read and commented anyway in gerrit, sorry
[14:36:17] <elukey>	 no need, the solution looks great!
[14:42:10] <ottomata>	 elukey:  we will have to see of course, but even though its just RC, i'm inclined to wait for the release for our upgrade :o
[14:42:22] <ottomata>	 it is a major version upgrade, and it will be reallly nice not to have to do that live
[14:42:25] <ottomata>	 but install on a new cluster
[14:43:23] <elukey>	 yep definitely
[14:43:29] <elukey>	 but let's make sure that it is not too new :)
[14:43:41] <ottomata>	 haha
[14:43:42] <ottomata>	 indeed
[14:47:35] <nuria_>	 hola
[14:47:37] <nuria_>	 bearloga: FYI regarding maps cluster: https://gerrit.wikimedia.org/r/#/c/357769/
[15:10:27] <wikibugs>	 10Analytics-Kanban, 10Patch-For-Review: Test failures in refinery master - https://phabricator.wikimedia.org/T166334#3332565 (10Nuria) a:03Nuria
[15:27:35] <wikibugs>	 10Analytics, 10Analytics-Cluster, 10Operations, 10ops-eqiad, 10Patch-For-Review: rack/setup/install replacement stat1006 (stat1003 replacement) - https://phabricator.wikimedia.org/T165366#3332608 (10Cmjohnson)
[15:29:32] <wikibugs>	 10Analytics, 10Analytics-Cluster, 10Operations, 10ops-eqiad, 10Patch-For-Review: rack/setup/install replacement to stat1005 (stat1002 replacement) - https://phabricator.wikimedia.org/T165368#3332623 (10Cmjohnson)
[15:37:32] <elukey>	 mforns: new version of el_cleaner out! :)
[15:39:27] <mforns>	 elukey, looking :]
[15:48:05] <wikibugs>	 10Analytics, 10Analytics-EventLogging: EventLogging tests fail for python 3.4 in Jenkins - https://phabricator.wikimedia.org/T164409#3332667 (10Nuria) p:05Normal>03High
[15:48:19] <wikibugs>	 10Analytics-EventLogging, 10Analytics-Kanban: EventLogging tests fail for python 3.4 in Jenkins - https://phabricator.wikimedia.org/T164409#3232541 (10Nuria)
[15:51:13] <wikibugs>	 10Analytics: Serve global unique device counts externally - https://phabricator.wikimedia.org/T157981#3332674 (10Nuria) There aretwo parts:  - serve files (ongoing)  - changes on aqs to serve two values on api
[15:51:53] <wikibugs>	 10Analytics: Serve global unique device counts externally - https://phabricator.wikimedia.org/T157981#3332677 (10Nuria) p:05Normal>03Low
[15:56:02] <wikibugs>	 10Analytics: Provide top domain and data to truly test superset - https://phabricator.wikimedia.org/T166689#3332698 (10Nuria) Superset is python and thus harder to deploy than a node counterpart.   Can we deploy this virtual env (whole environment is uploaded to gerrit)  Superset can consume from druid as it is.
[15:56:17] <wikibugs>	 10Analytics: Setup superset to consume from mysql slaves - https://phabricator.wikimedia.org/T167427#3332699 (10Nuria)
[15:56:46] <wikibugs>	 10Analytics: Configure superset to query mysql slaves  - https://phabricator.wikimedia.org/T167427#3332699 (10Nuria)
[15:57:48] <wikibugs>	 10Analytics: Configure superset to query mysql slaves - https://phabricator.wikimedia.org/T167427#3332699 (10Nuria)
[15:58:32] <wikibugs>	 10Analytics: Provide top domain and data to truly test superset - https://phabricator.wikimedia.org/T166689#3332725 (10Nuria) p:05Triage>03Normal
[16:00:31] <wikibugs>	 10Analytics, 10Analytics-Cluster: Upgrade Analytics Cluster to Java 8 - https://phabricator.wikimedia.org/T166248#3289857 (10Nuria) p:05High>03Normal
[16:17:28] <wikibugs>	 10Analytics, 10Analytics-EventLogging: Bulk/Batch event endpoint - https://phabricator.wikimedia.org/T166249#3332848 (10Nuria) Tasks:  * how do we handle posts? Does varnishlog have access to post body? (likely nbt) * do we need a new endpoint to consume this (outside varnish) , say node consumer that sends da...
[16:19:00] <wikibugs>	 10Analytics, 10Analytics-EventLogging: Bulk/Batch event endpoint - https://phabricator.wikimedia.org/T166249#3332850 (10Nuria) p:05Triage>03Normal
[16:23:10] <wikibugs>	 10Analytics-Kanban, 10Operations, 10Traffic, 10Patch-For-Review: Replace Analytics XFF/client.ip data with X-Client-IP - https://phabricator.wikimedia.org/T118557#3332875 (10Nuria)
[16:25:14] <wikibugs>	 10Analytics: Alarms on pageview API latency increase - https://phabricator.wikimedia.org/T164243#3332895 (10Nuria)
[16:25:26] <wikibugs>	 10Analytics-Kanban: Alarms on pageview API latency increase - https://phabricator.wikimedia.org/T164243#3226937 (10Nuria)
[16:25:54] <wikibugs>	 10Analytics: Reinstate a subset of reports removed from the reportcard until WikiStats 2.0 is back - https://phabricator.wikimedia.org/T166679#3332902 (10Nuria)
[16:26:32] <wikibugs>	 10Analytics-Kanban: Document that old deleted pages have empty fields in Analytics Cluster edit data - https://phabricator.wikimedia.org/T165201#3332903 (10Nuria) a:03mforns
[16:27:19] <wikibugs>	 10Analytics-Kanban, 10DBA, 10Operations, 10ops-eqiad, 10User-Elukey: db1046 BBU looks faulty - https://phabricator.wikimedia.org/T166141#3332904 (10elukey) @Cmjohnson sorry to ping :) Any idea if we have a spare BBU for db1046?
[16:29:28] <wikibugs>	 10Analytics-Kanban, 10DBA, 10Operations, 10ops-eqiad, 10User-Elukey: db1046 BBU looks faulty - https://phabricator.wikimedia.org/T166141#3332932 (10Cmjohnson) @elukey. Yes, I have another decommissioned r510 to take it from.  Ping you in a hour or so to replace
[16:35:30] <wikibugs>	 10Analytics, 10Analytics-EventLogging, 10Reading Epics (Analytics): Bulk/Batch event endpoint - https://phabricator.wikimedia.org/T166249#3332965 (10Fjalapeno) @Ottomata that was me you were talking with at the offsite earlier this year…  And yeah, sounds like a POST would be ideal here.   @Nuria let me know...
[16:35:50] <elukey>	 ottomata: ---^ db1046's bbu - when/how should we replace it?
[16:40:31] <wikibugs>	 10Analytics: Implement purging settings for Schema:ReadingDepth - https://phabricator.wikimedia.org/T167439#3332973 (10Tbayer)
[16:41:07] <nuria_>	 !log deploying refinery to  cluster 
[16:41:08] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[16:41:42] <wikibugs>	 10Analytics-Kanban, 10DBA, 10Operations, 10ops-eqiad, 10User-Elukey: db1046 BBU looks faulty - https://phabricator.wikimedia.org/T166141#3332986 (10elukey) @Cmjohnson thanks! Would it be possible to do the swap next week? Since this is an important DB I'd need to coordinate my team and Jaime/Manuel first.
[16:41:59] <elukey>	 just replied that we'd need to decide when
[16:42:52] <wikibugs>	 10Analytics, 10Analytics-EventLogging, 10Reading Epics (Analytics): Bulk/Batch event endpoint - https://phabricator.wikimedia.org/T166249#3332987 (10Nuria) @Fjalapeno : we think we might be able to start working on this possibly by end of Q2
[16:43:40] <wikibugs>	 10Analytics-Kanban: Preserve userAgent field in apps schemas - https://phabricator.wikimedia.org/T164125#3332992 (10Tbayer) >>! In T164125#3332222, @mforns wrote: > Hey @Tbayer, thanks for looking into this! >  ... >  >> PS: There are also four older app schemas (each last revised in 2013, but still active) that...
[16:46:44] <elukey>	 all right people logging off, will do some peer reviews later on
[16:46:49] <elukey>	 byyyeeee
[16:46:50] * elukey off!
[16:46:58] <nuria_>	 kudos to a-team on uodating docs about cluster deploy
[16:51:46] <nuria_>	 a-team: deployed new reginery code (not source , just oozie's)
[16:51:49] <nuria_>	 *refinery
[16:58:53] <joal>	 Thanks nuria_, will take on reordering uniques jobs later
[17:13:08] <wikibugs>	 10Analytics, 10Page-Previews: Update purging settings for Schema:Popups - https://phabricator.wikimedia.org/T167449#3333176 (10Tbayer)
[17:22:35] <wikibugs>	 10Analytics: Implement purging settings for Schema:MobileWikiAppFeed - https://phabricator.wikimedia.org/T167453#3333268 (10Tbayer)
[17:24:29] <wikibugs>	 10Analytics, 10Pageviews-API: Endpoint for average view rate in Pageview API - https://phabricator.wikimedia.org/T162933#3333284 (10Nettrom) Coming back to this I have a bunch of questions, so I'll just ask them and see where we go from there. Apologies if this is counterproductive, feel free to let me know ho...
[17:29:32] <wikibugs>	 10Analytics: Implement purging settings for Schema:MobileWikiAppFeed - https://phabricator.wikimedia.org/T167453#3333313 (10Tbayer) (This is actually already covered T164125#3300396 - sorry, got mixed up here.)
[17:31:54] <wikibugs>	 10Analytics: Implement purging settings for Schema:MobileWikiAppFeed - https://phabricator.wikimedia.org/T167453#3333323 (10Tbayer)
[17:44:23] <wikibugs>	 10Analytics-Kanban: Preserve userAgent field in apps schemas - https://phabricator.wikimedia.org/T164125#3333362 (10Dbrant) ^ This is correct. The fields in the schemas you mentioned do not need to be preserved.
[18:20:43] <milimetric>	 nuria_: I was just looking at https://phabricator.wikimedia.org/T166335, this is throwing errors in prod but the team triaged it low
[18:20:58] <milimetric>	 shouldn't I work on it now?
[18:21:32] <wikibugs>	 10Analytics-Dashiki, 10Analytics-Kanban, 10Wikimedia-log-errors: Warning: JsonConfig: Invalid $wgJsonConfigModels['JsonConfig.Dashiki'] array value, 'class' not found - https://phabricator.wikimedia.org/T166335#3333542 (10Milimetric) p:05Low>03Normal a:03Milimetric
[18:22:30] <nuria_>	 milimetric: if the errors are consistent (was not clear  from report) please do
[18:22:42] <milimetric>	 yeah, it's apparently throwing this error all the time
[18:23:02] <milimetric>	 I forgot about it this morning and people looking at logs see it and remind me
[18:50:16] <joal>	 team, I just realised that I forgot a dependency in oozie changes for uniques
[18:50:38] <wikibugs>	 10Analytics, 10Analytics-Cluster, 10Operations, 10ops-eqiad, 10Patch-For-Review: rack/setup/install replacement stat1006 (stat1003 replacement) - https://phabricator.wikimedia.org/T165366#3333636 (10Cmjohnson)
[18:51:30] <wikibugs>	 10Analytics-Dashiki, 10Analytics-Kanban, 10Wikimedia-log-errors: Warning: JsonConfig: Invalid $wgJsonConfigModels['JsonConfig.Dashiki'] array value, 'class' not found - https://phabricator.wikimedia.org/T166335#3333637 (10Milimetric) From analyzing logstash (just searching for [[ https://logstash.wikimedia.o...
[18:51:47] <wikibugs>	 (03PS1) 10Joal: Update cassandra loading for unique devices [analytics/refinery] - 10https://gerrit.wikimedia.org/r/357871
[18:58:55] <wikibugs>	 10Analytics, 10Analytics-Cluster, 10Operations, 10ops-eqiad, 10Patch-For-Review: rack/setup/install replacement stat1006 (stat1003 replacement) - https://phabricator.wikimedia.org/T165366#3333657 (10Cmjohnson) a:05Cmjohnson>03RobH @robh added mac address to dhcpd already, verified on switch that it's...
[19:08:40] <wikibugs>	 10Analytics, 10Analytics-Cluster, 10Operations, 10ops-eqiad, 10Patch-For-Review: rack/setup/install replacement to stat1005 (stat1002 replacement) - https://phabricator.wikimedia.org/T165368#3333707 (10Cmjohnson)
[19:09:15] <wikibugs>	 10Analytics, 10Analytics-Cluster, 10Operations, 10ops-eqiad, 10Patch-For-Review: rack/setup/install replacement to stat1005 (stat1002 replacement) - https://phabricator.wikimedia.org/T165368#3264256 (10Cmjohnson) a:05Cmjohnson>03RobH
[19:21:14] <wikibugs>	 (03CR) 10Nuria: "Friendly remainder to us that these split datasets also need removal after 90 days." [analytics/refinery] - 10https://gerrit.wikimedia.org/r/357814 (https://phabricator.wikimedia.org/T164020) (owner: 10Joal)
[19:33:30] <wikibugs>	 (03CR) 10Nuria: Update cassandra loading for unique devices (032 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/357871 (owner: 10Joal)
[19:42:09] <wikibugs>	 10Analytics, 10Analytics-EventLogging, 10Contributors-Analysis, 10EventBus, and 5 others: Record an EventLogging event every time a new content namespace page is created - https://phabricator.wikimedia.org/T150369#3333810 (10kaldari) @Ottomata - Now that we're collecting all the data via EventBus, what's t...
[19:51:24] <wikibugs>	 10Analytics, 10Analytics-EventLogging, 10Contributors-Analysis, 10EventBus, and 5 others: Record an EventLogging event every time a new content namespace page is created - https://phabricator.wikimedia.org/T150369#3333855 (10Nuria) @kaldari: I think we need to describe what are we counting/exposing/visuali...
[20:37:24] <wikibugs>	 10Analytics, 10Page-Previews, 10Reading-Web-Backlog: Update purging settings for Schema:Popups - https://phabricator.wikimedia.org/T167449#3334036 (10Jdlrobson) @tbayer or @mforns are you planning to do this or do you need web team assistance (ie. do we need to plan it for an upcoming sprint?)
[20:43:21] <hall1467>	 Hi, I'm curious how redirects are handled in the wmf.pageview_hourly table. How would a view coming from a redirect be logged?
[20:43:34] <halfak>	 o/ hey folks
[20:43:53] <hall1467>	 o/
[20:44:00] <halfak>	 I had a thread with Nuria about this, but it's still confusing and hall1467 wants to get view rates matched with page IDs for his work
[20:46:04] <nuria_>	 halfak: in order to use that data you need to first asses of its quality
[20:46:41] <halfak>	 nuria_, yeah.  That's a whole research project. 
[20:47:03] <nuria_>	 halfak: agreed
[20:47:08] <halfak>	 Before we assess it's quality we want to learn why you question its quality. 
[20:47:17] <nuria_>	 halfak: that is why that disclaimer is there
[20:47:34] <halfak>	 nuria_, why not just put that disclaimer on all the data?
[20:48:07] <nuria_>	 halfak: cause quality measures are in place for much of our data but we have never vetted page_id
[20:48:17] <nuria_>	 halfak: taht is not true on all fields on x-nalytics
[20:48:21] <nuria_>	 *that
[20:48:28] <nuria_>	 *x _analytics
[20:48:51] <nuria_>	 page_id is basically a project that (as far as i know) was never completed but hey, would love to be told otherwise
[20:49:01] <nuria_>	 ^ halfak 
[20:50:04] <nuria_>	 halfak: also pageview hourly only has data for pageviews
[20:50:13] <nuria_>	 halfak: if you are talking about 301 redirects 
[20:50:28] <nuria_>	 halfak: those are never propagated to taht table cc hall1467 
[20:50:32] <nuria_>	 *that
[20:50:56] <nuria_>	 halfak: pageveiws are not 301 , 307 or 302s
[20:50:59] <halfak>	 oh good.  Wouldn't have been a concern either way though. 
[20:51:00] <nuria_>	 cc hall1467 
[20:53:42] <halfak>	 nuria_, what makes you believe that we might get page_ids from redirecting pages?
[20:53:56] <halfak>	 what's the mechanism by which the page_id is identified?
[20:54:08] <nuria_>	 halfak: because we never checked that it would be otherwise
[20:54:32] <nuria_>	 halfak: this one: https://github.com/wikimedia/mediawiki-extensions-WikimediaEvents/blob/master/WikimediaEventsHooks.php#L39
[20:54:43] <nuria_>	 halfak: not sure if that is what you are asking
[20:57:34] <wikibugs>	 10Analytics, 10Page-Previews, 10Reading-Web-Backlog: Update purging settings for Schema:Popups - https://phabricator.wikimedia.org/T167449#3334058 (10Tbayer) @Jdlrobson As you can see, it's assigned to a member of the Analytics team. I expect that no work from the web team will be required, so feel fre to re...
[21:02:24] <hall1467>	 Sorry, I'm a bit confused here. So basically, we don't know if a redirect page gets logged in this table or not?
[21:12:53] <hall1467>	 nuria_
[21:38:59] <halfak>	 What's the recommended pattern for querying hive and writing the result to a file?
[21:53:55] <nuria_>	 halfak: hive -f some.hql  > out.txt
[21:54:16] <nuria_>	 hall1467: no, we know in that table we do not have redirect information
[21:54:18] <halfak>	 Cool.  stdout should be clean then?  This is nice because I can stream compress the output :) 
[21:55:19] <nuria_>	 hall1467: if that is the info you need you need to seek it elsewhere, that table only has information about pageviews
[21:56:04] <nuria_>	 halfak: I think so, I normally do that on a screen and after if files are huge i just gzip them