[00:04:36] jdlrobson: kafka doesn't have good timestamp based consumption support in the version we have [00:04:46] you kinda can do it with a kafka client, but you'd have to code a little bit [00:04:51] but, you can consume from an offset in the past [00:05:02] so you can guess around till you find a timestamp that's about where you want [00:05:08] also, you'll have to filter [00:05:10] OR [00:05:25] you can use the events in hadoop, buuuuuuut, i'd have to check on their status. They will be better soon... [00:06:38] from stat1002 you could try [00:09:01] kafkacat -C -b kafka1012.eqiad.wmnet:9092 -c 1 -t eqiad.mediawiki.revision-create -f "%o %s" [00:09:08] that will print out a single message and its offset [00:09:19] then, to request a specific message at an offset [00:09:31] kafkacat -C -b kafka1012.eqiad.wmnet:9092 -c 1 -t eqiad.mediawiki.revision-create -f "%o %s" -o [05:02:34] 10Analytics-Kanban: Preserve userAgent field in apps schemas - https://phabricator.wikimedia.org/T164125#3331227 (10Tbayer) @mforns Great point; thanks for checking, and sorry about the delayed response - I had to take a second look at some of these myself. Your suggestions all look great to me, except that for... [05:21:57] 10Analytics-Kanban: Preserve userAgent field in apps schemas - https://phabricator.wikimedia.org/T164125#3331235 (10Tbayer) PS: There are also [[https://meta.wikimedia.org/w/index.php?search=%22MobileApp%2A%22&title=Special:Search&profile=advanced&fulltext=1&ns470=1 | four older app schemas]] (each last revised... [07:19:35] morning! [07:19:40] first failure for cache-maps :) [07:19:48] that is now officially gone away [07:21:42] !log suspended cache maps as temporary measure to avoid oozie spamming [07:21:44] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [07:23:24] I'd probably just kill it [07:25:31] !log kill maps webrequest load coordinator as temporary measure to avoid oozie spamming [07:25:32] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [07:34:13] for fun I am trying to clean up the refinery [07:35:48] ahhh and the camus config is also in puppet [07:39:10] https://gerrit.wikimedia.org/r/#/c/357768 [07:40:26] (03PS1) 10Elukey: Remove any trace of the maps cluster [analytics/refinery] - 10https://gerrit.wikimedia.org/r/357769 [07:43:19] (03PS2) 10Elukey: Remove any trace of the maps cluster [analytics/refinery] - 10https://gerrit.wikimedia.org/r/357769 [07:44:35] (03PS3) 10Elukey: Remove any trace of the maps cluster [analytics/refinery] - 10https://gerrit.wikimedia.org/r/357769 [07:54:52] hi elukey [07:56:54] elukey: on analytics1003 --> grep " Error on topic webrequest_maps" /var/log/camus/webrequest.log [07:57:21] elukey: maps data is indeed gone :) [08:03:20] 10Analytics-Kanban, 10Operations, 10ops-eqiad, 10Patch-For-Review, 10User-Elukey: Review Megacli Analytics Hadoop workers settings - https://phabricator.wikimedia.org/T166140#3331391 (10elukey) Based on several guides like http://download.intel.com/support/motherboards/server/sb/configuring_raid_for_opti... [08:03:56] joal: o/ [08:05:49] merged the puppet change, applying it to an1003 [08:07:50] great elukey [08:08:06] elukey: will monitor camus logs [08:09:49] joal: all good during the last days? [08:09:56] elukey: yes sir :) [08:10:01] elukey: busy, but good !@ [08:10:30] elukey: you? [08:14:07] joal: same :) [08:14:10] we missed you! [08:14:57] elukey: While I missed my daily time with you guys, I didn't miss work a lot ;) [08:15:03] I didn't stop long enough :) [08:16:41] elukey: no more errors in camusPartitionChecker :) [08:20:07] \o/ [08:28:19] (03CR) 10Joal: [C: 031] "One mini-nit, and great :)" (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/357769 (owner: 10Elukey) [08:34:20] (03PS4) 10Elukey: Remove any trace of the maps cluster [analytics/refinery] - 10https://gerrit.wikimedia.org/r/357769 [08:34:27] joal: like this?? --^ [08:35:33] Yay ! [08:35:53] \o/ [08:36:07] (03CR) 10Joal: [V: 032 C: 032] "Yes for me! Let's merge :)" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/357769 (owner: 10Elukey) [08:36:27] elukey: Have we dpeloyed the cluster either yesterday or the day before? [08:36:54] not that I am aware of, the last deploy IIRC was done by Marcel last week [08:36:59] but I could be wrong [08:37:13] elukey: okey [08:37:30] elukey: Will probably synchronise with nuria_ to deploy today [08:38:01] super [08:38:10] joal: whenever you have time I'd like to discuss with you https://phabricator.wikimedia.org/T166140#3331391 [08:38:26] (amending it, replicated comments :P) [08:39:53] elukey: sure, but I don't know really what it's about :) [08:40:07] me too! :D [08:40:28] jokes aside, the main goal is to unify our configuration for hadoop workers [08:40:40] basically we have a strange config for each of them [08:40:51] elukey: Understood that point [08:41:05] each of the 12 disks for data partition is one raid-0 (composed by one disk) [08:41:09] it is sort of a JBOD [08:41:13] elukey: What I don't get understand is why we need raid settings while I thought we were J [08:41:17] BJBOD sorry [08:41:46] hm - So we have 12 raid arrays of 1 disk each? [08:42:02] exactly [08:42:27] from our perspective it is essentially JBOD [08:42:55] last week Jaime added a check for all the dbs to verify that WriteBack was in place for raid controllers of dbs [08:43:25] hm - Not sure what it means [08:43:34] (writing :) [08:43:49] :D [08:44:12] dumb question from a no-hardware man: Why do we use RAID for single disks? [08:44:31] so I wasn't aware of WriteBack at all, but basically when a write happens the raid controller puts it in cache and acks the I/O op as done, without waiting for the disk to have effectively done it [08:45:07] the caveat to prevent losses is that a phisical battery called BBU powers the raid cache [08:45:34] so in case of power loss of the host data is preserved for X amount of time [08:45:36] elukey: You are my enlightener [08:45:39] ahahah [08:45:57] so we discovered two faulty BBUs with this alarm [08:46:19] because two hosts showed WriteThrough, that means "go straight to the I/O on disk, do not cache" [08:46:25] elukey: Other thing I found: it's interesting to have RAID configure even for single disks array to benefit from the caching of the raid controller [08:47:59] elukey: I like the config you suggested based on my very small understanding of what it means [08:48:21] super, this is what I wanted to ask you :) [08:49:29] 10Analytics-Kanban, 10Operations, 10ops-eqiad, 10Patch-For-Review, 10User-Elukey: Review Megacli Analytics Hadoop workers settings - https://phabricator.wikimedia.org/T166140#3286412 (10JAllemandou) Looks good to me (even if I don't understand in depth what it means). I particularly like the idea of havi... [08:49:34] elukey: just commented on ticket --^ [08:51:04] thanks! [09:06:24] joal: FYI I am upgrading zookeeper everywhere, if you notice something weird let me know [09:06:48] sure [09:26:28] main-codfw upgraded, druid just done [09:39:46] I'm upgrading the remaining mysql-connector-java packages on the hadoop cluster now, 1031 has been running this for other a week, so should be fine [09:40:27] super [10:02:35] 10Analytics-Tech-community-metrics, 10Developer-Relations (Jul-Sep 2017): Automatically sync mediawiki-identities/wikimedia-affiliations.json DB dump file with the data available on wikimedia.biterg.io - https://phabricator.wikimedia.org/T157898#3331629 (10Albertinisg) All right so, I've generated new indexes... [10:28:22] 10Analytics: Add 'page_is_redirect' field to the mediawiki_history Data Lake tables - https://phabricator.wikimedia.org/T167396#3331684 (10Tbayer) [10:28:42] !log executed megacli -LDSetProp NoCachedBadBBU -LALL -aALL on analytics1032 as test - T166140 [10:28:44] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [10:28:44] T166140: Review Megacli Analytics Hadoop workers settings - https://phabricator.wikimedia.org/T166140 [10:30:52] 10Analytics-Kanban, 10Operations, 10ops-eqiad, 10Patch-For-Review, 10User-Elukey: Review Megacli Analytics Hadoop workers settings - https://phabricator.wikimedia.org/T166140#3331701 (10elukey) Better view: ``` elukey@neodymium:~$ sudo cumin 'R:class = role::analytics_cluster::hadoop::worker' 'megacli -... [10:32:20] 10Analytics-Tech-community-metrics, 10Developer-Relations (Jul-Sep 2017): Automatically sync mediawiki-identities/wikimedia-affiliations.json DB dump file with the data available on wikimedia.biterg.io - https://phabricator.wikimedia.org/T157898#3331703 (10Aklapper) @Albertinisg: Nice! Please override the dump... [10:35:24] !log executed megacli -LDSetProp NoCachedBadBBU -LALL -aALL on analytics1049/45 [10:35:25] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [10:44:28] * elukey going to lunch! [12:16:21] !log run megacli -LDSetProp ADRA -LALL -aALL on analytics1047 to set ReadAheadAdaptive - T166140 [12:16:23] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [12:16:23] T166140: Review Megacli Analytics Hadoop workers settings - https://phabricator.wikimedia.org/T166140 [12:54:56] !log run megacli -LDSetProp ADRA -LALL -aALL on analytics1047 to set ReadAheadAdaptive on analytics[1042-1046,1048-1057].eqiad.wmnet - T166140 [12:54:59] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [12:55:00] T166140: Review Megacli Analytics Hadoop workers settings - https://phabricator.wikimedia.org/T166140 [12:56:04] I am doing --^ one host at the time with 120s between them [13:05:21] hi! :) [13:05:27] Hi milimetric !! [13:05:46] I'm back if anyone needs me, I'm going to catch up on all the pings [13:07:01] milimetric: how's going [13:07:02] ? [13:07:54] everything's ok [13:08:01] not great, but manageable [13:11:43] :) [13:15:27] hey milimetric glad you're back [13:15:46] hi! It's nice to be back [13:32:28] oo ya maps is gone! [13:32:40] joal: we should remove the coordinator from the bundle, ya? [13:32:51] ottomata: look at your gerrit emails ;) [13:32:56] sorry I didn't say hellooooooooo a-team [13:33:03] * joal points finger to elukey [13:33:07] ottomata: --^ [13:33:08] eh? gerrit emails? [13:33:10] catching up with a buncha emails [13:33:10] oh [13:33:13] hellooooooo fdans [13:33:14] sorry there are more emails [13:33:15] looking [13:33:17] Hi fdans :) [13:33:22] thought i got them all already [13:33:37] i was like "OO i know what to do! they haven't done it yet! " [13:33:40] of COURSE they have :) [13:34:40] Man, I arrived at usual time, elukey had done it all already [13:38:22] * elukey plays the gingle TTTEEEEAAAAAMMMMMM EEEEEUUUURROOOOPEEEEE [13:38:36] * elukey blames fdans for the brainwash [13:39:29] POZORRRR! [13:41:35] ahahahah [13:44:56] !log AQS cluster in beta wiped and re-bootstrapped due to T167222 [13:44:57] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [13:44:57] T167222: Cassandra table storage backend error in Deployment-prep - https://phabricator.wikimedia.org/T167222 [13:48:03] elukey joal aw I missed you [13:48:15] 10Analytics-Kanban, 10Operations, 10ops-eqiad, 10Patch-For-Review, 10User-Elukey: Review Megacli Analytics Hadoop workers settings - https://phabricator.wikimedia.org/T166140#3332220 (10elukey) Current status is: ``` elukey@neodymium:~$ sudo cumin 'R:class = role::analytics_cluster::hadoop::worker' 'meg... [13:48:27] ottomata: --^ [13:49:11] 10Analytics-Kanban: Preserve userAgent field in apps schemas - https://phabricator.wikimedia.org/T164125#3332222 (10mforns) Hey @Tbayer, thanks for looking into this! > Your suggestions all look great to me, except that for the MobileWikiAppPageScroll, the "funnel" aspect is actually the less important part. Ca... [13:54:47] 10Analytics: Use hive dynamic partitioning to split webrequest on tags - https://phabricator.wikimedia.org/T164020#3332225 (10JAllemandou) [13:55:05] 10Analytics-Kanban: Use hive dynamic partitioning to split webrequest on tags - https://phabricator.wikimedia.org/T164020#3218541 (10JAllemandou) [13:55:14] 10Analytics-Kanban: Use hive dynamic partitioning to split webrequest on tags - https://phabricator.wikimedia.org/T164020#3218541 (10JAllemandou) a:03JAllemandou [13:55:29] (03PS1) 10Joal: [WIP] Split webrequest into smaller datasets [analytics/refinery] - 10https://gerrit.wikimedia.org/r/357814 (https://phabricator.wikimedia.org/T164020) [13:55:46] ottomata: If you have a minute, can you take a look at --^ ? [13:56:10] Ok, taking a break now a-team [13:56:27] cyaaa [14:03:49] joal: was just reading about some hive timestamp stuff [14:03:50] https://issues.apache.org/jira/browse/HIVE-9298 [14:04:02] would be nice, would allow us to keep the storage value as ISO 8601 [14:04:44] but we don't have hive 1.2 :( [14:23:29] mforns: o/ - I added the whitelist field check that you asked yesterday to https://gerrit.wikimedia.org/r/356383 [14:23:46] (03CR) 10Ottomata: "Thinking about the name 'split'. Maybe we should refer to this as webrequest_exploded? Or something else? Also, I would like to see mor" (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/357814 (https://phabricator.wikimedia.org/T164020) (owner: 10Joal) [14:23:46] elukey, will review :D [14:24:33] (plus test added) [14:24:53] mforns: is there anything else pending to do for the script? [14:25:01] elukey, I don't think so [14:25:05] super [14:25:50] (03CR) 10Ottomata: [WIP] Split webrequest into smaller datasets (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/357814 (https://phabricator.wikimedia.org/T164020) (owner: 10Joal) [14:26:04] elukey, I think though we should also check that all schema prefixes in the white-list also have at least 1 table in the db [14:26:16] I mean check that outside sanitize [14:26:45] so that if the whitelist has a typo in the schema name, then we do not execute the script and purge data that is expected to live [14:27:05] will comment in-line in gerrit [14:27:23] I'll add it now, not needed [14:27:27] good point though [14:32:21] ooo elukey there is a kafka 0.11.0.0 RC out... :o [14:32:49] * Exactly-once delivery and transactional messaging [14:32:49] * Streams exactly-once semantics [14:32:49] * Admin client with support for topic, ACLs and config management [14:32:49] * Record headers [14:32:49] * Request rate quotas [14:32:49] * Improved resiliency: replication protocol improvement and single-threaded controller [14:32:49] * Richer and more efficient message format [14:34:09] wow exactly-once * is big [14:34:24] (03CR) 10Joal: "We can definitely use another database to keep that table, like for instance wmf_exploded, or wmf_optim." (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/357814 (https://phabricator.wikimedia.org/T164020) (owner: 10Joal) [14:36:02] elukey, oh, didn't read and commented anyway in gerrit, sorry [14:36:17] no need, the solution looks great! [14:42:10] elukey: we will have to see of course, but even though its just RC, i'm inclined to wait for the release for our upgrade :o [14:42:22] it is a major version upgrade, and it will be reallly nice not to have to do that live [14:42:25] but install on a new cluster [14:43:23] yep definitely [14:43:29] but let's make sure that it is not too new :) [14:43:41] haha [14:43:42] indeed [14:47:35] hola [14:47:37] bearloga: FYI regarding maps cluster: https://gerrit.wikimedia.org/r/#/c/357769/ [15:10:27] 10Analytics-Kanban, 10Patch-For-Review: Test failures in refinery master - https://phabricator.wikimedia.org/T166334#3332565 (10Nuria) a:03Nuria [15:27:35] 10Analytics, 10Analytics-Cluster, 10Operations, 10ops-eqiad, 10Patch-For-Review: rack/setup/install replacement stat1006 (stat1003 replacement) - https://phabricator.wikimedia.org/T165366#3332608 (10Cmjohnson) [15:29:32] 10Analytics, 10Analytics-Cluster, 10Operations, 10ops-eqiad, 10Patch-For-Review: rack/setup/install replacement to stat1005 (stat1002 replacement) - https://phabricator.wikimedia.org/T165368#3332623 (10Cmjohnson) [15:37:32] mforns: new version of el_cleaner out! :) [15:39:27] elukey, looking :] [15:48:05] 10Analytics, 10Analytics-EventLogging: EventLogging tests fail for python 3.4 in Jenkins - https://phabricator.wikimedia.org/T164409#3332667 (10Nuria) p:05Normal>03High [15:48:19] 10Analytics-EventLogging, 10Analytics-Kanban: EventLogging tests fail for python 3.4 in Jenkins - https://phabricator.wikimedia.org/T164409#3232541 (10Nuria) [15:51:13] 10Analytics: Serve global unique device counts externally - https://phabricator.wikimedia.org/T157981#3332674 (10Nuria) There aretwo parts: - serve files (ongoing) - changes on aqs to serve two values on api [15:51:53] 10Analytics: Serve global unique device counts externally - https://phabricator.wikimedia.org/T157981#3332677 (10Nuria) p:05Normal>03Low [15:56:02] 10Analytics: Provide top domain and data to truly test superset - https://phabricator.wikimedia.org/T166689#3332698 (10Nuria) Superset is python and thus harder to deploy than a node counterpart. Can we deploy this virtual env (whole environment is uploaded to gerrit) Superset can consume from druid as it is. [15:56:17] 10Analytics: Setup superset to consume from mysql slaves - https://phabricator.wikimedia.org/T167427#3332699 (10Nuria) [15:56:46] 10Analytics: Configure superset to query mysql slaves - https://phabricator.wikimedia.org/T167427#3332699 (10Nuria) [15:57:48] 10Analytics: Configure superset to query mysql slaves - https://phabricator.wikimedia.org/T167427#3332699 (10Nuria) [15:58:32] 10Analytics: Provide top domain and data to truly test superset - https://phabricator.wikimedia.org/T166689#3332725 (10Nuria) p:05Triage>03Normal [16:00:31] 10Analytics, 10Analytics-Cluster: Upgrade Analytics Cluster to Java 8 - https://phabricator.wikimedia.org/T166248#3289857 (10Nuria) p:05High>03Normal [16:17:28] 10Analytics, 10Analytics-EventLogging: Bulk/Batch event endpoint - https://phabricator.wikimedia.org/T166249#3332848 (10Nuria) Tasks: * how do we handle posts? Does varnishlog have access to post body? (likely nbt) * do we need a new endpoint to consume this (outside varnish) , say node consumer that sends da... [16:19:00] 10Analytics, 10Analytics-EventLogging: Bulk/Batch event endpoint - https://phabricator.wikimedia.org/T166249#3332850 (10Nuria) p:05Triage>03Normal [16:23:10] 10Analytics-Kanban, 10Operations, 10Traffic, 10Patch-For-Review: Replace Analytics XFF/client.ip data with X-Client-IP - https://phabricator.wikimedia.org/T118557#3332875 (10Nuria) [16:25:14] 10Analytics: Alarms on pageview API latency increase - https://phabricator.wikimedia.org/T164243#3332895 (10Nuria) [16:25:26] 10Analytics-Kanban: Alarms on pageview API latency increase - https://phabricator.wikimedia.org/T164243#3226937 (10Nuria) [16:25:54] 10Analytics: Reinstate a subset of reports removed from the reportcard until WikiStats 2.0 is back - https://phabricator.wikimedia.org/T166679#3332902 (10Nuria) [16:26:32] 10Analytics-Kanban: Document that old deleted pages have empty fields in Analytics Cluster edit data - https://phabricator.wikimedia.org/T165201#3332903 (10Nuria) a:03mforns [16:27:19] 10Analytics-Kanban, 10DBA, 10Operations, 10ops-eqiad, 10User-Elukey: db1046 BBU looks faulty - https://phabricator.wikimedia.org/T166141#3332904 (10elukey) @Cmjohnson sorry to ping :) Any idea if we have a spare BBU for db1046? [16:29:28] 10Analytics-Kanban, 10DBA, 10Operations, 10ops-eqiad, 10User-Elukey: db1046 BBU looks faulty - https://phabricator.wikimedia.org/T166141#3332932 (10Cmjohnson) @elukey. Yes, I have another decommissioned r510 to take it from. Ping you in a hour or so to replace [16:35:30] 10Analytics, 10Analytics-EventLogging, 10Reading Epics (Analytics): Bulk/Batch event endpoint - https://phabricator.wikimedia.org/T166249#3332965 (10Fjalapeno) @Ottomata that was me you were talking with at the offsite earlier this year… And yeah, sounds like a POST would be ideal here. @Nuria let me know... [16:35:50] ottomata: ---^ db1046's bbu - when/how should we replace it? [16:40:31] 10Analytics: Implement purging settings for Schema:ReadingDepth - https://phabricator.wikimedia.org/T167439#3332973 (10Tbayer) [16:41:07] !log deploying refinery to cluster [16:41:08] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [16:41:42] 10Analytics-Kanban, 10DBA, 10Operations, 10ops-eqiad, 10User-Elukey: db1046 BBU looks faulty - https://phabricator.wikimedia.org/T166141#3332986 (10elukey) @Cmjohnson thanks! Would it be possible to do the swap next week? Since this is an important DB I'd need to coordinate my team and Jaime/Manuel first. [16:41:59] just replied that we'd need to decide when [16:42:52] 10Analytics, 10Analytics-EventLogging, 10Reading Epics (Analytics): Bulk/Batch event endpoint - https://phabricator.wikimedia.org/T166249#3332987 (10Nuria) @Fjalapeno : we think we might be able to start working on this possibly by end of Q2 [16:43:40] 10Analytics-Kanban: Preserve userAgent field in apps schemas - https://phabricator.wikimedia.org/T164125#3332992 (10Tbayer) >>! In T164125#3332222, @mforns wrote: > Hey @Tbayer, thanks for looking into this! > ... > >> PS: There are also four older app schemas (each last revised in 2013, but still active) that... [16:46:44] all right people logging off, will do some peer reviews later on [16:46:49] byyyeeee [16:46:50] * elukey off! [16:46:58] kudos to a-team on uodating docs about cluster deploy [16:51:46] a-team: deployed new reginery code (not source , just oozie's) [16:51:49] *refinery [16:58:53] Thanks nuria_, will take on reordering uniques jobs later [17:13:08] 10Analytics, 10Page-Previews: Update purging settings for Schema:Popups - https://phabricator.wikimedia.org/T167449#3333176 (10Tbayer) [17:22:35] 10Analytics: Implement purging settings for Schema:MobileWikiAppFeed - https://phabricator.wikimedia.org/T167453#3333268 (10Tbayer) [17:24:29] 10Analytics, 10Pageviews-API: Endpoint for average view rate in Pageview API - https://phabricator.wikimedia.org/T162933#3333284 (10Nettrom) Coming back to this I have a bunch of questions, so I'll just ask them and see where we go from there. Apologies if this is counterproductive, feel free to let me know ho... [17:29:32] 10Analytics: Implement purging settings for Schema:MobileWikiAppFeed - https://phabricator.wikimedia.org/T167453#3333313 (10Tbayer) (This is actually already covered T164125#3300396 - sorry, got mixed up here.) [17:31:54] 10Analytics: Implement purging settings for Schema:MobileWikiAppFeed - https://phabricator.wikimedia.org/T167453#3333323 (10Tbayer) [17:44:23] 10Analytics-Kanban: Preserve userAgent field in apps schemas - https://phabricator.wikimedia.org/T164125#3333362 (10Dbrant) ^ This is correct. The fields in the schemas you mentioned do not need to be preserved. [18:20:43] nuria_: I was just looking at https://phabricator.wikimedia.org/T166335, this is throwing errors in prod but the team triaged it low [18:20:58] shouldn't I work on it now? [18:21:32] 10Analytics-Dashiki, 10Analytics-Kanban, 10Wikimedia-log-errors: Warning: JsonConfig: Invalid $wgJsonConfigModels['JsonConfig.Dashiki'] array value, 'class' not found - https://phabricator.wikimedia.org/T166335#3333542 (10Milimetric) p:05Low>03Normal a:03Milimetric [18:22:30] milimetric: if the errors are consistent (was not clear from report) please do [18:22:42] yeah, it's apparently throwing this error all the time [18:23:02] I forgot about it this morning and people looking at logs see it and remind me [18:50:16] team, I just realised that I forgot a dependency in oozie changes for uniques [18:50:38] 10Analytics, 10Analytics-Cluster, 10Operations, 10ops-eqiad, 10Patch-For-Review: rack/setup/install replacement stat1006 (stat1003 replacement) - https://phabricator.wikimedia.org/T165366#3333636 (10Cmjohnson) [18:51:30] 10Analytics-Dashiki, 10Analytics-Kanban, 10Wikimedia-log-errors: Warning: JsonConfig: Invalid $wgJsonConfigModels['JsonConfig.Dashiki'] array value, 'class' not found - https://phabricator.wikimedia.org/T166335#3333637 (10Milimetric) From analyzing logstash (just searching for [[ https://logstash.wikimedia.o... [18:51:47] (03PS1) 10Joal: Update cassandra loading for unique devices [analytics/refinery] - 10https://gerrit.wikimedia.org/r/357871 [18:58:55] 10Analytics, 10Analytics-Cluster, 10Operations, 10ops-eqiad, 10Patch-For-Review: rack/setup/install replacement stat1006 (stat1003 replacement) - https://phabricator.wikimedia.org/T165366#3333657 (10Cmjohnson) a:05Cmjohnson>03RobH @robh added mac address to dhcpd already, verified on switch that it's... [19:08:40] 10Analytics, 10Analytics-Cluster, 10Operations, 10ops-eqiad, 10Patch-For-Review: rack/setup/install replacement to stat1005 (stat1002 replacement) - https://phabricator.wikimedia.org/T165368#3333707 (10Cmjohnson) [19:09:15] 10Analytics, 10Analytics-Cluster, 10Operations, 10ops-eqiad, 10Patch-For-Review: rack/setup/install replacement to stat1005 (stat1002 replacement) - https://phabricator.wikimedia.org/T165368#3264256 (10Cmjohnson) a:05Cmjohnson>03RobH [19:21:14] (03CR) 10Nuria: "Friendly remainder to us that these split datasets also need removal after 90 days." [analytics/refinery] - 10https://gerrit.wikimedia.org/r/357814 (https://phabricator.wikimedia.org/T164020) (owner: 10Joal) [19:33:30] (03CR) 10Nuria: Update cassandra loading for unique devices (032 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/357871 (owner: 10Joal) [19:42:09] 10Analytics, 10Analytics-EventLogging, 10Contributors-Analysis, 10EventBus, and 5 others: Record an EventLogging event every time a new content namespace page is created - https://phabricator.wikimedia.org/T150369#3333810 (10kaldari) @Ottomata - Now that we're collecting all the data via EventBus, what's t... [19:51:24] 10Analytics, 10Analytics-EventLogging, 10Contributors-Analysis, 10EventBus, and 5 others: Record an EventLogging event every time a new content namespace page is created - https://phabricator.wikimedia.org/T150369#3333855 (10Nuria) @kaldari: I think we need to describe what are we counting/exposing/visuali... [20:37:24] 10Analytics, 10Page-Previews, 10Reading-Web-Backlog: Update purging settings for Schema:Popups - https://phabricator.wikimedia.org/T167449#3334036 (10Jdlrobson) @tbayer or @mforns are you planning to do this or do you need web team assistance (ie. do we need to plan it for an upcoming sprint?) [20:43:21] Hi, I'm curious how redirects are handled in the wmf.pageview_hourly table. How would a view coming from a redirect be logged? [20:43:34] o/ hey folks [20:43:53] o/ [20:44:00] I had a thread with Nuria about this, but it's still confusing and hall1467 wants to get view rates matched with page IDs for his work [20:46:04] halfak: in order to use that data you need to first asses of its quality [20:46:41] nuria_, yeah. That's a whole research project. [20:47:03] halfak: agreed [20:47:08] Before we assess it's quality we want to learn why you question its quality. [20:47:17] halfak: that is why that disclaimer is there [20:47:34] nuria_, why not just put that disclaimer on all the data? [20:48:07] halfak: cause quality measures are in place for much of our data but we have never vetted page_id [20:48:17] halfak: taht is not true on all fields on x-nalytics [20:48:21] *that [20:48:28] *x _analytics [20:48:51] page_id is basically a project that (as far as i know) was never completed but hey, would love to be told otherwise [20:49:01] ^ halfak [20:50:04] halfak: also pageview hourly only has data for pageviews [20:50:13] halfak: if you are talking about 301 redirects [20:50:28] halfak: those are never propagated to taht table cc hall1467 [20:50:32] *that [20:50:56] halfak: pageveiws are not 301 , 307 or 302s [20:50:59] oh good. Wouldn't have been a concern either way though. [20:51:00] cc hall1467 [20:53:42] nuria_, what makes you believe that we might get page_ids from redirecting pages? [20:53:56] what's the mechanism by which the page_id is identified? [20:54:08] halfak: because we never checked that it would be otherwise [20:54:32] halfak: this one: https://github.com/wikimedia/mediawiki-extensions-WikimediaEvents/blob/master/WikimediaEventsHooks.php#L39 [20:54:43] halfak: not sure if that is what you are asking [20:57:34] 10Analytics, 10Page-Previews, 10Reading-Web-Backlog: Update purging settings for Schema:Popups - https://phabricator.wikimedia.org/T167449#3334058 (10Tbayer) @Jdlrobson As you can see, it's assigned to a member of the Analytics team. I expect that no work from the web team will be required, so feel fre to re... [21:02:24] Sorry, I'm a bit confused here. So basically, we don't know if a redirect page gets logged in this table or not? [21:12:53] nuria_ [21:38:59] What's the recommended pattern for querying hive and writing the result to a file? [21:53:55] halfak: hive -f some.hql > out.txt [21:54:16] hall1467: no, we know in that table we do not have redirect information [21:54:18] Cool. stdout should be clean then? This is nice because I can stream compress the output :) [21:55:19] hall1467: if that is the info you need you need to seek it elsewhere, that table only has information about pageviews [21:56:04] halfak: I think so, I normally do that on a screen and after if files are huge i just gzip them