[07:35:18] anyone knows what is the format/encoding of the title page in the raw pagecounts? [08:15:24] Analytics, Discovery: Many crapped encoded titles in pagecounts raw - https://phabricator.wikimedia.org/T137013#2355162 (eranroz) [09:24:32] joal: any chance that you are around? [09:24:39] elukey: I am :) [09:24:45] hellooooo [09:24:48] What-s up ? [09:24:53] HiiiiiIIIIIiii :) [09:24:58] https://grafana.wikimedia.org/dashboard/db/kafka?panelId=35&fullscreen [09:25:26] kafka1022 is loosing 1% of free space every two hours [09:26:03] checking in the data logs I would need to lower down the upload global retention to 1 day [09:26:06] to purge [09:26:51] ok [09:27:51] -rw-r--r-- 1 kafka kafka 512M Jun 3 08:35 00000000012543942505.log [09:27:51] -rw-r--r-- 1 kafka kafka 137K Jun 3 08:35 00000000012543942505.index [09:27:54] elukey@kafka1022:/var/spool/kafka/b/data/webrequest_text-16$ date [09:27:57] Sat Jun 4 09:17:54 UTC 2016 [09:28:15] the first log is the last line of ls -lht /var/spool/kafka/b/data/webrequest_text-16 [09:28:58] it will purge a lot of other logs across the brokers but I think it is not avoidable :( [09:29:26] so the proposal is [09:29:34] kafka configs --alter --entity-type topics --entity-name webrequest_upload --add-config retention.ms=86400000 [09:32:44] joal: ---^ [09:33:01] elukey: a lot of purge indeed ! [09:33:10] elukey: if nothing better, go for it [09:33:21] most of the logs have Jun 3 after the cluster restart :( [09:33:37] mwarf [09:36:13] Analytics, Analytics-Cluster: Kafka 0.9's partitions rebalance causes data log mtime reset messing up with time based log retention - https://phabricator.wikimedia.org/T136690#2355215 (elukey) After the cluster restart, kafka1022 doesn't look good: ``` elukey@kafka1022:/var/spool/kafka/b/data/webrequest... [09:36:43] joal: ok proceeding [09:38:23] !log Lowering down temporarily the Analytics kafka upload retention time to 24h to free space (T136690) [09:38:24] T136690: Kafka 0.9's partitions rebalance causes data log mtime reset messing up with time based log retention - https://phabricator.wikimedia.org/T136690 [09:38:24] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log, Master [09:47:50] !log removed temporary Analytics Kafka upload retention override (T136690) [09:47:51] T136690: Kafka 0.9's partitions rebalance causes data log mtime reset messing up with time based log retention - https://phabricator.wikimedia.org/T136690 [09:47:51] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log, Master [09:47:57] joal: all good now :) [09:48:05] thanks for double checking [10:03:26] elukey: I didn't do anything :) [10:03:30] Thanks YOU :) [10:04:08] have a good weekend :) [10:05:09] You too ! [10:56:30] http://blog.cloudera.com/blog/2016/05/new-in-cdh-5-7-improved-performance-security-and-sql-experience-in-hue/ [16:31:59] Analytics, Commons, Multimedia, Tabular-Data, and 4 others: Review shared data namespace (tabular data) implementation - https://phabricator.wikimedia.org/T134426#2355421 (Yurik) [16:48:33] heads up analytics - i just sent an email to wikitech-l re shared data ns. Please comment and support :)