[06:23:43] joal: o/ [06:23:51] I am reading https://issues.apache.org/jira/browse/KAFKA-1489 [06:24:51] and it seems to me that the 'log' retention.bytes refers to each partition, not global topic.. this could explain why we didn't get any effect yesterday [06:26:53] (brb( [07:03:53] I took a look to the kafka metrics and came up with: https://grafana-admin.wikimedia.org/dashboard/db/kafka?panelId=37&fullscreen [07:04:52] aliasByNode(sortByMaxima(highestMax(kafka.cluster.$cluster.kafka.$kafka_brokers.kafka.log.Log.Size.*.Value, 6)), 9) [07:05:11] so the Log Size metric is related to each partition [07:05:15] that makes sense [07:05:49] top 6 are of course text related and they are on avg ~600GiB [07:06:24] sorry, high peak is ~600/650 GiB [07:07:04] so the retention.bytes is about broker partitions imho [07:07:13] s/imho/afaik [07:08:01] I renamed the graph "Broker Log Size" to "Aggregate Partitions size per node" [07:08:48] because in there we sum all partitions size per node regardless of the topic [07:10:19] mmmm no 12TB per node is impossible [07:12:15] no it is not, just did the calculation of /var/spool/kafka/* sizes on kafka1020 [07:12:36] :O [07:20:24] so probably limiting retention.size to 600GB would do the trick for us [07:49:38] * elukey commutes the office [08:27:07] Hi elukey [08:29:18] o/ [08:29:36] Thanks for the explanations elukey ! [08:29:41] Makes a lot of sense :) [08:29:45] \o/ [08:30:26] I think you can try retention.bytes to either 600GiB or even 550GiB if you prefer to have some mitigation [08:31:09] I am wondering if I test it in labs, or maybe in upload/misc first [08:31:30] I am 90% positive that this will work but if I got it wrong there is the possibility to wipe a lot of data :D [08:31:57] elukey: I'd have no objection to determine a peak value for misc and test with alower value on that one [08:32:21] poor misc :P [08:32:37] huh [08:34:05] I am also about to merge the vk config change [08:34:12] YAY ! [08:34:30] elukey: Let me know when done, I'd like to double check the dataloss :) [08:35:29] I need to merge https://gerrit.wikimedia.org/r/#/c/293327/, sanity checking with traffic atm :P [08:38:26] ok coll [08:53:52] joal: puppet is running! [08:54:10] * joal is in the starting blocks [08:55:12] it has run on cp3009.esams.wmnet and tailing kafka logs from stat1002 looks good [08:55:18] timestamp is correctly displayed [08:55:22] cool [08:56:11] so, UTC wise, I should see a difference starting from 9:00 [08:56:15] elukey: --^ right ? [08:57:18] I would say probably from 10:00, puppet runs staggered across the cp* server and it'll take a bit before upgrading vk [08:57:43] elukey: ok :) [08:57:57] :) [08:59:49] elukey: thanks for yesterday's merge, sir! [09:00:12] Hi urandom [09:00:20] long time not talked urandom:) [09:00:24] joal: yeah! [09:00:46] urandom: Wanted to thank you for the good advices on compaction, multi-instance etc [09:01:05] joal: oh, no worries [09:01:06] currently testing with more and more data, and results are encouraging :) [09:02:04] joal: i'd still like to sit down some day and look at (and think about) your data model [09:02:19] urandom: thanks for the sir! :) [09:02:32] heh [09:02:43] urandom: completely agreed, I have the feeling that it's not optimised as it could [09:03:06] urandom: I think it's 'not bad', but could be better [09:09:44] joal: are you going to wikimania? [09:09:53] urandom: I am not :( [09:10:02] urandom: do you stay in Europe a bit after? [09:10:17] Oh, I recall we discussed you were in Berlin just before us [09:10:26] joal: we have an offsite the week after [09:10:45] urandom: we are in berlin next week [09:10:53] what a shame :( [09:10:59] joal: i know :( [09:11:05] joal: I'm in Berlin right now [09:11:13] mwarf [09:11:13] i leave just in time for you all to arrive [09:12:20] timing fail. [09:12:27] urandom: yeah [09:12:59] urandom: And then I have no plan to move to US, and I assume you are not coming to Europe soon [09:13:32] probably not this year, no [09:13:48] Right .... So correct schedule woulf be next year-s all hands [09:13:54] heh [09:13:58] It's a bit far, but at least we have it [09:14:08] true [09:14:49] It'll allow us to gather more experience with the current compactio etc before trying something new :) [09:15:00] joal: we could always compromise and find some place midway, perhaps the Jamcia, or the Bahamas? [09:15:09] urandom: works for me !!! [09:15:12] :) [09:15:32] if the foundation agrees on the need, let's do it ;) [09:21:17] joal: vk restarted on all misc-maps nodes [09:21:23] ok great [09:21:24] so we are running then new version now [09:21:27] Will monitor dataloss [09:21:55] elukey: the a-team owes you beers for this vk stuff ! [09:22:14] let's wait to see if it works before that :P [09:22:18] huhu [10:14:56] joal: I am trying on our kafka brokers in labs [10:15:03] k elukey [10:15:11] and the retention.bytes override doesn't seem to work [10:15:12] :/ [10:15:22] mwarf? :/ [10:16:38] yeah we have 4MB partitions in there for webrequest_test, so I added retention.bytes=2000000 to test [10:16:41] no action [10:17:15] that's weird though ... Have you tried to remove temporal limit? [10:17:27] Like maybe when there is both, it takes only itme? [10:17:45] I removed the topic override for retetion.ms, but not the global config [10:57:55] joal: no joy, tried also with global config removing retention.ms [10:58:00] so it must be something else [10:59:04] mmm maybe because inside the partitions there is only one log [10:59:20] err only one segment [10:59:22] mmmm [10:59:49] no at this point I think that misc is the best bet [11:03:52] urandom: whenever you have time can we chat about https://gerrit.wikimedia.org/r/#/c/292568/ ? [11:10:00] elukey: I'm ok for a misc try, but possibly better to wait for ottomata [11:11:12] so we could set misc to retention.bytes=2000000000 [11:11:16] that is 2GB [11:11:39] we are now at 2.3GiB [11:12:13] sure we can wait! [11:12:15] going to lunch now [11:12:20] ttl [12:33:38] mmmm oozie registered some loss for 2016-06-09-11 [12:34:01] hope that this will be the last spam email [12:59:20] elukey: re: https://gerrit.wikimedia.org/r/#/c/292568, I think we should go with the left-hand option [12:59:32] elukey: the one that hashes to fbc35ba7 [12:59:39] :) [13:03:45] urandom: sorry didn't get it [13:04:07] yeah, sorry, i have a horrible sense of humor [13:04:51] i was inspired by jzerebecki 's: "I would suggest to go with the only winning option." [13:07:09] elukey: i was hoping that alexandros would chime in again, since his "And yes you can parameterize the module class to accept the contact_groups but..." statement seemed to indicate that he thought we should do something other than well...paramaterizing the module class [13:07:10] :) [13:07:41] elukey: but we do exactly that all over the place [13:08:18] elukey: anyway, my puppet-fu is weak, so I defer to your judgement [13:09:29] urandom: :) So the super correct thing to do would be not to include specific monitoring (installation related) settings in a module but defer it to the role configuration [13:09:31] your plan to work up an alternative to have something concrete to discuss sounds good to me [13:09:53] but a configurable monitoring class would be good imho [13:09:57] elukey: is there an example of that elsewhere in the repo? [13:11:12] let me see [13:17:51] urandom: I tried to search "nrpe::monitor_service" in the puppet repo [13:18:08] and effectively I can see it in a lot of roles classes [13:18:09] heh. yes I didn't look at the change. [13:19:32] or for example modules/base/manifests/monitoring/host.pp [13:19:44] elukey: just double checked values in dataloss for misc: getting higher from 0 to 12 :( [13:21:41] that doesn't make any sense [13:21:56] elukey: indeed :( [13:22:19] do you have examples of holes [13:22:20] ? [13:22:33] I have not checked details, only globals [13:23:05] buuuu [13:23:07] :( [13:23:33] so this means that we probably need to add the sequence timestamp too [13:24:03] hm [13:24:21] but data first [13:24:34] I want to see how we are doing now [13:24:43] and why it is worse [13:25:07] elukey: I'm not sure if it's really worse, or just a side effect [13:25:16] we assumed that picking the timestamp closer to when the sequence number was generate would have bucketed things clearly [13:25:23] Could be a side effect of restarts [13:26:51] Let's wait a few hours before taking actions [13:27:18] * elukey put his battle gear back into the closet [13:27:28] :) [13:31:28] elukey: you keep your battle gear in the closet? [13:32:45] urandom: yes always useful when you need to write code [13:32:46] :D [13:33:37] elukey: oh, i use a shovel and rubber boots for that! [13:34:08] ahahhaha [13:36:48] I just wear a helmet for when I have to bang my head into the desk :) [13:38:20] joal / mforns: thanks for the advice, relaunching with that. I had tried 2G of driver memory on Madhu's advice last night, but that wasn't enough. I'll try 4G and then I'll keep thinking about the algorithm if that doesn't work either. [13:38:41] Hi milimetric :) [13:38:43] but 2G died even with me sampling the page table at 5%!! so I was thinking the algorithm probably needs work [13:38:46] hi milimetric OK [13:39:13] mmmmmhhhh [13:39:23] milimetric: Actually the thing might fail even before starting the actual work depending on how you read data [13:39:30] milimetric, let me know if you want to mob program :] [13:39:41] milimetric, mforns : YAY 1 [13:39:59] heh, ok, I'll see how it runs on 4G and think about it some more, then maybe batcave later? Are you both around until standup? [13:41:36] Analytics, Services: Refactor the default cassandra monitoring into a separate class - https://phabricator.wikimedia.org/T137422#2367810 (elukey) [13:41:56] urandom: --^ [13:42:00] Mostly milimetric [13:42:12] Analytics, Services, cassandra: Refactor the default cassandra monitoring into a separate class - https://phabricator.wikimedia.org/T137422#2367822 (Eevans) [13:42:20] elukey: yup! [13:42:27] mforns: we briefly discussed http://events.linuxfoundation.org/events/apache-big-data-europe with elukey [13:42:38] yesssss [13:42:44] milimetric, yes [13:42:47] mforns, elukey : Would you go with me, maybe even propose a talk ? [13:43:03] joal, looking [13:43:38] joal: I'd really like to go and can definitely help preparing a talk [13:44:40] joal, looks interesting! and yea, if I can help, I'll be pleased :] [13:46:39] joal: FYI I just removed the old AQS alarms (not multi-instance aware) and created https://phabricator.wikimedia.org/T137422 [13:47:12] elukey I'm flollowing from far your discussions with ops ;) [13:47:26] elukey: That's good work :) [13:47:26] gooood [13:47:31] elukey: Thanks for doing that [13:51:48] I get a lot of these, it seems ok though: [13:51:49] ERROR YarnScheduler: Lost executor 33 on analytics1035.eqiad.wmnet: remote Rpc client disassociated [13:51:56] mforns, elukey: if talk there is, I'd go for Pageview end-to-end [13:52:13] milimetric: this is indeed ok, it means spark is releasing resources it's not using anymore [13:52:27] joal: would be awesome [13:52:29] milimetric: This log is removed in spark 1.6, and we have 1.5 ;) [13:52:38] oh ok [13:52:45] from Varnish to Cassandra, a journey in the Analytics Jungle [13:52:57] Cause it says warning, but it's actually an expected behavior :) [13:53:33] ok, out of memory again hmmmm [13:53:49] elukey: From pageview to pageview - traffic analysis at WMF [13:54:15] more explicit: From page view to pageview count [13:57:22] milimetric: mwarf :( [13:57:37] milimetric: batcave? [13:57:37] Counting pageviews? anyhow, whatever, it would be awesome anyhow [13:57:49] joal: let me stare at it a bit [13:58:08] sure milimetric, ping me if you feel like it :) [13:58:30] milimetric: I'm trying to find excuses to stop (trying to) write documentation [14:00:03] ahaha, ok, I'm trying to find excuses to stop sucking at scala [14:01:07] I'm going to rewrite it to the newer version that marcel has and I'll try again with that [14:01:41] milimetric, ok, but I'm still in the middle of it [14:02:08] below ---------------------- it's old code [14:03:10] thx [14:06:52] Analytics-Cluster, Analytics-Kanban, Operations, ops-eqiad: analytics1049.eqiad.wmnet disk failure - https://phabricator.wikimedia.org/T137273#2367846 (Ottomata) [14:07:05] Analytics-Cluster, Analytics-Kanban, Operations, ops-eqiad: Smartctl disk defects on kafka1012 - https://phabricator.wikimedia.org/T136933#2367848 (Ottomata) p:Triage>Normal [14:08:32] ottomata: aloha, let me know when you have a minute :) [14:09:47] hiiii [14:09:49] tell me [14:09:57] Hi ottomata :) [14:09:57] i have minutes! [14:10:03] HIIIIII [14:10:24] I modified some graphs in https://grafana-admin.wikimedia.org/dashboard/db/kafka [14:10:43] I was investigating why retention.bytes has not worked [14:10:55] and realized that probably a log is a topic partition [14:10:59] not the whole topic [14:11:17] that more or less matches with the metrics [14:11:21] you think? i thought the docs say topic...but maybe not! [14:11:32] i see two new graphs, is that what you changed? [14:11:37] yep! [14:11:56] hm, yeah i guess it just says the log [14:11:58] maybe you are right! [14:12:10] and here's come the question [14:12:19] elukey: btw, you also saved the default view at 30 days :) [14:12:27] ah snap [14:12:33] going to fix it sorry [14:12:40] np [14:12:43] oh! [14:12:44] oh oh [14:12:49] and now we have the right number of graphs! [14:13:05] you can change the span on the 3 about disk space [14:13:10] so that they line up [14:13:11] so I tried in labs to do [14:13:14] woooooo [14:13:15] yes! [14:13:23] haha [14:13:25] ok elukey continue... [14:13:26] :) [14:13:36] * elukey is grabbing the command [14:14:09] kafka configs --alter --entity-type topics --entity-name webrequest_test --add-config retention.bytes=2000000 [14:14:20] because partitions where 4MB but didn't work [14:14:35] BUT this might be due to the fact that each of them has only one segment [14:14:40] SO [14:14:46] I'd like to try with misc :P [14:15:23] setting retention.bytes to 2GB [14:15:54] but since this might be completely stupid I wanted to have your opinion [14:16:46] naw, I think you are on to something [14:16:48] worth a try [14:17:07] do it! [14:18:14] YAY [14:21:56] * elukey plays Johnny B. Goode [14:23:57] kafka configs --alter --entity-type topics --entity-name webrequest_misc --delete-config retention.bytes=2000000 [14:24:10] joal, ottomata --^ [14:24:46] elukey: add config ? [14:24:54] argh [14:24:57] yes wrong paste [14:24:58] :P [14:25:24] kafka configs --alter --entity-type topics --entity-name webrequest_misc --add-config retention.bytes=2000000 [14:25:45] * joal counts zeros [14:25:48] ok [14:28:33] that's megabytes, no? [14:28:50] elukey: ^ [14:28:55] Correct ottomata !!!! [14:29:02] yes corrected [14:29:04] * joal gets back to counting zerosa [14:31:42] elukey: btw, I stole your analytics cluster megaraid stuff and slightly altered it for JBOD on analyltics kafka [14:31:42] https://wikitech.wikimedia.org/wiki/Analytics/Cluster/Kafka/Administration#Swapping_broken_disk [14:31:44] :) [14:32:21] woa! Niceeeee [14:32:24] thanks!! [14:32:46] it wooooorksssssssss \0/ [14:33:17] Great elukey :) [14:33:25] all the misc partitions went down to 1.83GiB [14:33:36] well thanks to ottomata I didn't wipe misc :P [14:34:17] ok! [14:34:38] elukey: that makes sense cause [14:34:54] 2000000000 bytes is 1.83GB :) [14:34:59] yep! [14:35:25] nice [14:36:24] !log Tested retention.bytes=2G for kafka webrequest_misc - setting removed [14:37:45] mmmm log not working? [14:37:59] !log Tested retention.bytes=2G for kafka webrequest_misc [14:38:47] mmmmm [14:38:52] well I logged in ops [14:39:16] now, upload or text could be the next [14:39:33] text's partitions are around 669GiB [14:40:36] maybe we can try retention.bytes=700000000000 ? [14:41:32] elukey: can we do powers of two? i'd prefer to be able to do say du -sh and it match what I expect :) [14:41:55] sure, makes sense [14:44:01] elukey: told you about GiB :-P [14:44:03] huhuhu [14:45:03] elukey: hm, um, i'l looking for a text partition that large [14:45:11] maybe i'm looking on nodes where we've cleared stuff [14:45:20] but the few i've found are around 250G [14:45:43] yes I think it is only on kafka1020 [14:46:00] the others are good [14:46:40] elukey: if we set at 250G, that might be a good uppper bound. the ones i've looked at are actually a little smaller [14:47:00] should keep it from growing after a restart, and have a little room to grow? [14:47:07] naturally [14:48:00] I would put a worst case scenario upper bound just in case.. maybe around 500/600GB? [14:48:00] right? [14:48:01] what do you think [14:48:11] hm, that would allow the text logs to double in size t hough, right? [14:48:24] which is what we are trying to avoid, since a restart can let them stick around for 2 weeks instead of 1 [14:49:14] yeah but a single partition would take max 600GB, that is acceptable on a 1.8TB disk partition.. usually in the worst case we get two text partitions + something else.. [14:49:27] I am only trying to be veeery conservative [14:50:02] hmmm [14:50:18] ok, i thought maybe you were trying to have a byte limit that was close to what 7 days gives us [14:50:30] elukey: ok, let's do 500G [14:50:33] ja? [14:51:37] can we do 700GB (or whatever is the GiB correspondent) as first try? Just to make sure that nothing will explode [14:51:53] then we'll lower it down [14:52:07] sure [14:53:27] thanks, I know that I am over paranoid [14:53:34] be patient :) [14:54:21] elukey: you won't see any change with 700G, will you? [14:54:36] the biggest partition is 671 GB (thanks to your graph, easy to see! :) ) [14:54:45] 671 Gib [14:54:48] GiB [14:55:23] so something like 720GB no? [14:55:54] at least, that was the unit of the graph.. I basically duplicated the one that you build before :P [14:56:07] but the 2.0GB misc setting was ok [14:56:11] so it should be good [14:59:10] oh heh, i guess I'm always talking in GiB [15:00:13] elukey: proceed i trusty ya, i think you got it :) [15:01:23] kafka configs --alter --entity-type topics --entity-name webrequest_text --add-config retention.bytes=700000000000 [15:01:30] ottomata: --^ ? [15:01:57] elukey: if this is just your trial, 700000000000 is cool. if we are going to pick something and leave it for a while, let's do a power of 2 [15:02:09] ah yes this one is the trial [15:02:12] k cool [15:02:14] proceed! :) [15:03:18] done [15:04:56] * elukey tails logs [15:06:46] cool! br [15:06:47] brb [15:09:44] it worked, goood :) [15:11:46] joal: retention.size=536871000000 for 500GB? [15:13:10] elukey: 536870912000 in my calculator (1024*1024*1024*500) [15:13:29] ah yes I got it rounded, lazy me :) [15:14:13] kafka configs --alter --entity-type topics --entity-name webrequest_text --add-config retention.bytes=536870912000 [15:17:31] mforns, milimetric : https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake#Project_Documentation [15:17:53] cool, will look closer after standup [15:21:42] gooooood now kafka1020 text partitions are down to 500GiB [15:21:56] retention.bytes works like a charm [15:22:44] free disk space on partitions up to 30% [15:26:23] yay [15:30:37] joal, looks awesome! thanks a lot for doing that! [15:30:54] cool mforns :) [15:31:37] currently adding a portion "stuff we said we'd not forget" in the ongoing work portion :) [15:38:23] elukey: awesome! [15:38:24] nice job [15:42:17] \o/ [15:58:43] ottomata: also updated https://gerrit.wikimedia.org/r/#/c/293270/ [15:59:25] +1! [16:02:39] AHH STANDUP [16:03:35] Analytics-Cluster, Analytics-Kanban, Operations, ops-eqiad: analytics1049.eqiad.wmnet disk failure - https://phabricator.wikimedia.org/T137273#2368252 (Milimetric) a:elukey [16:03:41] Analytics-Cluster, Analytics-Kanban, Operations, ops-eqiad: Smartctl disk defects on kafka1012 - https://phabricator.wikimedia.org/T136933#2368253 (Milimetric) a:elukey [16:08:27] Analytics-Kanban: Extract edit oriented data from MySQL for small wiki - https://phabricator.wikimedia.org/T134790#2368258 (Milimetric) docs from Joseph: https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake#Project_Documentation [17:24:46] going offline team! byyeeee o/ [17:24:56] alarms are now directed to this channel :) [18:04:46] Analytics-Kanban: Bot from an Azure cloud cluster is causing a false pageview spike - https://phabricator.wikimedia.org/T137454#2368793 (Milimetric) [18:53:53] o/ ottomata [18:54:05] Sorry to miss the Analytics/Dev/Researcher checkin [18:54:07] I had a conflict [18:54:16] Did I miss anything fun -- e.g. eventbusstuff [18:54:52] hmmmm I think my fixed point algorithm has an infinite loop :) [18:55:10] *Devops [18:55:37] s'ok! nobody came cept me and joseph! [18:55:39] hehe [21:32:18] joal: are you gone? [21:36:50] Analytics-Kanban, Patch-For-Review: Update mediwiki hooks to generate data for new event-bus schemas - https://phabricator.wikimedia.org/T137287#2369942 (Ottomata) In [[ https://phabricator.wikimedia.org/T134502#2326641 | T134502#2326641 ]] I wrote about how to implement `user_blocks_change` event: >The... [22:03:49] Analytics-Kanban, Patch-For-Review: Update mediwiki hooks to generate data for new event-bus schemas - https://phabricator.wikimedia.org/T137287#2370016 (Ottomata) Ok, it looks like `processForm()` does actually look up the previous block before submitting, if an initial insert fails. The `$currentBlock...