[13:32:58] o/ [13:36:34] dcausse there's an ongoing incident with the DBs, and some of the SREs are wondering if mirrormaker is involved...not sure if you have any special knowledge on that, but you might wanna read the scrollback in #wikimedia-sre just in case [13:36:59] Ben and Andrew are already working on it actively, so this is just an FYI [13:37:27] inflatador: yes just saw that in the sec channel, unsure I have much to add... [13:38:19] dcausse ACK, thanks for confirming [14:50:53] Do we retrospect today or shall we skip? [14:57:08] fine to skip for me [15:03:00] skip is ok [15:44:00] dcausse: I was able to run a spark-kafka-writer example today with 3 partitions and 3 executors. Despite that, the kafka records appear in perfect order of creation. According to the YARN UI, 4 containers have been allocated to run the job. But how can I make sure all those containers actually processed a partition? [15:44:56] pfischer: did you set executor cores to 1? [15:45:05] yes [15:45:16] (that’s the default for —master yarn) [15:46:05] perhaps data locality optimizations are taking precedence and spark prefers waiting on the same jvm because the data is there? [15:46:42] spark.locality.wait=0 perhaps? [15:47:20] or spark.sql.sources.ignoreDataLocality ? [15:47:47] just guessing possible causes... [15:50:00] Still get the same result (this time 5 containers where allocated by the attempt) [15:51:16] commuting, will be back on later tonight [15:58:22] would have to read the code but perhaps it's bound to the number of partitions in the target topic? not the number of input partitions & number of spark workers [15:59:53] maybe you can slowdown production by providing a kafka.partitioner.class that waits to ease testing? [16:00:24] heading out as well, back later tonight [16:02:32] workout, back in ~40 [16:48:25] back [17:47:54] lunch, back in ~40 [20:41:44] not seeing anything related to aligning the spark kafka writer job with the target topic number of partitions (not really surprising since the topic is part of the dataset) [20:42:51] seems like it just does a foreachPartition with a task that writes to kafka (https://github.com/apache/spark/blob/95b2d27079c2e012ab5bfb8c1dd83b11d7848258/connector/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaWriter.scala#L72) [20:43:26] so unless I'm missing something it should be bound to the number of partitions in the input dataset and the number of executors