Issue
I created a new Kafka server (I created 1 broker with 1 partition) and I succeeded to produce and consume from this server using java code, but Im not satisfied from the amount of events that I'm reading per second as a consumer.
I have already played with the following consumer setting:
AUTO_OFFSET_RESET_CONFIG = "earliest"
FETCH_MAX_BYTES_CONFIG = 52428800
MAX_PARTITION_FETCH_BYTES_CONFIG = 1048576
MAX_POLL_RECORDS_CONFIG = 10000
pollDuration = 3000
But no matter what I entered as a value to each one of the setting, the result stayed the same
Currently, I produced 100,000 messages to Kafka. each message size is 2 kilobytes and it took 20669 milliseconds or 20 seconds (total time) to read all batches of 100000 records, which means 5000 records per second.
I expect it to be much higher, what are the most ideal values that I can set or maybe I need to use other setting or maybe I need to set my Kafka server otherwise (multiple brokers or partitions)?
Solution
Apart from the settings you mentioned and ignoring horizontal scaling/partitioning:
if you are not using compression, do it!
From the wiki:
If enabled, data will be compressed by the producer, written in compressed format on the server and decompressed by the consumer.
lz4
compression type proved to be a good one in my experience, sample settings for the producer:
compression.type = lz4
batch.size = 131072
linger.ms = 10
That means less data has to be transmitted in the network and on the other hand more cpu usage for compression/decompression.
you can find more info related to the batch and linger time in this other answer I gave related to timeouts, however it is focused on the producer part.
Answered By - Paizo Answer Checked By - Clifford M. (PHPFixing Volunteer)
0 Comments:
Post a Comment
Note: Only a member of this blog may post a comment.