Resetting Kafka Topics

Let’s say you’re working on a program to load a Kafka topic and you mess up and want to start over.  There are two good ways of doing this.  Both of these methods involve connecting to the name node and running shell scripts in /usr/hdp/[version]/kafka/bin (for the Hortonworks Data Platform; for some other distro, I leave it as an exercise to the reader to find the appropriate directly…mostly because I wouldn’t know where it was).

Method One:  Delete And Re-Create

The method that I’ve shown already is the delete and re-create method.  This one is pretty simple:  we delete the existing topic and then generate a new one with the same name.

./kafka-topics.sh --delete --zookeeper localhost:2181 --topic test

When you delete the topic, you’ll the the following warning message:

kafkadeletewarning

You can check this in Ambari by going to the Kafka —> Configs section:

deletetopicenable

Then, once we’ve deleted the topic, we can re-create it.

./kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test

Method Two:  Retention Policy Shenanigans

The first method works fine for non-production scenarios where you can stop all of the producers and consumers, but let’s say that you want to flush the topic while leaving your producers and consumers up (but maybe you have a downtime window where you know the producers aren’t pushing anything).  In this case, we can change the retention period to something very short, let the queue flush, and bring it back to normal, all using the kafka-configs shell script.

First, let’s check out our current configuration settings for the topic called test:

./kafka-configs.sh --zookeeper localhost:2181 --describe --entity-type topics --entity-name test

defaultconfigs

This might look odd at first, but it’s just the Kafka configuration script’s way of saying that you’re using the default settings.  Incidentally, our default setting has a retention period of 168 hours, as we can see in Ambari.

kafkaconfigs

Now that we have the correct script, we can run the following command to set our retention policy to something a bit shorter:

./kafka-configs.sh --zookeeper localhost:2181 --alter --entity-type topics --entity-name test --add-config retention.ms=1000

shortconfig

Now we can see that the retention period is 1000 milliseconds, or one second. Give that a minute or two to take hold and then we can run the following to remove the special configuration:

./kafka-configs.sh --zookeeper localhost:2181 --alter --entity-type topics --entity-name test --delete-config retention.ms

backtostandardconfig

And we’re back, with no real downtime.  As long as the producers were temporarily paused, we didn’t lose any data and our producers can go about their business like nothing happened.

Conclusion

There are at least two different methods for clearing out a Kafka topic.  Before you break out the hammer, see if monkeying with the retention period will solve your problem without as much disruption.

Advertisements

One thought on “Resetting Kafka Topics

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s