Friday, June 19, 2015

How Do You Scale a Logging Infrastructure to Accept a Billion Messages a Day by Paul Stack

This seminar is a description on how Paul Stack changed a legacy solution using SQLfor storing logs to a more scalable solution.

To replace the SQL solution they went to the ELK Stack which is made up of ElasticSearch, LogStash, and, Kibana. In the beginning a developer took a single JAR file and ran it, this worked well but had some stability issues.

Next iteration Redis was used as transport medium for the logs before it was pushed to ElasticSearch. Due to problems with Redis overflowing they moved to Apache Kafka. Kafka is a high-throughput distributed messaging system and is backed by ZooKeeper which is basically a key-value store.

The following iteration they tried to improve the server structure however this mostly increased the complexity of the system without improving the system. The system could new handle the peak load of 12 billion messages per day.

Mr. Stack recommend reading Jepsens database series for choosing technologies. A lesson learned from the project is to not re-invent the wheel, the simple solution would have been to buy Splunk however during the development they learned a lot about their system.

I’m not a such deep into DevOpts but I found this talk very informative.

No comments:

Post a Comment