At Chaordic we have been using Apache Cassandra to store data at scale since 2012, when we faced exponential growth and migrated from MySQL. Since then Cassandra is a key technology here, allowing us to scale from a few hundred million to tens of billions requests per month, and growing… In this post we will share some of our experiences at the Cassandra Summit, held in San Francisco from September 10th to 13th this year.
The conference had over 2000 participants from around the globe and awesome talks from leading companies in many fields sharing their experiences with Cassandra. Cassandra is a fully distributed database, and its conference couldn’t be different: the keynotes were broadcast to over 20 locations worlwide, including Chaordic headquarters in Florianópolis, where the rest of the team watched the main announcements of the Summit.
One of the Summit highlights was the official announcement of Cassandra 2.1, with really neat features like incremental repair (a major pain point since the beginning of the project), incremental compactions and a performance boost of over 100% of CQL against thrift access. After this, we’re definitely including migration of thrift data model to CQL and upgrade to C* 2.1 in our roadmap. We hope to present these experiences and improvements at the next Cassandra Summit in 2015. ;)
A very cool moment of the Summit was during Aaron’s Morton talk “Lesser Known Features of Cassandra 2.0 and 2.1”, when he mentioned one of our contributions to the Cassandra codebase, a flag to enable more aggressive tombstone compactions (CASSANDRA-6563). It was awesome to have our patch mentioned at the main stage of the Cassandra Summit! :-)
Overall the conference was great, and I was quite impressed with the recent wide-scale adoption of Cassandra throughout industry and how fast the system is evolving to become a first class database, even competing with Oracle in more traditional markets, like banking, government, etc.
After the main conference, some of us joined Apache Cassandra commiters for the next two days for an intensive Bootcamp to learn and hack the Cassandra internals. On the first day, we had workshops on “hairy” cassandra topics like compaction and storage engine internals and CQL query parsing. After that we’ve done a few challenging exercises on the Cassandra codebase, like implementing our own compaction strategy.
In the second day, we had a free day of hacking Cassandra on any ticket of our choice, with the support of Cassandra developers. I chose to fix the repair procedure within a single datacenter, an issue we faced at Chaordic, so I thought it would be interesting to work on that. By the end of the day, after a delicious lunch and a few beers, I was able to complete the patch that was reviewed and committed by Yuki Morishita. If you want to see more details about the issue, you can find it here: CASSANDRA-7450.
It was a unique experience to get up close and personal with the Cassandra team and deep dive into the Cassandra codebase. Furthermore, it was a great opportunity share experiences with Cassandra professionals from other companies from all over the world. We were really happy and proud to be part of that. Many thanks to DataStax and the Cassandra team for organizing this gathering!
Would you like to work with Cassandra and other big data technologies at Chaordic, the largest recommendations provider in South America? We’re hiring! :)