Skip to Content

Why we ditched DynamoDB

Posted on October 24, 2022 by

Categories: AWS

Tags:

I can’t believe I’ve written this essay. Still, we’ve entirely abandoned DynamoDB after being publicly thrilled about it for more than a year and amassing a sizable mailing list around our aspirations to transfer our whole database into DynamoDB.

DynamoDB had the potential to revolutionize our world, but it turned out to be a significant obstacle to our ability to grow our web analytics at a reasonable price.

I’ll describe how we got to DynamoDB’s door in this piece and why we decided to burn down the entire home.

All of it was a dream.

As many of you are aware, I am a strong supporter of serverless technologies. I’ve been upfront about not wanting to manage servers, keep track of demand, or plan capacity. I want it to switch automatically when I hit a button, turn it on, and run. We have a restraining order against Amazon Web Services (yes, fellow nerd, I’m aware that EC2 powers Lambda), we are a small company, and we lack time to run servers.

I was intrigued by DynamoDB since it offered the following benefits:

  • Endless scale (no CPU, memory, or disc planning)
  • Expense what you use
  • Highly accessible
  • recurring backups

And we place a high emphasis on each of those principles.

When we started working on an enormous scale, we faced scaling issues. We started preparing to switch to DynamoDB since we thought it would address all of our issues.

  • Endless ingest? Check.
  • No issues with scaling? Check.
  • Pay for the things we use? Check.

We were prepared to fight.

The discovery of grandeur

We needed to query the data flexibly since we were eager to work with DynamoDB at this point. How could we immediately ingest all our pageviews into DynamoDB so we could query them as we wanted?

I discussed this extensively in my post on developing the world’s fastest analytics when I attempted to integrate DynamoDB with Rocket. The expectation was that DynamoDB would save all of the pageview data and that Rocket could indiscriminately query it. They turned out to be extremely expensive and sluggish, and they would have caused a delay between DynamoDB and Rocket, which was something we didn’t want.

I’ve always been wary of using fixed-sized services since I’ve had small-scale, terrible encounters with them. But once I discovered SingleStore, I realized they could manage such absurdly high throughput that I wasn’t worried. We could afford to over-provision, and they didn’t object when we sent tens of thousands of requests per second while only spending $3,000 per month.

overcoming a scalable infrastructure design challenge

We didn’t have our infrastructure configured in a way that would grow over time until September 2021. We used a Lambda function to create a new database connection, run a query, and then kill the database connection for each request (API request, pageview, event, etc.) that came in. I’ve always created applications this way, and it works wonderfully at modest sizes, but this isn’t how it’s done at scale. I wasn’t aware of the significant CPU overhead on our SingleStore cluster until their staff informed us of it.

Lambda function from earlier

We didn’t have time to undertake a rework. However, SingleStore recently created a fantastic HTTP API that takes care of connection pooling and results in significant speed benefits. So we chose an alternative route rather than switching to the HTTP API.

Two components of our application were the root of the issues we were having:

Previous Lambda function

  1. The dashboard for Fathom (private API, jobs, etc.)
  2. The line of the Fathom collector (the queue runs in the background and processes incoming pageviews)

In light of this, we transfer everything else to Heroku while keeping the primary Fathom ingest point on Laravel Vapor (the ingest point writes to SQS, our queuing system).

UPDATE: The Laravel Vapor team introduced a modification to support persistent connections within the queue less than 24 hours after this post was published. I did not believe that this was feasible. Actually, we have returned our queue to Laravel Vapor.

As you can expect, the change required some effort, but it allowed us to use persistent database connections. Persistent connections mean we were able to entirely remove the consistent open database connection & shut the database connection for every single request to be straightforward for the non-tech individuals following (let me know if you folks exist through Twitter).

Our queue back on Laravel Vapor

Thus, this significantly impacted and drastically decreased our CPU overhead (by 60–80%). This indicated that we had finally reached the point where we were utilizing SingleStore as designed. After this migration was finished, my dopamine levels spiked for around five days. I’m happy we took the time to accomplish this. As a result, Fathom has improved and is now much more scalable.

Persistent connections are now used by SingleStore to manage everything. We can utilize it for all key/value lookups and tens of thousands (perhaps more at times) of inserts per second (which we previously used DynamoDB for). What a time to be alive—it handles all our application rate-limiting and supports our unique tracking system.

Savings of $3,000

DynamoDB charges were rapidly increasing as we expanded. Although we had sufficient capacity for the exorbitant DynamoDB expenditure and adequate margins on the cheaper plans, it wouldn’t be sustainable if we kept attracting more prominent clients.

We kept returning to the fact that we couldn’t get DynamoDB to operate for many months. We would have an unsustainable business if we continued to attract these enormous clients while maintaining the same rates. However, if we increased the cost of those more extensive plans, we would lose our ability to compete on pricing. Even while the pricing wouldn’t be an issue for some of the larger businesses, we also serve high-traffic SMBs that don’t have $1 million to spend on SaaS subscriptions. It was challenging to balance.

Price increases were almost inevitable, but something didn’t feel right. We knew we could have done better. At that point, we decided to use SingleStore as an alternative. They could easily handle the absurd throughput because it was a monthly expense.

Our SingleStore usage

Our DynamoDB cost decreased to around $0.02 per day when we switched to SingleStore, while our SingleStore bill remained the same. Our SingleStore cluster runs in this manner for most of the day. We will undoubtedly need to increase our bill at some point, but for the time being, everything is steady.

What happens if we see such a massive surge in traffic and SingleStore cannot handle the throughput? SQS (Simple Queue Service), endlessly scalable and retains all our pageviews, will keep us secure in that unlikely scenario. Therefore, the worst that may happen is we build up a brief backlog of pageviews that need to be processed.

the upcoming

We now have a single database that is the foundation for our whole application. High availability, reasonable overprovisioning, and excellent pricing. Our dashboards will operate more quickly whenever we have to spend more money on SingleStore, and adding CPUs and RAM excites me the most. We are in an extraordinary situation right now. Many developers that manage Redis, DynamoDB, MySQL, and other stack components will be reading this. Individually handling all of that is tough.

Although our database solution is not serverless, it is designed so that we can manage to exceed our capacity, which, as you can see, would need an additional 90%+ CPU consumption and a significant amount of additional memory usage. So, all is OK. We’ve adjusted to our new normal since the dopamine attacks ceased, but I’ll never forget how things were before we made this shift.

We’re expanding quickly as more people choose to use Google Analytics alternatives like Fathom, so we’ll be scaling out our database architecture and perhaps make more modifications shortly.

Reaching this inevitable fork in the road where you can either raise prices when confronted with rising costs or optimize your infrastructure expenditure for higher margins and maintain the exact pricing is a significant aspect of running a software business. In our opinion, the latter better serve our current and potential consumers. We could do this by making a  stack, which will allow us to maintain long-term sustainability without uniformly raising costs.