2024

Startup Database Trends Report

Examining the database infrastructure trends of early-stage and growth-stage startups.

Preface

We put together this report because, as software developers, we noticed a lack of information about the current state of startup data infrastructure. While plenty of reports look at which databases developers use or admire, we couldn't find much that delved deeper into specifics like read replication, sharding, or workloads.

So, we decided to run our own database survey and share the results with the startup community. We're excited to present our findings and hope they provide insights for anyone interested in database infrastructure trends.

— The Springtail Team

Methodology

This report is based on a database infrastructure survey we conducted from late 2023 through early 2024.

The survey was distributed through various developer communities, social media platforms, and professional networks. It was shared with hundreds of startup professionals, including software developers, technical founders, and DevOps engineers, across a wide range of early-stage and growth-stage companies.​

Although we made every effort to gather a diverse group of respondents, the voluntary nature of the survey might have introduced selection bias.

Highlights

Icon of a database.
PostgreSQL leads in the cloud

A significant proportion of respondents use PostgreSQL as their primary production database, with most opting for managed cloud services.

Database replication icon.
Replication prioritized vs. sharding

Most respondents operate with a single primary database. However, replication practices vary, with many having 2-5 read replicas or replicating for failover.

Transactional workloads prevail

The primary workload for most respondents is read-heavy with some write operations, and there is a clear preference for short-lived queries.

Trend 01

Production database

Pie chart of production database survey results.

What is your primary production database?

PostgreSQL was the most popular production database, likely driven by its open-source community and extensive features suitable for various applications.

MySQL and SQL Server also had significant usage, reflecting the preference for traditional relational databases in production settings.

A variety of other databases had single-digit representation, indicating a smaller contingent of startups favoring niche solutions for additional flexibility or specialized used cases like massive volumes or unstructured data.

Trend 02

Database deployment

Bar chart of database deployment survey results.

How is your primary database deployed?

Our survey results highlight a clear trend towards cloud-based database deployment. The preference for managed cloud services underscores the importance of convenience, scalability, and security for startup organizations.

Self-hosting in the cloud and on-premise deployments were notably less common, suggesting the benefits of managed hosting far outweigh customization, data sovereignty, or compliance concerns.

Trend 03

Database sharding

Pie chart of database sharding survey results.

Is your database logically sharded across multiple primaries?

Most respondents favored simple database configurations without sharding or minimal sharding. Understandably, a single primary or a small number of primaries is easier to manage and often sufficient for many applications.

Few respondents reported complex sharding configurations, indicating a lack of database optimization or infrastructure expertise at early stage companies.

The uncertainty among some respondents may highlight the potential reliance on managed services, where the underlying architecture may be less obvious or understood.

Trend 04

Read replication

Pie chart of database read replication survey results.

Do you maintain read replicas of your primary database?

Our survey results highlighted a fairly balanced approach to read replication.

Many organizations prioritized failover replication to ensure that their databases remain available during an unforeseen event. Maintaining a moderate number of read replicas was also common, reflecting a need to distribute the read load and improve performance for read-heavy applications.

Interestingly, fourteen percent of respondents indicated they did not have read replicas in place, relying solely on their primary database to keep up with demand.

Trend 05

Connected applications

Bar chart of connected applications database survey results.

How many separate applications connect to a single database?

Most respondents maintain a relatively low number of connections to their primary database, with over half having only 1 to 2 applications connected.

A smaller but significant number indicated 3 to 5 applications connected to their database and nearly a quarter had six or more. This points to a more integrated environment where multiple services or applications interact with the database, likely reflecting a more extensive and distributed application infrastructure.

Trend 06

Primary workload

Pie chart of database workflow survey results.

How would you describe your primary workload?

The survey results show an equal split between respondents who describe their workloads as "mostly read with some write/update" and those with a "roughly 50/50 mix of read and write." This indicates that many organizations have balanced workloads or tend towards read-heavy operations, which is typical in many database applications.

The uncertainty among some respondents regarding primary workloads might suggest a need for better monitoring and analysis tools to understand database usage patterns.

Trend 07

Long-running queries

Bar chart of long-running queries survey results.

Do you run analytics or long-running queries from your primary database?

The majority of respondents only run short-lived queries from their primary database. This approach helps maintain performance and responsiveness by avoiding resource-intensive operations on their primary.

Another prevalent strategy involves using ETL processes to offload analytic workloads from the primary database to a dedicated analytics warehouse.

A smaller contingent of respondents run heavy analytics directly on their primary database, which can require robust database infrastructure to handle the additional load without affecting transactional performance.

Acknowledgements

We want to extend our heartfelt gratitude to all the readers, participants, and supporters who contributed to this report. We sincerely appreciate the time and energy you invested in sharing our database survey with your networks — your generous efforts have greatly enriched our research.​

DOWNLOAD
Startup Database Trends Report (2024) cover page.

Enter your email to download a PDF of the results — including bonus content. Plus, you'll be the first to hear about upcoming surveys and new reports.