Up through the first decade of the 21st century, the world of relational databases wasn’t a particularly exciting one. From their arrival in the 1970s, relational databases mostly supported rather boring if important back office applications with a handful of users and relatively small amounts of data. But times – and relational database requirements – have changed.
Today, it’s not uncommon to find enterprise and consumer applications supporting hundreds of thousands or even millions of users and massive volumes of high velocity data. These are mission critical applications that companies across industries rely on to engage and transact with their customers, partners, and workers. And these applications often times require a relational database that is not only ACID compliant but also high performance, massively scalable, and highly available.
Consider DoorDash, the largest third-party delivery service in the world. Its mobile application has over 20 million users and processes hundreds of thousands of orders per day delivered by 200,000+ drivers. It supports on-demand delivery for more than 340,000 local businesses and restaurants in 4,400 cities across the United States and Canada, with an average order-to-delivery time of just 37 minutes.
Or Code.org, a Seattle-based non-profit dedicated to expanding access to computer science in schools. Every year, the organization hosts Hour of Code, a weeklong event in which millions of students and teachers around the world engage with hundreds of computer science tutorials in more than 45 languages on the Code.org platform.
Then there’s Samsung. The electronics giant developed Samsung Account, a user certification and authorization application, to enable its customers to sync data across Samsung devices, access Samsung Pay, and use the company’s Find My Mobile service, among other uses. The application supports an eye watering 1 billion users, who make on average 80,000 requests per second.
Traditional relational databases face critical challenges trying to support these types of ultra-demanding applications. For one, traditional relational databases, including old-guard commercial relational databases and open source relational databases, must be taken offline, sometimes for hours at a time, for maintenance, patches and upgrades. With millions of users scattered across time zones relying on these applications 24/7, when is a good time to take the database offline for maintenance? The answer is never. Cost is also an obstacle. Some of the old-guard commercial databases provide the performance and high availability required by these types of ultra-demanding applications, but they come with steep price tags and punitive licensing terms. Open source databases, on the other hand, are easier on the wallet, but require skilled, expensive manpower to continuously tune for performance and ensure reliability.
That’s why we at Amazon Web Services built Amazon Aurora. Aurora is a MySQL and PostgreSQL-compatible relational database built for the cloud, that combines the performance and availability of traditional enterprise databases with the simplicity and cost-effectiveness of open source databases. It is designed to support all manner of database workloads, including operational and transactional applications with the highest availability, durability and performance requirements. Aurora is managed by the Amazon Relational Database Service (RDS), which automates database management tasks such as hardware provisioning, software patching, setup, configuration, and backups. And Aurora is the relational database that DoorDash, Code.org, Samsung and tens of thousands of other companies rely on to power their most important and demanding database workloads.
Since its debut in 2015, Aurora continues to be the fastest growing service in the history of AWS. While customers choose Aurora for a number of reasons, benefits include:
- Performance: Testing on standard benchmarks have shown up to a 5x increase in throughput performance over stock MySQL on similar hardware and a 3x increase over stock PostgreSQL. Amazon Aurora uses a variety of software and hardware techniques to fully leverage available compute, memory and networking. This includes pushing a number of core functions, like redo log processing and database snapshots, to Aurora’s distributed storage layer instead of executing them in the database instance itself. This means there is no impact on the performance of your production workloads.
- Scalability: Amazon Aurora will automatically grow the size of your database volume as your database storage needs grow. Your volume will grow in increments of 10 GB up to a maximum of 128 TB. You don’t need to provision excess storage for your database to handle future growth. Using the Amazon RDS APIs or with a few clicks in the AWS Management Console, you can scale the compute and memory resources powering your deployment up or down. Compute scaling operations typically complete in a few minutes.
- Availability: Amazon Aurora is designed to deliver better than 99.99% availability. It does this, in part, by replicating six copies of your data stored across three AWS Availability Zones (an Availability Zone is one or more discrete data centers with redundant power, networking, and connectivity in an AWS Region). To support globally distributed applications, Aurora Global Database enables a single Amazon Aurora database to span multiple AWS regions, protecting against region-wide outages. Your data is also continuously backed up to Amazon S3 and Aurora transparently recovers from physical storage failures, typically in less than 30 seconds.
- Ease of Management: As mentioned, Amazon Aurora is managed by Amazon RDS, which makes it easy to set up, operate, and scale Aurora with just a few clicks. It provides cost-efficient and resizable capacity while automating time-consuming administration tasks such as hardware provisioning, database setup, patching and backups. This frees up database administrators to focus on higher value tasks, like optimizing application performance, and enables developers to spend more time doing what they do best – writing and shipping great software and applications.
Speaking at AWS Global Summit New York, DoorDash co-founder and CTO Andy Fang cited Aurora’s dynamically growing storage, its ability to handle tables with billions of rows of data, its high I/O and memory capacity limits, and Aurora’s extremely fast replica lag as critical capabilities that help the company serve its millions of customers.
“We’ve been all in on AWS,” Fang said. “Because of the capabilities of Aurora and the data products that AWS provides, we’ve been able to focus on innovating, building products, and cementing our industry position.”
We’re excited that so many companies, from DoorDash, Code.org and Samsung to Experian, The Pokémon Company, Dow Jones and many more, chose Amazon Aurora to power their most important, demanding applications and database workloads. And we continue to innovate on your behalf, introducing important new features such as Database Activity Streams, which provides a near real-time stream of database activities for monitoring and auditing, and machine learning directly from the database, which enables developers to apply predictive models with no specialized skills required, in just the last 12 months. And there’s more to come.
Modern applications require a modern database. To learn how Amazon Aurora can help your company power amazing applications that delight users, visit the Aurora page here. And don’t miss re:Invent 2020, taking place online between Nov. 30 and December 18. The conference is free and will include dozens of sessions aimed at helping developers and DBAs make the most of their databases and applications. See you there!