PostgreSQL can scale rather well vertically. The more resources (CPU, memory, disk) that you can make available to your PostgreSQL server, the better it can perform. However, while some parts of Postgres can automatically make use of the increased resources, other parts need configuration changes before improvements can be noticed.
Sharding, also known as horizontal partitioning, is a popular scale-out approach for relational databases. Amazon Relational Database Service (Amazon RDS) is a managed relational database service that provides great features to make sharding easy to use in the cloud. In this post, I describe how to use Amazon RDS to implement a sharded database architecture to achieve high scalability, high availability, and fault tolerance for data storage. I discuss considerations for schema design and monitoring metrics when deploying Amazon RDS as a database shard. I also outline the challenges for resharding and highlight the push-button scale-up and scale-out solutions in Amazon RDS.
Many distributed storage systems achieve high data access throughput via partitioning and replication, each system with its own advantages and tradeoffs. In order to achieve high scalability, however, today’s systems generally reduce transactional support, disallowing single transactions from spanning multiple partitions. Calvin is a practical transaction scheduling and data replication layer that uses a deterministic ordering guarantee to significantly reduce the normally prohibitive contention costs associated with distributed transactions. Unlike previous deterministic database system prototypes, Calvin supports disk-based storage, scales near-linearly on a cluster of commodity machines, and has no single point of failure. By replicating transaction inputs rather than effects, Calvin is also able to support multiple consistency levels—including Paxosbased strong consistency across geographically distant replicas—at no cost to transactional throughput.