Papers

The Google File System

Bigtable: A Distributed Storage System for Structured Data

The Chubby Lock Service for Loosely-Coupled Distributed Systems

Spanner: Google's Globally-Distributed Database

MapReduce: Simplified Data Processing on Large Clusters

Large-scale cluster management at Google with borg

F1 - The Fault-Tolerant Distributed RDBMS Supporting Google's Ad Business

Sinfonia: a new paradigm for building scalable distributed systems

Finding a needle in Haystack: Facebook's photo storage

Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing

Scaling Distributed Machine Learning with the Parameter Server

Lease: An Efficient Fault-Tolerant Mechanism for Distributed File Cache Consistency

Dynamo: Amazon's Highly Available Key-value Store

S4: Distributed Stream Computing Platform

# More Reading List...etc

Paxos "Trilogy"

Paxos made simple

Paxos made practical

Paxos made live, an engineering perspective.

Consensus ALgorithm

The Raft Consensus Algorithm, equivalent to Paxos in fault-tolerance and performance.

Quorum mechanism

Weighted Voting for Replicated Data

A Quorum-Consensus Replication Method for Abstract Data Types

Articles

Amazon's Dynomo

Wheels

ZooKeeper, enables highly reliable distributed coordination. (ref: The Chubby Lock Service for Loosely-Coupled Distributed Systems)

Cassandra, scalabiliy and high availability without compromising performance.(ref: Dynamo: Amazon's Highly Available Key-value Store)

Storm, distributed realtime computation system.

S4, process continuous unbounded streams of data.(ref: S4: Distributed Stream Computing Platform))

Documents from SNS / Posts

Quora - What are some good resources for learning about distributed computing? Why?

Quora - How do i become a data scientist?

Nginx Module Development

Emiller's Guide To Nginx Module Development