FAWN: Trading Raw Speed for Radical Energy Efficiency

This post is adapted from a paper review I wrote during CMU’s 18-845: Internet Services course, where we studied seminal papers in distributed systems, web architecture, and virtualization. I’ve converted my reviews into blog posts to share them more broadly.

Paper [1]: “FAWN: A Fast Array of Wimpy Nodes” by David Andersen, Jason Franklin, Michael Kaminsky, Amar Phanishayee, Lawrence Tan, Vijay Vasudevan (SOSP, 2009)

Paper [2]: “Energy-efficient Cluster Computing with FAWN: Workloads and Implications” by V. Vasudevan, D. Andersen, M. Kaminsky, L. Tan, J. Franklin, I. Moraru (e-Energy, 2010)

TL;DR

FAWN proposes a cluster architecture built from many low-power (“wimpy”) embedded processors paired with local flash storage, demonstrating that such a system can be several times more energy-efficient than traditional servers for I/O-bound workloads. The authors validate the concept through FAWN-KV, a consistent, replicated, highly available key-value store that achieves strong performance while drawing only a few watts per node.

Summary

FAWN – Fast Array of Wimpy Nodes – is a cluster architecture consisting of a large number of slower but efficient nodes, each drawing only a few watts of power, coupled with low-power flash storage. It presents an energy-efficient alternative to conventional data-intensive computing systems. FAWN couples low-power embedded CPUs to small amounts of local flash storage and carefully balances computation and I/O capabilities to enable efficient, massively parallel access to data.

The papers provide design and implementation details for FAWN-KV, a consistent, replicated, highly available, and high-performance key-value storage system built on a FAWN prototype. The use of slower processors is justified by the observation that they are more efficient: they consume fewer joules of energy per instruction than high-speed processors.

Both papers include extensive experiments and benchmarks demonstrating why the FAWN concept is necessary, important, and practical. FAWN can be several times more efficient than traditional systems for I/O-bound workloads and is on par with or more efficient than traditional systems for many memory-bound and CPU-limited applications.

Strengths

Thorough power analysis. The paper devotes significant attention to power requirements. The in-depth analysis of how much power is consumed by each component effectively illustrates the enormous power demands of modern datacenters and motivates the FAWN approach.
Sound choice of ring-based architecture. The authors’ decision to use a Chord-like ring structure for key-value storage is well-reasoned. This design simplifies replication and integrates naturally with an append-only log structure.
Comprehensive evaluation. The evaluation and benchmarking in Paper [1] are particularly strong, clearly demonstrating that FAWN achieves its stated research goals. The general architectural comparison also provides an excellent overview of where FAWN sits within the broader computing spectrum.

Weaknesses

Node departure and replacement left unexplained. The authors should have explained what happens when a node leaves a chain – how does a new node join? Does the node manager create a new node, or is an existing node added to the chain? It is unclear how the replication factor R is maintained during these transitions.
Black-box node administration. The paper does not explain how the key range or the parameters R (replication factor) and V (virtual nodes) are selected. Node administration remains largely a black box.
Inconsistent units in sort efficiency discussion. In Paper [2], Figure 5 depicts sort efficiency, and the accompanying text discusses power in watts, but the graph’s y-axis is in MB per joule. The text should have explained the correlation between these two units to avoid confusion.

Discussion Questions

Maximum value size. What is the maximum size of values that can be stored in FAWN-KV? For comparison, RAMCloud supports objects up to 1 MB. The authors do not show any experiments with object sizes larger than 1 KB.
The id parameter in put operations. Figure 4 in Paper [1] shows the put operation as put(key, value, id). What does id correspond to? Is it a monotonically increasing unique identifier?
Latency of data migration. What is the latency of copying data when a new node joins or leaves the FAWN-KV ring? For example, when a node joins, it must receive a copy of at least R-1 replica ranges. How long does this take when disk utilization is high?
Key length selection. The key is 160 bits. Is this length chosen to align with standard key hashing (e.g., SHA-1), or is it an arbitrary selection? Longer keys consume more disk space per entry.
Failure during chain reconfiguration. What happens when an existing node dies without sending an ACK to its chained successor upon the joining of a new node? Is there a self-healing mechanism to recover from this scenario?
Minimum cluster size. What is the minimum number of nodes required for a FAWN-KV ring? If VID=3 and R=3, can a single node with three virtual nodes serve as its own replicas?