Apple Business is Here
Apple Business brings together device management, communication tools, and customer reach into one unified platform for businesses and schools. I've been working on this for a while, and it's exciting...
Live Migration of Virtual Machines: Seamless OS Migration with Minimal Downtime
This paper presents a technique for migrating running virtual machines across physical hosts with remarkably low downtime -- as little as 60 milliseconds. Using a pre-copy approach that iteratively transfers...
Nested Virtualization and the Turtles Project: Hypervisors All the Way Down
The Turtles Project demonstrates that hypervisors can efficiently run inside other hypervisors -- even without architectural support for nesting on x86. By implementing multi-dimensional paging for MMU virtualization and multi-level...
Xen and KVM: Two Approaches to Virtualization
Xen pioneered paravirtualization -- modifying guest OSes slightly to achieve near-native performance on a bare-metal hypervisor -- while KVM took a radically simpler approach by turning the Linux kernel itself...
Hardware vs. Software Virtualization: A Deep Dive into x86 Techniques
This paper provides a rigorous head-to-head comparison of software-based and hardware-based x86 virtualization. The surprising finding is that software VMMs outperform first-generation hardware VMMs for I/O-heavy and context-switch-heavy workloads, while...
Virtual Machine Monitors and Intel VT: Foundations of Modern Virtualization
These two papers provide a comprehensive look at virtualization from both the software and hardware sides. Rosenblum and Garfinkel (VMware's co-founder among them) lay out the design goals and implementation...
FAWN: Trading Raw Speed for Radical Energy Efficiency
FAWN proposes a cluster architecture built from many low-power ("wimpy") embedded processors paired with local flash storage, demonstrating that such a system can be several times more energy-efficient than traditional...
Inside the Warehouse-Scale Computer: Datacenter Basics
This book reframes the modern datacenter not as a collection of co-located servers, but as a single warehouse-scale computer requiring holistic design. Chapter 4 dives into the physical fundamentals --...
Building Raft with Test-Driven Development in Go
Report: “Building Raft - An Exploration of Test-Driven Development in Go” by Sean Klein and Advaya Krishna (Student Project Report, 2015)
Dapper: Google's Large-Scale Distributed Tracing Infrastructure
Dapper is Google's production distributed tracing system that provides low-overhead, application-transparent instrumentation across Google's massive infrastructure. Originally conceived as a tracing tool, it evolved into a general-purpose monitoring platform --...
Mesa: Google's Geo-Replicated, Near Real-Time Data Warehousing System
Mesa is Google's distributed data warehousing system designed for storing and querying critical advertising metrics. It achieves a unique combination of geo-replication across multiple datacenters, near real-time update ingestion, strong...
Percolator: Large-Scale Incremental Processing at Google
Percolator is Google's system for incrementally processing updates to massive datasets, built on top of Bigtable to replace batch-oriented MapReduce pipelines for web search indexing. By providing ACID snapshot-isolation transactions...
Bigtable: Google's Distributed Storage System for Structured Data
Bigtable is Google's sparse, distributed, multi-dimensional sorted map designed to scale to petabytes of data across thousands of machines. Rather than building a traditional RDBMS, Google created a system organized...
Chubby: Google's Distributed Lock Service for Loosely-Coupled Systems
Chubby is Google's distributed lock service that provides coarse-grained locking and reliable low-volume storage for loosely-coupled distributed systems. Rather than exposing a raw Paxos library to developers, Chubby offers a...
Paxos Made Live: Bridging Theory and Production Systems
This paper chronicles the hard-won engineering lessons from building a production fault-tolerant database using the Paxos consensus algorithm inside Google's Chubby lock service. The key takeaway is that the gap...
MapReduce: Simplified Data Processing on Large Clusters
This landmark paper introduces MapReduce, a programming model and runtime framework for processing massive datasets across thousands of machines. By abstracting away the complexity of parallelization, fault tolerance, and load...
Locality-Aware Request Distribution in Cluster-Based Web Servers
These two papers tackle the problem of intelligently distributing requests across web server clusters. The first introduces LARD, which routes requests to maximize backend cache locality while maintaining load balance....
The Birth of Google: Search Architecture and PageRank
These two foundational papers introduce the Google search engine and its core ranking algorithm, PageRank. The first paper describes Google's system architecture -- how web pages are crawled, indexed, and...
Measuring Web Server Capacity Under Realistic Conditions
This paper exposes critical shortcomings in how web server benchmarks were designed in the late 1990s -- they failed to push servers past their capacity and ignored the effects of...
Comparing Web Server Architectures: Events, Threads, and Pipelines
This paper provides a rigorous performance comparison of event-driven (userver), thread-per-connection (Knot), and hybrid pipeline (WatPipe) web server architectures. After carefully tuning each server and eliminating confounding factors, the authors...
Events vs. Threads: Two Sides of the Web Server Debate
These two papers represent opposing viewpoints in the classic events-versus-threads debate for web server design. The first proposes a scalable event delivery mechanism to replace the bottleneck-prone `select()` system call,...
Accept Strategies: A Simple Knob for Big Web Server Gains
This paper demonstrates that tuning a single parameter -- the accept limit, which controls how many incoming connections a server accepts at a time -- can significantly improve web server...
Flash: A Web Server Built on the AMPED Architecture
This paper introduces the AMPED (Asymmetric Multi-Process Event-Driven) architecture for web servers and evaluates it through the Flash web server. Flash combines the efficiency of event-driven designs with helper processes...
The End-to-End Design Principle: Placing Functions in the Right Layer
This foundational paper argues that functions placed at low levels of a system may be redundant or of little value when compared to the cost of providing them at that...
Welcome to My Blog
Welcome to my personal blog! I’m excited to share my thoughts, experiences, and musings!