The last few weeks have been anything but normal for many of us. I do hope that you and your loved ones are managing to stay safe. My routines have been disrupted too, and with the closure of schoo… | Continue reading
Serverless in the wild: characterizing and optimising the serverless workload at a large cloud provider, Shahrad et al., arXiv 2020 This is a fresh-from-the-arXivs paper that Jonathan Mace (@mpi_jc… | Continue reading
An empirical guide to the behavior and use of scalable persistent memory, Yang et al., FAST’20 We’ve looked at multiple papers exploring non-volatile main memory and its implications (e… | Continue reading
Understanding, detecting and localizing partial failures in large system software, Lou et al., NSDI’20 Partial failures (gray failures) occur when some but not all of the functionalities of a… | Continue reading
Rex: preventing bugs and misconfiguration in large services using correlated change analysis, Mehta et al., NSDI’20 and Check before you change: preventing correlated failures in service upda… | Continue reading
Characterizing, modeling, and benchmarking RocksDB key-value workloads at Facebook, Cao et al., FAST’20 You get good at what you practice. Or in the case of key-value stores, what you benchma… | Continue reading
Characterizing, modeling, and benchmarking RocksDB key-value workloads at Facebook, Cao et al., FAST’20 You get good at what you practice. Or in the case of key-value stores, what you benchma… | Continue reading
Building an elastic query engine on disaggregated storage, Vuppalapati, NSDI’20 This paper describes the design decisions behind the Snowflake cloud-based data warehouse. As the saying goes, … | Continue reading
(Post updated to add links to write-ups of the papers now that the series is complete). We had to get here at some point! Inspired by the recent publication of Raft Refloated I thought it would be … | Continue reading
Millions of tiny databases, Brooker et al., NSDI’20 This paper is a real joy to read. It takes you through the thinking processes and engineering practices behind the design of a key part of … | Continue reading
Firecracker: lightweight virtualisation for serverless applications, Agache et al., NSDI’20 Finally the NSDI’20 papers have opened up to the public (as of last week), and what a great l… | Continue reading
Gandalf: an intelligent, end-to-end analytics service for safe deployment in cloud-scale infrastructure, Li et al., NSDI’20 Modern software systems at scale are incredibly complex ever changi… | Continue reading
Meaningful availability, Hauer et al., NSDI’20 With thanks to Damien Mathieu for the recommendation. This very clearly written paper describes the Google G Suite team’s search for a mea… | Continue reading
AnyLog: a grand unification of the Internet of Things, Abadi et al., CIDR’20 The Web provides decentralised publishing and direct access to unstructured data (searching / querying that data h… | Continue reading
Extending relational query processing with ML inference, Karanasos, CIDR’10 This paper provides a little more detail on the concrete work that Microsoft is doing to embed machine learning inf… | Continue reading
Cloudy with a high chance of DBMS: a 10-year prediction for enterprise-grade ML, Agrawal et al., CIDR’20 "Cloudy with a high chance of DBMS" is a fascinating vision paper from a gro… | Continue reading
Migrating a privacy-safe information extraction system to a software 2.0 design, Sheng, CIDR’20 This is a comparatively short (7 pages) but very interesting paper detailing the migration of a… | Continue reading
Programs, life cycles, and laws of software evolution, Lehman, Proc. IEEE, 1980 Today’s paper came highly recommended by Kevlin Henney and Nat Pryce in a Twitter thread last week, thank you b… | Continue reading
Let’s encrypt: an automated certificate authority to encrypt the entire web, Aas et al., CCS’19 This paper tells the story of Let’s Encrypt, from it’s early beginnings in 20… | Continue reading
Ten challenges for making automation a ‘team player’ in joint human-agent activity, Klein et al., IEEE Computer Nov/Dec 2004 With thanks to Thomas Depierre for the paper suggestion. Las… | Continue reading
Watching you watch: the tracking ecosystem of over-the-top TV streaming devices, Moghaddam et al., CCS’19 The results from this paper are all too predictable: channels on Over-The-Top (OTT) s… | Continue reading
Cloudburst: stateful functions-as-a-service, Sreekanti et al., arXiv 2020 Today’s paper choice is a fresh-from-the-arXivs take on serverless computing from the RISELab at Berkeley, addressing… | Continue reading
POTS: Protective optimization technologies, Kulynych, Overdorf et al., arXiv 2019 With thanks to @TedOnPrivacy for recommending this paper via Twitter. Last time out we looked at fairness in the co… | Continue reading
The measure and mismeasure of fairness: a critical review of fair machine learning, Corbett-Davies & Goel, arXiv 2018 With many thanks to Ben Fried and the ACM Queue editorial board for the pap… | Continue reading
The Tail at Scale – Dean and Barroso 2013 We’ve all become familiar with the importance of fault-tolerance and the techniques that can be used to achieve it. Less well-known is the idea… | Continue reading
Seamless offloading of web app computations from mobile device to edge clouds via HTML5 web worker migration, Jeong et al., SoCC’19 [^1] This paper caught my eye for its combination of an int… | Continue reading
Narrowing the gap between serverless and its state with storage functions, Zhang et al., SoCC’19 "Narrowing the gap" was runner-up in the SoCC’19 best paper awards. While bein… | Continue reading
Trade-offs under pressure: heuristics and observations of teams resolving internet service outages, Allspaw, Masters thesis, Lund University 2015 This is part 2 of our look at Allspaw’s 2015 … | Continue reading