Slog: Serializable, low-latency, geo-replicated transactions

SLOG: serializable, low-latency, geo-replicated transactions Ren et al., VLDB’19 SLOG is another research system motivated by the needs of the application developer (aka, user!). Building cor… | Continue reading


@blog.acolyer.org | 5 years ago

IPA: Invariant-preserving applications for weakly consistent replicated DB’s

IPA: invariant-preserving applications for weakly consistent replicated databases Balegas et al., VLDB’19 IPA for developers, happy days! Last we week looked at automating checks for invarian… | Continue reading


@blog.acolyer.org | 5 years ago

Choosing a cloud DBMS: architectures and tradeoffs

Choosing a cloud DBMS: architectures and tradeoffs Tan et al., VLDB’19 If you’re moving an OLAP workload to the cloud (AWS in the context of this paper), what DBMS setup should you go with? T… | Continue reading


@blog.acolyer.org | 5 years ago

Snuba: Automating weak supervision to label training data

Snuba: automating weak supervision to label training data Varma & Ré, VLDB 2019 This week we’re moving on from ICML to start looking at some of the papers from VLDB 2019. VLDB is a huge confere… | Continue reading


@blog.acolyer.org | 5 years ago

Learning to prove theorems via interacting with proof assistants

Learning to prove theorems via interacting with proof assistants Yang & Deng, ICML’19 Something a little different to end the week: deep learning meets theorem proving! It’s been a while … | Continue reading


@blog.acolyer.org | 5 years ago

Statistical Foundations of Virtual Democracy

Statiscal foundations of virtual democracy Kahng et al., ICML’19 This is another paper on the theme of combining information and making decisions in the face of noise and uncertainty – … | Continue reading


@blog.acolyer.org | 5 years ago

Robust Learning from Untrusted Sources

Robust learning from untrusted sources Konstantinov & Lampert, ICML’19 Welcome back to a new term of The Morning Paper! Just before the break we were looking at selected papers from ICML’… | Continue reading


@blog.acolyer.org | 5 years ago

Split-Level IO Scheduling

Split-Level IO Scheduling – Yang et al. 2015 The central idea in today’s paper is pretty simple: block-level I/O schedulers (the most common kind) lack the higher level information nece… | Continue reading


@blog.acolyer.org | 5 years ago

Meta Learning Bloom Filters

Meta-learning neural bloom filters Rae et al., ICML’19 Bloom filters are wonderful things, enabling us to quickly ask whether a given set could possibly contain a certain value. They produce … | Continue reading


@blog.acolyer.org | 5 years ago

Gray failure: the Achilles’ heel of cloud-scale systems (2017)

Gray failure: the Achilles’ heel of cloud-scale systems Huang et al., HotOS’17 If you’re going to fail, fail properly dammit! All this limping along in degraded mode, doing your b… | Continue reading


@blog.acolyer.org | 5 years ago

Challenging common assumptions in the unsupervised learning of disentangled

Challenging common assumptions in the unsupervised learning of disentangled representations Locatello et al., ICML’19 Today’s paper choice won a best paper award at ICML’19. The ‘common assum… | Continue reading


@blog.acolyer.org | 5 years ago

Data Shapely

Data Shapley: equitable valuation of data for machine learning Ghorbani & Zou et al., ICML’19 It’s incredibly difficult from afar to make sense of the almost 800 papers published at ICML … | Continue reading


@blog.acolyer.org | 5 years ago

Recursive Programming

Recursive Programming – Dijkstra 1960 * Updated link to one that is not behind a paywall – thanks to Graham Markall for the catch * This paper deals with something we take so much for g… | Continue reading


@blog.acolyer.org | 5 years ago

View-centric performance optimization for database-backed web applications

View-centric performance optimization for database-backed web applications Yang et al., ICSE 2019 The problem set-up in this paper discusses the importance of keeping web page load times low as a f… | Continue reading


@blog.acolyer.org | 5 years ago

Three key checklists and remedies for trustworthy analysis of online controlled

Three key checklists and remedies for trustworthy analysis of online controlled experiments at scale Fabijan et al., ICSE 2019 Last time out we looked at machine learning at Microsoft, where we lea… | Continue reading


@blog.acolyer.org | 5 years ago

Software engineering for machine learning: a case study

Software engineering for machine learning: a case study Amershi et al., ICSE’19 Previously on The Morning Paper we’ve looked at the spread of machine learning through Facebook and Google and … | Continue reading


@blog.acolyer.org | 5 years ago

Automating Chaos Experiments in Production

Automating chaos experiments in production Basiri et al., ICSE 2019 Are you ready to take your system assurance programme to the next level? This is a fascinating paper from members of Netflix’s Re… | Continue reading


@blog.acolyer.org | 5 years ago

One SQL to rule them all: an efficient and syntactically idiomatic approach to

One SQL to rule them all: an efficient and syntactically idiomatic approach to management of streams and tables Begoli et al., SIGMOD’19 In data processing it seems, all roads eventually lead… | Continue reading


@blog.acolyer.org | 5 years ago

The Convoy Phenomenon

The convoy phenomenon Blasgen et al., IBM Research Report 1977 (revised 1979) Today we’re jumping from HotOS topics of 2019, to hot topics of 1977! With thanks to Pat Helland for the recommendation… | Continue reading


@blog.acolyer.org | 5 years ago

Machine learning systems are stuck in a rut

Machine learning systems are stuck in a rut Barham & Isard, HotOS’19 In this paper we argue that systems for numerical computing are stuck in a local basin of performance and programmabil… | Continue reading


@blog.acolyer.org | 5 years ago

Designing far memory data structures: think outside the box – the morning paper

Designing far memory data structures: think outside the box Aguilera et al., HotOS’19 Last time out we looked at some of the trade-offs between RInKs and LInKs, and the advantages of local in… | Continue reading


@blog.acolyer.org | 5 years ago

Fast key-value stores: an idea whose time has come and gone

Fast key-value stores: an idea whose time has come and gone Adya et al., HotOS’19 No controversy here! Adya et al. would like you to stop using Memcached and Redis, and start building 11-fact… | Continue reading


@blog.acolyer.org | 5 years ago

What bugs cause cloud production incidents?

What bugs cause production cloud incidents? Liu et al., HotOS’19 Last time out we looked at SLOs for cloud platforms, today we’re looking at what causes them to be broken! This is a stu… | Continue reading


@blog.acolyer.org | 5 years ago

Nines are not enough: meaningful metrics for clouds

Nines are not enough: meaningful metrics for clouds Mogul & Wilkes, HotOS’19 It’s hard to define good SLOs, especially when outcomes aren’t fully under the control of any single party. Th… | Continue reading


@blog.acolyer.org | 5 years ago

Towards Multiverse Databases

Towards multiverse databases Marzoev et al., HotOS’19 A typical backing store for a web application contains data for many users. The application makes queries on behalf of an authenticated u… | Continue reading


@blog.acolyer.org | 5 years ago

A case for managed and model-less inference serving

A case for managed and model-less inference serving Yadwadkar et al., HotOS’19 HotOS’19 is presenting me with something of a problem as there are so many interesting looking papers in the pro… | Continue reading


@blog.acolyer.org | 5 years ago

Beyond data and model parallelism for deep neural networks

Beyond data and model parallelism for deep neural networks Jia et al., SysML’2019 I’m guessing the authors of this paper were spared some of the XML excesses of the late nineties and early no… | Continue reading


@blog.acolyer.org | 5 years ago

PyTorch-BigGraph: a large-scale graph embedding system

PyTorch-BigGraph: a large-scale graph embedding system Lerer et al., SysML’19 We looked at graph neural networks earlier this year, which operate directly over a graph structure. Via graph au… | Continue reading


@blog.acolyer.org | 5 years ago

Towards federated learning at scale: system design

Towards federated learning at scale: system design Bonawitz et al., SysML 2019 This is a high level paper describing Google’s production system for federated learning. One of the most interesting t… | Continue reading


@blog.acolyer.org | 5 years ago

Data Validation for Machine Learning

Data validation for machine learning Breck et al., SysML’19 Last time out we looked at continuous integration testing of machine learning models, but arguably even more important than the mod… | Continue reading


@blog.acolyer.org | 5 years ago

Continuous integration of machine learning models with ease.ml/ci

Continuous integration of machine learning models with ease.ml/ci: towards a rigorous yet practical treatment Renggli et al., SysML’19 Developing machine learning models is no different from … | Continue reading


@blog.acolyer.org | 5 years ago

In Search of an Understandable Consensus Algorithm – The Morning Paper

In Search of an Understandable Consensus Algorithm (Extended Edition) – Ongaro & Ousterhout 2014 This is part 9 of a ten part series on consensus and replication. Here’s something t… | Continue reading


@blog.acolyer.org | 5 years ago

A case for lease-based, utilitarian resource management on mobile devices

A case for lease-based, utilitarian resource management on mobile devices Hu et al., ASPLOS’19 I’ve chosen another energy-related paper to end the week, addressing a problem many people can r… | Continue reading


@blog.acolyer.org | 5 years ago

What’s wrong with Git? A conceptual design analysis (2016)

What’s wrong with Git? A conceptual design analysis De Rossi & Jackson Onward! 2013 We finished up last week talking about the how to find good concepts / abstractions in a software desig… | Continue reading


@blog.acolyer.org | 5 years ago

Boosted race trees for low energy classification

Boosted race trees for low energy classification Tzimpragos et al., ASPLOS’19 We don’t talk about energy as often as we probably should on this blog, but it’s certainly true that our data cen… | Continue reading


@blog.acolyer.org | 5 years ago

CheriABI: Enforcing valid pointer provenance and minimizing pointer privilege in

CheriABI: enforcing valid pointer provenance and minimizing pointer privilege in the POSIX C run-time environment Davis et al., ASPLOS’19 Last week we saw the benefits of rethinking memory an… | Continue reading


@blog.acolyer.org | 5 years ago

Cloud computing simplified: a Berkeley view on serverless computing

Cloud programming simplified: a Berkeley view on serverless computing Jonas et al., arXiv 2019 With thanks to Eoin Brazil who first pointed this paper out to me via Twitter…. Ten years ago Berkeley… | Continue reading


@blog.acolyer.org | 5 years ago

Compress objects, not cache lines: an object-based compressed memory hierarchy

Compress objects, not cache lines: an object-based compressed memory hierarchy Tsai & Sanchez, ASPLOS’19 Last time out we saw how Google have been able to save millions of dollars though … | Continue reading


@blog.acolyer.org | 5 years ago

Software-defined far memory in warehouse scale computers

Software-defined far memory in warehouse-scale computers Lagar-Cavilla et al., ASPLOS’19 Memory (DRAM) remains comparatively expensive, while in-memory computing demands are growing rapidly. … | Continue reading


@blog.acolyer.org | 5 years ago

RPCValet: NI-driven tail-aware balancing of µs-scale RPCs

RPCValet: NI-driven tail-aware balancing of µs-scale RPCs Daglis et al., ASPLOS’19 Last week we learned about the [increased tail-latency sensitivity of microservices based applications with … | Continue reading


@blog.acolyer.org | 5 years ago

Understanding real-world concurrency bugs in Go

Understanding real-world concurrency bugs in Go Tu, Liu et al., ASPLOS’19 The design of a programming (or data) model not only makes certain problems easier (or harder) to solve, but also mak… | Continue reading


@blog.acolyer.org | 5 years ago

Seer: Leveraging big data to navigate the complexity of performance debugging in

Seer: leveraging big data to navigate the complexity of performance debugging in cloud microservices Gan et al., ASPLOS’19 Last time around we looked at the DeathStarBench suite of microservi… | Continue reading


@blog.acolyer.org | 5 years ago

Open-source benchmark for microservices and their HW-SW effect for cloud+edge

An open-source benchmark suite for microservices and their hardware-software implications for cloud & edge systems Gan et al., ASPLOS’19 Microservices are well known for producing ‘death … | Continue reading


@blog.acolyer.org | 5 years ago

Distributed Consensus Revised – Part III

Distributed consensus revised (part III) Howard, PhD thesis With all the ground work laid, the second half of the thesis progressively generalises the Paxos algorithm: weakening the quorum intersec… | Continue reading


@blog.acolyer.org | 5 years ago

Distributed Consensus Revised – Part II

Distributed consensus revised (part II) Howard, PhD thesis In today’s post we’re going to be looking at chapter 3 of Dr Howard’s thesis, which is a tour (“systematisation of knowledge”,… | Continue reading


@blog.acolyer.org | 5 years ago

Distributed consensus revised – Part I

Distributed consensus revised Howard, PhD thesis Welcome back to a new term of The Morning Paper! To kick things off, I’m going to start by taking a look at Dr Howard’s PhD thesis, ‘Distributed con… | Continue reading


@blog.acolyer.org | 5 years ago

End of Term

We’ve reached the end of term again on The Morning Paper, and I’ll be taking a two week break. The Morning Paper will resume on Tuesday 7th May (since Monday 6th is a public holiday in … | Continue reading


@blog.acolyer.org | 5 years ago

Keeping Master Green at Scale

Keeping master green at scale Ananthanarayanan et al., EuroSys’19 This paper provides a fascinating look at a key part of Uber’s software delivery machine. With a monorepo, and many thousands… | Continue reading


@blog.acolyer.org | 5 years ago