Sharding the shards: managing datastore locality at scale with Akkio Annamalai et al., OSDI’18 In Harry Potter, the Accio Summoning Charm summons an object to the caster of the spell, sometim… | Continue reading
an interesting/influential/important paper from the world of CS every weekday morning, as selected by Adrian Colyer | Continue reading
The FuzzyLog: a partially ordered shared log Lockerman et al., OSDI’18 If you want to build a distributed system then having a distributed shared log as an abstraction to build upon — one tha… | Continue reading
Moment-based quantile sketches for efficient high cardinality aggregation queries Gan et al., VLDB’18 Today we’re temporarily pausing our tour through some of the OSDI’18 papers in order to l… | Continue reading
Out of the Tar Pit – Moseley & Marks 2006 This is the final Desert Island Paper choice from Jonas Bonér, and a great way to round out the week. ‘Out of the Tar Pit’ was the 10… | Continue reading
Noria: dynamic, partially-stateful data-flow for high-performance web applications Gjengset, Schwarzkopf et al., OSDI’18 I have way more margin notes for this paper than I typically do, and t… | Continue reading
RobinHood: tail latency aware caching – dynamic reallocation from cache-rich to cache-poor Berger et al., OSDI’18 It’s time to rethink everything you thought you knew about caching! My … | Continue reading
RobinHood: tail latency aware caching – dynamic reallocation from cache-rich to cache-poor Berger et al., OSDI’18 It’s time to rethink everything you thought you knew about caching! My … | Continue reading
Maelstrom: mitigating datacenter-level disasters by draining interdependent traffic safely and efficiently Veeraraghavan et al., OSDI’18 Here’s a really valuable paper detailing four plus yea… | Continue reading
Maelstrom: mitigating datacenter-level disasters by draining interdependent traffic safely and efficiently Veeraraghavan et al., OSDI’18 Here’s a really valuable paper detailing four plus yea… | Continue reading
LegoOS: a disseminated, distributed OS for hardware resource disaggregation Shan et al., OSDI’18 One of the interesting trends in hardware is the proliferation and importance of dedicated acc… | Continue reading
Orca: differential bug localization in large-scale services Bhagwan et al., OSDI’18 Earlier this week we looked at REPT, the reverse debugging tool deployed live in the Windows Error Reportin… | Continue reading
REPT: reverse debugging of failures in deployed software Cui et al., OSDI’18 REPT (‘repeat’) won a best paper award at OSDI’18 this month. It addresses the problem of debugging crashes in pro… | Continue reading
Capturing and enhancing in situ system observability for failure detection Huang et al., OSDI’18 The central idea in this paper is simple and brilliant. The place where we have the most relev… | Continue reading
Is Sound Gradual Typing Dead? – Takikawa et al. 2016 Last year we looked at the notion of gradual typing in an ECOOP 2015 paper by Takikawa et al. based on TypedRacket. Today’s choice f… | Continue reading
Automatic discovery of tactics in spatio-temporal soccer match data Decroos et al., KDD’18 Here’s a fun paper to end the week. Data collection from sporting events is now widespread. This fue… | Continue reading
Detecting spacecraft anomalies using LSTMs and nonparametric dynamic thresholding Hundman et al., KDD’18 How do you effectively monitor a spacecraft? That was the question facing NASA’s Jet P… | Continue reading
Online parameter selection for web-based ranking problems Agarwal et al., KDD’18 Last week we looked at production systems from Facebook, Airbnb, and Snap Inc., today it’s the turned of Linke… | Continue reading
I know you’ll be back: interpretable new user clustering and churn prediction on a mobile social application Yang et al., KDD’18 Churn rates (how fast users abandon your app / service) are re… | Continue reading
Customized regression model for Airbnb dynamic pricing Ye et al., KDD’18 This paper details the methods that Airbnb use to suggest prices to listing hosts (hosts ultimately remain in control … | Continue reading
Customized regression model for Airbnb dynamic pricing Ye et al., KDD’18 This paper details the methods that Airbnb use to suggest prices to listing hosts (hosts ultimately remain in control … | Continue reading
Rosetta: large scale system for text detection and recognition in images Borisyuk et al., KDD’18 Rosetta is Facebook’s production system for extracting text (OCR) from uploaded images. In the… | Continue reading
Rosetta: large scale system for text detection and recognition in images Borisyuk et al., KDD’18 Rosetta is Facebook’s production system for extracting text (OCR) from uploaded images. In the… | Continue reading
Columnstore and B+ tree – are hybrid physical designs important? Dziedzic et al., SIGMOD’18 Earlier this week we looked at the design of column stores and their advantages for analytic … | Continue reading
The design and implementation of modern column-oriented database systems Abadi et al., Foundations and trends in databases, 2012 I came here by following the references in the Smoke paper we looked… | Continue reading
Smoke: fine-grained lineage at interactive speed Psallidas et al., VLDB’18 Data lineage connects the input and output data items of a computation. Given a set of output records, a backward li… | Continue reading
Same-different problems strain convolutional neural networks Ricci et al., arXiv 2018 Since we’ve been looking at the idea of adding structured representations and relational reasoning to deep lear… | Continue reading
Relational inductive biases, deep learning, and graph networks Battaglia et al., arXiv’18 Earlier this week we saw the argument that causal reasoning (where most of the interesting questions … | Continue reading
Why Functional Programming Matters John Hughes, Research Topics in Functional Programming, 1990 (based on an earlier Computer Journal paper that appeared in 1989). 1989/1990 must have been a fairly… | Continue reading
The seven tools of causal inference with reflections on machine learning Pearl, CACM 2018 With thanks to @osmandros for sending me a link to this paper on twitter. In this technical report Judea Pe… | Continue reading
An empirical analysis of anonymity in Zcash Kappos et al., USENIX Security’18 As we’ve seen before, in practice Bitcoin offers little in the way of anonymity. Zcash on the other hand was care… | Continue reading
QSYM: a practical concolic execution engine tailored for hybrid fuzzing Yun et al., USENIX Security 2018 There are two main approaches to automated test case generated for uncovering bugs and vulne… | Continue reading
Unveiling and quantifying Facebook exploitation of sensitive personal data for advertising purposes Cabañas et al., USENIX Security 2018 Earlier this week we saw how the determined can still bypas… | Continue reading
Who left open the cookie jar? A comprehensive evaluation of third-party cookie policies from the Franken et al., USENIX Security 2018 This paper won a ‘Distinguished paper’ award at USENIX Security… | Continue reading
Fear the reaper: characterization and fast detection of card skimmers Scaife et al., USENIX Security 2018 Until I can get my hands on a Skim Reaper I’m not sure I’ll ever trust an ATM or other expo… | Continue reading
Fear the reaper: characterization and fast detection of card skimmers Scaife et al., USENIX Security 2018 Until I can get my hands on a Skim Reaper I’m not sure I’ll ever trust an ATM or other expo… | Continue reading
STTR: A system for tracking all vehicles all the time at the edge of the network Xu et al., DEBS’18 With apologies for only bringing you two paper write-ups this week: we moved house, which t… | Continue reading
Learning the structure of generative models without labeled data Bach et al., ICML’17 For the last couple of posts we’ve been looking at Snorkel and BabbleLabble which both depend on data pro… | Continue reading
Training classifiers with natural language explanations Hancock et al., ACL’18 We looked at Snorkel earlier this week, which demonstrates that maybe AI isn’t going to take over all of our pro… | Continue reading
Snorkel: rapid training data creation with weak supervision Ratner et al., VLDB’18 Earlier this week we looked at Sparser, which comes from the Stanford Dawn project, “a five-year resea… | Continue reading
Filter before you parse: faster analytics on raw data with Sparser Palkar et al., VLDB’18 We’ve been parsing JSON for over 15 years. So it’s surprising and wonderful that with a fresh look at… | Continue reading
Fairness without demographics in repeated loss minimization Hashimoto et al., ICML’18 When we train machine learning models and optimise for average loss it is possible to obtain systems with… | Continue reading
Human-Robot Teaming for Rescue Missions: Team ViGIR’s Approach to the 2013 DARPA Robotics Challenge Trials – Kohlbrecher et al. 2014 Yesterday we looked at ROS, the Robot Operating Syst… | Continue reading
Obfuscated gradients give a false sense of security: circumventing defenses to adversarial examples Athalye et al., ICML’18 There has been a lot of back and forth in the research community on… | Continue reading
Delayed impact of fair machine learning Liu et al., ICML’18 “Delayed impact of fair machine learning” won a best paper award at ICML this year. It’s not an easy read (at least it … | Continue reading
Bounding data races in space and time Dolan et al., PLDI’18 Yesterday we looked at the case for memory models supporting local data-race-freedom (local DRF). In today’s post we’ll push deeper… | Continue reading
Bounding data races in space and time Dolan et al., PLDI’18 Are you happy with your programming language’s memory model? In this beautifully written paper, Dolan et al. point out some of the … | Continue reading
HHVM JIT: A profile-guided, region-based compiler for PHP and Hack Ottoni, PLDI’18 HHVM is a virtual machine for PHP and Hack (a PHP extension) which is used to power Facebook’s website among… | Continue reading