Device Tracking via Linux TCP Source Port Selection Algorithm

We describe a tracking technique for Linux devices, exploiting a new TCPsource port generation mechanism recently introduced to the Linux kernel. Thismechanism is based on an algorithm, standardized in RFC 6056, for boostingsecurity by better randomizing port selection. Our techn … | Continue reading


@arxiv.org | 1 year ago

On-Device Training Under 256KB Memory

On-device training enables the model to adapt to new data collected from thesensors by fine-tuning a pre-trained model. However, the training memoryconsumption is prohibitive for IoT devices that have tiny memory resources. Wepropose an algorithm-system co-design framework to mak … | Continue reading


@arxiv.org | 1 year ago

Predicting the Future of AI with AI: prediction in an growing knowledge network

A tool that could suggest new personalized research directions and ideas bytaking insights from the scientific literature could significantly acceleratethe progress of science. A field that might benefit from such an approach isartificial intelligence (AI) research, where the num … | Continue reading


@arxiv.org | 1 year ago

It's time to replace TCP in the Datacenter

In spite of its long and successful history, TCP is a poor transport protocolfor modern datacenters. Every significant element of TCP, from its streamorientation to its expectation of in-order packet delivery, is wrong for thedatacenter. It is time to recognize that TCP's problem … | Continue reading


@arxiv.org | 1 year ago

Text to Human Motion

Natural and expressive human motion generation is the holy grail of computeranimation. It is a challenging task, due to the diversity of possible motion,human perceptual sensitivity to it, and the difficulty of accurately describingit. Therefore, current generative solutions are … | Continue reading


@arxiv.org | 1 year ago

How Much Structure Is Needed for Quantum Speedups?

I survey, for a general scientific audience, three decades of research intowhich sorts of problems admit exponential speedups via quantum computers --from the classics (like the algorithms of Simon and Shor), to the breakthroughof Yamakawa and Zhandry from April 2022. I discuss b … | Continue reading


@arxiv.org | 1 year ago

The LSST DESC DC2 Simulated Sky Survey

We describe the simulated sky survey underlying the second data challenge(DC2) carried out in preparation for analysis of the Vera C. Rubin ObservatoryLegacy Survey of Space and Time (LSST) by the LSST Dark Energy ScienceCollaboration (LSST DESC). Significant connections across m … | Continue reading


@arxiv.org | 1 year ago

Conflicting Privacy Preference Signals in the Wild (2021)

Privacy preference signals allow users to express preferences over how theirpersonal data is processed. These signals become important in determiningprivacy outcomes when they reference an enforceable legal basis, as is the casewith recent signals such as the Global Privacy Contr … | Continue reading


@arxiv.org | 1 year ago

Saving Days of ImageNet and Bert Training with Latest Weight Averaging

Training vision or language models on large datasets can take days, if notweeks. We show that averaging the weights of the k latest checkpoints, eachcollected at the end of an epoch, can speed up the training progression interms of loss and accuracy by dozens of epochs, correspon … | Continue reading


@arxiv.org | 1 year ago

Adapting Kubernetes controllers to the edge – Control planes using WASM and WASI

Kubernetes' high resource requirements hamper its adoption in constrainedenvironments such as the edge and fog. Its extensible control plane is asignificant contributor to this, consisting of long-lived processes called"controllers" that constantly listen for state changes and us … | Continue reading


@arxiv.org | 1 year ago

Co-writing longform narratives with Dramatron

Language models are increasingly attracting interest from writers. However,such models lack long-range semantic coherence, limiting their usefulness forlongform creative writing. We address this limitation by applying languagemodels hierarchically, in a system we call Dramatron. … | Continue reading


@arxiv.org | 1 year ago

Project Lyra: Sending a Spacecraft to 1I/’Oumuamua [pdf]

The first definitely interstellar object 1I/'Oumuamua (previously A/2017 U1)observed in our solar system provides the opportunity to directly studymaterial from other star systems. Can such objects be intercepted? Thechallenge of reaching the object within a reasonable timeframe … | Continue reading


@arxiv.org | 1 year ago

Perils of Privacy Exposure Through Reverse DNS

Given the importance of privacy, many Internet protocols are nowadaysdesigned with privacy in mind (e.g., using TLS for confidentiality). Foreseeingall privacy issues at the time of protocol design is, however, challenging andmay become near impossible when interaction out of pro … | Continue reading


@arxiv.org | 1 year ago

Make-a-Video: Text-to-Video Generation Without Text-Video Data

We propose Make-A-Video -- an approach for directly translating thetremendous recent progress in Text-to-Image (T2I) generation to Text-to-Video(T2V). Our intuition is simple: learn what the world looks like and how it isdescribed from paired text-image data, and learn how the wo … | Continue reading


@arxiv.org | 1 year ago

Provably efficient ML for Quantum many-body problems

Classical machine learning (ML) provides a potentially powerful approach tosolving challenging quantum many-body problems in physics and chemistry.However, the advantages of ML over more traditional methods have not beenfirmly established. In this work, we prove that classical ML … | Continue reading


@arxiv.org | 1 year ago

Journey of Migrating Millions of Queries on the Cloud

Treasure Data is processing millions of distributed SQL queries every day onthe cloud. Upgrading the query engine service at this scale is challengingbecause we need to migrate all of the production queries of the customers to anew version while preserving the correctness and per … | Continue reading


@arxiv.org | 1 year ago

Deep Learning the Functional Renormalization Group

We perform a data-driven dimensionality reduction of the scale-dependent4-point vertex function characterizing the functional Renormalization Group(fRG) flow for the widely studied two-dimensional $t - t'$ Hubbard model on thesquare lattice. We demonstrate that a deep learning ar … | Continue reading


@arxiv.org | 1 year ago

Building Flexible, Low-Cost Wireless Access Networks with Magma

Billions of people remain without Internet access due to availability oraffordability of service. In this paper, we present Magma, an open and flexiblesystem for building low-cost wireless access networks. Magma aims to connectusers where operator economics are difficult due to i … | Continue reading


@arxiv.org | 1 year ago

Catoptric Light Can Be Dangerous: Physical-World Attack by Natural Phenomenon

Deep neural networks (DNNs) have achieved great success in many tasks.Therefore, it is crucial to evaluate the robustness of advanced DNNs. Thetraditional methods use stickers as physical perturbations to fool theclassifiers, which is difficult to achieve stealthiness and there e … | Continue reading


@arxiv.org | 1 year ago

Towards Faithful Model Explanation in NLP: A Survey

End-to-end neural NLP architectures are notoriously difficult to understand,which gives rise to numerous efforts towards model explainability in recentyears. An essential principle of model explanation is Faithfulness, i.e., anexplanation should accurately represent the reasoning … | Continue reading


@arxiv.org | 1 year ago

Katara: Synthesizing CRDTs with Verified Lifting

Conflict-free replicated data types (CRDTs) are a promising tool fordesigning scalable, coordination-free distributed systems. However,constructing correct CRDTs is difficult, posing a challenge for even seasoneddevelopers. As a result, CRDT development is still largely the domai … | Continue reading


@arxiv.org | 1 year ago

Distributed Execution Indexing

This work-in-progress report presents both the design and partial evaluationof distributed execution indexing, a technique for microservice applicationsthat precisely identifies dynamic instances of inter-service remote procedurecalls (RPCs). Such an indexing scheme is critical f … | Continue reading


@arxiv.org | 1 year ago

Is Rust C++-fast? Benchmarking system languages on everyday routines

Rust is a relatively new system programming language that has beenexperiencing a rapid adoption in the past 10 years. Rust incorporates a memoryownership model enforced at a compile time. Since this model involves zeroruntime overhead, programs written in Rust are not only memory … | Continue reading


@arxiv.org | 1 year ago

Using Inertial Sensors for Position and Orientation Estimation

In recent years, MEMS inertial sensors (3D accelerometers and 3D gyroscopes)have become widely available due to their small size and low cost. Inertialsensor measurements are obtained at high sampling rates and can be integratedto obtain position and orientation information. Thes … | Continue reading


@arxiv.org | 1 year ago

Textual Screen Peeking via Eyeglass Reflections in Video Conferencing

Using mathematical modeling and human subjects experiments, this researchexplores the extent to which emerging webcams might leak recognizable textualand graphical information gleaming from eyeglass reflections captured bywebcams. The primary goal of our work is to measure, compu … | Continue reading


@arxiv.org | 1 year ago

Human-level Atari 200x faster

The task of building general agents that perform well over a wide range oftasks has been an important goal in reinforcement learning since its inception.The problem has been subject of research of a large body of work, withperformance frequently measured by observing scores over … | Continue reading


@arxiv.org | 1 year ago

Memory Tagging: A Memory Efficient Design

ARM recently introduced a security feature called Memory Tagging Extension orMTE, which is designed to defend against common memory safety vulnerabilities,such as buffer overflow and use after free. In this paper, we examine threeaspects of MTE. First, we survey how modern softwa … | Continue reading


@arxiv.org | 1 year ago

IStandWithPutin vs. IStandWithUkraine: The interactions of bots and humans

The 2022 Russian invasion of Ukraine emphasises the role social media playsin modern-day warfare, with conflict occurring in both the physical andinformation environments. There is a large body of work on identifyingmalicious cyber-activity, but less focusing on the effect this a … | Continue reading


@arxiv.org | 1 year ago

Digital Traces of Brain Drain: Developers During the Russian Invasion of Ukraine

The Russian invasion of Ukraine has sparked renewed interest in thephenomenon of brain drain: the exodus of human capital out of countries. Yetquantifying brain drain, especially in real time during crisis situations,remains difficult. This hinders our ability to understand its d … | Continue reading


@arxiv.org | 1 year ago

Extended Wigner's friend and internal consistency of standard quantum mechanics

The extended Wigner's friend problem deals with two Observers each measuringa sealed laboratory in which a friend is making a quantum measurement. Weinvestigate this problem by relying on the basic rules of quantum mechanics asexposed by Feynman in the well-known "Feynman Lecture … | Continue reading


@arxiv.org | 1 year ago

Teardown and feasibility study of IronKey, most secure USB Flash drive (2021)

There are many solutions for protecting user data on USB Flash drives.However, the family of IronKey devices was designed with the highest securityexpectations. They are definitely standing above others by being certified toFIPS 140-2 Level 3 and also claimed as certified by NATO … | Continue reading


@arxiv.org | 1 year ago

Fairness in Ranking: A Survey

In the past few years, there has been much work on incorporating fairnessrequirements into algorithmic rankers, with contributions coming from the datamanagement, algorithms, information retrieval, and recommender systemscommunities. In this survey we give a systematic overview o … | Continue reading


@arxiv.org | 1 year ago

How to Write Beautiful Process-and-Data-Science Papers

After 25 years of PhD supervision, the author noted typical recurringproblems that make papers look sloppy, difficult to read, and incoherent. Thegoal is not to write a paper for the sake of writing a paper, but to convey avaluable message that is clear and precise. The goal is t … | Continue reading


@arxiv.org | 1 year ago

Language Models Can Teach Themselves to Program Better

This work shows how one can use large-scale language models (LMs) tosynthesize programming problems with verified solutions, in the form ofprogramming puzzles, which can then in turn be used to fine-tune those samemodels, improving their performance. This work builds on two recen … | Continue reading


@arxiv.org | 1 year ago

Metamaterial-based model of the Alcubierre warp drive

Electromagnetic metamaterials are capable of emulating many exotic space-timegeometries, such as black holes, rotating cosmic strings, and the big bangsingularity. Here we present a metamaterial-based model of the Alcubierre warpdrive, and study its limitations due to available r … | Continue reading


@arxiv.org | 1 year ago

On the Paradox of Learning to Reason from Data

Logical reasoning is needed in a wide range of NLP tasks. Can a BERT model betrained end-to-end to solve logical reasoning problems presented in naturallanguage? We attempt to answer this question in a confined problem space wherethere exists a set of parameters that perfectly si … | Continue reading


@arxiv.org | 1 year ago

The dangers of non-empirical confirmation (2016)

In the book "String Theory and the Scientific Method", Richard Dawiddescribes a few of the many non-empirical arguments that motivate theoreticalphysicists' confidence in a theory, taking string theory as case study. I arguethat excessive reliance on non-empirical evidence compro … | Continue reading


@arxiv.org | 1 year ago

Anomalies in Physical Cosmology

The $Λ$CDM cosmology passes demanding tests that establish it as a goodapproximation to reality. The theory is incomplete, of course, and open issuesare being examined in active research programs. I offer a review of less widelydiscussed anomalies that might also point to hints t … | Continue reading


@arxiv.org | 1 year ago

Design and Evaluation of IPFS: A Storage Layer for the Decentralized Web

Recent years have witnessed growing consolidation of web operations. Forexample, the majority of web traffic now originates from a few organizations,and even micro-websites often choose to host on large pre-existing cloudinfrastructures. In response to this, the "Decentralized We … | Continue reading


@arxiv.org | 1 year ago

Exploring GPU Stream-Aware Message Passing Using Triggered Operations

Modern heterogeneous supercomputing systems are comprised of compute bladesthat offer CPUs and GPUs. On such systems, it is essential to move dataefficiently between these different compute engines across a high-speednetwork. While current generation scientific applications and s … | Continue reading


@arxiv.org | 1 year ago

A Variational Quantum Attack for AES-Like Symmetric Cryptography

We propose a variational quantum attack algorithm (VQAA) for classicalAES-like symmetric cryptography, as exemplified the simplified-data encryptionstandard (S-DES). In the VQAA, the known ciphertext is encoded as the groundstate of a Hamiltonian that is constructed through a reg … | Continue reading


@arxiv.org | 1 year ago

Confusion in the Church-Turing Thesis

The Church-Turing Thesis confuses numerical computations with symboliccomputations. In particular, any model of computability in which equality isnot definable, such as the lambda-models underpinning higher-order programminglanguages, is not equivalent to the Turing model. Howeve … | Continue reading


@arxiv.org | 1 year ago

Programmable large-scale simulation of bosonic transport in optical

Photonic simulators using synthetic frequency dimensions have enabledflexible experimental analogues of condensed-matter systems, realizingphenomena that are impractical to observe in real-space systems. However, todate such photonic simulators have been limited to small systems … | Continue reading


@arxiv.org | 1 year ago

Understanding User Awareness and Behaviors Concerning Encrypted DNS Settings

Recent developments to encrypt the Domain Name System (DNS) have resulted inmajor browser and operating system vendors deploying encrypted DNSfunctionality, often enabling various configurations and settings by default.In many cases, default encrypted DNS settings have implicatio … | Continue reading


@arxiv.org | 1 year ago

A direct empirical proof of the existence of dark matter

We present new weak lensing observations of 1E0657-558 (z=0.296), a uniquecluster merger, that enable a direct detection of dark matter, independent ofassumptions regarding the nature of the gravitational force law. Due to thecollision of two clusters, the dissipationless stellar … | Continue reading


@arxiv.org | 1 year ago

The Rise of GitHub in Scholarly Publications

The definition of scholarly content has expanded to include the data andsource code that contribute to a publication. While major archiving efforts topreserve conventional scholarly content, typically in PDFs (e.g., LOCKSS,CLOCKSS, Portico), are underway, no analogous effort has … | Continue reading


@arxiv.org | 1 year ago

Mary Kenneth Keller: First US PhD in Computer Science [pdf]

The first two doctoral-level degrees in Computer Science in the US wereawarded in June 1965. This paper discusses one of the degree recipients, SisterMary Kenneth Keller, BVM. | Continue reading


@arxiv.org | 1 year ago

Meaning without reference in large language models

The widespread success of large language models (LLMs) has been met withskepticism that they possess anything like human concepts or meanings. Contraryto claims that LLMs possess no meaning whatsoever, we argue that they likelycapture important aspects of meaning, and moreover wo … | Continue reading


@arxiv.org | 1 year ago