The Significance Filter, the Winner's Curse and the Need to Shrink

The "significance filter" refers to focusing exclusively on statisticallysignificant results. Since frequentist properties such as unbiasedness andcoverage are valid only before the data have... | Continue reading | 3 hours ago

Avoiding Side Effects by Considering Future Tasks

Designing reward functions is difficult: the designer has to specify what todo (what it means to complete the task) as well as what not to do (side effectsthat should be avoided while completing... | Continue reading | 7 hours ago

AI ethics: race and gender – Timnit Gebru (2019)

From massive face-recognition-based surveillance and machine-learning-baseddecision systems predicting crime recidivism rates, to the move towardsautomated health diagnostic systems, artificial... | Continue reading | 1 day ago

Every natural number is the sum of (at most) forty-nine palindromes

It is shown that the set of decimal palindromes is an additive basis for thenatural numbers. Specifically, we prove that every natural number can beexpressed as the sum of forty-nine (possibly... | Continue reading | 1 day ago

Every Model Learned by Gradient Descent Is Approximately a Kernel Machine

Deep learning's successes are often attributed to its ability toautomatically discover new representations of the data, rather than relying onhandcrafted features like other learning methods. We... | Continue reading | 2 days ago

Who Is Debugging the Debuggers? Exposing Debug Info. Bugs in Optimized Binaries

Despite the advancements in software testing, bugs still plague deployedsoftware and result in crashes in production. When debugging issues --sometimes caused by "heisenbugs" -- there is the... | Continue reading | 2 days ago

DNS spoofing has more than doubled in less than seven years

DNS is important in nearly all interactions on the Internet. All large DNSoperators use IP anycast, announcing servers in BGP from multiple physicallocations to reduce client latency and provide... | Continue reading | 3 days ago

Measuring and Preventing Supply Chain Attacks on Package Managers

Package managers have become a vital part of the modern software developmentprocess. They allow developers to reuse third-party code, share their own code,minimize their codebase, and simplify... | Continue reading | 5 days ago

Centrifugal melt spinning for manufacture of N95 filtering facepiece respirators

The COVID-19 pandemic has caused a global shortage of personal protectiveequipment. While existing supply chains are struggling to meet the surge indemand, the limited supply of N95 filtering... | Continue reading | 6 days ago

Coding Guidelines for Prolog (2009)

Coding standards and good practices are fundamental to a disciplined approachto software projects, whatever programming languages they employ. Prologprogramming can benefit from such an... | Continue reading | 6 days ago

Equivariant Neural Rendering

We propose a framework for learning neural scene representations directlyfrom images, without 3D supervision. Our key insight is that 3D structure canbe imposed by ensuring that the learned... | Continue reading | 6 days ago

The Relevance of Classic Fuzz Testing: Have We Solved This One?

As fuzz testing has passed its 30th anniversary, and in the face of theincredible progress in fuzz testing techniques and tools, the question arisesif the classic, basic fuzz technique is still... | Continue reading | 7 days ago

Indirection Stream Semantic Reg. Arch. for Efficient Sparse-Dense Linear Algebra

Sparse-dense linear algebra is crucial in many domains, but challenging tohandle efficiently on CPUs, GPUs, and accelerators alike; multiplications withsparse formats like CSR and CSF require... | Continue reading | 7 days ago

An Introduction to Geometric Algebra

This is an introduction to geometric algebra, an alternative to traditionalvector algebra that expands on it in two ways: 1. In addition to scalars and vectors, it defines new objects... | Continue reading | 7 days ago

Piuma: Programmable Integrated Unified Memory Architecture

High performance large scale graph analytics is essential to timely analyzerelationships in big data sets. Conventional processor architectures sufferfrom inefficient resource usage and bad... | Continue reading | 8 days ago

A Modern Compiler for the French Tax Code

In France, income tax is computed from taxpayers' individual returns, usingan algorithm that is authored, designed and maintained by the French PublicFinances Directorate (DGFiP). This algorithm... | Continue reading | 9 days ago

First Observational Tests of Eternal Inflation

The eternal inflation scenario predicts that our observable universe residesinside a single bubble embedded in a vast inflating multiverse. We present thefirst observational tests of eternal... | Continue reading | 10 days ago

GuessTheMusic: Song Identification from EEG Response

The music signal comprises of different features like rhythm, timbre, melody,harmony. Its impact on the human brain has been an exciting research topic forthe past several decades.... | Continue reading | 10 days ago

A Theoretical Computer Science Perspective on Consciousness [pdf]

Continue reading | 10 days ago

Invariant Variation Problems Emmy Noether, M. A. Tavel[Trans]

The problems in variation here concerned are such as to admit a continuousgroup (in Lie's sense); the conclusions that emerge from the correspondingdifferential equations find their most general... | Continue reading | 12 days ago

Game Plan: What AI Can Do for Football, and What Football Can Do for AI

The rapid progress in artificial intelligence (AI) and machine learning hasopened unprecedented analytics possibilities in various team and individualsports, including baseball, basketball, and... | Continue reading | 12 days ago

A Novel Framework for Explaining Machine Learning Using Shapley Values

A number of techniques have been proposed to explain a machine learningmodel's prediction by attributing it to the corresponding input features.Popular among these are techniques that apply the... | Continue reading | 12 days ago

Redistricting Algorithms

Why not have a computer just draw a map? This is something you hear a lotwhen people talk about gerrymandering, and it's easy to think at first thatthis could solve redistricting altogether. But... | Continue reading | 12 days ago

A Stochastic Derivation of Classical and Quantum Mechanics

We derive the classical Hamilton-Jacobi equation from first principles as thenatural description for smooth stochastic processes when one neglectsstochastic velocity fluctuations. The... | Continue reading | 12 days ago

Survey of System Architectures and Techniques for FPGA Virtualization

FPGA accelerators are gaining increasing attention in both cloud and edgecomputing because of their hardware flexibility, high computational throughput,and low power consumption. However, the... | Continue reading | 12 days ago

Challenges in Deploying Machine Learning: A Survey of Case Studies

In recent years, machine learning has received increased interest both as anacademic research field and as a solution for real-world business problems.However, the deployment of machine learning... | Continue reading | 13 days ago

Neural Abstract Reasoner

Abstract reasoning and logic inference are difficult problems for neuralnetworks, yet essential to their applicability in highly structured domains. Inthis work we demonstrate that a well known... | Continue reading | 14 days ago

Screen Gleaning: A Screen Reading Tempest Attack on Mobile Devices

We introduce screen gleaning, a TEMPEST attack in which the screen of amobile device is read without a visual line of sight, revealing sensitiveinformation displayed on the phone screen. The... | Continue reading | 14 days ago

ZORB: A Derivative-Free Backpropagation Algorithm for Neural Networks

Gradient descent and backpropagation have enabled neural networks to achieveremarkable results in many real-world applications. Despite ongoing success,training a neural network with gradient... | Continue reading | 14 days ago

A Comprehensive Formal Security Analysis of OAuth 2.0 (2016)

The OAuth 2.0 protocol is one of the most widely deployedauthorization/single sign-on (SSO) protocols and also serves as the foundationfor the new SSO standard OpenID Connect. Despite the... | Continue reading | 14 days ago

The Semantics of Rank Polymorphism

Iverson's APL and its descendants (such as J, K and FISh) are examples of thefamily of "rank-polymorphic" programming languages. The principal controlmechanism of such languages is the general... | Continue reading | 15 days ago

FSD50K: An Open Dataset of Human-Labeled Sound Events

Most existing datasets for sound event recognition (SER) are relatively smalland/or domain-specific, with the exception of AudioSet, based on a massiveamount of audio tracks from YouTube videos... | Continue reading | 15 days ago

Sketch and Scale: Geo-Distributed TSNE and UMAP

Running machine learning analytics over geographically distributed datasetsis a rapidly arising problem in the world of data management policies ensuringprivacy and data security. Visualizing... | Continue reading | 16 days ago

The Usability of Ownership

Ownership is the concept of tracking aliases and mutations to data, usefulfor both memory safety and system design. The Rust programming languageimplements ownership via the borrow checker, a... | Continue reading | 16 days ago

Improving seasonal forecast using probabilistic deep learning

The path toward realizing the potential of seasonal forecasting and itssocioeconomic benefits depends heavily on improving general circulation modelbased dynamical forecasting systems. To... | Continue reading | 19 days ago

Gamification Affects Software Developers, a Thesis on GitHub Streaks [pdf]

We examine how the behavior of software developers changes in response toremoving gamification elements from GitHub, an online platform forcollaborative programming and software development. We... | Continue reading | 20 days ago

Jelly Bean World: An Environment for Never-Ending Learning

Machine learning has shown growing success in recent years. However, currentmachine learning systems are highly specialized, trained for particularproblems or domains, and typically on a single... | Continue reading | 21 days ago

Graph Kernels: State-of-the-Art and Future Challenges

Graph-structured data are an integral part of many application domains,including chemoinformatics, computational biology, neuroimaging, and socialnetwork analysis. Over the last two decades,... | Continue reading | 22 days ago

A Neural Scaling Law from the Dimension of the Data Manifold

When data is plentiful, the loss achieved by well-trained neural networksscales as a power-law $L \propto N^{-α}$ in the number of networkparameters $N$. This empirical scaling law holds... | Continue reading | 22 days ago

Current Challenges and New Directions in Sentiment Analysis Research

Sentiment analysis as a field has come a long way since it was firstintroduced as a task nearly 20 years ago. It has widespread commercialapplications in various domains like marketing, risk... | Continue reading | 22 days ago

Learning Autocompletion from Real-World Datasets

Code completion is a popular software development tool integrated into allmajor IDEs. Many neural language models have achieved promising results incompletion suggestion prediction on synthetic... | Continue reading | 23 days ago

The fractal dimension of the Appalachian Trail

The Appalachian Trail (AT) is a 2193-mile-long hiking trail in the easternUnited States. The trail has many bends and turns at different length scales,which gives it a nontrivial fractal... | Continue reading | 24 days ago

Large-scale multilingual audio visual dubbing

We describe a system for large-scale audiovisual translation and dubbing,which translates videos from one language to another. The source language'sspeech content is transcribed to text,... | Continue reading | 25 days ago

Training EfficientNets at Scale: 83% ImageNet Top-1 Accuracy in One Hour

EfficientNets are a family of state-of-the-art image classification modelsbased on efficiently scaled convolutional neural networks. Currently,EfficientNets can take on the order of days to... | Continue reading | 25 days ago

On generalization problems of ML models in real-world applications

ML models often exhibit unexpectedly poor behavior when they are deployed inreal-world domains. We identify underspecification as a key reason for thesefailures. An ML pipeline is underspecified... | Continue reading | 25 days ago

Runtime vs. Scheduler: Analyzing Dask’s Overheads

Dask is a distributed task framework which is commonly used by datascientists to parallelize Python code on computing clusters with littleprogramming effort. It uses a sophisticated... | Continue reading | 25 days ago

StealthDB: A Scalable Encrypted Databasewith Full SQL Query Support [pdf]

Encrypted database systems provide a great method for protecting sensitivedata in untrusted infrastructures. These systems are built using eitherspecial-purpose cryptographic algorithms that... | Continue reading | 25 days ago

Restoration of Fragmentary Babylonian Texts Using Recurrent Neural Networks

The main source of information regarding ancient Mesopotamian history andculture are clay cuneiform tablets. Despite being an invaluable resource, manytablets are fragmented leading to missing... | Continue reading | 1 month ago