This paper presents a methodology for using LLVM-based tools to tune theDCA++ (dynamical clusterapproximation) application that targets the new ARMA64FX processor. The goal is to describethe changes required for the newarchitecture and generate efficient single instruction/multip … | Continue reading
Neural network training and validation rely on the availability of largehigh-quality datasets. However, in many cases only incomplete datasets areavailable, particularly in health care applications, where each patienttypically undergoes different clinical procedures or can drop o … | Continue reading
Python has become the de facto language for scientific computing. Programmingin Python is highly productive, mainly due to its rich science-orientedsoftware ecosystem built around the NumPy module. As a result, the demand forPython support in High Performance Computing (HPC) has … | Continue reading
Securing a secret master key is a non-trivial task, we even argue it isimpossible to fully secure it, hence we must make it as difficult as possiblefor any powerful adversary to steal or use the key. We introduce the reader tointeresting cryptography which is starting to get more … | Continue reading
Reinforcement learning (RL) is typically concerned with estimatingsingle-step policies or single-step models, leveraging the Markov property tofactorize the problem in time. However, we can also view RL as a sequencemodeling problem, with the goal being to predict a sequence of a … | Continue reading
Python has become a popular programming language because of its excellentprogrammability. Many modern software packages utilize Python for high-levelalgorithm design and depend on native libraries written in C/C++/Fortran forefficient computation kernels. Interaction between Pyth … | Continue reading
Latest ARM processors are approaching the computational power of x86architectures while consuming much less energy. Consequently, supply followsdemand with Amazon EC2, Equinix Metal and Microsoft Azure offering ARM-basedinstances, while Oracle Cloud Infrastructure is about to add … | Continue reading
We design multi-horizon forecasting models for limit order book (LOB) data byusing deep learning techniques. Unlike standard structures where a singleprediction is made, we adopt encoder-decoder models with sequence-to-sequenceand Attention mechanisms, to generate a forecasting p … | Continue reading
This paper gives a brief description of the author's database of integersequences, now over 35 years old, together with a selection of a few of themost interesting sequences in the table. Many unsolved problems are mentioned. | Continue reading
Open-source, Decentralized Online Social Networks (DOSNs) are emerging asalternatives to the popular yet centralized and profit-driven platforms likeFacebook or Twitter. In DOSNs, users can set up their own server, or instance,while they can actually interact with users of other … | Continue reading
In this essay, I argue that mathematics is a natural science---just likephysics, chemistry, or biology---and that this can explain the alleged"unreasonable" effectiveness of mathematics in the physical sciences. The mainchallenge for this view is to explain how mathematical theor … | Continue reading
Neural language models can be successfully trained on source code, leading toapplications such as code completion. However, their versatile autoregressiveself-supervision objective overlooks important global sequence-level featuresthat are present in the data such as syntactic co … | Continue reading
We introduce PathQuery, a graph query language developed to scale withGoogle's query and data volumes as well as its internal developer community.PathQuery supports flexible and declarative semantics. We have found that thisenables query developers to think in a naturally "graphy … | Continue reading
The last-generation video conferencing software allows users to utilize avirtual background to conceal their personal environment due to privacyconcerns, especially in official meetings with other employers. On the otherhand, users maybe want to fool people in the meeting by cons … | Continue reading
With the proliferation of the digital data economy, digital data isconsidered as the crude oil in the twenty-first century, and its value isincreasing. Keeping pace with this trend, the model of data market tradingbetween data providers and data consumers, is starting to emerge a … | Continue reading
Data poisoning has been proposed as a compelling defense against facialrecognition models trained on Web-scraped pictures. By perturbing the imagesthey post online, users can fool models into misclassifying future(unperturbed) pictures. We demonstrate that this strategy provides … | Continue reading
An important challenge in reinforcement learning is training agents that cansolve a wide variety of tasks. If tasks depend on each other (e.g. needing tolearn to walk before learning to run), curriculum learning can speed uplearning by focusing on the next best task to learn. We … | Continue reading
Time domain science forms an increasing fraction of astronomical programs atmany facilities. Synoptic and targeted observing modes of transient, varying,and moving sources rely on precise clocks to provide the underlying time tags.Often precision is mistaken for accuracy, or the … | Continue reading
State-of-the-art models in natural language processing rely on separate rigidsubword tokenization algorithms, which limit their generalization ability andadaptation to new settings. In this paper, we propose a new model inductivebias that learns a subword tokenization end-to-end … | Continue reading
Fault-tolerant quantum computers offer the promise of dramatically improvingmachine learning through speed-ups in computation or improved modelscalability. In the near-term, however, the benefits of quantum machinelearning are not so clear. Understanding expressibility and traina … | Continue reading
In this letter, we address the issue of scalable and timely dissemination ofinformation in resource-constrained IoT networks. The scalability is addressedby adopting a publishsubscribe architecture. To address the timelydissemination, we propose an HTTP/3 (H3) publish-subscribe s … | Continue reading
The suffix array, describing the lexicographic order of suffixes of a giventext, is the central data structure in string algorithms. The suffix array of alength-$n$ text uses $Θ(n \log n)$ bits, which is prohibitive in manyapplications. To address this, Grossi and Vitter [STOC 20 … | Continue reading
Shannon's mathematical theory of communication defines fundamental limits onhow much information can be transmitted between the different components of anyman-made or biological system. This paper is an informal but rigorousintroduction to the main ideas implicit in Shannon's the … | Continue reading
Hypergraph expanders are hypergraphs with surprising, non-intuitive expansionproperties. In a recent paper, the first author gave a simple construction,which can be randomized, of $3$-uniform hypergraph expanders withpolylogarithmic degree. We generalize this construction, giving … | Continue reading
Adaption of end-to-end speech recognition systems to new tasks is known to bechallenging. A number of solutions have been proposed which apply externallanguage models with various fusion methods, possibly with a combination oftwo-pass decoding. Also TTS systems have been used to … | Continue reading
OpenML is an online platform for open science collaboration in machinelearning, used to share datasets and results of machine learning experiments.In this paper we introduce OpenML-Python, a client API for Python, opening upthe OpenML platform for a wide range of Python-based too … | Continue reading
We present a large-scale bibliometric analysis of gender differences inscientific careers, covering all scientific disciplines and a large number ofcountries worldwide. We take a longitudinal perspective in which we trace thepublication careers of almost six million male and fema … | Continue reading
In this article we show why flying and rotating beer mats, CDs, or other flatdisks will eventually flip in the air and end up flying with backspin, thus,making them unusable as frisbees. The crucial effect responsible for theflipping is found to be the lift attacking not in the c … | Continue reading
Recent advances in scanning tunneling and transmission electron microscopies(STM and STEM) have allowed routine generation of large volumes of imaging datacontaining information on the structure and functionality of materials. Theexperimental data sets contain signatures of long- … | Continue reading
We present a real-time neural radiance caching method for path-traced globalillumination. Our system is designed to handle fully dynamic scenes, and makesno assumptions about the lighting, geometry, and materials. The data-drivennature of our approach sidesteps many difficulties … | Continue reading
Most modern deep learning-based multi-view 3D reconstruction techniques useRNNs or fusion modules to combine information from multiple images afterencoding them. These two separate steps have loose connections and do notconsider all available information while encoding each view. … | Continue reading
Multiplying matrices is among the most fundamental and compute-intensiveoperations in machine learning. Consequently, there has been significant workon efficiently approximating matrix multiplies. We introduce a learning-basedalgorithm for this task that greatly outperforms exist … | Continue reading
Forty years ago, Richard Feynman proposed harnessing quantum physics to builda more powerful kind of computer. Realizing Feynman's vision is one of thegrand challenges facing 21st century science and technology. In this article,we'll recall Feynman's contribution that launched th … | Continue reading
Tabular datasets are the last "unconquered castle" for deep learning, withtraditional ML methods like Gradient-Boosted Decision Trees still performingstrongly even against recent specialized neural architectures. In this paper,we hypothesize that the key to boosting the performan … | Continue reading
Blind face restoration (BFR) from severely degraded face images in the wildis a very challenging problem. Due to the high illness of the problem and thecomplex unknown degradation, directly training a deep neural network (DNN)usually cannot lead to acceptable results. Existing ge … | Continue reading
Third-party tracking allows companies to collect users' behavioural data andtrack their activity across digital devices. This can put deep insights intousers' private lives into the hands of strangers, and often happens withoutusers' awareness or explicit consent. EU and UK data … | Continue reading
Deep learning has been used to demonstrate end-to-end neural network learningfor autonomous vehicle control from raw sensory input. While LiDAR sensorsprovide reliably accurate information, existing end-to-end driving solutionsare mainly based on cameras since processing 3D data … | Continue reading
Social media platforms such as Facebook and Twitter are used for socialactivism purposes, and TikTok is no different. We conducted 9 qualitativesemi-structured interviews with social activists who recently posted theirvideos on TikTok to understand. This study presents an initial … | Continue reading
The use of video surveillance in public spaces -- both by government agenciesand by private citizens -- has attracted considerable attention in recentyears, particularly in light of rapid... | Continue reading
Word embedding is a Natural Language Processing (NLP) technique thatautomatically maps words from a vocabulary to vectors of real numbers in anembedding space. It has been widely used in recent... | Continue reading
What is the computational model behind a Transformer? Where recurrent neuralnetworks have direct parallels in finite state machines, allowing cleardiscussion and thought around architecture... | Continue reading
We propose an efficient framework, called Simple Swap (SimSwap), aiming forgeneralized and high fidelity face swapping. In contrast to previous approachesthat either lack the ability to... | Continue reading
Attention-based neural networks such as the Vision Transformer (ViT) haverecently attained state-of-the-art results on many computer vision benchmarks.Scale is a primary ingredient in attaining... | Continue reading
The advent of the transformer has sparked a quick growth in the size oflanguage models, far outpacing hardware improvements. (Dense) transformers areexpected to reach the trillion-parameter... | Continue reading
To explore basin geometry in high-dimensional dynamical systems, we considera ring of identical Kuramoto oscillators. Many attractors are known to coexistin this system; each is a twisted... | Continue reading
Inpainting is a learned interpolation technique that is based on generativemodeling and used to populate masked or missing pieces in an image; it has wideapplications in picture editing and... | Continue reading