We present measurements and simulations of semiconductor-superconductorheterostructure devices that are consistent with the observation of topologicalsuperconductivity and Majorana zero modes. The devices are fabricated fromhigh-mobility two-dimensional electron gases in which qu … | Continue reading
One concern with the rise of large language models lies with their potentialfor significant harm, particularly from pretraining on biased, obscene,copyrighted, and private information. Emerging ethical approaches haveattempted to filter pretraining material, but such approaches h … | Continue reading
This paper introduces a new type of attack on isolated, air-gappedworkstations. Although air-gap computers have no wireless connectivity, we showthat attackers can use the SATA cable as a wireless antenna to transfer radiosignals at the 6 GHz frequency band. The Serial ATA (SATA) … | Continue reading
We present a unified method, termed Unicorn, that can simultaneously solvefour tracking problems (SOT, MOT, VOS, MOTS) with a single network using thesame model parameters. Due to the fragmented definitions of the object trackingproblem itself, most existing trackers are develope … | Continue reading
This document characterizes the actual science performance of the James WebbSpace Telescope (JWST), as known on 12 July 2022. Following six months ofcommissioning to prepare JWST for science operations, the observatory is nowfully capable of achieving the discoveries for which it … | Continue reading
Real-time analytics systems employ hybrid data layouts in which data arestored in different formats throughout their lifecycle. Recent data are storedin a row-oriented format to serve OLTP workloads and support high insert rates,while older data are transformed to a column-orient … | Continue reading
We present XMem, a video object segmentation architecture for long videoswith unified feature memory stores inspired by the Atkinson-Shiffrin memorymodel. Prior work on video object segmentation typically only uses one type offeature memory. For videos longer than a minute, a sin … | Continue reading
YOLOv7 surpasses all known object detectors in both speed and accuracy in therange from 5 FPS to 160 FPS and has the highest accuracy 56.8% AP among allknown real-time object detectors with 30 FPS or higher on GPU V100. YOLOv7-E6object detector (56 FPS V100, 55.9% AP) outperforms … | Continue reading
Reliable generalization lies at the heart of safe ML and AI. However,understanding when and how neural networks generalize remains one of the mostimportant unsolved problems in the field. In this work, we conduct an extensiveempirical study (2200 models, 16 tasks) to investigate … | Continue reading
We present Generalized Histogram Thresholding (GHT), a simple, fast, andeffective technique for histogram-based image thresholding. GHT works byperforming approximate maximum a posteriori estimation of a mixture ofGaussians with appropriate priors. We demonstrate that GHT subsume … | Continue reading
We show that ResNets converge, in the infinite depth limit, to ageneralization of image registration algorithms. In this generalization, imagesare replaced by abstractions (ideas) living in high dimensional RKHS spaces,and material points are replaced by data points. Whereas comp … | Continue reading
As demonstrated by GPT-3 and T5, transformers grow in capability as parameterspaces become larger and larger. However, for tasks that require a large amountof knowledge, non-parametric memory allows models to grow dramatically with asub-linear increase in computational cost and G … | Continue reading
In machine learning (ML), researchers and engineers seem to be at odds.System implementers would prefer models to be declarative, with detailed typeinformation and semantic restrictions that allow models to be optimised,rearranged and parallelised. Yet practitioners show an overw … | Continue reading
A proof is one of the most important concepts of mathematics. However, thereis a striking difference between how a proof is defined in theory and how it isused in practice. This puts the unique status of mathematics as exact scienceinto peril. Now may be the time to reconcile the … | Continue reading
If humanity is ever to consider substantial, long-term colonization of Mars,the resources needed are going to be extensive. For a long-term human presenceon Mars to be established, serious thought would need to be given toterraforming the planet. One major requirement for such te … | Continue reading
We study whether language models can evaluate the validity of their ownclaims and predict which questions they will be able to answer correctly. Wefirst show that larger models are well-calibrated on diverse multiple choiceand true/false questions when they are provided in the ri … | Continue reading
In order to address increasing demands of real-world applications, theresearch for knowledge-intensive NLP (KI-NLP) should advance by capturing thechallenges of a truly open-domain environment: web-scale knowledge, lack ofstructure, inconsistent quality and noise. To this end, we … | Continue reading
Ten years into the revival of deep networks and artificial intelligence, wepropose a theoretical framework that sheds light on understanding deep networkswithin a bigger picture of Intelligence in general. We introduce twofundamental principles, Parsimony and Self-consistency, th … | Continue reading
This paper presents a novel nearest neighbor search algorithm achieving TPU(Google Tensor Processing Unit) peak performance, outperformingstate-of-the-art GPU algorithms with similar level of recall. The design of theproposed algorithm is motivated by an accurate accelerator perf … | Continue reading
Random number generation is an important task in a wide variety of criticalapplications including cryptographic algorithms, scientific simulations, andindustrial testing tools. True Random Number Generators (TRNGs) produce trulyrandom data by sampling a physical entropy source th … | Continue reading
Constrained random test generation is one of the most widely adopted methodsfor generating stimuli for simulation-based verification. Randomness leads totest diversity, but tests tend to repeatedly exercise the same design logic.Constraints are written (typically manually) to bia … | Continue reading
Solitons in space--time capable of transporting time-like observers atsuperluminal speeds have long been tied to violations of the weak, strong, anddominant energy conditions of general relativity. The negative-energy sourcesrequired for these solitons must be created through ene … | Continue reading
The application of zero-shot learning in computer vision has beenrevolutionized by the use of image-text matching models. The most notableexample, CLIP, has been widely used for both zero-shot classification andguiding generative models with a text prompt. However, the zero-shot … | Continue reading
We describe some recent computer investigations with the `Constraint LogicPropagation over Finite Domains' -- CLP(FD) -- library in the Prologprogramming environment to search for new simple Lie algebras over the field$\GF(2)$ of $2$ elements. Motivated by a paper of Grishkov et. … | Continue reading
This paper studies the design of B-tree that can take full advantage ofmodern storage hardware with built-in transparent compression. Recent yearshave witnessed significant interest in applying log-structured merge tree(LSM-tree) as an alternative to B-tree. The current consensus … | Continue reading
In this essay, I present the advantages and, I dare say, the beauty ofprogramming in a language with set-theoretic types, that is, types that includeunion, intersection, and negation type connectives. I show by several exampleshow set-theoretic types are necessary to type some co … | Continue reading
Loosely inspired by the somewhat fanciful notion of detecting an arbitrarilyadvanced alien civilization, we consider a general-relativistic thin-shellDyson mega-sphere completely enclosing a central star-like object, and performa full general-relativistic analysis using the Israe … | Continue reading
Blockchain systems come with a promise of decentralization that oftenstumbles on a roadblock when key decisions about modifying the softwarecodebase need to be made. This is attested by the fact that both of the twomajor cryptocurrencies, Bitcoin and Ethereum, have undergone hard … | Continue reading
The possibility of achieving quantum communication using photons acrossinterstellar distances is examined. For this, different factors are consideredthat could induce decoherence of photons, including the gravitational field ofastrophysical bodies, the particle content in the int … | Continue reading
Given a small training data set and a learning algorithm, how much more datais necessary to reach a target validation or test performance? This question isof critical importance in applications such as autonomous driving or medicalimaging where collecting data is expensive and ti … | Continue reading
A vast amount of location information exists in unstructured texts, such associal media posts, news stories, scientific articles, web pages, travel blogs,and historical archives. Geoparsing refers to the process of recognizinglocation references from texts and identifying their g … | Continue reading
While large language models a la BERT are used ubiquitously in NLP,pretraining them is considered a luxury that only a few well-funded industrylabs can afford. How can one train such models with a more modest budget? Wepresent a recipe for pretraining a masked language model in 2 … | Continue reading
Speculative attacks are still an active threat today that, even if initiallyfocused on the x86 platform, reach across all modern hardware architectures.RISC-V is a newly proposed open instruction set architecture that has seentraction from both the industry and academia in recent … | Continue reading
Automatic program synthesis is a long-lasting dream in software engineering.Recently, a promising Deep Learning (DL) based solution, called Copilot, hasbeen proposed by Open AI and Microsoft as an industrial product. Although somestudies evaluate the correctness of Copilot soluti … | Continue reading
We present an analytical proof assisted by computer calculations for thedynamical stability of the eight main planets and Pluto for the next 100,000years. It means that the semi-major axes of the planets will not changesignificantly during this period. Also the eccentricities and … | Continue reading
It is believed that random quantum circuits are difficult to simulateclassically. These have been used to demonstrate quantum supremacy: theexecution of a computational task on a quantum computer that is infeasible forany classical computer. The task underlying the assertion of q … | Continue reading
Widely observed neural scaling laws, in which error falls off as a power ofthe training set size, model size, or both, have driven substantial performanceimprovements in deep learning. However, these improvements through scalingalone require considerable costs in compute and ener … | Continue reading
Context: The advancements in machine learning techniques have encouragedresearchers to apply these techniques to a myriad of software engineering tasksthat use source code analysis such as testing and vulnerabilities detection. Alarge number of studies poses challenges to the com … | Continue reading
In this paper we propose to study generalization of neural networks on smallalgorithmically generated datasets. In this setting, questions about dataefficiency, memorization, generalization, and speed of learning can be studiedin great detail. In some situations we show that neur … | Continue reading
Assistance robots have gained widespread attention in various industries suchas logistics and human assistance. The tasks of guiding or following a human ina crowded environment such as airports or train stations to carry weight orgoods is still an open problem. In these use case … | Continue reading
We introduce DeepNash, an autonomous agent capable of learning to play theimperfect information game Stratego from scratch, up to a human expert level.Stratego is one of the few iconic board games that Artificial Intelligence (AI)has not yet mastered. This popular game has an eno … | Continue reading
WebAssembly (Wasm) is a compact, well-specified bytecode format that offers aportable compilation target with near-native execution speed. The bytecodeformat was specifically designed to be fast to parse, validate, and compile,positioning itself as a portable alternative to nativ … | Continue reading
Generating expressive and contextually appropriate prosody remains achallenge for modern text-to-speech (TTS) systems. This is particularly evidentfor long, multi-sentence inputs. In this paper, we examine simple extensions toa Transformer-based FastSpeech-like system, with the g … | Continue reading
What is the most effective way to grill food? Timing is everything, sinceonly one surface is exposed to heat at a given time. Should we flip only once,or many times? We present a simple model of cooking by flipping, and someinteresting observations emerge. The rate of cooking dep … | Continue reading
This is a collection of (mostly) pen-and-paper exercises in machine learning.The exercises are on the following topics: linear algebra, optimisation,directed graphical models, undirected graphical models, expressive power ofgraphical models, factor graphs and message passing, inf … | Continue reading
Stable and accurate electroencephalogram (EEG) signal acquisition isfundamental in non-invasive brain-computer interface (BCI) technology. Commonlyused EEG acquisition system's hardware and software are usually closed-source.Its inability to flexible expansion and secondary devel … | Continue reading
Several election districts in the US have recently moved to ranked-choicevoting (RCV) to decide the results of local elections. RCV allows voters torank their choices, and the results are computed in rounds, eliminating onecandidate at a time. RCV ensures fairer elections and has … | Continue reading