Metablog

NDP Challenges Gaussian Processes to Describe Rich Distributions over Functions

While researchers have traditionally employed Gaussian processes (GP) for specifying prior and posterior distributions over functions, this approach becomes computationally expensive when scaled, is limited by the expressivity of its covariance function, and struggles with adapti … | Continue reading

@syncedreview.com | 1 year ago

A ‘Simple Trick’ for Reducing Transformers’ (Self-)Attention Memory Requirements

A Google Research team has proposed a novel method for dramatically reducing transformers’ (self-)attention memory requirements. This “trick,” which they believe had been simply overlooked by the machine learning community, addresses the concerning quadratic time and space comple … | Continue reading

@syncedreview.com | 2 years ago

Algorithms That Understand the World Through Actions

The idiom “actions speak louder than words” first appeared in print almost 300 years ago. A new study echoes this view, arguing that combining self-supervised and offline reinforcement learning (RL) could lead to a new class of algorithms that understand the world through actions … | Continue reading

@syncedreview.com | 2 years ago

PolyViT: Universal Transformer for Image, Video, and Audio Classification

The original 2017 transformer model was designed for natural language processing (NLP), where it achieved SOTA results. Its performance intrigued machine learning researchers, who have since successfully adapted the attention-based architecture to perception tasks in other modali … | Continue reading

@syncedreview.com | 2 years ago

Warsaw U, OpenAI and Google’s Hourglass Hierarchical Transformer Model

As machine learning models become larger and more powerful, researchers are increasingly seeking ways to reduce their huge computational appetites and improve efficiency. Nowhere is this evidenced more than with transformer architectures, whose superior capabilities in handling l … | Continue reading

@syncedreview.com | 2 years ago

Pass, an ImageNet Replacement for Self-Supervised Pretraining

Baseline datasets such as ImageNet have played an important role in pretraining models in the computer vision field. These datasets however often include images that have technical, ethical and legal shortcomings. Furthermore, current state-of-the-art pretrained models use unsupe … | Continue reading

@syncedreview.com | 2 years ago

Google’s Zero-Label Language Learning Competitive with Supervised Learning

While contemporary deep learning models continue to achieve outstanding results across a wide range of tasks, these models are known to have huge data appetites. The emergence of large-scale pretrained language models such as Open AI’s GPT-3 has helped reduce the need for task-sp … | Continue reading

@syncedreview.com | 2 years ago

DeepMind’s Perceiver IO: A General Architecture for Variety of Ins and Outs

Human beings perceive and understand the world by processing high-dimensional inputs from modalities as diverse as vision, audio, touch, proprioception, etc. Yet most machine learning models rely on modality-specific architectures, dealing with a stereotyped set of inputs and out … | Continue reading

@syncedreview.com | 2 years ago

Baidu’s Ernie 3.0 Framework Surpasses Human Performance on SuperGLUE Benchmark

Baidu has released ERNIE 3.0, a framework for pretraining knowledge-enhanced language models that integrates both auto-encoder networks and auto-regressive networks. The novel approach achieves state-of-the-art results on Chinese and English language understanding and generation … | Continue reading

@syncedreview.com | 2 years ago

MIT, Allen AI and Microsoft Open-Source a Suite of AI Programming Puzzles

Programming competition problems are pervasive in the AI community. They can be used to evaluate programmers’ abilities to solve artificial tasks as well as to test the limits of state-of-the-art algorithms.A research team from MIT, Allen Institute for AI and Microsoft Research r … | Continue reading

@syncedreview.com | 2 years ago

Toward a Token-Free Future: Google Proposes Byte-to-Byte Transformers for NLP

A new Google Research study proposes modifying the standard transformer architecture to process byte sequences in natural language processing (NLP). The researchers show that in terms of parameter count, training FLOPs and inference speed, their proposed byte-level models can be … | Continue reading

@syncedreview.com | 2 years ago

Cornell and NTT’s Physical Neural Nets Enable Arbitrary Physical System Training

Deep neural networks (DNNs) already provide the best solutions for many complex problems in image recognition, speech recognition, and natural language processing. Now, DNNs are entering the physical arena. DNNs and physical processes share numerous structural similarities, such … | Continue reading

@syncedreview.com | 2 years ago

ETH Zürich Identifies Priors That Boost Bayesian Deep Learning Models

It’s well known across the machine learning community that choosing the right prior — an initial belief re an event expressed in terms of a probability distribution — is crucial for Bayesian inference. Many recent Bayesian deep learning models however resort to established but un … | Continue reading

@syncedreview.com | 2 years ago

Facebook Transfer Learning Method Boosts Code Autocompletion Accuracy by 50%+

Autocompletion, where an application predicts the next item in a text input, has become a convenient and widely used tool in contemporary messaging and other writing tasks. It is also one of the most important features of an integrated development environment (IDE) for computer p … | Continue reading

@syncedreview.com | 2 years ago

An Erlangen Programme to Establish the Geometric Foundations of Deep Learning

A recently published 156-page paper from a team led by Imperial College Professor and Twitter Chief Scientist Michael Bronstein aims to geometrically unify CNN, GNN, LSTM and Transformer architectures from a perspective of symmetry and invariance to build an "Erlangen Programme" … | Continue reading

@syncedreview.com | 2 years ago

Lexical Semantic Influence Network Analyzes 19th Century Abolitionist Newspapers

Researchers from Google, Georgia Tech and Emory University examine the semantic changes in abolitionist newspapers and explore whether the papers exhibited leadership or were followers, providing an overall picture of their network of semantic influence. | Continue reading

@syncedreview.com | 2 years ago

Biologically Inspired Optimizer Boosts FCNN and SNN Training

In the almost 80 years since they were first envisioned, compute-fueled artificial neural networks (ANNs) have made tremendous progress toward replicating the function of the mammalian brains that inspired them. Although today’s systems have surpassed humans in many tasks, fundam … | Continue reading