While researchers have traditionally employed Gaussian processes (GP) for specifying prior and posterior distributions over functions, this approach becomes computationally expensive when scaled, is limited by the expressivity of its covariance function, and struggles with adapti … | Continue reading
A Google Research team has proposed a novel method for dramatically reducing transformers’ (self-)attention memory requirements. This “trick,” which they believe had been simply overlooked by the machine learning community, addresses the concerning quadratic time and space comple … | Continue reading
The idiom “actions speak louder than words” first appeared in print almost 300 years ago. A new study echoes this view, arguing that combining self-supervised and offline reinforcement learning (RL) could lead to a new class of algorithms that understand the world through actions … | Continue reading
The original 2017 transformer model was designed for natural language processing (NLP), where it achieved SOTA results. Its performance intrigued machine learning researchers, who have since successfully adapted the attention-based architecture to perception tasks in other modali … | Continue reading
As machine learning models become larger and more powerful, researchers are increasingly seeking ways to reduce their huge computational appetites and improve efficiency. Nowhere is this evidenced more than with transformer architectures, whose superior capabilities in handling l … | Continue reading
Baseline datasets such as ImageNet have played an important role in pretraining models in the computer vision field. These datasets however often include images that have technical, ethical and legal shortcomings. Furthermore, current state-of-the-art pretrained models use unsupe … | Continue reading
While contemporary deep learning models continue to achieve outstanding results across a wide range of tasks, these models are known to have huge data appetites. The emergence of large-scale pretrained language models such as Open AI’s GPT-3 has helped reduce the need for task-sp … | Continue reading
Human beings perceive and understand the world by processing high-dimensional inputs from modalities as diverse as vision, audio, touch, proprioception, etc. Yet most machine learning models rely on modality-specific architectures, dealing with a stereotyped set of inputs and out … | Continue reading
Baidu has released ERNIE 3.0, a framework for pretraining knowledge-enhanced language models that integrates both auto-encoder networks and auto-regressive networks. The novel approach achieves state-of-the-art results on Chinese and English language understanding and generation … | Continue reading
Programming competition problems are pervasive in the AI community. They can be used to evaluate programmers’ abilities to solve artificial tasks as well as to test the limits of state-of-the-art algorithms.A research team from MIT, Allen Institute for AI and Microsoft Research r … | Continue reading
A new Google Research study proposes modifying the standard transformer architecture to process byte sequences in natural language processing (NLP). The researchers show that in terms of parameter count, training FLOPs and inference speed, their proposed byte-level models can be … | Continue reading
Deep neural networks (DNNs) already provide the best solutions for many complex problems in image recognition, speech recognition, and natural language processing. Now, DNNs are entering the physical arena. DNNs and physical processes share numerous structural similarities, such … | Continue reading
It’s well known across the machine learning community that choosing the right prior — an initial belief re an event expressed in terms of a probability distribution — is crucial for Bayesian inference. Many recent Bayesian deep learning models however resort to established but un … | Continue reading
Autocompletion, where an application predicts the next item in a text input, has become a convenient and widely used tool in contemporary messaging and other writing tasks. It is also one of the most important features of an integrated development environment (IDE) for computer p … | Continue reading
A recently published 156-page paper from a team led by Imperial College Professor and Twitter Chief Scientist Michael Bronstein aims to geometrically unify CNN, GNN, LSTM and Transformer architectures from a perspective of symmetry and invariance to build an "Erlangen Programme" … | Continue reading
Researchers from Google, Georgia Tech and Emory University examine the semantic changes in abolitionist newspapers and explore whether the papers exhibited leadership or were followers, providing an overall picture of their network of semantic influence. | Continue reading
In the almost 80 years since they were first envisioned, compute-fueled artificial neural networks (ANNs) have made tremendous progress toward replicating the function of the mammalian brains that inspired them. Although today’s systems have surpassed humans in many tasks, fundam … | Continue reading
Large-scale pretrained transformers learn from corpuses containing oceans of factual knowledge, and are surprisingly good at recalling this knowledge without any fine-tuning. In a new paper, a team from Microsoft Research and Peking University peeps into pretrained transformers, … | Continue reading
Large-scale transformer-based language models have produced substantial gains in the field of natural language processing (NLP). Training such models however is challenging, for two reasons: No single GPU has enough memory to accommodate parameter totals which have grown exponent … | Continue reading
A new DeepMind paper introduces two architectures designed for the efficient use of Tensor Processing Units (TPUs) in reinforcement learning (RL) research at scale.Deep learning (DL) frameworks such as TensorFlow, PyTorch and JAX enable easy, rapid model prototyping while also op … | Continue reading
A new Google Brain and New York University study argues that the current evaluation techniques for natural language understanding (NLU) tasks are broken, and proposes guidelines designed to produce better NLU benchmarks.Contemporary NLU studies tend to focus on improving results … | Continue reading
The Beijing Academy of Artificial Intelligence (BAAI) releases Wu Dao 1.0, China’s first large-scale pretraining model. | Continue reading
University of Oxford researchers propose COIN, a novel image compression method that stores the weights of an MLP overfitted to an image and outperforms JPEG at low bitrates even without entropy coding. | Continue reading
University of Toronto researchers propose a BERT-inspired training approach as a self-supervised pretraining step to enable deep neural networks to leverage newly and publicly available massive EEG (electroencephalography) datasets for downstream brain-computer-interface (BCI) ap … | Continue reading
Researchers from UC Berkeley and Google Research have introduced BoTNet, a "conceptually simple yet powerful" backbone architecture that boosts performance on computer vision (CV) tasks such as image classification, object detection and instance segmentation. Researchers from UC … | Continue reading
Google Brain’s Switch Transformer language model packs a whopping 1.6 trillion parameters while effectively controlling computational cost. The model achieved a 4x pretraining speedup over a strongly tuned T5-XXL baseline. | Continue reading
In a new paper, researchers from Max Planck Institute for Informatics and Facebook Reality Labs propose an end-to-end trainable method that enables re-rendering of humans from one single image. | Continue reading
Turing Award winner Geoffrey Everest Hinton spoke about his recent research, his quest to understand learning in the brain and why he sees unsupervised learning as the future of AI. | Continue reading
In the new paper Canonical Capsules: Unsupervised Capsules in Canonical Pose, Turing Award Honoree Dr. Geoffrey Hinton and a team of researchers propose an architecture for unsupervised learning wi… | Continue reading
This year, 22 Transformer-related research papers were accepted by NeurIPS, the world’s most prestigious machine learning conference. Synced has selected ten of these works to showcase the la… | Continue reading
The new AI-powered Multi-Ingredient Pizza Generator (MPG) can deliver all these mouth-watering pies and many more. | Continue reading
A Princeton student designed a GAN framework for Chinese landscape painting generation that is so effective most humans can’t distinguish its works from the real thing. | Continue reading
Amazon Alexa AI paper asks whether NLU problems could be mapped to question-answering (QA) problems using transfer learning. | Continue reading
DeepMind and Universidade de Lisboa introduce graph-based toolkit for analyzing and comparing multiplayer games | Continue reading
Cornell and Facebook AI “Correct and Smooth” high-accuracy graph learning method is fast to train and outperforms big Graph Neural Network (GNN) models | Continue reading
In a new paper, researchers from Google, OpenAI, and DeepMind introduce “behaviour priors,” a framework designed to capture common movement and interaction patterns that are shared acro… | Continue reading
Facebook AI says DNNs can perform well without class specific neurons and overreliance on intuition-based methods for understanding DNNs can be misleading. | Continue reading
Probability trees may have been around for decades, but they have received little attention from the AI and ML community. | Continue reading
Now, just in time for costume season, another indie developer has taken facial image transfer tech to the opposite end of the cuteness spectrum, building a zombie generator. | Continue reading
Google recently introduced mT5, a multilingual variant of its “Text-to-Text Transfer Transformer” (T5), pretrained on a new Common Crawl-based dataset covering 101 languages. | Continue reading
Trust in AI systems is becoming, if not already, the biggest barrier for enterprises — as they start to move from exploring AI or potentially piloting or doing some proof of concept works into depl… | Continue reading
Google Brain has improved the SOTA on the LibriSpeech automatic speech recognition task, with their score of 1.4 percent/ 2.6 percent word-error-rates. | Continue reading
Japanese AI startup Preferred Networks (PFN) is moving ChainerRL to the PyTorch ecosystem. | Continue reading
ICLR 2021 submission proposes LambdaNetworks, a transformer-specific method that reduces costs of modeling long-range interactions for CV and other applications. | Continue reading
Facebook AI open-sourced a multilingual machine translation (MMT) model that translates between any pair of 100 languages without relying on English data. | Continue reading
The researchers identify just how much AI research might benefit from the field of animal cognition. | Continue reading
An international research team is suggesting AI might become even more efficient and reliable if it learns to think more like worms. | Continue reading
A new Facebook AI and CMU renewable energy storage project could enable labs to perform days of electrocatalyst screening and calculations in just seconds. | Continue reading