Metablog

The role of containers on MLOps and model production

Container technology has changed the way data science gets done. The original container use case for data science focused on what I call, “environment management”. Configuring software environments is a constant chore, especially in the open source software space, the space in wh … | Continue reading

@blog.dominodatalab.com | 3 years ago

Snowflake and Domino: Better Together

Introduction Arming data science teams with the access and capabilities needed to establish a two-way flow of information is one critical challenge many organizations face when it comes to unlocking value from their modeling efforts. Part of this challenge is that many organiza … | Continue reading

@blog.dominodatalab.com | 3 years ago

Faster data exploration in Jupyter through Lux

Notebooks have become one of the key primary tools for many data scientists. They offer a clear way to collaborate with others throughout the process of data exploration, feature engineering and model fitting and through utilizing some clear best practices, can also become living … | Continue reading

@blog.dominodatalab.com | 3 years ago

Analyzing Large P Small N Data

Guest Post by Bill Shannon, Co-Founder and Managing Partner of BioRankings Introduction High throughput screening technologies have been developed to measure all the molecules of interest in a sample in a single experiment (e.g., the entire genome, the amounts of metabolites, the … | Continue reading

@blog.dominodatalab.com | 3 years ago

Performing non-compartmental analysis with Julia and Pumas AI

When analysing pharmacokinetic data to determine the degree of exposure of a drug and associated pharmacokinetic parameters (e.g., clearance, elimination half-life, maximum observed concentration (), time where the maximum concentration was observed (), Non-Compartmental Analysis … | Continue reading

@blog.dominodatalab.com | 3 years ago

Density-Based Clustering

Original content by Manojit Nandi – Updated by Josh Poduska. Cluster Analysis is an important problem in data analysis. Data scientists use clustering to identify malfunctioning servers, group genes with similar expression patterns, and perform various other applications. There a … | Continue reading

@blog.dominodatalab.com | 3 years ago

Bringing ML to agriculture: Transforming a millennia-old industry

Guest post by Jeff Melching from The Climate Corporation At The Climate Corporation, we aim to help farmers better understand their operations and make better decisions to increase their crop yields in a sustainable way. We’ve developed a model-driven software platform, called Cl … | Continue reading

@blog.dominodatalab.com | 3 years ago

Why models fail to deliver value and what you can do about it

Building models requires a lot of time and effort. Data scientists can spend weeks just trying to find, capture and transform data into decent features for models, not to mention many cycles of training, tuning, and tweaking models so they’re performant. Yet despite all this hard … | Continue reading

@blog.dominodatalab.com | 3 years ago

Evaluating Ray: Distributed Python for Scalability

Dean Wampler provides a distilled overview of Ray, an open source system for scaling Python systems from single machines to large clusters. If you are interested in additional insights, register for the upcoming Ray Summit. Introduction This post is for people making technology d … | Continue reading