Machine Learning Product Management: Lessons Learned

This Domino Data Science Field Note covers Pete Skomoroch’s recent Strata London talk. It focuses on his ML product management insights and lessons learned. If you are interested in hearing more practical insights on ML or AI product management, then consider attending Pete’s upc … | Continue reading


@blog.dominodatalab.com | 5 years ago

Announcing Domino 3.4: Furthering Collaboration with Activity Feed

Our last release, Domino 3.3 saw the addition of two major capabilities: Datasets and Experiment Manager. “Datasets”, a high-performance, revisioned data store offers data scientists the flexibility they need to make use of large data resources when developing models. And “Experi … | Continue reading


@blog.dominodatalab.com | 5 years ago

Addressing Irreproducibility in the Wild

This Domino Data Science Field Note provides highlights and excerpted slides from Chloe Mawer’s “The Ingredients of a Reproducible Machine Learning Model” talk at a recent WiMLDS meetup. Mawer is a Principal Data Scientist at Lineage Logistics as well as an Adjunct Lecturer at No … | Continue reading


@blog.dominodatalab.com | 5 years ago

Addressing Irreproducibility in the Wild

This Domino Data Science Field Note provides highlights and excerpted slides from Chloe Mawer’s “The Ingredients of a Reproducible Machine Learning Model” talk at a recent WiMLDS meetup. Mawer is a Principal Data Scientist at Lineage Logistics as well as an Adjunct Lecturer at No … | Continue reading


@blog.dominodatalab.com | 5 years ago

Can Data Science Help Us Make Sense of the Mueller Report?

This blog post provides insights on how to apply Natural Language Processing (NLP) techniques. A complementary Domino project is available. The Mueller Report The Mueller Report, officially known as the Report on the Investigation into Russian Interference in the 2016 Presidentia … | Continue reading


@blog.dominodatalab.com | 5 years ago

Using Data Science to Make Sense of the Mueller Report

This blog post provides insights on how to apply Natural Language Processing (NLP) techniques. A complementary Domino project is available. The Mueller Report The Mueller Report, officially known as the Report on the Investigation into Russian Interference in the 2016 Presidentia … | Continue reading


@blog.dominodatalab.com | 5 years ago

Machine Learning in Production: Software Architecture

Continue reading


@blog.dominodatalab.com | 5 years ago

Comparing the Functionality of Open Source Natural Language Processing Libraries

Continue reading


@blog.dominodatalab.com | 5 years ago

Domino 3.3: Datasets and Experiment Manager

Continue reading


@blog.dominodatalab.com | 5 years ago

Data Science Themes and Conferences per Pacoid

In Paco Nathan's latest column, he explores the role of curiosity in data science work as well as Rev 2, an upcoming summit for data science leaders. Intro | Continue reading


@blog.dominodatalab.com | 5 years ago

Reflections on the Data Science Platform Market

Reflections Before we get too far into 2019, I wanted to take a brief moment to reflect on some of the changes we’ve seen in the market. In 2018 we saw the | Continue reading


@blog.dominodatalab.com | 5 years ago

Model Interpretability with TCAV (Testing with Concept Activation Vectors)

This Domino Data Science Field Note provides very distilled insights and excerpts from Been Kim’s recent MLConf 2018 talk and research about Testing with C | Continue reading


@blog.dominodatalab.com | 5 years ago

Using SHAP and LIME

This blog post provides insights on how to use the SHAP and LIME Python libraries in practice and how to interpret their output, helping readers prepare to | Continue reading


@blog.dominodatalab.com | 5 years ago

Creating Multi-Language Pipelines with Apache Spark

In this guest post, Holden Karau, Apache Spark Committer, provides insights on how to create multi-language pipelines with Apache Spark and avoid rewriting | Continue reading


@blog.dominodatalab.com | 5 years ago

Data Science vs. Engineering: Tension Points

This blog post provides highlights and a full written transcript from the panel, “Data Science Versus Engineering: Does It Really Have To Be This Way?” wit | Continue reading


@blog.dominodatalab.com | 5 years ago

Visual Logic Authoring vs. Code (2016)

At some point in their careers, almost every data scientist has written code to perform a series of steps, and thought, “It would be great if I could build | Continue reading


@blog.dominodatalab.com | 5 years ago

Using Bayesian Methods to Clean Up Human Labels

Derrick Higgins, AmFam Data Science & Analytics, discusses how Bayesian methods can be applied to improve the quality of annotated training sets. Sessi | Continue reading


@blog.dominodatalab.com | 5 years ago

SHAP and LIME Python Libraries

This blog post provides a brief technical introduction to the SHAP and LIME Python libraries, followed by code and output to highlight a few pros and cons | Continue reading


@blog.dominodatalab.com | 5 years ago

Making PySpark Work with SpaCy: Overcoming Serialization Errors

In this guest post, Holden Karau, Apache Spark Committer, provides insights on how to use spaCy to process text data. Karau is a Developer Advocate at Goog | Continue reading


@blog.dominodatalab.com | 5 years ago

Collaboration Between Data Science and Data Engineering: True or False?

This blog post includes candid insights about addressing tension points that arise when people collaborate on developing and deploying models. Domino’s Hea | Continue reading


@blog.dominodatalab.com | 5 years ago

Growing Data Scientists into Manager Roles

In this post, Ricky Chachra, Research Science Manager at Lyft, provides insight for companies looking to home-grow their promising individual contributors | Continue reading


@blog.dominodatalab.com | 5 years ago

Domino 3.0: New Features and User Experiences to Help the World Run on Models

This blog post introduces new Domino 3.0 features. | Continue reading


@blog.dominodatalab.com | 5 years ago

Justified Algorithmic Forgiveness

Research, insights, and implications to consider when developing predictive risk assessment models | Continue reading


@blog.dominodatalab.com | 5 years ago

Themes and Conferences per Pacoid

Covers themes of data science for accountability, reinforcement learning challenges assumptions, as well as surprises within AI and Economics. | Continue reading


@blog.dominodatalab.com | 5 years ago

Trust in LIME: Yes, No, Maybe So?

Brief Overview of LIME (Local Interpretable Model-Agnostic Explanations) | Continue reading


@blog.dominodatalab.com | 5 years ago

Item Response Theory in R for Survey Analysis

In this guest blog post, Derrick Higgins, of American Family Insurance, covers item response theory (IRT) and how data scientists can apply it within a pro | Continue reading


@blog.dominodatalab.com | 5 years ago

Data Science Themes and Conferences

Introduction: New Monthly Series! Welcome to a new monthly series! I’ll summarize highlights from recent industry conferences, new open source projects, in | Continue reading


@blog.dominodatalab.com | 5 years ago

Make Machine Learning Interpretability More Rigorous

This Domino Data Science Field Note covers a proposed definition of machine learning interpretability, why interpretability matters, and the arguments for | Continue reading


@blog.dominodatalab.com | 5 years ago

Make Machine Learning Interpretability More Rigorous

This Domino Data Science Field Note covers a proposed definition of machine learning interpretability, why interpretability matters, and the arguments for | Continue reading


@blog.dominodatalab.com | 5 years ago

Learn from the Reproducibility Crisis in Science

Key highlights from Clare Gollnick’s talk, “The limits of inference: what data scientists can learn from the reproducibility crisis in science”, are covere | Continue reading


@blog.dominodatalab.com | 5 years ago

Feature Engineering Techniques

This Domino Field Note provides highlights and excerpted slides from Amanda Casari’s “Feature Engineering for Machine Learning” talk at QCon Sao Paulo. Cas | Continue reading


@blog.dominodatalab.com | 5 years ago

Feature Engineering: A Framework and Techniques

Continue reading


@blog.dominodatalab.com | 5 years ago

Simple Worrying Stats Problems

In this guest post, Sean Owen, writes about three data situations that provide ambiguous results and how causation helps clarifies the interpretation of da | Continue reading


@blog.dominodatalab.com | 5 years ago

Data Science Use Cases

In this post, Don Miner covers how to identify, evaluate, prioritize, and pick which data science problems to work on next. Don is a cofounder of Miner | Continue reading


@blog.dominodatalab.com | 5 years ago

Data Science is more than Machine Learning

This Domino Data Science Field Note provides highlights and video clips from Addhyan Pandey’s Domino Data Pop-Up talk, “Leveraging Data Science in the Auto | Continue reading


@blog.dominodatalab.com | 5 years ago

Classify all the Things

Derrick Higgins of American Family Insurance presented a talk, “Classify all the Things (with multiple labels): The most common type of modeling task no on | Continue reading


@blog.dominodatalab.com | 5 years ago

Avoiding a Data Science Hype Bubble

In this post, Josh Poduska, Chief Data Scientist at Domino Data Lab, advocates for a common taxonomy of terms within the data science industry. The propose | Continue reading


@blog.dominodatalab.com | 5 years ago

On the Importance of Community-Led Open Source

Wes McKinney, Director of Ursa Labs and creator of pandas project, presented the keynote, "Advancing Data Science Through Open Source" at Rev. McKinney's k | Continue reading


@blog.dominodatalab.com | 5 years ago

Model Evaluation

This Domino Data Science Field Note provides some highlights of Alice Zheng’s report, "Evaluating Machine Learning Models", including evaluation metrics fo | Continue reading


@blog.dominodatalab.com | 5 years ago

Data Science Models Build on Each Other

Alex Leeds, presented “Building Up Local Models of Customers” at a Domino Data Science Popup. Leeds discussed how the Squarespace data science team built m | Continue reading


@blog.dominodatalab.com | 5 years ago

Humans in the Loop (2017)

This guest blog post from Paco Nathan dives into how people and machines collaborating together to perform work is real and not science fiction. Paco Natha | Continue reading


@blog.dominodatalab.com | 5 years ago

Ingesting Kate Crawford’s “The Trouble with Bias”

Kate Crawford discussed bias at a recent SF-based City Arts and Lectures talk and a recording of the discussion will be broadcast, May 6th, on KQED and loc | Continue reading


@blog.dominodatalab.com | 6 years ago