Editor's Note: This post is part of a series based on the research conducted in District Data Labs' NLP Research Lab. Make sure to check out NLP Research Lab Part 1: Distributed Representations.
Chances are, if you’ve been working in Natural Language Processing (NLP) or machine learning, you’ve heard of the class of approaches called . . .
How I Learned To Stop Worrying And Love Word Embeddings
Editor's Note: This post is part of a series based on the research conducted in District Data Labs' NLP Research Lab.
This post is about Distributed Representations, a concept that is foundational not only to the understanding of data processing in machine learning, but also to the understanding of information processing and storage . . .
Visualizing Text with Python
In this article, we explore two extremely powerful ways to visualize text: word bubbles and word networks. These two visualizations are replacing word clouds as the defacto text visualization of choice because they are simple to create, understandable, and provide deep and valuable at-a-glance insights. In this post, we will examine how to . . .
Overview of our Talk, Tutorial, Posters, and Sprints
Last week, a group of us from District Data Labs flew to Portland, Oregon to attend PyCon, the largest annual gathering for the Python community. We had a talk, a tutorial, and two posters accepted to the conference, and we also hosted development sprints for several open source projects. With this blog post, we are putting everything . . .
Visual Evaluation and Parameter Tuning
Welcome back! In this final installment of Visual Diagnostics for More Informed Machine Learning, we'll close the loop on visualization tools for navigating the different phases of the machine learning workflow. Recall that we are framing the workflow in terms of the . . .
PyCon 2016 Tutorial on Sunday May 29, 2016 at 9am
This post is designed to point you to the resources that you need in order to prepare for the NLP tutorial at PyCon this coming weekend! If you have any questions, please contact us according to the directions at the end of the post.
In this tutorial, we will explore the features of the NLTK library for text processing in order to build . . .
Demystifying Model Selection
Note: Before starting Part 2, be sure to read Part 1!
When it comes to machine learning, ultimately the most important picture to have is the big picture. Discussions of (i.e. arguments about) machine learning are usually about which model is the best. Whether it's logistic regression, random forests, Bayesian methods, support vector . . .