Tags: Personal Tools and Resources Statistics IPython Data Analysis Book Video Ice Cream Crimes Curriculum Think Stats Terms Distribution Exposition Inferential Stats Visualization Ethics Data Expert Theory Bayesian Research Paper Future DataSci for Good Machine Learning TensorFlow Data Points

### Data Points: Books

##### 2017-06-19

In this installation of Data Points, I include several books that have been instrumental in shaping my career and life in general.

### Cargo Cult Science

##### 2017-06-02

In his 1974 Caltech Commencement Address, Richard Feynman speaks on scientific integrity, which equally applies to practitioners of data science.

### Data Points: Publications

##### 2017-04-17

I'm starting a new monthly series called Data Points where instead of taking a deep dive into a topic, I touch on several external sources that speak to me, and hopefully some of which will have an impact on you, as well.

### A New Job Direction

##### 2017-04-03

This post includes a few details about a new job I've accepted, and what that means for this blog.

### TensorFlow Dev Summit 2017, Part 2

##### 2017-03-20

This post is Part 2 of my survey of the 2017 TensorFlow Dev Summit. In this round of videos, there are some impressive applications of TensorFlow.

### TensorFlow Dev Summit 2017, Part 1

##### 2017-03-06

After the TensorFlow introduction from my last post, I thought it would be interested to take stock in where the community is currently from their 2017 developer summit.

### Intro to Tensor Flow

##### 2017-02-20

This week we dip our toes into Google's open source machine learning library and get our hands dirty with an example program. Wait, so are we washing our feet or dirtying our hands? Oh, never mind.

### Medicine Effectiveness Experiement

##### 2017-01-30

Two weeks ago I posted an experiement, and in this post I'm going to analyze the results.

### A Quick Experiement

##### 2017-01-16

For the first time on the blog, I'm actively collecting experimental data which I'll analyze in a future post.

### The Year Ahead

##### 2017-01-02

I'm excited for 2017 for a number of reasons. At the top of that list are a few new opportunities I'll update you on here.

### The Santy Claus Problem

##### 2016-12-19

In this seasonal take on the classic Monty Hall problem, we'll look at yet another paradoxical result of probability in action.

### Finals

##### 2016-12-05

As I enter the last week of my first semester of the OMSCS program, I have a little perspective to offer.

### Improving a Car Theft Visualization

##### 2016-11-21

I ran across a visualization the other day that I thought could use some improving.

### Data Visualization at Netflix, Part 2

##### 2016-11-07

In this post, I continue with the Netflix interviews from a previous post to get a deeper look at how data analysis and visualization comes into play at the company.

### The Thinking Eye

##### 2016-10-24

Edward Tufte gave a talk at Art Center College of Design on some of his work that hasn't been published yet. Tufte is an incredible visionary, and this talk doesn't disappoint.

### Data in Science

##### 2016-10-03

Data Science and the traditional sciences are similar in so many ways, and communication between the two is vital to both communities.

### Lessons from my Online Masters

##### 2016-09-19

Taking grad courses online isn't what I was expecting, but a few weeks in, I think I already have some advice for anyone considering it.

### Data Visualization at Netflix

##### 2016-09-05

A look into how data science is used at Netflix, a company that relies heavily on data-driven functionality.

### OMSCS: First Semester

##### 2016-08-29

My first semester of grad school has officially started. In this update, I'll talk about my experience so far, and how it might affect this blog.

### Crime Fighting Algorithms

##### 2016-08-22

Data and machine learning have the potential to save human lives in a variety of contexts, but in every such instance, ethical concerns are raised as well.

### Cases of Deanonymization

##### 2016-08-15

The mantra of big data is 'more is more', but this sentiment must be tempered with a respect for privacy, in my opinion. In this post, I'll look at some cases where identities were exposed not by malice but by lack of rigor.

### Entropy and the Central Limit Theorem

##### 2016-08-08

The Central Limit Theorem is extremely fundamental to statistics, but it's so fundamental that it pops up in other places, like physics, too.

### Practicing Algorithms

##### 2016-08-01

Algorithmic problem solving skills are crucial to data science, and as such, is a skill that deserves constant sharpening.

### Simpson's Paradox

##### 2016-06-27

People say 'You can make statistics say anything', but that's only true if you don't know how to spot the warning signs of bad statistics.

### Local Data Projects

##### 2016-06-20

Lately, I've been attending local meetups for civically minded data science projects. The one I attended last week had amazing projects and presenters.

### Notes from 'The Visual Display of Quantitative Information'

##### 2016-06-13

I've been studying the incomparable works on visualization by Edward Tufte, and I'm sharing my notes here along with some general self-study tips.

### Data in Civic Tech

##### 2016-06-06

The best way to enter into the world of Data Science is to practice Data Science. A great way to get involved (and to make a real difference in the world) is to join the civic tech community in your area.

### The Intelligence Age

##### 2016-05-30

Human progress can often be grouped into phases, from the Stone Age to the Iron Age to the Information Age and beyond. We may be on the brink of another technological revolution powered by AI.

### Data Ethics and Privacy

##### 2016-05-23

This post is a continuation on the concepts of previous data privacy posts, focusing on the perspective of Eleanor Saitta, Etsy's new Security Architect.

### Experimenting with Bayes

##### 2016-05-16

The kind of statistics that have been covered in previous posts has mostly been Frequentist statistics. This post goes into the basics of Bayesian statistics with a look at experimental design.

### Bias in Supervised Machine Learning

##### 2016-05-09

Algorithmic bias can pop up in unexpected places if you don't safeguard against it.

### Numeric Calculation Workflow

##### 2016-05-02

A Data Scientist is often concerned with optimization problems, so when I find a great workflow for getting a task done efficiently, I immediately want to incorporate it into my process.

### Online Masters in Computer Science

##### 2016-04-25

As I wrap up my Udacity Data Analyst Nano Degree, I look forward to my next steps as a Data Journeyman.

### U.S. Legislative Process Visualization

##### 2016-04-18

A visualization project I did for my Udacity nano degree program.

### Jen Christiansen's Four Visualization Lenses

##### 2016-04-11

The senior graphics editor at Scientific American magazine, Jen Christiansen, has four rules of thumb for when a data visualization is appropriate.

### Newsletters

##### 2016-04-04

I like to stay up to date on data science news, and a great way to do so is through newsletters. I have a couple of tips for managing them.

### HemoVis: A Visualization Case Study

##### 2016-03-28

My visualization posts so far have covered a lot on the theory side of things. Armed with that background of theory, we can appreciate a very inspiring case where data visualization actually saved lives.

### Preattentive Processing

##### 2016-03-21

Tying together concepts from the past three posts, the principals of preattentive processing give your visualization that extra punch that can make communication with your audience more effective.

### Color

##### 2016-03-14

Color is one of the most misused visual encodings, so I'm dedicating an entire post on its dos and don'ts.

### Visual Encodings

##### 2016-03-07

Visual encodings are the building blocks of data visualizations, so before we go any further with visualization posts, we need to go over them.

### Data Visualization

##### 2016-02-29

This introductory post to data visualization will be the first of a several-week series on the subject.

### Transparency Versus Privacy

##### 2016-02-21

As proponents of the data revolution will often say, more data is always better. But is this actually the case?

### Uncertainty in Data Science

##### 2016-02-15

What is the role of uncertainty in data science? It definitely needs to be part of the equation. Um...probably.

### Being Scientifically Minded

##### 2016-02-08

I take an introspective look at what the scientist part of data scientist really means in terms of one's personal worldview.

### Correlation Nation

##### 2015-11-23

'Correlation does not imply causation' is data science mantra, but in this post I take a look at another problem with reports of correlations.

### Pivot

##### 2015-11-16

After many weeks of not posting and much consideration, I'm taking this blog in a new direction.

### Looking Ahead

##### 2015-07-06

I'm making a personal shift in my data science studies, and we're making a curriculum shift as we move on to explore Bayesian statistics.

### Set Theory and Infinity

##### 2015-06-15

Taking a quick break our statistics curriculum, let's dive into the world of theoretical set theory. Don't worry, our previously scheduled programming will continue next week when we start tackling Bayesian Statistics.

### The Illusion of Causality Analysis

##### 2015-06-01

We previously looked at a paper that showed a concept called the illusion of causality. Now that we have the tools to check the results of that paper, we're going to do just that.

### ANOVA

##### 2015-05-25

This week we wrap up our Inferential Statistics course with a look at Analysis of Variance (ANOVA), a very common technique to test the relationship of outcomes among multiple groups.

### Chi-Squared Tests

##### 2015-05-18

We've done some hypothesis tests for normal distributions (and t distributions when appropriate). Now we'll look how we can use the chi-squared distribution to perform hypothesis tests on other distributions.

### Comparing Populations

##### 2015-05-11

We've looked at confidence intervals and hypothesis tests by comparing a sample to an entire population (with well defined parameters), but what changes if you replace that population with another sample?

### The Button: An Alternate Timeline

##### 2015-04-26

A simple April Fool's Day stunt turned into a fascinating social experiment, but was derailed weeks into it by technical issues. I'll offer an analysis that will hopefully bring closure to those who were invested in the experiment.

### Hypothesis Testing

##### 2015-04-13

We will look at hypothesis testing by way of an example problem.

### Confidence Intervals

##### 2015-04-06

A look at the concept of confidence intervals (as opposed to point estimators) by way of several examples, including one that introduces a new distribution, the t distribution.

### Basic Discrete Distributions

##### 2015-03-30

A quick look at some common discrete distributions.

### The Illusion of Causality

##### 2015-03-24

Incorrectly recognizing a relationship as causal is so hard-wired into the human psyche, experts have given it a name: causal illusion.

### Inferential Statistics

##### 2015-03-16

Moving on from

*Think Stats*, we'll apply many of its probability concepts by looking into the Inferrential Statistics curriculum on Khan Academy.### Correlation

##### 2015-03-09

This week we wrap up our studies on the

*Think Stats*book with the subject of its final chapter, correlation.### The German Tank Problem

##### 2015-03-02

With the concept of estimators in hand, we'll take on an actual wartime usage of the concept.

### Estimation

##### 2015-02-23

The idea of estimating a distribution's parameters is often glossed over, but it's important to know the difference between an estimated parameter and a true parameter.

### Wikipedia With Academic Styling

##### 2015-02-16

Wikipedia is an unrivaled source of information, but who says you can't have brains and looks?

### Texas Sharpshooter Fallacy

##### 2015-02-09

Not all misrepresentations of data are of malicious intent. Sometimes a misrepresentation arises from a lack of due diligence.

### The Distribution Within

##### 2015-02-02

If you've been reading along and are convinced of the value of modeling your data with a well-defined distribution, then understanding how to know which distribution is a good fit for you data is the important next step.

### Basic Continuous Distributions

##### 2015-01-26

Examining well-known probability distributions will give us a lot to chew on in our path to understanding data. We'll start of with 4 common continuous distributions.

### Why Data Science Will Power the Future?

##### 2014-11-13

Udacity's Co-Founder and CEO, Sabastian Thrun, and VP Engineering and Data Science, Nitin Sharma, answer some aspects to this question.

### Probability and Statistics Terms

##### 2014-11-06

Starting out with

*Think Stats*will require us to cover the preliminary definitions that the book covers.### Think Stats and the Data Journeyman Curriculum

##### 2014-10-29

I'm starting a Curriculum tag to track a linear and (hopefully) fairly complete progression of data science skills and knowledge, starting with a statistics refresher book.

### Misrepresented Healthcare Poll Data

##### 2014-10-11

When I found myself in the weeds of the comment section of a politically charged opinion piece, I couldn't stand by and let some misleading data representations slide.

### Chocolate Consumption and Cognitive Function

##### 2014-10-03

Dr. Messerli takes some liberties in his paper comparing chocolate consumption to cognitive abilities and reaches a dubious conclusion.

### Finding Data Sets

##### 2014-09-28

Finding good (and preferably free) data sets online can be challenging, but hopefully this list will help you get jump started on your next data project.

### A Failed Experiment

##### 2014-09-23

Trying to build a case for the claim "Ice Cream Sales and Violent Crime Rates are positively correlated" proves to be harder than I expected.

### Ice Cream Crimes

##### 2014-09-16

I'm kicking off a new series of posts to point out incorrectly applied data analysis techniques.

### Thinking With Data

##### 2014-09-12

In this post I'll take a look at Max Shron's book

*Thinking With Data: How to Turn Information into Insights*.### Birthday Paradox

##### 2014-09-06

In honor of my birthday weekend, here's a look at one of the first unintuitive statistics results I ever encountered, which just so happens to deal with birthdays.

### Data Tools

##### 2014-08-31

An overview of the tools that are used throughout this blog.

### Data Journeyman

##### 2014-08-28

In this prefatory post, I will answer several starting questions.

Who am I? What will this blog cover? What is a Data Journeyman, anyway?