EXPLORATORY ANALYSIS – WHEN TO CHOOSE R, PYTHON, TABLEAU OR A COMBINATION

EXPLORATORY ANALYSIS...

Not all data analysis tools are created equal. Recently, I started looking into data sets to compete in Go Code Colorado (check it out if you live in CO). The problem with such diversity in data sets is finding a way to quickly visualize the data and do exploratory analysis. While tools like Tableau make data visualization […]

Streaming in Python ...

This work is supported by Continuum Analytics, and the Data Driven Discovery Initiative from the Moore Foundation. This blogpost is about experimental software. The project may change or be abandoned without warning. You should not depend on anything within this blogpost. This week I built a small streaming library for Python. This was originally an exercise to help me […]

On Taking Things Too Seriously: Holiday Edition

On Taking Things Too...

For some reason Atlanta got a pretty significant amount of snow yesterday, and because of that I’ve been mostly stuck at home. When faced with that kind of time on hand, sometimes I spend too much time on things that don’t really matter all that much. Recently, I’ve been fascinated with rating systems (see a […]

Ripyr: Sampled Metrics on Datasets Using Python’s Asuncio

Ripyr: Sampled Metri...

Today I’d like to introduce a little python library I’ve toyed around with here and there for the past year or so, ripyr. Originally it was written just as an excuse to try out some newer features in modern python: asyncio and type hinting. The whole package is type hinted, which turned out to be […]

2 Ways to Implement Multinomial Logistic Regression in Python

2 Ways to Implement ...

Logistic regression is one of the most popular supervised classification algorithm. This classification algorithm mostly used for solving binary classification problems. People follow the myth that logistic regression is only useful for the binary classification problems. Which is not true. Logistic regression algorithm can also use to solve the multi-classification problems. So in this article, your are going to […]

What is knyfe?

What is knyfe?...

Knyfe is a python utility for rapid exploration of datasets. Use it when you have some kind of dataset and you want to get a feel for how it is composed, run some simple tests on it, or prepare it for further processing. The great thing about knyfe is that you don’t have to know […]

Classifying segmented strokes as characters – Part 3 of an XKCD font saga

Classifying segmente...

In part two of my XKCD font saga I was able to separate strokes from the XKCD handwriting dataset into many smaller images. I also handled the easier cases of merging some of the strokes back together – I particularly focussed on “dotty” or “liney” type glyphs, such as i, !, % and =. Now […]

Segment, extract, and combine features of an image with SciPy and scikit-image – Part 2 of an XKCD font saga

Segment, extract, an...

In part one of XKCD font saga I gave some background on the XKCD handwriting dataset, and took an initial look at image segmentation in order to extract the individual strokes from the scanned image. In this installment, I will apply the technique from part 1, as well as attempting to merge together strokes to […]

Python as a way of thinking

Python as a way of t...

This article contains supporting material for this blog post at Scientific American.  The thesis of the post is that modern programming languages (like Python) are qualitatively different from the first generation (like FORTRAN and C), in ways that make them effective tools for teaching, learning, exploring, and thinking. I presented a longer version of this argument […]