Word embeddings in 2017: Trends and future directions

Word embeddings in 2...

Table of contents: Subword-level embeddings OOV handling Evaluation Multi-sense embeddings Beyond words as points Phrases and multi-word expressions Bias Temporal dimension Lack of theoretical understanding Task and domain-specific embeddings Embeddings for multiple languages Embeddings based on other contexts The word2vec method based on skip-gram with negative sampling (Mikolov et al., 2013) [49] was published in […]

Using The Newspaper Library to Scrape News Articles

Using The Newspaper ...

This post is the first of a two-part series in which we apply NLP techniques to analyze articles about big data, data science, and AI. If you are tired of the hassles of web scraping, then this post might be just for you. I occasionally web scrape news articles from the web for NLP/data science projects, such […]

Multi-Task Learning Objectives for Natural Language Processing

Multi-Task Learning ...

In a previous blog post, I discussed how multi-task learning (MTL) can be used to improve the performance of a model by leveraging a related task. Multi-task learning consists of two main components: a) The architecture used for learning and b) the auxiliary task(s) that are trained jointly. Both facets still have a lot of room […]

Enhancing Customer Experience with Natural Language Processing

Enhancing Customer E...

Processing language into actionable components is the future of communication. If you talk to a man in a language he understands, that goes to his head. If you talk to him in his language, that goes to his heart. — Nelson Mandela I would venture to guess that most people had their first encounter with […]

Applying Deep Learning to natural language processing

Applying Deep Learni...

Language is the medium that humans use for conversing. Giving machines the ability to learn human language with natural language processing has given rise to several new products and possibilities that were not previously imaginable. Natural language processing (NLP) is one of the most important technologies present in the information age. Understanding complex language utterances […]

If I loved Natural Language Processing less, I might be able to talk about it more

If I loved Natural L...

In my last post, I did some natural language processing and sentiment analysis for Jane Austen’s most well-known novel, Pride and Prejudice. It was just so much fun that I wanted to extend some of that work and compare across her body of writing. I decided to make an R package for her texts, for […]

Stupid word games

Stupid word games...

Today, Jeroen Ooms announced the appearance on CRAN of an R package for language detection, wrapping the “CLD2″ compact language detector.   Obviously, given a tool like that on a holiday long weekend, my first reaction was to try to confuse it. Two fun games to play with a language detector: Find an obviously English sentence (ideally […]

You Must Allow Me To Tell You How Ardently I Admire and Love Natural Language Processing

You Must Allow Me To...

It is a truth universally acknowledged that sentiment analysis is super fun, and Pride and Prejudice is probably my very favorite book in all of literature, so let’s do some Jane Austen natural language processing. Project Gutenberg makes e-texts available for many, many books, including Pride and Prejudice which is available here. I am using […]

Introduction to Natural Language Processing with NLTK

Introduction to Natu...

Hello all and welcome to the second of the series – NLP with NLTK. The first of the series can be found here, incase you have missed. In this article we will talk about basic NLP concepts and use NLTK to implement the concepts. Contents: Corpus Tokenization/Segmentation Frequency Distribution Conditional Frequency Distribution Normalization Zipf’s law […]