Data Science for Good, Part 2

Data Science for Goo...

Introduction This is the second of a three-article series about Data Science for Good. This article introduces people, organizations, and projects that use data science for good. The first article explained what the idea of data science is about and how you can get involved in it. The third and last article will discuss resources […]

CatBoost: Yandex’s machine learning algorithm is available free of charge

CatBoost: YandexR...

Russia’s Internet giant Yandex has launched CatBoost, an open source machine learning service. The algorithm has already been integrated by the European Organization for Nuclear Research to analyze data from the Large Hadron Collider, the world’s most sophisticated experimental facility. Machine learning helps make decisions by analyzing data and can be used in many different […]

Open Source is one of the engines of the world’s economy and culture. Its next iteration will be bigger.

Open Source is one o...

Once upon a time, the very concept of Open Source was absurd, and only its proponents ever thought it could be other than marginal. Important software could only be built and supported by sophisticated businesses, an expensive industrial component whose blueprints — the source code — was extremely valuable. But Open Source won. It became clear, to […]

2017 ODSC West Data Science Award: pandas

2017 ODSC West Data ...

  The ODSC team was delighted to present the second Outstanding Data Science Project Award to ‘Pandas’ at ODSC West on November 3rd.    Why ODSC is gives these awards… Most data scientists/developers use an open source language, tool, software or platform daily. All of these resources available because their contributors who have dedicated countless […]

An Open Source Triple Feature

An Open Source Tripl...

Editor’s note: The following three experts shared their industry insight at OpenSec2017.   Jen Andre, founder and CEO of Komand.   At Komand, Jen empowers security teams to focus on efficient incident response and decision making by offering the automation of manual tasks, and a space to share this automation and knowhow with the wider security […]

Supporting Users in Open Source

Supporting Users in ...

What are the social expectations of open source developers to help users understand their projects? What are the social expectations of users when asking for help? As part of developing Dask, an open source library with growing adoption, I directly interact with users over GitHub issues for bug reports, StackOverflow for usage questions, a mailing […]

Quantifying Productivity

Quantifying Producti...

I’m always on a lookout for interesting datasets to collect, analyze and interpret. And what better dataset to collect/analyze than the meta-dataset of my own activity collecting/analyzing other datasets? How much time do I *really spend working per day? How do I spend most of that time? What makes me productive? These are all relatively […]

Open Source, the Work of Thousands

Open Source, the Wor...

When I was helping to develop the MVS/XA mainframe operating system at IBM in the 1980s, we had a disciplined process for software development. We knew that a bug fixed in requirements was a hundred times cheaper than if we repaired it after it was out in the field. So we were careful and diligent, […]

Praveen Seluka – Is Apache Spark Ready for the Cloud?

Praveen Seluka ̵...

Abstract: Since it was open sourced in 2010, Apache Spark has grown to become one of the largest open source communities in big data, with over 200 contributors from more than 50 organizations. Apache Spark, unlike MapReduce, is all about performing sophisticated analytics at lightning fast speed. According to stats on Apache.org, Spark can “run […]