Mjesečna Arhiva: Studeni 2014.

The Long Tail

http://archive.wired.com/wired/archive/12.10/tail_pr.html In 1988, a British mountain climber named Joe Simpson wrote a book called Touching the Void, a harrowing account of near death in the Peruvian Andes. It got good reviews but, only a modest success, it was soon forgotten. … Nastavi čitati

Objavljeno u Nekategorizirano | Ostavi komentar

Scalable Machine Learning With Apache Spark and MLBase

http://www.glassbeam.com/scalable-machine-learning-apache-spark-mlbase/

Objavljeno u Nekategorizirano | Ostavi komentar

Distributed Machine Learning using MLbase

In this talk Ameet Talwalker and Evan Sparks describe their efforts, as part of the MLbase project, to develop a distributed Machine Learning platform on top of Spark.

Objavljeno u Nekategorizirano | Ostavi komentar

Boss Competence and Worker Well-being

http://www.andrewoswald.com/docs/NovArtzGoodallOswald2014.pdf This study offers some of the first formal evidence that that a boss’s technical competence is the single strongest predictor of a worker’s well-being. Sapienti sat.

Objavljeno u Nekategorizirano | Ostavi komentar

K-means Clustering in Spark with Categorical Variables

A useful trick for incorporating categorical variables into k-means clustering in Spark is to encoding those variables as boolean indicators. In statistics, boolean indicator (also known as an dummy variable, indicator variable, categorical variable, or binary variable) is one that … Nastavi čitati

Objavljeno u Nekategorizirano | Ostavi komentar

Hortonworks has filed for an IPO

Hortonworks, a Silicon Valley-based open-source platform for storing and analyzing big data, this afternoon filed for a $100 million IPO, becoming the first Hadoop company to do so.. It plans to trade on the Nasdaq under ticker symbol HDP. Hadoop … Nastavi čitati

Objavljeno u Nekategorizirano | Ostavi komentar

Mahalanobis Clustering

A new clustering algorithm, Mahalanobis clustering, is proposed as an improvement on traditional K-means clustering. In order to mathematically identify clusters in a data set, it is usually necessary to first define a measure of similarity or proximity which will … Nastavi čitati

Objavljeno u Nekategorizirano | Ostavi komentar