Data Mining

Data mining blogs

Finding the closest pair in datat using PROC MODECLUS

May 9, 2013
By
Finding the closest pair in datat using PROC MODECLUS

    UPDATE: Rick Wicklin kindly shared his visualization efforts on the output to put a more straightforward sense on the results. Thanks. Here is the code, run after my code below. Note that this is designed for K=2.   proc iml...

Read more »

Metric driven Agile for Big Data

April 20, 2013
By
Metric driven Agile for Big Data

Working in Bing Local Search brings together a number of interesting challenges. Firstly, we are in a moderately sized organization, which means that our org chart has some rough similarities to our high level system architecture. This means that we...

Read more »

CFP: the 11th Australasian Data Mining Conference (AusDM 2013), submission due 15 July

April 3, 2013
By
CFP: the 11th Australasian Data Mining Conference (AusDM 2013), submission due 15 July

********************************************************************* The 11th Australasian Data Mining Conference (AusDM 2013) Canberra, Australia, 13-15 November 2013, http://ausdm13.togaware.com Join us on LinkedIn: http://www.linkedin.com/groups/AusDM-4907891 ********************************************************************* Data mining, the art and science of intelligent analysis of (usually large) data sets for meaningful (and previously unknown) … Continue reading →

Read more »

Large Scale Linear Mixed Model

March 26, 2013
By
Large Scale Linear Mixed Model

Update at the end: ****************************; Bob at r4stats.com claimed that a linear mixed model with over 5 million observations and 2 million levels of random effects was fit using lme4 package in R: I am always interested in large scale mi...

Read more »

Call for participation: DMApps 2013 – an International Workshop on Data Mining Applications in Industry and Government

March 10, 2013
By
Call for participation: DMApps 2013 – an International Workshop on Data Mining Applications in Industry and Government

Call for participation: DMApps 2013 – an International Workshop on Data Mining Applications in Industry and Government in conjunction with PAKDD 2013, Gold Coast, Australia, April 14, 2013 http://dmapps2013.rdatamining.com To attend the workshop, you need to register for PAKDD 2013 … Continue reading →

Read more »

Poor man’s HPQLIM?

February 27, 2013
By
Poor man’s HPQLIM?

Tobit model is a type of censored regression and is one of the most important regression models you will encounter in business. Amemiya 1984 classified Tobit models into 5 categories and interested reader can refer to SAS online doc for details. In SAS...

Read more »

O Knoweldge Graph, Where Art Thou?

February 11, 2013
By
O Knoweldge Graph, Where Art Thou?

The web search community, in recent months and years, has heard quite a bit about the 'knowledge graph'. The basic concept is reasonably straightforward - instead of a graph of pages, we propose a graph of knowledge where the nodes...

Read more »

O Knowledge Graph, Where Art Thou?

February 11, 2013
By
O Knowledge Graph, Where Art Thou?

The web search community, in recent months and years, has heard quite a bit about the 'knowledge graph'. The basic concept is reasonably straightforward - instead of a graph of pages, we propose a graph of knowledge where the nodes...

Read more »

Participation and Observation in Search

February 9, 2013
By
Participation and Observation in Search

The early days of web search were essentially about observation. The web search engine observed the web (documents, links and user behaviours) and then delivered results based on those observations. In recent years we have started to see more of...

Read more »

Better Beaches

January 27, 2013
By
Better Beaches

Having recently returned from a trip to Kauai where I used my beach search engine with middling success, I've now got a few updates out on the site. Firstly, there is a full map showing either all the beaches in...

Read more »

Subscribe

Email:

Add to Google Reader or Homepage

  Subscribe