Posts Tagged ‘ Big Data ’

Data Wrangling at Scale

November 15, 2017
By
Data Wrangling at Scale

Just wrote a new R article: “Data Wrangling at Scale” (using Dirk Eddelbuettel’s tint template). Please check it out.

Read more »

Update on coordinatized or fluid data

November 13, 2017
By
Update on coordinatized or fluid data

We have just released a major update of the cdata R package to CRAN. If you work with R and data, now is the time to check out the cdata package. Among the changes in the 0.5.* version of cdata package: All coordinatized data or fluid data operations are now in the cdata package (no … Continue reading Update on coordinatized or fluid data

Read more »

Another day, another fake data scandal… This time, it’s Twitter

October 26, 2017
By

Kaiser Fung, author of Numbersense and founder of Principal Analytics Prep, sees a crisis of confidence in self-reported non-financial metrics by technology upstarts. He calls for third-party auditing.

Read more »

Some Announcements

October 24, 2017
By

Some Announcements: Dr. Nina Zumel will be presenting “Myths of Data Science: Things you Should and Should Not Believe”, Sunday, October 29, 2017 10:00 AM to 12:30 PM at the She Talks Data Meetup (Bay Area). ODSC West 2017 is soon. It is our favorite conference and we will be giving both a workshop and … Continue reading Some Announcements

Read more »

Analysts must reckon with the fake data menace

October 9, 2017
By

Kaiser Fung, founder of Principal Analytics Prep, comments on the fake data and fraud problem in digital advertising, and calls on data scientists and analysts to rise up to the challenge.

Read more »

Q&A with NBA Hackathon winner

October 2, 2017
By
Q&A with NBA Hackathon winner

After the NBA Hackathon (see report here), I caught up with the winning team in the business analytics competition, DataBucket, composed of Barbara Zhan and Harold Li. Junkcharts: Congratulations for winning the business analytics competition at the NBA Hackathon. As a judge, I was very impressed by how much work you were able to do in 24 hours. Did you sleep or did you work all the way through? DataBucket:…

Read more »

My advice on dplyr::mutate()

September 22, 2017
By
My advice on dplyr::mutate()

There are substantial differences between ad-hoc analyses (be they: machine learning research, data science contests, or other demonstrations) and production worthy systems. Roughly: ad-hoc analyses have to be correct only at the moment they are run (and often once they are correct, that is the last time they are run; obviously the idea of reproducible … Continue reading My advice on dplyr::mutate()

Read more »

Uber data collection makes news again

August 31, 2017
By

Kaiser Fung, founder of Junk Charts and Principal Analytics Prep, the next-gen data analytics bootcamp, discusses ethical issues concerning Uber's collection of user data from smartphone apps.

Read more »

Gelman digested read

August 16, 2017
By

It's hard to keep up with Andrew Gelman, so let me point to some interesting recent posts from his blog. Readings on philosophy of statistics (link): Andrew has a bunch of links of (mostly his own) writings about deep statistical issues. Science is about understanding how the world works, which involves questions of cause and effect, and randomness and unexplained variability. Data that can be observed are almost never sufficient…

Read more »

Did web scraping just receive a legal boost?

August 15, 2017
By

Kaiser Fung, founder of Principal Analytics Prep and author of Numbersense, discusses a recent legal ruling against LinkedIn's technologies that restricts web scraping.

Read more »


Subscribe

Email:

  Subscribe