Blog Archives

A quick introduction to Apache Spark for statisticians

February 8, 2017
By
A quick introduction to Apache Spark for statisticians

Introduction Apache Spark is a Scala library for analysing "big data". It can be used for analysing huge (internet-scale) datasets distributed across large clusters of machines. The analysis can be anything from the computation of simple descriptive statistics associated with the datasets, through to rather sophisticated machine learning pipelines involving data pre-processing, transformation, nonlinear model … Continue reading A quick introduction to Apache Spark for statisticians

Read more »

Books on Scala for statistical computing and data science

December 22, 2016
By
Books on Scala for statistical computing and data science

Introduction People regularly ask me about books and other resources for getting started with Scala for statistical computing and data science. This post will focus on books, but it’s worth briefly noting that there are a number of other resources available, on-line and otherwise, that are also worth considering. I particularly like the Coursera course … Continue reading Books on Scala for statistical computing and data science

Read more »

Scala for Data Science [book review]

December 22, 2016
By
Scala for Data Science [book review]

This post will review the book: Scala for Data Science, Bugnion, Packt, 2016. Disclaimer: This book review has not been solicited by the publisher (or anyone else) in any way. I purchased the review copy of this book myself. I have not received any benefit from the writing of this review. Introduction On this blog … Continue reading Scala for Data Science [book review]

Read more »

Working with SBML using Scala

December 17, 2016
By
Working with SBML using Scala

Introduction The Systems Biology Markup Language (SBML) is an XML-based format for representation and exchange of biochemical network models. SBML is supported by most systems biology modelling tools, allowing the export of a model in SBML from one tool and then reading in another tool. Because it offers a standard way of representing biochemical networks … Continue reading Working with SBML using Scala

Read more »

A scalable particle filter in Scala

July 22, 2016
By
A scalable particle filter in Scala

Introduction Many modern algorithms in computational Bayesian statistics have at their heart a particle filter or some other sequential Monte Carlo (SMC) procedure. In this blog I’ve discussed particle MCMC algorithms which use a particle filter in the inner-loop in order to compute a (noisy, unbiased) estimate of the marginal likelihood of the data. These … Continue reading A scalable particle filter in Scala

Read more »

First steps with monads in Scala

April 15, 2016
By
First steps with monads in Scala

Introduction In the previous post I gave a quick introduction to some important concepts in functional programming, such as HOFs, closures, currying and partial application, and hopefully gave some insight into why these concepts might be useful in the context of scientific computing. Another concept that is very important in modern functional programming is that … Continue reading First steps with monads in Scala

Read more »

HOFs, closures, partial application and currying to solve the function environment problem in Scala

November 16, 2015
By
HOFs, closures, partial application and currying to solve the function environment problem in Scala

Introduction Functional programming (FP) is a programming style that emphasises the use of referentially transparent pure functions and immutable data structures. Higher order functions (HOFs) tend to be used extensively to enable a clean functional programming style. A HOF is just a function that either takes a function as an argument or returns a function. … Continue reading HOFs, closures, partial application and currying to solve the function environment problem…

Read more »

Data frames and tables in Scala

August 21, 2015
By
Data frames and tables in Scala

Introduction To statisticians and data scientists used to working in R, the concept of a data frame is one of the most natural and basic starting points for statistical computing and data analysis. It always surprises me that data frames aren’t a core concept in most programming languages’ standard libraries, since they are essentially a … Continue reading Data frames and tables in Scala

Read more »

Data frames and tables in Scala

August 21, 2015
By
Data frames and tables in Scala

Introduction To statisticians and data scientists used to working in R, the concept of a data frame is one of the most natural and basic starting points for statistical computing and data analysis. It always surprises me that data frames aren’t a core concept in most programming languages’ standard libraries, since they are essentially a … Continue reading Data frames and tables in Scala

Read more »

Calling R from Scala sbt projects using rscala

August 15, 2015
By
Calling R from Scala sbt projects using rscala

Overview In the previous post I showed how the rscala package (which has replaced the jvmr package) can be used to call Scala code from within R. In this post I will show how to call R from Scala code. I have previously described how to do this using jvmr. This post is really just … Continue reading Calling R from Scala sbt projects using rscala

Read more »


Subscribe

Email:

  Subscribe