(This article was originally published at Statisfaction » Statistics, and syndicated at StatsBlogs.)

Hey,

With Robin Ryder we wrote a paper titled *The Wang-Landau Algorithm Reaches the Flat Histogram in Finite Time* and it has been accepted in *Annals of Applied Probability *(arXiv preprint here). I’m especially happy about it since it was the last remaining unpublished chapter of my PhD thesis. In this post I’ll try to explain what we proved here on a simple example.

Suppose you are given a target density , and you wish to sample from it BUT with the additional constraint that you want some parts of the state space to be favoured and some other to be penalized. For instance, you are given an univariate Gaussian density. It is symmetric so if you sample directly from it, in average you will get half of your samples in the negative values, and half in the positive values. Suppose that you would like to have 75% of your samples in the negative values instead. This kind of constraint can be interesting if you want to explore multimodal distributions, in which case you might want to penalize already-visited regions of the state space, in order to explore yet unvisited regions (see this old post for instance, which was about a related paper now accepted in JCGS).

In the case of a Gaussian distribution, it is easy because you can compute exactly the required quantities For instance if you partition the state space in two regions and , and you want to have a proportion of samples on each bin (say and ). You can compute for each bin

and then if you define the “biased” density as follows (up to a multiplicative constant):

then you can check that the biased density has a mass of on bin and on bin , ie . Hence drawing from this biased density should yield the desired samples. However in general it is not possible to compute , so you need an algorithm to estimate these quantities. The Wang-Landau algorithm does this in an online manner, which means that it both gets the desired draws from and estimates the penalties simultaneously. It’s an iterative algorithm that you can run for as long as you want. Now we say that the Flat Histogram criterion is reached when we obtain (approximately) the desired frequencies in each bin, ie when the empirical proportions of samples in each bin are close to the desired ones. What we proved with Robin is that the Flat Histogram criterion is reached in a random time that has a finite expectation, i.e. you’re not going to wait forever in front of your computer waiting for the Flat Histogram criterion.

The following plot shows the empirical proportions converging to the desired ones.

To produce these figures you can use the PAWL package that is on CRAN. Here’s a bit of code producing both figures of this post.

library(PAWL) theme_set(theme_bw()) rinit <- function(size) rep(0, size) parameters <- list(mean = 0, sd = 1) logdensity <- function(x, parameters){ return(dnorm(x, parameters$mean, parameters$sd, log = TRUE) * (x > -10 & x < 10)) } gaussiantarget <- target(name = "gaussian", dimension = 1, rinit = rinit, logdensity = logdensity, parameters = parameters) Nchains <- 1; Niterations <- 10000 proposal <- createAdaptiveRandomWalkProposal(Nchains, gaussiantarget@dimension, adaptiveproposal = FALSE) pawlparameters <- tuningparameters(nchains = Nchains, niterations = Niterations, storeall = TRUE) getPos <- function(points, logdensity) points desfreq <- c(0.75, 0.25) positionbinning <- binning(position = getPos, name = "position", bins = c(-Inf, 0), desiredfreq = desfreq, useLearningRate = FALSE, autobinning = FALSE) pawlresults <- pawl(gaussiantarget, binning = positionbinning, AP = pawlparameters, proposal = proposal) chains <- ConvertResults(pawlresults) alllocations <- positionbinning@getLocations(positionbinning@bins, pawlresults$allreaction) locationsdf <- data.frame(cbind(cumsum(alllocations == 1) / 1:(Niterations+1), cumsum(alllocations == 2) / 1:(Niterations+1), 1:(Niterations+1))) names(locationsdf) <- c("proportion1", "proportion2", "iterations") meltedlocations <- melt(data=locationsdf, id = c("iterations")) g <- ggplot(data = meltedlocations, aes(x = iterations, y = value, colour = variable)) g <- g + geom_line() + theme(legend.position = "none") g <- g + scale_colour_manual(values= c("red", "red")) g <- g + geom_hline(yintercept = positionbinning@desiredfreq, col = "black", linetype = 3) g <- g + xlab("iterations") + ylab("Proportions of visits to each bin") print(g) ghist <- qplot(chains[,1], geom = "histogram", binwidth = 0.3) + geom_vline(xintercept = 0, colour = "red", size = 3) ghist <- ghist + xlab("X") print(ghist)

**Please comment on the article here:** **Statisfaction » Statistics**