It is possible to calculate the mean and the standard deviation of a serie of double values while iterating only once on the whole set of data. This approach is thus also suitable for streaming data; it could update the mean and standard deviation on-the-fly.

Here I present an implementation of the algorithm shown by Mark Hoemmen in his paper Computing the standard deviation efficiently as a static Java method. The values are provided by an `Iterator`

. Note that in Java 5+ the iterator could directly force the objects to implement the interface. Here I have chosen to delegate the extraction of the double value to an extractor object (I’m using it on Java 1.4).

[sourcecode lang=”java”]

/**

* A method to calculate the mean and standard deviation of a series

* of double. The objects are provided by an {@link Iterator} and the

* double value is extracted using the passed extractor.

* It skips <code>NaN</code> values.<br>

*

* Returns a <code>double[3]</code> where <br>

* [0] is the number of values used in the mean<br>

* [1] contains the mean value of the series<br>

* [2] contains the standard deviation (sigma) of the complete

* population<br>

* Returns <code>NaN</code> for the mean and std dev if no valid value

* could be found in the series.<br>

*

* Algorithm taken from "Computing the standard deviation efficiently"

* by Mark Hoemmen

* (http://www.cs.berkeley.edu/~mhoemmen/cs194/Tutorials/variance.pdf)

*

* @return the number of values, mean and std-dev of the serie

*

*/

public static double[] getMeanAndStdDev(Iterator it, DoubleExtractor e){

while(it.hasNext()){// while initial value is NaN, try next

double xk = e.doubleValue(it.next());

if(Double.isNaN(xk)){

continue;

}

int k = 1;

double Mk = xk;

double Qk = 0;

while(it.hasNext()){

xk = e.doubleValue(it.next());

if(Double.isNaN(xk)){

continue;

}

k++;

double d = xk-Mk; // is actually xk – Mk-1,

// as Mk was not yet updated

Qk += (k-1)*d*d/k;

Mk += d/k;

}

return new double[]{k,Mk,Math.sqrt(Qk/k)};

}

return new double[]{0,Double.NaN,Double.NaN};

}

[/sourcecode]

using the following interface

[sourcecode lang=”java”]

/**

* An object which can return a double representation of passed objects.

*

* @author Philipp Buluschek

*

*/

public interface DoubleExtractor {

/**

* @return the double representation of the passed object.

*/

public double doubleValue(Object o);

}

[/sourcecode]