Archive

Posts Tagged ‘code’

Calculating the mean and standard deviation in one pass

October 27th, 2009

It is possible to calculate the mean and the standard deviation of a serie of double values while iterating only once on the whole set of data. This approach is thus also suitable for streaming data; it could update the mean and standard deviation on-the-fly.

Here I present an implementation of the algorithm shown by Mark Hoemmen in his paper Computing the standard deviation efficiently as a static Java method. The values are provided by an Iterator. Note that in Java 5+ the iterator could directly force the objects to implement the interface. Here I have chosen to delegate the extraction of the double value to an extractor object (I’m using it on Java 1.4).

	/**
	 * A method to calculate the mean and standard deviation of a series 
	 * of double. The objects are provided by an {@link Iterator} and the 
	 * double value is extracted using the passed extractor.
	 * It skips <code>NaN</code> values.<br>
	 * 
	 * Returns a <code>double[3]</code> where <br>
	 * [0] is the number of values used in the mean<br>
	 * [1] contains the mean value of the series<br>
	 * [2] contains the standard deviation (sigma) of the complete 
	 * population<br>
	 * Returns <code>NaN</code> for the mean and std dev if no valid value 
	 * could be found in the series.<br>
	 * 
	 * Algorithm taken from "Computing the standard deviation efficiently" 
	 * by Mark Hoemmen 
	 * (http://www.cs.berkeley.edu/~mhoemmen/cs194/Tutorials/variance.pdf)
	 * 
	 * @return the number of values, mean and std-dev of the serie
	 * 
	 */
	public static double[] getMeanAndStdDev(Iterator it, DoubleExtractor e){
		while(it.hasNext()){// while initial value is NaN, try next
			double xk = e.doubleValue(it.next());
			if(Double.isNaN(xk)){ 
				continue;
			}
			int k = 1;
			double Mk = xk;
			double Qk = 0;
			while(it.hasNext()){
				xk = e.doubleValue(it.next());
				if(Double.isNaN(xk)){
					continue;
				}
				k++;
				double d = xk-Mk; // is actually xk - Mk-1, 
				                  // as Mk was not yet updated
				Qk += (k-1)*d*d/k;
				Mk += d/k;
			}
			return new double[]{k,Mk,Math.sqrt(Qk/k)};
		} 
		return new double[]{0,Double.NaN,Double.NaN};
	}

using the following interface

/**
 * An object which can return a double representation of passed objects.
 * 
 * @author Philipp Buluschek
 *
 */
public interface DoubleExtractor {
	
	/**
	 * @return the double representation of the passed object.
	 */
	public double doubleValue(Object o);
}

Uncategorized , , ,

Getting the number of dimensions of a multidimensional array in Java

January 21st, 2009

Based on my previous post about cloning multidimensional arrays, I have come up with a short static function to get the number of dimensions which a multidimensional array has.

Note that this method only returns the number of dimensions of the original object passed as argument. It does not expand into the contained objects themselves. Example:

Object[] a = new Object[3][0];
System.out.println(getNumberOfDimensions(a));

prints 2.

Object[] a = new Object[1];
a[0] = new float[2][5][1];
System.out.println(getNumberOfDimensions(a));

prints 1.

Here’s the code:

/** 
 * Returns the number of dimensions which a multidimensional 
 * array has as a positive integer. For example, returns 3 
 * when passed a <code>float[][][]</code> 
 * @param src the multidimensional array 
 * @return the number of dimensions of src 
 * @author Philipp Buluschek bulu@romandie.com 
 */  
public static int getNumberOfDimensions(Object[] src){  
  int dim = 1;  
  Class cl = src.getClass().getComponentType();  
  while(cl.isArray()){  
    dim++;  
    cl = cl.getComponentType(); // will not return null as we tested isArray() above  
  }  
  return dim;  
}  

java , , , ,

Cloning multidimensional arrays in Java

January 13th, 2009

In Java, calling the clone() method on an array returns a copy of the array, that is a new array containing references to the same objects as the source array. In particular, the objects themselves are not copied. As multidimensional arrays are just arrays of arrays, cloning a multidimensional array result in only the first dimension being copied, the array of the second and following dimensions are the same in the source and the destination array.

Below I present a static method to get a real clone of a multidimensional array, meaning one where all dimensions are copied, but not the objects. I use it for example, if I have a private multidimensional array in a class, and want to provide an accessor for it. To assure that the array which is returned by the accessor is totally independant of the one inside the class, all dimensions must be copied correctly (defensive copy).

Note that this implementation does not support arrays containing themselves. This will result in an infinite recursion and eventually a stack overflow.

So without further discussion, here’s the code (compatible JRE 1.3)

/** 
 * Makes a copy of all dimensions of a multidimensional 
 * array. This function does not copy the actual objects, but 
 * only the references. The behavior is the same as {@link #clone()} 
 * but extends to all dimensions of an array.<br> 
 * Note that this implementation does not handle arrays which 
 * contain themselves correctly. It results in a infinite recursion
 * ending in a {@link StackOverflowError}.
 * 
 * @param src a multidimensional array, either of {@link Object}s 
 * or of a primitive type 
 * @return a copy of src, so that each dimension is copied, but 
 * the objects are not cloned 
 * @author Philipp Buluschek bulu@romandie.com 
 */  
public static Object[] deepClone(Object[] src){
  if(src == null){
    return null;
  }
  Object[] dest = (Object[])src.clone();
  for (int i = 0; i < dest.length; i++) {
    Object e = dest[i];
    if (e != null &amp;amp;&amp;amp; e.getClass().isArray()) { 
      // if it is null or not an array, it was taken care of by the clone()
      if (e instanceof Object[]) {
        // using recursion to reach all dimensions
        dest[i] = deepClone((Object[])e);
      }
      else {
        // primitive arr
        if(e instanceof byte[])   dest[i] = ((byte[]) e).clone();
        else if(e instanceof short[])  dest[i] = ((short[]) e).clone();
        else if(e instanceof int[])    dest[i] = ((int[]) e).clone();
        else if(e instanceof long[])   dest[i] = ((long[]) e).clone();
        else if(e instanceof float[])  dest[i] = ((float[]) e).clone();
        else if(e instanceof double[]) dest[i] = ((double[]) e).clone();
        else if(e instanceof boolean[])dest[i] = ((boolean[]) e).clone();
      }
    }
  }
  return dest;
}

The same code using generics for JRE >= 1.5

/** 
 * Makes a copy of all dimensions of a multidimensional 
 * array. This function does not copy the actual objects, but 
 * only the references. The behavior is the same as {@link #clone()} 
 * but extends to all dimensions of an array.<br> 
 * Note that this implementation does not handle arrays which 
 * contain themselves correctly. It results in a infinite recursion
 * ending in a {@link StackOverflowError}.
 * 
 * @param src a multidimensional array, either of {@link Object}s 
 * or of a primitive type 
 * @return a copy of src, so that each dimension is copied, but 
 * the objects are not cloned 
 * @author Philipp Buluschek bulu@romandie.com 
 */  

@SuppressWarnings("unchecked")
public static <T> T[] deepClone(T[] src){
  if(src == null){
    return null;
  }
  T[] dest = src.clone();
  for (int i = 0; i < dest.length; i++) {
    Object e = dest[i];
    if (e != null &amp;amp;&amp;amp; e.getClass().isArray()) { 
      // if it is null or not an array, it was taken care of by the clone()
      if (e instanceof Object[]) {
        // using recursion to reach all dimensions
        dest[i] = (T)(deepClone((Object[])e));
      }
      else {
        // primitive arr
        if(e instanceof byte[])        dest[i] = (T)((byte[]) e).clone();
        else if(e instanceof short[])  dest[i] = (T)((short[]) e).clone();
        else if(e instanceof int[])    dest[i] = (T)((int[]) e).clone();
        else if(e instanceof long[])   dest[i] = (T)((long[]) e).clone();
        else if(e instanceof float[])  dest[i] = (T)((float[]) e).clone();
        else if(e instanceof double[]) dest[i] = (T)((double[]) e).clone();
        else if(e instanceof boolean[])dest[i] = (T)((boolean[]) e).clone();
      }
    }
  }
  return dest;
}

java , , ,