Class SmpBlas

java.lang.Object
cern.colt.matrix.linalg.SmpBlas
All Implemented Interfaces:
Blas

public class SmpBlas extends Object implements Blas
Parallel implementation of the Basic Linear Algebra System for symmetric multi processing boxes. Currently only a few algorithms are parallelised; the others are fully functional, but run in sequential mode. Parallelised are:
  • dgemm (matrix-matrix multiplication)
  • dgemv (matrix-vector multiplication)
  • assign(A,function) (generalized matrix scaling/transform): Strong speedup only for expensive functions like logarithm, sin, etc.
  • assign(A,B,function) (generalized matrix scaling/transform): Strong speedup only for expensive functions like pow etc.
In all cases, no or only marginal speedup is seen for small problem sizes; they are detected and the sequential algorithm is used.

Usage

Call the static method allocateBlas(int, cern.colt.matrix.linalg.Blas) at the very beginning of your program, supplying the main parameter for SmpBlas, the number of available CPUs. The method sets the public global variable SmpBlas.smpBlas to a blas using a maximum of CPUs threads, each concurrently processing matrix blocks with the given sequential blas algorithms. Normally there is no need to call allocateBlas more than once. Then use SmpBlas.smpBlas.someRoutine(...) to run someRoutine in parallel. E.g.
int cpu_s = 4;
SmpBlas.allocateBlas(cpu_s, SeqBlas.seqBlas);
...
SmpBlas.smpBlas.dgemm(...)
SmpBlas.smpBlas.dgemv(...)
Even if you don't call a blas routine yourself, it often makes sense to allocate a SmpBlas, because other matrix library routines sometimes call the blas. So if you're lucky, you get parallel performance for free.

Notes

  • Unfortunately, there is no portable means of automatically detecting the number of CPUs on a JVM, so there is no good way to automate defaults.
  • Only improves performance on boxes with > 1 CPUs and VMs with native threads.
  • Currently only improves performance when working on dense matrix types. On sparse types, performance is likely to degrade (because of the implementation of sub-range views)!
  • Implemented using Doug Lea's fast lightweight task framework (
    invalid reference
    EDU.oswego.cs.dl.util.concurrent
    ) built upon Java threads, and geared for parallel computation.
Version:
0.9, 16/04/2000
See Also:
  • FJTaskRunnerGroup
  • FJTask
  • Field Details

    • smpBlas

      public static Blas smpBlas
      The public global parallel blas; initialized via allocateBlas(int, cern.colt.matrix.linalg.Blas). Do not modify this variable via other means (it is public).
    • seqBlas

      protected Blas seqBlas
    • smp

      protected Smp smp
    • maxThreads

      protected int maxThreads
    • NN_THRESHOLD

      protected static int NN_THRESHOLD
  • Constructor Details

    • SmpBlas

      protected SmpBlas(int maxThreads, Blas seqBlas)
      Constructs a blas using a maximum of maxThreads threads; each executing the given sequential algos.
  • Method Details

    • allocateBlas

      public static void allocateBlas(int maxThreads, Blas seqBlas)
      Sets the public global variable SmpBlas.smpBlas to a blas using a maximum of maxThreads threads, each executing the given sequential algorithm; maxThreads is normally the number of CPUs. Call this method at the very beginning of your program. Normally there is no need to call this method more than once.
      Parameters:
      maxThreads - the maximum number of threads (= CPUs) to be used
      seqBlas - the sequential blas algorithms to be used on concurrently processed matrix blocks.
    • assign

      public void assign(DoubleMatrix2D A, DoubleFunction function)
      Description copied from interface: Blas
      Assigns the result of a function to each cell; x[row,col] = function(x[row,col]).
      Specified by:
      assign in interface Blas
      Parameters:
      A - the matrix to modify.
      function - a function object taking as argument the current cell's value.
      See Also:
    • assign

      public void assign(DoubleMatrix2D A, DoubleMatrix2D B, DoubleDoubleFunction function)
      Description copied from interface: Blas
      Assigns the result of a function to each cell; x[row,col] = function(x[row,col],y[row,col]).
      Specified by:
      assign in interface Blas
      Parameters:
      A - the matrix to modify.
      B - the secondary matrix to operate on.
      function - a function object taking as first argument the current cell's value of this, and as second argument the current cell's value of y,
      See Also:
    • dasum

      public double dasum(DoubleMatrix1D x)
      Description copied from interface: Blas
      Returns the sum of absolute values; |x[0]| + |x[1]| + ... . In fact equivalent to x.aggregate(cern.jet.math.Functions.plus, cern.jet.math.Functions.abs).
      Specified by:
      dasum in interface Blas
      Parameters:
      x - the first vector.
    • daxpy

      public void daxpy(double alpha, DoubleMatrix1D x, DoubleMatrix1D y)
      Description copied from interface: Blas
      Combined vector scaling; y = y + alpha*x. In fact equivalent to y.assign(x,cern.jet.math.Functions.plusMult(alpha)).
      Specified by:
      daxpy in interface Blas
      Parameters:
      alpha - a scale factor.
      x - the first source vector.
      y - the second source vector, this is also the vector where results are stored.
    • daxpy

      public void daxpy(double alpha, DoubleMatrix2D A, DoubleMatrix2D B)
      Description copied from interface: Blas
      Combined matrix scaling; B = B + alpha*A. In fact equivalent to B.assign(A,cern.jet.math.Functions.plusMult(alpha)).
      Specified by:
      daxpy in interface Blas
      Parameters:
      alpha - a scale factor.
      A - the first source matrix.
      B - the second source matrix, this is also the matrix where results are stored.
    • dcopy

      public void dcopy(DoubleMatrix1D x, DoubleMatrix1D y)
      Description copied from interface: Blas
      Vector assignment (copying); y = x. In fact equivalent to y.assign(x).
      Specified by:
      dcopy in interface Blas
      Parameters:
      x - the source vector.
      y - the destination vector.
    • dcopy

      public void dcopy(DoubleMatrix2D A, DoubleMatrix2D B)
      Description copied from interface: Blas
      Matrix assignment (copying); B = A. In fact equivalent to B.assign(A).
      Specified by:
      dcopy in interface Blas
      Parameters:
      A - the source matrix.
      B - the destination matrix.
    • ddot

      public double ddot(DoubleMatrix1D x, DoubleMatrix1D y)
      Description copied from interface: Blas
      Returns the dot product of two vectors x and y, which is Sum(x[i]*y[i]). In fact equivalent to x.zDotProduct(y).
      Specified by:
      ddot in interface Blas
      Parameters:
      x - the first vector.
      y - the second vector.
      Returns:
      the sum of products.
    • dgemm

      public void dgemm(boolean transposeA, boolean transposeB, double alpha, DoubleMatrix2D A, DoubleMatrix2D B, double beta, DoubleMatrix2D C)
      Description copied from interface: Blas
      Generalized linear algebraic matrix-matrix multiply; C = alpha*A*B + beta*C. In fact equivalent to A.zMult(B,C,alpha,beta,transposeA,transposeB). Note: Matrix shape conformance is checked after potential transpositions.
      Specified by:
      dgemm in interface Blas
      Parameters:
      transposeA - set this flag to indicate that the multiplication shall be performed on A'.
      transposeB - set this flag to indicate that the multiplication shall be performed on B'.
      alpha - a scale factor.
      A - the first source matrix.
      B - the second source matrix.
      beta - a scale factor.
      C - the third source matrix, this is also the matrix where results are stored.
    • dgemv

      public void dgemv(boolean transposeA, double alpha, DoubleMatrix2D A, DoubleMatrix1D x, double beta, DoubleMatrix1D y)
      Description copied from interface: Blas
      Generalized linear algebraic matrix-vector multiply; y = alpha*A*x + beta*y. In fact equivalent to A.zMult(x,y,alpha,beta,transposeA). Note: Matrix shape conformance is checked after potential transpositions.
      Specified by:
      dgemv in interface Blas
      Parameters:
      transposeA - set this flag to indicate that the multiplication shall be performed on A'.
      alpha - a scale factor.
      A - the source matrix.
      x - the first source vector.
      beta - a scale factor.
      y - the second source vector, this is also the vector where results are stored.
    • dger

      public void dger(double alpha, DoubleMatrix1D x, DoubleMatrix1D y, DoubleMatrix2D A)
      Description copied from interface: Blas
      Performs a rank 1 update; A = A + alpha*x*y'. Example:
      A = { {6,5}, {7,6} }, x = {1,2}, y = {3,4}, alpha = 1 -->
      A = { {9,9}, {13,14} }
      
      Specified by:
      dger in interface Blas
      Parameters:
      alpha - a scalar.
      x - an m element vector.
      y - an n element vector.
      A - an m by n matrix.
    • dnrm2

      public double dnrm2(DoubleMatrix1D x)
      Description copied from interface: Blas
      Return the 2-norm; sqrt(x[0]^2 + x[1]^2 + ...). In fact equivalent to Math.sqrt(Algebra.DEFAULT.norm2(x)).
      Specified by:
      dnrm2 in interface Blas
      Parameters:
      x - the vector.
    • drot

      public void drot(DoubleMatrix1D x, DoubleMatrix1D y, double c, double s)
      Description copied from interface: Blas
      Applies a givens plane rotation to (x,y); x = c*x + s*y; y = c*y - s*x.
      Specified by:
      drot in interface Blas
      Parameters:
      x - the first vector.
      y - the second vector.
      c - the cosine of the angle of rotation.
      s - the sine of the angle of rotation.
    • drotg

      public void drotg(double a, double b, double[] rotvec)
      Description copied from interface: Blas
      Constructs a Givens plane rotation for (a,b). Taken from the LINPACK translation from FORTRAN to Java, interface slightly modified. In the LINPACK listing DROTG is attributed to Jack Dongarra
      Specified by:
      drotg in interface Blas
      Parameters:
      a - rotational elimination parameter a.
      b - rotational elimination parameter b.
      rotvec - [] Must be at least of length 4. On output contains the values {a,b,c,s}.
    • dscal

      public void dscal(double alpha, DoubleMatrix1D x)
      Description copied from interface: Blas
      Vector scaling; x = alpha*x. In fact equivalent to x.assign(cern.jet.math.Functions.mult(alpha)).
      Specified by:
      dscal in interface Blas
      Parameters:
      alpha - a scale factor.
      x - the first vector.
    • dscal

      public void dscal(double alpha, DoubleMatrix2D A)
      Description copied from interface: Blas
      Matrix scaling; A = alpha*A. In fact equivalent to A.assign(cern.jet.math.Functions.mult(alpha)).
      Specified by:
      dscal in interface Blas
      Parameters:
      alpha - a scale factor.
      A - the matrix.
    • dswap

      public void dswap(DoubleMatrix1D x, DoubleMatrix1D y)
      Description copied from interface: Blas
      Swaps the elements of two vectors; y invalid input: '<'==> x. In fact equivalent to y.swap(x).
      Specified by:
      dswap in interface Blas
      Parameters:
      x - the first vector.
      y - the second vector.
    • dswap

      public void dswap(DoubleMatrix2D A, DoubleMatrix2D B)
      Description copied from interface: Blas
      Swaps the elements of two matrices; B invalid input: '<'==> A.
      Specified by:
      dswap in interface Blas
    • dsymv

      public void dsymv(boolean isUpperTriangular, double alpha, DoubleMatrix2D A, DoubleMatrix1D x, double beta, DoubleMatrix1D y)
      Description copied from interface: Blas
      Symmetric matrix-vector multiplication; y = alpha*A*x + beta*y. Where alpha and beta are scalars, x and y are n element vectors and A is an n by n symmetric matrix. A can be in upper or lower triangular format.
      Specified by:
      dsymv in interface Blas
      Parameters:
      isUpperTriangular - is A upper triangular or lower triangular part to be used?
      alpha - scaling factor.
      A - the source matrix.
      x - the first source vector.
      beta - scaling factor.
      y - the second vector holding source and destination.
    • dtrmv

      public void dtrmv(boolean isUpperTriangular, boolean transposeA, boolean isUnitTriangular, DoubleMatrix2D A, DoubleMatrix1D x)
      Description copied from interface: Blas
      Triangular matrix-vector multiplication; x = A*x or x = A'*x. Where x is an n element vector and A is an n by n unit, or non-unit, upper or lower triangular matrix.
      Specified by:
      dtrmv in interface Blas
      Parameters:
      isUpperTriangular - is A upper triangular or lower triangular?
      transposeA - set this flag to indicate that the multiplication shall be performed on A'.
      isUnitTriangular - true --> A is assumed to be unit triangular; false --> A is not assumed to be unit triangular
      A - the source matrix.
      x - the vector holding source and destination.
    • idamax

      public int idamax(DoubleMatrix1D x)
      Description copied from interface: Blas
      Returns the index of largest absolute value; i such that |x[i]| == max(|x[0]|,|x[1]|,...)..
      Specified by:
      idamax in interface Blas
      Parameters:
      x - the vector to search through.
      Returns:
      the index of largest absolute value (-1 if x is empty).
    • run

      protected double[] run(DoubleMatrix2D A, DoubleMatrix2D B, boolean collectResults, Matrix2DMatrix2DFunction fun)
    • run

      protected double[] run(DoubleMatrix2D A, boolean collectResults, Matrix2DMatrix2DFunction fun)
    • stats

      public void stats()
      Prints various snapshot statistics to System.out; Simply delegates to FJTaskRunnerGroup.stats().
    • xsum

      private double xsum(DoubleMatrix2D A)