Package cern.colt.matrix.linalg
Class SmpBlas
- java.lang.Object
-
- cern.colt.matrix.linalg.SmpBlas
-
- All Implemented Interfaces:
Blas
public class SmpBlas extends java.lang.Object implements Blas
Parallel implementation of the Basic Linear Algebra System for symmetric multi processing boxes. Currently only a few algorithms are parallelised; the others are fully functional, but run in sequential mode. Parallelised are:dgemm
(matrix-matrix multiplication)dgemv
(matrix-vector multiplication)assign(A,function)
(generalized matrix scaling/transform): Strong speedup only for expensive functions like logarithm, sin, etc.assign(A,B,function)
(generalized matrix scaling/transform): Strong speedup only for expensive functions like pow etc.
Usage
Call the static methodallocateBlas(int, cern.colt.matrix.linalg.Blas)
at the very beginning of your program, supplying the main parameter for SmpBlas, the number of available CPUs. The method sets the public global variable SmpBlas.smpBlas to a blas using a maximum of CPUs threads, each concurrently processing matrix blocks with the given sequential blas algorithms. Normally there is no need to call allocateBlas more than once. Then use SmpBlas.smpBlas.someRoutine(...) to run someRoutine in parallel. E.g.int cpu_s = 4; SmpBlas.allocateBlas(cpu_s, SeqBlas.seqBlas); ... SmpBlas.smpBlas.dgemm(...) SmpBlas.smpBlas.dgemv(...)
Notes
- Unfortunately, there is no portable means of automatically detecting the number of CPUs on a JVM, so there is no good way to automate defaults.
- Only improves performance on boxes with > 1 CPUs and VMs with native threads.
- Currently only improves performance when working on dense matrix types. On sparse types, performance is likely to degrade (because of the implementation of sub-range views)!
- Implemented using Doug Lea's fast lightweight task framework (
EDU.oswego.cs.dl.util.concurrent
) built upon Java threads, and geared for parallel computation.
- Version:
- 0.9, 16/04/2000
- See Also:
FJTaskRunnerGroup
,FJTask
-
-
Field Summary
Fields Modifier and Type Field Description protected int
maxThreads
protected static int
NN_THRESHOLD
protected Blas
seqBlas
protected Smp
smp
static Blas
smpBlas
The public global parallel blas; initialized viaallocateBlas(int, cern.colt.matrix.linalg.Blas)
.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description static void
allocateBlas(int maxThreads, Blas seqBlas)
Sets the public global variable SmpBlas.smpBlas to a blas using a maximum of maxThreads threads, each executing the given sequential algorithm; maxThreads is normally the number of CPUs.void
assign(DoubleMatrix2D A, DoubleFunction function)
Assigns the result of a function to each cell; x[row,col] = function(x[row,col]).void
assign(DoubleMatrix2D A, DoubleMatrix2D B, DoubleDoubleFunction function)
Assigns the result of a function to each cell; x[row,col] = function(x[row,col],y[row,col]).double
dasum(DoubleMatrix1D x)
Returns the sum of absolute values; |x[0]| + |x[1]| + ...void
daxpy(double alpha, DoubleMatrix1D x, DoubleMatrix1D y)
Combined vector scaling; y = y + alpha*x.void
daxpy(double alpha, DoubleMatrix2D A, DoubleMatrix2D B)
Combined matrix scaling; B = B + alpha*A.void
dcopy(DoubleMatrix1D x, DoubleMatrix1D y)
Vector assignment (copying); y = x.void
dcopy(DoubleMatrix2D A, DoubleMatrix2D B)
Matrix assignment (copying); B = A.double
ddot(DoubleMatrix1D x, DoubleMatrix1D y)
Returns the dot product of two vectors x and y, which is Sum(x[i]*y[i]).void
dgemm(boolean transposeA, boolean transposeB, double alpha, DoubleMatrix2D A, DoubleMatrix2D B, double beta, DoubleMatrix2D C)
Generalized linear algebraic matrix-matrix multiply; C = alpha*A*B + beta*C.void
dgemv(boolean transposeA, double alpha, DoubleMatrix2D A, DoubleMatrix1D x, double beta, DoubleMatrix1D y)
Generalized linear algebraic matrix-vector multiply; y = alpha*A*x + beta*y.void
dger(double alpha, DoubleMatrix1D x, DoubleMatrix1D y, DoubleMatrix2D A)
Performs a rank 1 update; A = A + alpha*x*y'.double
dnrm2(DoubleMatrix1D x)
Return the 2-norm; sqrt(x[0]^2 + x[1]^2 + ...).void
drot(DoubleMatrix1D x, DoubleMatrix1D y, double c, double s)
Applies a givens plane rotation to (x,y); x = c*x + s*y; y = c*y - s*x.void
drotg(double a, double b, double[] rotvec)
Constructs a Givens plane rotation for (a,b).void
dscal(double alpha, DoubleMatrix1D x)
Vector scaling; x = alpha*x.void
dscal(double alpha, DoubleMatrix2D A)
Matrix scaling; A = alpha*A.void
dswap(DoubleMatrix1D x, DoubleMatrix1D y)
Swaps the elements of two vectors; y <==> x.void
dswap(DoubleMatrix2D A, DoubleMatrix2D B)
Swaps the elements of two matrices; B <==> A.void
dsymv(boolean isUpperTriangular, double alpha, DoubleMatrix2D A, DoubleMatrix1D x, double beta, DoubleMatrix1D y)
Symmetric matrix-vector multiplication; y = alpha*A*x + beta*y.void
dtrmv(boolean isUpperTriangular, boolean transposeA, boolean isUnitTriangular, DoubleMatrix2D A, DoubleMatrix1D x)
Triangular matrix-vector multiplication; x = A*x or x = A'*x.int
idamax(DoubleMatrix1D x)
Returns the index of largest absolute value; i such that |x[i]| == max(|x[0]|,|x[1]|,...)..protected double[]
run(DoubleMatrix2D A, boolean collectResults, Matrix2DMatrix2DFunction fun)
protected double[]
run(DoubleMatrix2D A, DoubleMatrix2D B, boolean collectResults, Matrix2DMatrix2DFunction fun)
void
stats()
Prints various snapshot statistics to System.out; Simply delegates toFJTaskRunnerGroup.stats()
.private double
xsum(DoubleMatrix2D A)
-
-
-
Field Detail
-
smpBlas
public static Blas smpBlas
The public global parallel blas; initialized viaallocateBlas(int, cern.colt.matrix.linalg.Blas)
. Do not modify this variable via other means (it is public).
-
seqBlas
protected Blas seqBlas
-
smp
protected Smp smp
-
maxThreads
protected int maxThreads
-
NN_THRESHOLD
protected static int NN_THRESHOLD
-
-
Constructor Detail
-
SmpBlas
protected SmpBlas(int maxThreads, Blas seqBlas)
Constructs a blas using a maximum of maxThreads threads; each executing the given sequential algos.
-
-
Method Detail
-
allocateBlas
public static void allocateBlas(int maxThreads, Blas seqBlas)
Sets the public global variable SmpBlas.smpBlas to a blas using a maximum of maxThreads threads, each executing the given sequential algorithm; maxThreads is normally the number of CPUs. Call this method at the very beginning of your program. Normally there is no need to call this method more than once.- Parameters:
maxThreads
- the maximum number of threads (= CPUs) to be usedseqBlas
- the sequential blas algorithms to be used on concurrently processed matrix blocks.
-
assign
public void assign(DoubleMatrix2D A, DoubleFunction function)
Description copied from interface:Blas
Assigns the result of a function to each cell; x[row,col] = function(x[row,col]).
-
assign
public void assign(DoubleMatrix2D A, DoubleMatrix2D B, DoubleDoubleFunction function)
Description copied from interface:Blas
Assigns the result of a function to each cell; x[row,col] = function(x[row,col],y[row,col]).
-
dasum
public double dasum(DoubleMatrix1D x)
Description copied from interface:Blas
Returns the sum of absolute values; |x[0]| + |x[1]| + ... . In fact equivalent to x.aggregate(cern.jet.math.Functions.plus, cern.jet.math.Functions.abs).
-
daxpy
public void daxpy(double alpha, DoubleMatrix1D x, DoubleMatrix1D y)
Description copied from interface:Blas
Combined vector scaling; y = y + alpha*x. In fact equivalent to y.assign(x,cern.jet.math.Functions.plusMult(alpha)).
-
daxpy
public void daxpy(double alpha, DoubleMatrix2D A, DoubleMatrix2D B)
Description copied from interface:Blas
Combined matrix scaling; B = B + alpha*A. In fact equivalent to B.assign(A,cern.jet.math.Functions.plusMult(alpha)).
-
dcopy
public void dcopy(DoubleMatrix1D x, DoubleMatrix1D y)
Description copied from interface:Blas
Vector assignment (copying); y = x. In fact equivalent to y.assign(x).
-
dcopy
public void dcopy(DoubleMatrix2D A, DoubleMatrix2D B)
Description copied from interface:Blas
Matrix assignment (copying); B = A. In fact equivalent to B.assign(A).
-
ddot
public double ddot(DoubleMatrix1D x, DoubleMatrix1D y)
Description copied from interface:Blas
Returns the dot product of two vectors x and y, which is Sum(x[i]*y[i]). In fact equivalent to x.zDotProduct(y).
-
dgemm
public void dgemm(boolean transposeA, boolean transposeB, double alpha, DoubleMatrix2D A, DoubleMatrix2D B, double beta, DoubleMatrix2D C)
Description copied from interface:Blas
Generalized linear algebraic matrix-matrix multiply; C = alpha*A*B + beta*C. In fact equivalent to A.zMult(B,C,alpha,beta,transposeA,transposeB). Note: Matrix shape conformance is checked after potential transpositions.- Specified by:
dgemm
in interfaceBlas
- Parameters:
transposeA
- set this flag to indicate that the multiplication shall be performed on A'.transposeB
- set this flag to indicate that the multiplication shall be performed on B'.alpha
- a scale factor.A
- the first source matrix.B
- the second source matrix.beta
- a scale factor.C
- the third source matrix, this is also the matrix where results are stored.
-
dgemv
public void dgemv(boolean transposeA, double alpha, DoubleMatrix2D A, DoubleMatrix1D x, double beta, DoubleMatrix1D y)
Description copied from interface:Blas
Generalized linear algebraic matrix-vector multiply; y = alpha*A*x + beta*y. In fact equivalent to A.zMult(x,y,alpha,beta,transposeA). Note: Matrix shape conformance is checked after potential transpositions.- Specified by:
dgemv
in interfaceBlas
- Parameters:
transposeA
- set this flag to indicate that the multiplication shall be performed on A'.alpha
- a scale factor.A
- the source matrix.x
- the first source vector.beta
- a scale factor.y
- the second source vector, this is also the vector where results are stored.
-
dger
public void dger(double alpha, DoubleMatrix1D x, DoubleMatrix1D y, DoubleMatrix2D A)
Description copied from interface:Blas
Performs a rank 1 update; A = A + alpha*x*y'. Example:A = { {6,5}, {7,6} }, x = {1,2}, y = {3,4}, alpha = 1 --> A = { {9,9}, {13,14} }
-
dnrm2
public double dnrm2(DoubleMatrix1D x)
Description copied from interface:Blas
Return the 2-norm; sqrt(x[0]^2 + x[1]^2 + ...). In fact equivalent to Math.sqrt(Algebra.DEFAULT.norm2(x)).
-
drot
public void drot(DoubleMatrix1D x, DoubleMatrix1D y, double c, double s)
Description copied from interface:Blas
Applies a givens plane rotation to (x,y); x = c*x + s*y; y = c*y - s*x.
-
drotg
public void drotg(double a, double b, double[] rotvec)
Description copied from interface:Blas
Constructs a Givens plane rotation for (a,b). Taken from the LINPACK translation from FORTRAN to Java, interface slightly modified. In the LINPACK listing DROTG is attributed to Jack Dongarra
-
dscal
public void dscal(double alpha, DoubleMatrix1D x)
Description copied from interface:Blas
Vector scaling; x = alpha*x. In fact equivalent to x.assign(cern.jet.math.Functions.mult(alpha)).
-
dscal
public void dscal(double alpha, DoubleMatrix2D A)
Description copied from interface:Blas
Matrix scaling; A = alpha*A. In fact equivalent to A.assign(cern.jet.math.Functions.mult(alpha)).
-
dswap
public void dswap(DoubleMatrix1D x, DoubleMatrix1D y)
Description copied from interface:Blas
Swaps the elements of two vectors; y <==> x. In fact equivalent to y.swap(x).
-
dswap
public void dswap(DoubleMatrix2D A, DoubleMatrix2D B)
Description copied from interface:Blas
Swaps the elements of two matrices; B <==> A.
-
dsymv
public void dsymv(boolean isUpperTriangular, double alpha, DoubleMatrix2D A, DoubleMatrix1D x, double beta, DoubleMatrix1D y)
Description copied from interface:Blas
Symmetric matrix-vector multiplication; y = alpha*A*x + beta*y. Where alpha and beta are scalars, x and y are n element vectors and A is an n by n symmetric matrix. A can be in upper or lower triangular format.
-
dtrmv
public void dtrmv(boolean isUpperTriangular, boolean transposeA, boolean isUnitTriangular, DoubleMatrix2D A, DoubleMatrix1D x)
Description copied from interface:Blas
Triangular matrix-vector multiplication; x = A*x or x = A'*x. Where x is an n element vector and A is an n by n unit, or non-unit, upper or lower triangular matrix.- Specified by:
dtrmv
in interfaceBlas
- Parameters:
isUpperTriangular
- is A upper triangular or lower triangular?transposeA
- set this flag to indicate that the multiplication shall be performed on A'.isUnitTriangular
- true --> A is assumed to be unit triangular; false --> A is not assumed to be unit triangularA
- the source matrix.x
- the vector holding source and destination.
-
idamax
public int idamax(DoubleMatrix1D x)
Description copied from interface:Blas
Returns the index of largest absolute value; i such that |x[i]| == max(|x[0]|,|x[1]|,...)..
-
run
protected double[] run(DoubleMatrix2D A, DoubleMatrix2D B, boolean collectResults, Matrix2DMatrix2DFunction fun)
-
run
protected double[] run(DoubleMatrix2D A, boolean collectResults, Matrix2DMatrix2DFunction fun)
-
stats
public void stats()
Prints various snapshot statistics to System.out; Simply delegates toFJTaskRunnerGroup.stats()
.
-
xsum
private double xsum(DoubleMatrix2D A)
-
-