Class Kernel
- All Implemented Interfaces:
Cloneable
To write a new kernel, a developer extends the Kernel
class and overrides the Kernel.run()
method.
To execute this kernel, the developer creates a new instance of it and calls Kernel.execute(int globalSize)
with a suitable 'global size'. At runtime
Aparapi will attempt to convert the Kernel.run()
method (and any method called directly or indirectly
by Kernel.run()
) into OpenCL for execution on GPU devices made available via the OpenCL platform.
Note that Kernel.run()
is not called directly. Instead,
the Kernel.execute(int globalSize)
method will cause the overridden Kernel.run()
method to be invoked once for each value in the range 0...globalSize
.
On the first call to Kernel.execute(int _globalSize)
, Aparapi will determine the EXECUTION_MODE of the kernel.
This decision is made dynamically based on two factors:
- Whether OpenCL is available (appropriate drivers are installed and the OpenCL and Aparapi dynamic libraries are included on the system path).
- Whether the bytecode of the
run()
method (and every method that can be called directly or indirectly from therun()
method) can be converted into OpenCL.
Below is an example Kernel that calculates the square of a set of input values.
class SquareKernel extends Kernel{ private int values[]; private int squares[]; public SquareKernel(int values[]){ this.values = values; squares = new int[values.length]; } public void run() { int gid = getGlobalID(); squares[gid] = values[gid]*values[gid]; } public int[] getSquares(){ return(squares); } }
To execute this kernel, first create a new instance of it and then call execute(Range _range)
.
int[] values = new int[1024]; // fill values array Range range = Range.create(values.length); // create a range 0..1024 SquareKernel kernel = new SquareKernel(values); kernel.execute(range);
When execute(Range)
returns, all the executions of Kernel.run()
have completed and the results are available in the squares
array.
int[] squares = kernel.getSquares();
for (int i=0; iinvalid input: '<' values.length; i++){
System.out.printf("%4d %4d %8d\n", i, values[i], squares[i]);
}
A different approach to creating kernels that avoids extending Kernel is to write an anonymous inner class:
final int[] values = new int[1024];
// fill the values array
final int[] squares = new int[values.length];
final Range range = Range.create(values.length);
Kernel kernel = new Kernel(){
public void run() {
int gid = getGlobalID();
squares[gid] = values[gid]*values[gid];
}
};
kernel.execute(range);
for (int i=0; iinvalid input: '<' values.length; i++){
System.out.printf("%4d %4d %8d\n", i, values[i], squares[i]);
}
- Version:
- Alpha, 21/09/2010
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic @interface
We can use this Annotation to 'tag' intended constant buffers.class
static enum
Deprecated.final class
This class is for internal Kernel state managementstatic @interface
We can use this Annotation to 'tag' intended local buffers.static @interface
Annotation which can be applied to either a getter (with usual java bean naming convention relative to an instance field), or to any method with void return type, which prevents both the method body and any calls to the method being emitted in the generated OpenCL.protected static @interface
This annotation is for internal use onlyprotected static @interface
This annotation is for internal use onlystatic @interface
We can use this Annotation to 'tag' __private (unshared) array fields. -
Field Summary
FieldsModifier and TypeFieldDescriptionprivate static final IntBinaryOperator
private static final ValueCache
<Class<?>, Map<String, Boolean>, RuntimeException> private static final ValueCache
<Class<?>, Map<String, Boolean>, RuntimeException> private boolean
static final String
We can use this suffix to 'tag' intended constant buffers.private Iterator
<Kernel.EXECUTION_MODE> Deprecated.private Kernel.EXECUTION_MODE
Deprecated.private final LinkedHashSet
<Kernel.EXECUTION_MODE> Deprecated.private KernelRunner
private Kernel.KernelState
static final String
We can use this suffix to 'tag' intended local buffers.private static final double
private static Logger
private static final ValueCache
<Class<?>, Map<String, Boolean>, RuntimeException> private static final ValueCache
<Class<?>, Map<String, String>, RuntimeException> private static final IntBinaryOperator
private static final IntBinaryOperator
private static final ValueCache
<Class<?>, Map<String, Boolean>, RuntimeException> private static final IntBinaryOperator
private static final double
static final String
We can use this suffix to 'tag' __private buffers.(package private) boolean
private static final IntBinaryOperator
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprotected double
abs
(double _d) Delegates to eitherMath.abs(double)
(Java) orfabs(double)
(OpenCL).protected float
abs
(float _f) Delegates to eitherMath.abs(float)
(Java) orfabs(float)
(OpenCL).protected int
abs
(int n) Delegates to eitherMath.abs(int)
(Java) orabs(int)
(OpenCL).protected long
abs
(long n) Delegates to eitherMath.abs(long)
(Java) orabs(long)
(OpenCL).protected double
acos
(double a) Delegates to eitherMath.acos(double)
(Java) oracos(double)
(OpenCL).protected float
acos
(float a) Delegates to eitherMath.acos(double)
(Java) oracos(float)
(OpenCL).protected final double
acospi
(double a) protected final float
acospi
(float a) void
addExecutionModes
(Kernel.EXECUTION_MODE... platforms) Deprecated.protected double
asin
(double _d) Delegates to eitherMath.asin(double)
(Java) orasin(double)
(OpenCL).protected float
asin
(float _f) Delegates to eitherMath.asin(double)
(Java) orasin(float)
(OpenCL).protected final double
asinpi
(double a) protected final float
asinpi
(float a) protected double
atan
(double _d) Delegates to eitherMath.atan(double)
(Java) oratan(double)
(OpenCL).protected float
atan
(float _f) Delegates to eitherMath.atan(double)
(Java) oratan(float)
(OpenCL).protected double
atan2
(double _d1, double _d2) Delegates to eitherMath.atan2(double, double)
(Java) oratan2(double, double)
(OpenCL).protected float
atan2
(float _f1, float _f2) Delegates to eitherMath.atan2(double, double)
(Java) oratan2(float, float)
(OpenCL).protected final double
atan2pi
(double y, double x) protected final float
atan2pi
(float y, double x) protected final double
atanpi
(double a) protected final float
atanpi
(float a) protected int
atomicAdd
(int[] _arr, int _index, int _delta) Atomically adds_delta
value to_index
element of array_arr
(Java) or delegates toatomic_add(volatile int*, int)
(OpenCL).protected final int
atomicAdd
(AtomicInteger p, int val) protected final int
atomicAnd
(AtomicInteger p, int val) protected final int
atomicCmpXchg
(AtomicInteger p, int expectedVal, int newVal) protected final int
protected final int
protected final int
protected final int
atomicMax
(AtomicInteger p, int val) protected final int
atomicMin
(AtomicInteger p, int val) protected final int
atomicOr
(AtomicInteger p, int val) protected final void
atomicSet
(AtomicInteger p, int val) protected final int
atomicSub
(AtomicInteger p, int val) protected final int
atomicXchg
(AtomicInteger p, int newVal) protected final int
atomicXor
(AtomicInteger p, int val) private static <K,
V, T extends Throwable>
ValueCache<Class<?>, Map<K, V>, T> cacheProperty
(ValueCache.ThrowingValueComputer<Class<?>, Map<K, V>, T> throwingValueComputer) void
Invoking this method flags that once the current pass is complete execution should be abandoned.protected final double
cbrt
(double a) protected final float
cbrt
(float a) protected double
ceil
(double _d) Delegates to eitherMath.ceil(double)
(Java) orceil(double)
(OpenCL).protected float
ceil
(float _f) Delegates to eitherMath.ceil(double)
(Java) orceil(float)
(OpenCL).void
Frees the bulk of the resources used by this kernel, by setting array sizes in non-primitiveKernelArg
s to 1 (0 size is prohibited) and invoking kernel execution on a zero size range.clone()
When using a Java Thread Pool Aparapi uses clone to copy the initial instance to each thread.protected int
clz
(int _i) Delegates to eitherInteger.numberOfLeadingZeros(int)
(Java) orclz(int)
(OpenCL).protected long
clz
(long _l) Delegates to eitherLong.numberOfLeadingZeros(long)
(Java) orclz(long)
(OpenCL).Force pre-compilation of the kernel for a given device, without executing it.Force pre-compilation of the kernel for a given device, without executing it.protected double
cos
(double _d) Delegates to eitherMath.cos(double)
(Java) orcos(double)
(OpenCL).protected float
cos
(float _f) Delegates to eitherMath.cos(double)
(Java) orcos(float)
(OpenCL).protected final double
cosh
(double x) protected final float
cosh
(float x) protected final double
cospi
(double a) protected final float
cospi
(float a) protected Range
createRange
(int _range) private static String
void
dispose()
Release any resources associated with this Kernel.execute
(int _range) Start execution of_range
kernels.execute
(int _range, int _passes) Start execution of_passes
iterations over the_range
of kernels.Start execution of_range
kernels.Start execution of_passes
iterations of_range
kernels.Start execution ofglobalSize
kernels for the given entrypoint.Start execution ofglobalSize
kernels for the given entrypoint.void
executeFallbackAlgorithm
(Range _range, int _passId) IfhasFallbackAlgorithm()
has been overriden to return true, this method should be overriden so as to apply a single pass of the kernel's logic to the entire _range.protected double
exp
(double _d) Delegates to eitherMath.exp(double)
(Java) orexp(double)
(OpenCL).protected float
exp
(float _f) Delegates to eitherMath.exp(double)
(Java) orexp(float)
(OpenCL).protected final double
exp10
(double a) protected final float
exp10
(float a) protected final double
exp2
(double a) protected final float
exp2
(float a) protected final double
expm1
(double x) protected final float
expm1
(float x) protected double
floor
(double _d) Delegates to eitherMath.floor(double)
(Java) orfloor(double)
(OpenCL).protected float
floor
(float _f) Delegates to eitherMath.floor(double)
(Java) orfloor(float)
(OpenCL).protected double
fma
(double a, double b, double c) Delegates to either {code}a*b+c{code} (Java) orfma(double, double, double)
(OpenCL).protected float
fma
(float a, float b, float c) Delegates to either {code}a*b+c{code} (Java) orfma(float, float, float)
(OpenCL).get
(boolean[] array) Enqueue a request to return this buffer from the GPU.get
(boolean[][] array) Enqueue a request to return this buffer from the GPU.get
(boolean[][][] array) Enqueue a request to return this buffer from the GPU.get
(byte[] array) Enqueue a request to return this buffer from the GPU.get
(byte[][] array) Enqueue a request to return this buffer from the GPU.get
(byte[][][] array) Enqueue a request to return this buffer from the GPU.get
(char[] array) Enqueue a request to return this buffer from the GPU.get
(char[][] array) Enqueue a request to return this buffer from the GPU.get
(char[][][] array) Enqueue a request to return this buffer from the GPU.get
(double[] array) Enqueue a request to return this buffer from the GPU.get
(double[][] array) Enqueue a request to return this buffer from the GPU.get
(double[][][] array) Enqueue a request to return this buffer from the GPU.get
(float[] array) Enqueue a request to return this buffer from the GPU.get
(float[][] array) Enqueue a request to return this buffer from the GPU.get
(float[][][] array) Enqueue a request to return this buffer from the GPU.get
(int[] array) Enqueue a request to return this buffer from the GPU.get
(int[][] array) Enqueue a request to return this buffer from the GPU.get
(int[][][] array) Enqueue a request to return this buffer from the GPU.get
(long[] array) Enqueue a request to return this buffer from the GPU.get
(long[][] array) Enqueue a request to return this buffer from the GPU.get
(long[][][] array) Enqueue a request to return this buffer from the GPU.double
Determine the total execution time of all previous Kernel.execute(range) calls for all threads that ran this kernel for the device used in the last kernel execution.double
Determine the total execution time of all produced profile reports from all threads that executed the current kernel on the specified device.double
Determine the total execution time of all previous kernel executions called from the current thread, calling this method, that executed the current kernel on the specified device.private static String
getArgumentsLetters
(Method method) private static boolean
getBoolean
(ValueCache<Class<?>, Map<String, Boolean>, RuntimeException> methodNamesCache, ClassModel.ConstantPool.MethodReferenceEntry methodReferenceEntry) int
double
Determine the time taken to convert bytecode to OpenCL for first Kernel.execute(range) call.int
Deprecated.double
Determine the execution time of the previous Kernel.execute(range) called from the last thread that ran and executed on the most recently used device.protected final int
Determine the globalId of an executing kernel.protected final int
getGlobalId
(int _dim) protected final int
Determine the value that was passed toKernel.execute(int globalSize)
method.protected final int
getGlobalSize
(int _dim) protected final int
Determine the groupId of an executing kernel.protected final int
getGroupId
(int _dim) int[]
getKernelCompileWorkGroupSize
(Device device) Retrieves the specified work-group size in the compiled kernel for the specified device or intermediate language for the device.long
getKernelLocalMemSizeInUse
(Device device) Retrieves the amount of local memory used in the specified device by this kernel instance.int
getKernelMaxWorkGroupSize
(Device device) Retrieves the maximum work-group size allowed for this kernel when running on the specified device.long
Retrieves that minimum private memory in use per work item for this kernel instance and the specified device.int
Retrieves the preferred work-group multiple in the specified device for this kernel instance.protected final int
Determine the local id of an executing kernel.protected final int
getLocalId
(int _dim) protected final int
Determine the size of the group that an executing kernel is a member of.protected final int
getLocalSize
(int _dim) static String
getMappedMethodName
(ClassModel.ConstantPool.MethodReferenceEntry _methodReferenceEntry) protected final int
Determine the number of groups that will be used to execute a kernelprotected final int
getNumGroups
(int _dim) protected final int
Determine the passId of an executing kernel.Get the profiling information from the last successful call to Kernel.execute().getProfileReportCurrentThread
(Device device) Retrieves the most recent complete report available for the current thread calling this method for the current kernel instance and executed on the given device.getProfileReportLastThread
(Device device) Retrieves a profile report for the last thread that executed this kernel on the given device.private static <V,
T extends Throwable>
VgetProperty
(ValueCache<Class<?>, Map<String, V>, T> cache, ClassModel.ConstantPool.MethodReferenceEntry methodReferenceEntry, V defaultValue) private static String
getReturnTypeLetter
(Method meth) final Device
protected final void
Wait for all kernels in the current work group to rendezvous at this call before continuing execution.
It will also enforce memory ordering, such that modifications made by each thread in the work-group, to the memory, before entering into this barrier call will be visible by all threads leaving the barrier.boolean
False by default.boolean
Deprecated.protected double
hypot
(double a, double b) protected float
hypot
(float a, float b) protected double
IEEEremainder
(double _d1, double _d2) Delegates to eitherMath.IEEEremainder(double, double)
(Java) orremainder(double, double)
(OpenCL).protected float
IEEEremainder
(float _f1, float _f2) Delegates to eitherMath.IEEEremainder(double, double)
(Java) orremainder(float, float)
(OpenCL).static void
boolean
isAllowDevice
(Device _device) boolean
boolean
boolean
For dev purposes (we should remove this for production) determine whether this Kernel uses explicit memory managementstatic boolean
isMappedMethod
(ClassModel.ConstantPool.MethodReferenceEntry methodReferenceEntry) static boolean
isOpenCLDelegateMethod
(ClassModel.ConstantPool.MethodReferenceEntry methodReferenceEntry) private static boolean
isRelevant
(Method method) boolean
protected final void
Wait for all kernels in the current work group to rendezvous at this call before continuing execution.
It will also enforce memory ordering, such that modifications made by each thread in the work-group, to the memory, before entering into this barrier call will be visible by all threads leaving the barrier.protected final void
Wait for all kernels in the current work group to rendezvous at this call before continuing execution.
It will also enforce memory ordering, such that modifications made by each thread in the work-group, to the memory, before entering into this barrier call will be visible by all threads leaving the barrier.protected double
log
(double _d) Delegates to eitherMath.log(double)
(Java) orlog(double)
(OpenCL).protected float
log
(float _f) Delegates to eitherMath.log(double)
(Java) orlog(float)
(OpenCL).protected final double
log10
(double a) protected final float
log10
(float a) protected final double
log1p
(double x) protected final float
log1p
(float x) protected final double
log2
(double a) protected final float
log2
(float a) protected final double
mad
(double a, double b, double c) protected final float
mad
(float a, float b, float c) private static <A extends Annotation>
ValueCache<Class<?>, Map<String, Boolean>, RuntimeException> markedWith
(Class<A> annotationClass) protected double
max
(double _d1, double _d2) Delegates to eitherMath.max(double, double)
(Java) orfmax(double, double)
(OpenCL).protected float
max
(float _f1, float _f2) Delegates to eitherMath.max(float, float)
(Java) orfmax(float, float)
(OpenCL).protected int
max
(int n1, int n2) Delegates to eitherMath.max(int, int)
(Java) ormax(int, int)
(OpenCL).protected long
max
(long n1, long n2) Delegates to eitherMath.max(long, long)
(Java) ormax(long, long)
(OpenCL).protected double
min
(double _d1, double _d2) Delegates to eitherMath.min(double, double)
(Java) orfmin(double, double)
(OpenCL).protected float
min
(float _f1, float _f2) Delegates to eitherMath.min(float, float)
(Java) orfmin(float, float)
(OpenCL).protected int
min
(int n1, int n2) Delegates to eitherMath.min(int, int)
(Java) ormin(int, int)
(OpenCL).protected long
min
(long n1, long n2) Delegates to eitherMath.min(long, long)
(Java) ormin(long, long)
(OpenCL).private float
native_rsqrt
(float _f) private float
native_sqrt
(float _f) protected final double
nextAfter
(double start, double direction) protected final float
nextAfter
(float start, float direction) protected int
popcount
(int _i) Delegates to eitherInteger.bitCount(int)
(Java) orpopcount(int)
(OpenCL).protected long
popcount
(long _i) Delegates to eitherLong.bitCount(long)
(Java) orpopcount(long)
(OpenCL).protected double
pow
(double _d1, double _d2) Delegates to eitherMath.pow(double, double)
(Java) orpow(double, double)
(OpenCL).protected float
pow
(float _f1, float _f2) Delegates to eitherMath.pow(double, double)
(Java) orpow(float, float)
(OpenCL).private KernelRunner
put
(boolean[] array) Tag this array so that it is explicitly enqueued before the kernel is executedput
(boolean[][] array) Tag this array so that it is explicitly enqueued before the kernel is executedput
(boolean[][][] array) Tag this array so that it is explicitly enqueued before the kernel is executedput
(byte[] array) Tag this array so that it is explicitly enqueued before the kernel is executedput
(byte[][] array) Tag this array so that it is explicitly enqueued before the kernel is executedput
(byte[][][] array) Tag this array so that it is explicitly enqueued before the kernel is executedput
(char[] array) Tag this array so that it is explicitly enqueued before the kernel is executedput
(char[][] array) Tag this array so that it is explicitly enqueued before the kernel is executedput
(char[][][] array) Tag this array so that it is explicitly enqueued before the kernel is executedput
(double[] array) Tag this array so that it is explicitly enqueued before the kernel is executedput
(double[][] array) Tag this array so that it is explicitly enqueued before the kernel is executedput
(double[][][] array) Tag this array so that it is explicitly enqueued before the kernel is executedput
(float[] array) Tag this array so that it is explicitly enqueued before the kernel is executedput
(float[][] array) Tag this array so that it is explicitly enqueued before the kernel is executedput
(float[][][] array) Tag this array so that it is explicitly enqueued before the kernel is executedput
(int[] array) Tag this array so that it is explicitly enqueued before the kernel is executedput
(int[][] array) Tag this array so that it is explicitly enqueued before the kernel is executedput
(int[][][] array) Tag this array so that it is explicitly enqueued before the kernel is executedput
(long[] array) Tag this array so that it is explicitly enqueued before the kernel is executedput
(long[][] array) Tag this array so that it is explicitly enqueued before the kernel is executedput
(long[][][] array) Tag this array so that it is explicitly enqueued before the kernel is executedvoid
Registers a new profile report observer to receive profile reports as they're produced.protected double
rint
(double _d) Delegates to eitherMath.rint(double)
(Java) orrint(double)
(OpenCL).protected float
rint
(float _f) Delegates to eitherMath.rint(double)
(Java) orrint(float)
(OpenCL).protected long
round
(double _d) Delegates to eitherMath.round(double)
(Java) orround(double)
(OpenCL).protected int
round
(float _f) Delegates to eitherMath.round(float)
(Java) orround(float)
(OpenCL).protected double
rsqrt
(double _d) Computes inverse square root usingMath.sqrt(double)
(Java) or delegates torsqrt(double)
(OpenCL).protected float
rsqrt
(float _f) Computes inverse square root usingMath.sqrt(double)
(Java) or delegates torsqrt(double)
(OpenCL).abstract void
run()
The entry point of a kernel.void
setAutoCleanUpArrays
(boolean autoCleanUpArrays) Property which if true enables automatic calling ofcleanUpArrays()
following each execution.void
setExecutionMode
(Kernel.EXECUTION_MODE _executionMode) Deprecated.void
setExecutionModeWithoutFallback
(Kernel.EXECUTION_MODE _executionMode) void
setExplicit
(boolean _explicit) For dev purposes (we should remove this for production) allow us to define that this Kernel uses explicit memory managementvoid
Deprecated.protected double
sin
(double _d) Delegates to eitherMath.sin(double)
(Java) orsin(double)
(OpenCL).protected float
sin
(float _f) Delegates to eitherMath.sin(double)
(Java) orsin(float)
(OpenCL).protected final double
sinh
(double x) Delegates to eitherMath.sinh(double)
(Java) orsinh(double)
(OpenCL).protected final float
sinh
(float x) Delegates to eitherMath.sinh(double)
(Java) orsinh(float)
(OpenCL).protected final double
sinpi
(double a) Backed by eitherMath.sin(double)
(Java) orsinpi(double)
(OpenCL).protected final float
sinpi
(float a) Backed by eitherMath.sin(double)
(Java) orsinpi(float)
(OpenCL).protected double
sqrt
(double _d) Delegates to eitherMath.sqrt(double)
(Java) orsqrt(double)
(OpenCL).protected float
sqrt
(float _f) Delegates to eitherMath.sqrt(double)
(Java) orsqrt(float)
(OpenCL).protected double
tan
(double _d) Delegates to eitherMath.tan(double)
(Java) ortan(double)
(OpenCL).protected float
tan
(float _f) Delegates to eitherMath.tan(double)
(Java) ortan(float)
(OpenCL).protected final double
tanh
(double x) Delegates to eitherMath.tanh(double)
(Java) ortanh(double)
(OpenCL).protected final float
tanh
(float x) Delegates to eitherMath.tanh(float)
(Java) ortanh(float)
(OpenCL).protected final double
tanpi
(double a) Backed by eitherMath.tan(double)
(Java) ortanpi(double)
(OpenCL).protected final float
tanpi
(float a) Backed by eitherMath.tan(double)
(Java) ortanpi(float)
(OpenCL).private static String
toClassShortNameIfAny
(Class<?> retClass) protected double
toDegrees
(double _d) Delegates to eitherMath.toDegrees(double)
(Java) ordegrees(double)
(OpenCL).protected float
toDegrees
(float _f) Delegates to eitherMath.toDegrees(double)
(Java) ordegrees(float)
(OpenCL).protected double
toRadians
(double _d) Delegates to eitherMath.toRadians(double)
(Java) orradians(double)
(OpenCL).protected float
toRadians
(float _f) Delegates to eitherMath.toRadians(double)
(Java) orradians(float)
(OpenCL).private static String
toSignature
(ClassModel.ConstantPool.MethodReferenceEntry methodReferenceEntry) (package private) static String
toSignature
(Method method) toString()
void
Deprecated.static boolean
usesAtomic32
(ClassModel.ConstantPool.MethodReferenceEntry methodReferenceEntry) static boolean
usesAtomic64
(ClassModel.ConstantPool.MethodReferenceEntry methodReferenceEntry)
-
Field Details
-
logger
-
LOCAL_SUFFIX
We can use this suffix to 'tag' intended local buffers. So either name the buffer
Or use the Annotation formint[] buffer_$local$ = new int[1024];
invalid input: '@'Local int[] buffer = new int[1024];
- See Also:
-
CONSTANT_SUFFIX
We can use this suffix to 'tag' intended constant buffers. So either name the buffer
Or use the Annotation formint[] buffer_$constant$ = new int[1024];
invalid input: '@'Constant int[] buffer = new int[1024];
- See Also:
-
PRIVATE_SUFFIX
We can use this suffix to 'tag' __private buffers.So either name the buffer
Or use the Annotation formint[] buffer_$private$32 = new int[32];
invalid input: '@'PrivateMemorySpace(32) int[] buffer = new int[32];
- See Also:
-
kernelRunner
-
autoCleanUpArrays
private boolean autoCleanUpArrays -
kernelState
-
LOG_2_RECIPROCAL
private static final double LOG_2_RECIPROCAL -
PI_RECIPROCAL
private static final double PI_RECIPROCAL- See Also:
-
minOperator
-
maxOperator
-
andOperator
-
orOperator
-
xorOperator
-
typeToLetterMap
-
useNullForLocalSize
boolean useNullForLocalSize -
executionModes
Deprecated. -
currentMode
Deprecated. -
executionMode
Deprecated. -
mappedMethodFlags
-
openCLDelegateMethodFlags
private static final ValueCache<Class<?>,Map<String, openCLDelegateMethodFlagsBoolean>, RuntimeException> -
atomic32Cache
-
atomic64Cache
-
mappedMethodNamesCache
private static final ValueCache<Class<?>,Map<String, mappedMethodNamesCacheString>, RuntimeException>
-
-
Constructor Details
-
Kernel
public Kernel()
-
-
Method Details
-
getGlobalId
protected final int getGlobalId()Determine the globalId of an executing kernel.The kernel implementation uses the globalId to determine which of the executing kernels (in the global domain space) this invocation is expected to deal with.
For example in a
SquareKernel
implementation:class SquareKernel extends Kernel{ private int values[]; private int squares[]; public SquareKernel(int values[]){ this.values = values; squares = new int[values.length]; } public void run() { int gid = getGlobalID(); squares[gid] = values[gid]*values[gid]; } public int[] getSquares(){ return(squares); } }
Each invocation of
SquareKernel.run()
retrieves it's globalId by callinggetGlobalId()
, and then computes the value ofsquare[gid]
for a given value ofvalue[gid]
.- Returns:
- The globalId for the Kernel being executed
- See Also:
-
getGlobalId
protected final int getGlobalId(int _dim) -
getGroupId
protected final int getGroupId()Determine the groupId of an executing kernel.When a
Kernel.execute(int globalSize)
is invoked for a particular kernel, the runtime will break the work into various 'groups'.A kernel can use
getGroupId()
to determine which group a kernel is currently dispatched toThe following code would capture the groupId for each kernel and map it against globalId.
final int[] groupIds = new int[1024]; Kernel kernel = new Kernel(){ public void run() { int gid = getGlobalId(); groupIds[gid] = getGroupId(); } }; kernel.execute(groupIds.length); for (int i=0; iinvalid input: '<' values.length; i++){ System.out.printf("%4d %4d\n", i, groupIds[i]); }
- Returns:
- The groupId for this Kernel being executed
- See Also:
-
getGroupId
protected final int getGroupId(int _dim) -
getPassId
protected final int getPassId()Determine the passId of an executing kernel.When a
Kernel.execute(int globalSize, int passes)
is invoked for a particular kernel, the runtime will break the work into various 'groups'.A kernel can use
getPassId()
to determine which pass we are in. This is ideal for 'reduce' type phases- Returns:
- The groupId for this Kernel being executed
- See Also:
-
getLocalId
protected final int getLocalId()Determine the local id of an executing kernel.When a
Kernel.execute(int globalSize)
is invoked for a particular kernel, the runtime will break the work into various 'groups'.getLocalId()
can be used to determine the relative id of the current kernel within a specific group.The following code would capture the groupId for each kernel and map it against globalId.
final int[] localIds = new int[1024]; Kernel kernel = new Kernel(){ public void run() { int gid = getGlobalId(); localIds[gid] = getLocalId(); } }; kernel.execute(localIds.length); for (int i=0; iinvalid input: '<' values.length; i++){ System.out.printf("%4d %4d\n", i, localIds[i]); }
- Returns:
- The local id for this Kernel being executed
- See Also:
-
getLocalId
protected final int getLocalId(int _dim) -
getLocalSize
protected final int getLocalSize()Determine the size of the group that an executing kernel is a member of.When a
Kernel.execute(int globalSize)
is invoked for a particular kernel, the runtime will break the work into various 'groups'.getLocalSize()
allows a kernel to determine the size of the current group.Note groups may not all be the same size. In particular, if
(global size)%(# of compute devices)!=0
, the runtime can choose to dispatch kernels to groups with differing sizes.- Returns:
- The size of the currently executing group.
- See Also:
-
getLocalSize
protected final int getLocalSize(int _dim) -
getGlobalSize
protected final int getGlobalSize()Determine the value that was passed toKernel.execute(int globalSize)
method.- Returns:
- The value passed to
Kernel.execute(int globalSize)
causing the current execution. - See Also:
-
getGlobalSize
protected final int getGlobalSize(int _dim) -
getNumGroups
protected final int getNumGroups()Determine the number of groups that will be used to execute a kernelWhen
Kernel.execute(int globalSize)
is invoked, the runtime will split the work into multiple 'groups'.getNumGroups()
returns the total number of groups that will be used.- Returns:
- The number of groups that kernels will be dispatched into.
- See Also:
-
getNumGroups
protected final int getNumGroups(int _dim) -
run
public abstract void run()The entry point of a kernel.Every kernel must override this method.
-
hasFallbackAlgorithm
public boolean hasFallbackAlgorithm()False by default. In the event that all preferred devices fail to execute a kernel, it is possible to supply an alternate (possibly non-parallel) execution algorithm by overriding this method to return true, and overridingexecuteFallbackAlgorithm(Range, int)
with the alternate algorithm. -
executeFallbackAlgorithm
IfhasFallbackAlgorithm()
has been overriden to return true, this method should be overriden so as to apply a single pass of the kernel's logic to the entire _range.This is not normally required, as fallback to
JavaDevice.THREAD_POOL
will implement the algorithm in parallel. However in the event that thread pool execution may be prohibitively slow, this method might implement a "quick and dirty" approximation to the desired result (for example, a simple box-blur as opposed to a gaussian blur in an image processing application). -
cancelMultiPass
public void cancelMultiPass()Invoking this method flags that once the current pass is complete execution should be abandoned. Due to the complexity of intercommunication between java (or C) and executing OpenCL, this is the best we can do for general cancellation of execution at present. OpenCL 2.0 should introduce pipe mechanisms which will support mid-pass cancellation easily.Note that in the case of thread-pool/pure java execution we could do better already, using Thread.interrupt() (and/or other means) to abandon execution mid-pass. However at present this is not attempted.
- See Also:
-
getCancelState
public int getCancelState() -
getCurrentPass
public int getCurrentPass()- See Also:
-
isExecuting
public boolean isExecuting()- See Also:
-
clone
When using a Java Thread Pool Aparapi uses clone to copy the initial instance to each thread.If you choose to override
clone()
you are responsible for delegating tosuper.clone();
-
acos
protected float acos(float a) Delegates to eitherMath.acos(double)
(Java) oracos(float)
(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
a
- value to delegate toMath.acos(double)
/acos(float)
- Returns:
Math.acos(double)
casted to float/acos(float)
- See Also:
-
acos
protected double acos(double a) Delegates to eitherMath.acos(double)
(Java) oracos(double)
(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
a
- value to delegate toMath.acos(double)
/acos(double)
- Returns:
Math.acos(double)
/acos(double)
- See Also:
-
asin
protected float asin(float _f) Delegates to eitherMath.asin(double)
(Java) orasin(float)
(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_f
- value to delegate toMath.asin(double)
/asin(float)
- Returns:
Math.asin(double)
casted to float/asin(float)
- See Also:
-
asin
protected double asin(double _d) Delegates to eitherMath.asin(double)
(Java) orasin(double)
(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_d
- value to delegate toMath.asin(double)
/asin(double)
- Returns:
Math.asin(double)
/asin(double)
- See Also:
-
atan
protected float atan(float _f) Delegates to eitherMath.atan(double)
(Java) oratan(float)
(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_f
- value to delegate toMath.atan(double)
/atan(float)
- Returns:
Math.atan(double)
casted to float/atan(float)
- See Also:
-
atan
protected double atan(double _d) Delegates to eitherMath.atan(double)
(Java) oratan(double)
(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_d
- value to delegate toMath.atan(double)
/atan(double)
- Returns:
Math.atan(double)
/atan(double)
- See Also:
-
atan2
protected float atan2(float _f1, float _f2) Delegates to eitherMath.atan2(double, double)
(Java) oratan2(float, float)
(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_f1
- value to delegate to first argument ofMath.atan2(double, double)
/atan2(float, float)
_f2
- value to delegate to second argument ofMath.atan2(double, double)
/atan2(float, float)
- Returns:
Math.atan2(double, double)
casted to float/atan2(float, float)
- See Also:
-
atan2
protected double atan2(double _d1, double _d2) Delegates to eitherMath.atan2(double, double)
(Java) oratan2(double, double)
(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_d1
- value to delegate to first argument ofMath.atan2(double, double)
/atan2(double, double)
_d2
- value to delegate to second argument ofMath.atan2(double, double)
/atan2(double, double)
- Returns:
Math.atan2(double, double)
/atan2(double, double)
- See Also:
-
ceil
protected float ceil(float _f) Delegates to eitherMath.ceil(double)
(Java) orceil(float)
(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_f
- value to delegate toMath.ceil(double)
/ceil(float)
- Returns:
Math.ceil(double)
casted to float/ceil(float)
- See Also:
-
ceil
protected double ceil(double _d) Delegates to eitherMath.ceil(double)
(Java) orceil(double)
(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_d
- value to delegate toMath.ceil(double)
/ceil(double)
- Returns:
Math.ceil(double)
/ceil(double)
- See Also:
-
cos
protected float cos(float _f) Delegates to eitherMath.cos(double)
(Java) orcos(float)
(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_f
- value to delegate toMath.cos(double)
/cos(float)
- Returns:
Math.cos(double)
casted to float/cos(float)
- See Also:
-
cos
protected double cos(double _d) Delegates to eitherMath.cos(double)
(Java) orcos(double)
(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_d
- value to delegate toMath.cos(double)
/cos(double)
- Returns:
Math.cos(double)
/cos(double)
- See Also:
-
exp
protected float exp(float _f) Delegates to eitherMath.exp(double)
(Java) orexp(float)
(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_f
- value to delegate toMath.exp(double)
/exp(float)
- Returns:
Math.exp(double)
casted to float/exp(float)
- See Also:
-
exp
protected double exp(double _d) Delegates to eitherMath.exp(double)
(Java) orexp(double)
(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_d
- value to delegate toMath.exp(double)
/exp(double)
- Returns:
Math.exp(double)
/exp(double)
- See Also:
-
abs
protected float abs(float _f) Delegates to eitherMath.abs(float)
(Java) orfabs(float)
(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_f
- value to delegate toMath.abs(float)
/fabs(float)
- Returns:
Math.abs(float)
/fabs(float)
- See Also:
-
popcount
protected int popcount(int _i) Delegates to eitherInteger.bitCount(int)
(Java) orpopcount(int)
(OpenCL).- Parameters:
_i
- value to delegate toInteger.bitCount(int)
/popcount(int)
- Returns:
Integer.bitCount(int)
/popcount(int)
- See Also:
-
popcount
protected long popcount(long _i) Delegates to eitherLong.bitCount(long)
(Java) orpopcount(long)
(OpenCL).- Parameters:
_i
- value to delegate toLong.bitCount(long)
/popcount(long)
- Returns:
Long.bitCount(long)
/popcount(long)
- See Also:
-
clz
protected int clz(int _i) Delegates to eitherInteger.numberOfLeadingZeros(int)
(Java) orclz(int)
(OpenCL).- Parameters:
_i
- value to delegate toInteger.numberOfLeadingZeros(int)
/clz(int)
- Returns:
Integer.numberOfLeadingZeros(int)
/clz(int)
- See Also:
-
clz
protected long clz(long _l) Delegates to eitherLong.numberOfLeadingZeros(long)
(Java) orclz(long)
(OpenCL).- Parameters:
_l
- value to delegate toLong.numberOfLeadingZeros(long)
/clz(long)
- Returns:
Long.numberOfLeadingZeros(long)
/clz(long)
- See Also:
-
abs
protected double abs(double _d) Delegates to eitherMath.abs(double)
(Java) orfabs(double)
(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_d
- value to delegate toMath.abs(double)
/fabs(double)
- Returns:
Math.abs(double)
/fabs(double)
- See Also:
-
abs
protected int abs(int n) Delegates to eitherMath.abs(int)
(Java) orabs(int)
(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
n
- value to delegate toMath.abs(int)
/abs(int)
- Returns:
Math.abs(int)
/abs(int)
- See Also:
-
abs
protected long abs(long n) Delegates to eitherMath.abs(long)
(Java) orabs(long)
(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
n
- value to delegate toMath.abs(long)
/abs(long)
- Returns:
Math.abs(long)
/abs(long)
- See Also:
-
floor
protected float floor(float _f) Delegates to eitherMath.floor(double)
(Java) orfloor(float)
(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_f
- value to delegate toMath.floor(double)
/floor(float)
- Returns:
Math.floor(double)
casted to float/floor(float)
- See Also:
-
floor
protected double floor(double _d) Delegates to eitherMath.floor(double)
(Java) orfloor(double)
(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_d
- value to delegate toMath.floor(double)
/floor(double)
- Returns:
Math.floor(double)
/floor(double)
- See Also:
-
max
protected float max(float _f1, float _f2) Delegates to eitherMath.max(float, float)
(Java) orfmax(float, float)
(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_f1
- value to delegate to first argument ofMath.max(float, float)
/fmax(float, float)
_f2
- value to delegate to second argument ofMath.max(float, float)
/fmax(float, float)
- Returns:
Math.max(float, float)
/fmax(float, float)
- See Also:
-
max
protected double max(double _d1, double _d2) Delegates to eitherMath.max(double, double)
(Java) orfmax(double, double)
(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_d1
- value to delegate to first argument ofMath.max(double, double)
/fmax(double, double)
_d2
- value to delegate to second argument ofMath.max(double, double)
/fmax(double, double)
- Returns:
Math.max(double, double)
/fmax(double, double)
- See Also:
-
max
protected int max(int n1, int n2) Delegates to eitherMath.max(int, int)
(Java) ormax(int, int)
(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
n1
- value to delegate toMath.max(int, int)
/max(int, int)
n2
- value to delegate toMath.max(int, int)
/max(int, int)
- Returns:
Math.max(int, int)
/max(int, int)
- See Also:
-
max
protected long max(long n1, long n2) Delegates to eitherMath.max(long, long)
(Java) ormax(long, long)
(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
n1
- value to delegate to first argument ofMath.max(long, long)
/max(long, long)
n2
- value to delegate to second argument ofMath.max(long, long)
/max(long, long)
- Returns:
Math.max(long, long)
/max(long, long)
- See Also:
-
min
protected float min(float _f1, float _f2) Delegates to eitherMath.min(float, float)
(Java) orfmin(float, float)
(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_f1
- value to delegate to first argument ofMath.min(float, float)
/fmin(float, float)
_f2
- value to delegate to second argument ofMath.min(float, float)
/fmin(float, float)
- Returns:
Math.min(float, float)
/fmin(float, float)
- See Also:
-
min
protected double min(double _d1, double _d2) Delegates to eitherMath.min(double, double)
(Java) orfmin(double, double)
(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_d1
- value to delegate to first argument ofMath.min(double, double)
/fmin(double, double)
_d2
- value to delegate to second argument ofMath.min(double, double)
/fmin(double, double)
- Returns:
Math.min(double, double)
/fmin(double, double)
- See Also:
-
min
protected int min(int n1, int n2) Delegates to eitherMath.min(int, int)
(Java) ormin(int, int)
(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
n1
- value to delegate to first argument ofMath.min(int, int)
/min(int, int)
n2
- value to delegate to second argument ofMath.min(int, int)
/min(int, int)
- Returns:
Math.min(int, int)
/min(int, int)
- See Also:
-
min
protected long min(long n1, long n2) Delegates to eitherMath.min(long, long)
(Java) ormin(long, long)
(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
n1
- value to delegate to first argument ofMath.min(long, long)
/min(long, long)
n2
- value to delegate to second argument ofMath.min(long, long)
/min(long, long)
- Returns:
Math.min(long, long)
/min(long, long)
- See Also:
-
log
protected float log(float _f) Delegates to eitherMath.log(double)
(Java) orlog(float)
(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_f
- value to delegate toMath.log(double)
/log(float)
- Returns:
Math.log(double)
casted to float/log(float)
- See Also:
-
log
protected double log(double _d) Delegates to eitherMath.log(double)
(Java) orlog(double)
(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_d
- value to delegate toMath.log(double)
/log(double)
- Returns:
Math.log(double)
/log(double)
- See Also:
-
pow
protected float pow(float _f1, float _f2) Delegates to eitherMath.pow(double, double)
(Java) orpow(float, float)
(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_f1
- value to delegate to first argument ofMath.pow(double, double)
/pow(float, float)
_f2
- value to delegate to second argument ofMath.pow(double, double)
/pow(float, float)
- Returns:
Math.pow(double, double)
casted to float/pow(float, float)
- See Also:
-
pow
protected double pow(double _d1, double _d2) Delegates to eitherMath.pow(double, double)
(Java) orpow(double, double)
(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_d1
- value to delegate to first argument ofMath.pow(double, double)
/pow(double, double)
_d2
- value to delegate to second argument ofMath.pow(double, double)
/pow(double, double)
- Returns:
Math.pow(double, double)
/pow(double, double)
- See Also:
-
IEEEremainder
protected float IEEEremainder(float _f1, float _f2) Delegates to eitherMath.IEEEremainder(double, double)
(Java) orremainder(float, float)
(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_f1
- value to delegate to first argument ofMath.IEEEremainder(double, double)
/remainder(float, float)
_f2
- value to delegate to second argument ofMath.IEEEremainder(double, double)
/remainder(float, float)
- Returns:
Math.IEEEremainder(double, double)
casted to float/remainder(float, float)
- See Also:
-
IEEEremainder
protected double IEEEremainder(double _d1, double _d2) Delegates to eitherMath.IEEEremainder(double, double)
(Java) orremainder(double, double)
(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_d1
- value to delegate to first argument ofMath.IEEEremainder(double, double)
/remainder(double, double)
_d2
- value to delegate to second argument ofMath.IEEEremainder(double, double)
/remainder(double, double)
- Returns:
Math.IEEEremainder(double, double)
/remainder(double, double)
- See Also:
-
toRadians
protected float toRadians(float _f) Delegates to eitherMath.toRadians(double)
(Java) orradians(float)
(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_f
- value to delegate toMath.toRadians(double)
/radians(float)
- Returns:
Math.toRadians(double)
casted to float/radians(float)
- See Also:
-
toRadians
protected double toRadians(double _d) Delegates to eitherMath.toRadians(double)
(Java) orradians(double)
(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_d
- value to delegate toMath.toRadians(double)
/radians(double)
- Returns:
Math.toRadians(double)
/radians(double)
- See Also:
-
toDegrees
protected float toDegrees(float _f) Delegates to eitherMath.toDegrees(double)
(Java) ordegrees(float)
(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_f
- value to delegate toMath.toDegrees(double)
/degrees(float)
- Returns:
Math.toDegrees(double)
casted to float/degrees(float)
- See Also:
-
toDegrees
protected double toDegrees(double _d) Delegates to eitherMath.toDegrees(double)
(Java) ordegrees(double)
(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_d
- value to delegate toMath.toDegrees(double)
/degrees(double)
- Returns:
Math.toDegrees(double)
/degrees(double)
- See Also:
-
rint
protected float rint(float _f) Delegates to eitherMath.rint(double)
(Java) orrint(float)
(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_f
- value to delegate toMath.rint(double)
/rint(float)
- Returns:
Math.rint(double)
casted to float/rint(float)
- See Also:
-
rint
protected double rint(double _d) Delegates to eitherMath.rint(double)
(Java) orrint(double)
(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_d
- value to delegate toMath.rint(double)
/rint(double)
- Returns:
Math.rint(double)
/rint(double)
- See Also:
-
round
protected int round(float _f) Delegates to eitherMath.round(float)
(Java) orround(float)
(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_f
- value to delegate toMath.round(float)
/round(float)
- Returns:
Math.round(float)
/round(float)
- See Also:
-
round
protected long round(double _d) Delegates to eitherMath.round(double)
(Java) orround(double)
(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_d
- value to delegate toMath.round(double)
/round(double)
- Returns:
Math.round(double)
/round(double)
- See Also:
-
sin
protected float sin(float _f) Delegates to eitherMath.sin(double)
(Java) orsin(float)
(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_f
- value to delegate toMath.sin(double)
/sin(float)
- Returns:
Math.sin(double)
casted to float/sin(float)
- See Also:
-
sin
protected double sin(double _d) Delegates to eitherMath.sin(double)
(Java) orsin(double)
(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_d
- value to delegate toMath.sin(double)
/sin(double)
- Returns:
Math.sin(double)
/sin(double)
- See Also:
-
sqrt
protected float sqrt(float _f) Delegates to eitherMath.sqrt(double)
(Java) orsqrt(float)
(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_f
- value to delegate toMath.sqrt(double)
/sqrt(float)
- Returns:
Math.sqrt(double)
casted to float/sqrt(float)
- See Also:
-
sqrt
protected double sqrt(double _d) Delegates to eitherMath.sqrt(double)
(Java) orsqrt(double)
(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_d
- value to delegate toMath.sqrt(double)
/sqrt(double)
- Returns:
Math.sqrt(double)
/sqrt(double)
- See Also:
-
tan
protected float tan(float _f) Delegates to eitherMath.tan(double)
(Java) ortan(float)
(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_f
- value to delegate toMath.tan(double)
/tan(float)
- Returns:
Math.tan(double)
casted to float/tan(float)
- See Also:
-
tan
protected double tan(double _d) Delegates to eitherMath.tan(double)
(Java) ortan(double)
(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_d
- value to delegate toMath.tan(double)
/tan(double)
- Returns:
Math.tan(double)
/tan(double)
- See Also:
-
acospi
protected final double acospi(double a) -
acospi
protected final float acospi(float a) -
asinpi
protected final double asinpi(double a) -
asinpi
protected final float asinpi(float a) -
atanpi
protected final double atanpi(double a) -
atanpi
protected final float atanpi(float a) -
atan2pi
protected final double atan2pi(double y, double x) -
atan2pi
protected final float atan2pi(float y, double x) -
cbrt
protected final double cbrt(double a) -
cbrt
protected final float cbrt(float a) -
cosh
protected final double cosh(double x) -
cosh
protected final float cosh(float x) -
cospi
protected final double cospi(double a) -
cospi
protected final float cospi(float a) -
exp2
protected final double exp2(double a) -
exp2
protected final float exp2(float a) -
exp10
protected final double exp10(double a) -
exp10
protected final float exp10(float a) -
expm1
protected final double expm1(double x) -
expm1
protected final float expm1(float x) -
log2
protected final double log2(double a) -
log2
protected final float log2(float a) -
log10
protected final double log10(double a) -
log10
protected final float log10(float a) -
log1p
protected final double log1p(double x) -
log1p
protected final float log1p(float x) -
mad
protected final double mad(double a, double b, double c) -
mad
protected final float mad(float a, float b, float c) -
fma
protected float fma(float a, float b, float c) Delegates to either {code}a*b+c{code} (Java) orfma(float, float, float)
(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
a
- value to delegate to first argument offma(float, float, float)
b
- value to delegate to second argument offma(float, float, float)
c
- value to delegate to third argument offma(float, float, float)
- Returns:
- a * b + c /
fma(float, float, float)
- See Also:
-
fma
protected double fma(double a, double b, double c) Delegates to either {code}a*b+c{code} (Java) orfma(double, double, double)
(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
a
- value to delegate to first argument offma(double, double, double)
b
- value to delegate to second argument offma(double, double, double)
c
- value to delegate to third argument offma(double, double, double)
- Returns:
- a * b + c /
fma(double, double, double)
- See Also:
-
nextAfter
protected final double nextAfter(double start, double direction) -
nextAfter
protected final float nextAfter(float start, float direction) -
sinh
protected final double sinh(double x) Delegates to eitherMath.sinh(double)
(Java) orsinh(double)
(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
x
- value to delegate toMath.sinh(double)
/sinh(double)
- Returns:
Math.sinh(double)
/sinh(double)
- See Also:
-
sinh
protected final float sinh(float x) Delegates to eitherMath.sinh(double)
(Java) orsinh(float)
(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
x
- value to delegate toMath.sinh(double)
/sinh(float)
- Returns:
Math.sinh(double)
/sinh(float)
- See Also:
-
sinpi
protected final double sinpi(double a) Backed by eitherMath.sin(double)
(Java) orsinpi(double)
(OpenCL). This method is equivelant toMath.sin(a * Math.PI)
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
a
- value to delegate tosinpi(double)
or java equivelant- Returns:
sinpi(double)
or java equivelant- See Also:
-
sinpi
protected final float sinpi(float a) Backed by eitherMath.sin(double)
(Java) orsinpi(float)
(OpenCL). This method is equivelant toMath.sin(a * Math.PI)
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
a
- value to delegate tosinpi(float)
or java equivelant- Returns:
sinpi(float)
or java equivelant- See Also:
-
tanh
protected final double tanh(double x) Delegates to eitherMath.tanh(double)
(Java) ortanh(double)
(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
x
- value to delegate toMath.tanh(double)
/tanh(double)
- Returns:
Math.tanh(double)
/tanh(double)
- See Also:
-
tanh
protected final float tanh(float x) Delegates to eitherMath.tanh(float)
(Java) ortanh(float)
(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
x
- value to delegate toMath.tanh(float)
/tanh(float)
- Returns:
Math.tanh(float)
/tanh(float)
- See Also:
-
tanpi
protected final double tanpi(double a) Backed by eitherMath.tan(double)
(Java) ortanpi(double)
(OpenCL). This method is equivelant toMath.tan(a * Math.PI)
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
a
- value to delegate totanpi(double)
or java equivelant- Returns:
tanpi(double)
or java equivelant- See Also:
-
tanpi
protected final float tanpi(float a) Backed by eitherMath.tan(double)
(Java) ortanpi(float)
(OpenCL). This method is equivelant toMath.tan(a * Math.PI)
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
a
- value to delegate totanpi(float)
or java equivelant- Returns:
tanpi(float)
or java equivelant- See Also:
-
rsqrt
protected float rsqrt(float _f) Computes inverse square root usingMath.sqrt(double)
(Java) or delegates torsqrt(double)
(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_f
- value to delegate toMath.sqrt(double)
/rsqrt(double)
- Returns:
( 1.0f /
/Math.sqrt(double)
casted to float )rsqrt(double)
- See Also:
-
rsqrt
protected double rsqrt(double _d) Computes inverse square root usingMath.sqrt(double)
(Java) or delegates torsqrt(double)
(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_d
- value to delegate toMath.sqrt(double)
/rsqrt(double)
- Returns:
( 1.0f /
/Math.sqrt(double)
)rsqrt(double)
- See Also:
-
native_sqrt
private float native_sqrt(float _f) -
native_rsqrt
private float native_rsqrt(float _f) -
atomicAdd
protected int atomicAdd(int[] _arr, int _index, int _delta) Atomically adds_delta
value to_index
element of array_arr
(Java) or delegates toatomic_add(volatile int*, int)
(OpenCL).- Parameters:
_arr
- array for which an element value needs to be atomically incremented by_delta
_index
- index of the_arr
array that needs to be atomically incremented by_delta
_delta
- value by which_index
element of_arr
array needs to be atomically incremented- Returns:
- previous value of
_index
element of_arr
array - See Also:
-
atomicGet
-
atomicSet
-
atomicAdd
-
atomicSub
-
atomicXchg
-
atomicInc
-
atomicDec
-
atomicCmpXchg
-
atomicMin
-
atomicMax
-
atomicAnd
-
atomicOr
-
atomicXor
-
localBarrier
protected final void localBarrier()Wait for all kernels in the current work group to rendezvous at this call before continuing execution.
It will also enforce memory ordering, such that modifications made by each thread in the work-group, to the memory, before entering into this barrier call will be visible by all threads leaving the barrier.
Note1: In OpenCL will execute as barrier(CLK_LOCAL_MEM_FENCE), which will have a different behaviour than in Java, because it will only guarantee visibility of modifications made to local memory space to all threads leaving the barrier.
Note2: In OpenCL it is required that all threads must enter the same if blocks and must iterate the same number of times in all loops (for, while, ...).
Note3: Java version is identical to localBarrier(), globalBarrier() and localGlobalBarrier() -
globalBarrier
protected final void globalBarrier()Wait for all kernels in the current work group to rendezvous at this call before continuing execution.
It will also enforce memory ordering, such that modifications made by each thread in the work-group, to the memory, before entering into this barrier call will be visible by all threads leaving the barrier.
Note1: In OpenCL will execute as barrier(CLK_GLOBAL_MEM_FENCE), which will have a different behaviour; than in Java, because it will only guarantee visibility of modifications made to global memory space to all threads, in the work group, leaving the barrier.
Note2: In OpenCL it is required that all threads must enter the same if blocks and must iterate the same number of times in all loops (for, while, ...).
Note3: Java version is identical to localBarrier(), globalBarrier() and localGlobalBarrier() -
localGlobalBarrier
protected final void localGlobalBarrier()Wait for all kernels in the current work group to rendezvous at this call before continuing execution.
It will also enforce memory ordering, such that modifications made by each thread in the work-group, to the memory, before entering into this barrier call will be visible by all threads leaving the barrier.
Note1: When in doubt, use this barrier instead of localBarrier() or globalBarrier(), despite the possible performance loss.
Note2: In OpenCL will execute as barrier(CLK_LOCAL_MEM_FENCE | CLK_GLOBAL_MEM_FENCE), which will have the same behaviour than in Java, because it will guarantee the visibility of modifications made to any of the memory spaces to all threads, in the work group, leaving the barrier.
Note3: In OpenCL it is required that all threads must enter the same if blocks and must iterate the same number of times in all loops (for, while, ...).
Note4: Java version is identical to localBarrier(), globalBarrier() and localGlobalBarrier() -
hypot
protected float hypot(float a, float b) -
hypot
protected double hypot(double a, double b) -
getKernelState
-
prepareKernelRunner
-
registerProfileReportObserver
Registers a new profile report observer to receive profile reports as they're produced. This is the method recommended when the client application desires to receive all the execution profiles for the current kernel instance on all devices over all client threads running such kernel with a single observer
Note1: A report will be generated by a thread that finishes executing a kernel. In multithreaded execution environments it is up to the observer implementation to handle thread safety.
Note2: To cancel the report subscription just set observer tonull
value.- Parameters:
observer
- the observer instance that will receive the profile reports
-
getProfileReportLastThread
Retrieves a profile report for the last thread that executed this kernel on the given device. A report will only be available if at least one thread executed the kernel on the device.
Note1: If the profile report is intended to be kept in memory, the object should be cloned withProfileReport.clone()
- Parameters:
device
- the relevant device where the kernel executed- Returns:
- the profiling report for the current most recent execution
- null, if no profiling report is available for such thread
- See Also:
-
getProfileReportCurrentThread
Retrieves the most recent complete report available for the current thread calling this method for the current kernel instance and executed on the given device.
Note1: If the profile report is intended to be kept in memory, the object should be cloned withProfileReport.clone()
Note2: If the thread didn't execute this kernel on the specified device, it will return null.- Parameters:
device
- the relevant device where the kernel executed- Returns:
- the profiling report for the current most recent execution
- null, if no profiling report is available for such thread
- See Also:
-
getExecutionTime
public double getExecutionTime()Determine the execution time of the previous Kernel.execute(range) called from the last thread that ran and executed on the most recently used device.
Note1: This is kept for backwards compatibility only, usage of eithergetProfileReportLastThread(Device)
orregisterProfileReportObserver(IProfileReportObserver)
is encouraged instead.
Note2: Calling this method is not recommended when using more than a single thread to execute the same kernel, or when running kernels on more than one device concurrently.
Note that for the first call this will include the conversion time.
- Returns:
- The time spent executing the kernel (ms)
- NaN, if no profile report is available
- See Also:
-
getConversionTime
public double getConversionTime()Determine the time taken to convert bytecode to OpenCL for first Kernel.execute(range) call.
Note1: This is kept for backwards compatibility only, usage of eithergetProfileReportLastThread(Device)
orregisterProfileReportObserver(IProfileReportObserver)
is encouraged instead.
Note2: Calling this method is not recommended when using more than a single thread to execute the same kernel, or when running kernels on more than one device concurrently.
Note that for the first call this will include the conversion time.
- Returns:
- The time spent preparing the kernel for execution using GPU
- NaN, if no profile report is available
- See Also:
-
getAccumulatedExecutionTimeCurrentThread
Determine the total execution time of all previous kernel executions called from the current thread, calling this method, that executed the current kernel on the specified device.
Note1: This is the recommended method to retrieve the accumulated execution time for a single current thread, even when doing multithreading for the same kernel and device.
Note that this will include the initial conversion time.- Parameters:
the
- device of interest where the kernel executed- Returns:
- The total time spent executing the kernel (ms)
- NaN, if no profiling information is available
- See Also:
-
getAccumulatedExecutionTimeAllThreads
Determine the total execution time of all produced profile reports from all threads that executed the current kernel on the specified device.
Note1: This is the recommended method to retrieve the accumulated execution time, even when doing multithreading for the same kernel and device.
Note that this will include the initial conversion time.- Parameters:
the
- device of interest where the kernel executed- Returns:
- The total time spent executing the kernel (ms)
- NaN, if no profiling information is available
- See Also:
-
getAccumulatedExecutionTime
public double getAccumulatedExecutionTime()Determine the total execution time of all previous Kernel.execute(range) calls for all threads that ran this kernel for the device used in the last kernel execution.
Note1: This is kept for backwards compatibility only, usage ofgetAccumulatedExecutionTimeAllThreads(Device)
is encouraged instead.
Note2: Calling this method is not recommended when using more than a single thread to execute the same kernel on multiple devices concurrently.
Note that this will include the initial conversion time.- Returns:
- The total time spent executing the kernel (ms)
- NaN, if no profiling information is available
- See Also:
-
execute
Start execution of_range
kernels.When
kernel.execute(globalSize)
is invoked, Aparapi will schedule the execution ofglobalSize
kernels. If the execution mode is GPU then the kernels will execute as OpenCL code on the GPU device. Otherwise, if the mode is JTP, the kernels will execute as a pool of Java threads on the CPU.- Parameters:
_range
- The number of Kernels that we would like to initiate.
-
toString
-
execute
Start execution of_range
kernels.When
kernel.execute(_range)
is 1invoked, Aparapi will schedule the execution of_range
kernels. If the execution mode is GPU then the kernels will execute as OpenCL code on the GPU device. Otherwise, if the mode is JTP, the kernels will execute as a pool of Java threads on the CPU.Since adding the new
Range class
this method offers backward compatibility and merely defers toreturn (execute(Range.create(_range), 1));
.- Parameters:
_range
- The number of Kernels that we would like to initiate.
-
createRange
-
execute
Start execution of_passes
iterations of_range
kernels.When
kernel.execute(_range, _passes)
is invoked, Aparapi will schedule the execution of_reange
kernels. If the execution mode is GPU then the kernels will execute as OpenCL code on the GPU device. Otherwise, if the mode is JTP, the kernels will execute as a pool of Java threads on the CPU.- Parameters:
_passes
- The number of passes to make- Returns:
- The Kernel instance (this) so we can chain calls to put(arr).execute(range).get(arr)
-
execute
Start execution of_passes
iterations over the_range
of kernels.When
kernel.execute(_range)
is invoked, Aparapi will schedule the execution of_range
kernels. If the execution mode is GPU then the kernels will execute as OpenCL code on the GPU device. Otherwise, if the mode is JTP, the kernels will execute as a pool of Java threads on the CPU.Since adding the new
Range class
this method offers backward compatibility and merely defers toreturn (execute(Range.create(_range), 1));
.- Parameters:
_range
- The number of Kernels that we would like to initiate.
-
execute
Start execution ofglobalSize
kernels for the given entrypoint.When
kernel.execute("entrypoint", globalSize)
is invoked, Aparapi will schedule the execution ofglobalSize
kernels. If the execution mode is GPU then the kernels will execute as OpenCL code on the GPU device. Otherwise, if the mode is JTP, the kernels will execute as a pool of Java threads on the CPU.- Parameters:
_entrypoint
- is the name of the method we wish to use as the entrypoint to the kernel- Returns:
- The Kernel instance (this) so we can chain calls to put(arr).execute(range).get(arr)
-
execute
Start execution ofglobalSize
kernels for the given entrypoint.When
kernel.execute("entrypoint", globalSize)
is invoked, Aparapi will schedule the execution ofglobalSize
kernels. If the execution mode is GPU then the kernels will execute as OpenCL code on the GPU device. Otherwise, if the mode is JTP, the kernels will execute as a pool of Java threads on the CPU.- Parameters:
_entrypoint
- is the name of the method we wish to use as the entrypoint to the kernel- Returns:
- The Kernel instance (this) so we can chain calls to put(arr).execute(range).get(arr)
-
compile
Force pre-compilation of the kernel for a given device, without executing it.- Parameters:
_device
- the device for which the kernel is to be compiled- Returns:
- the Kernel instance (this) so we can chain calls
- Throws:
CompileFailedException
- if compilation failed for some reason
-
compile
Force pre-compilation of the kernel for a given device, without executing it.- Parameters:
_entrypoint
- is the name of the method we wish to use as the entrypoint to the kernel_device
- the device for which the kernel is to be compiled- Returns:
- the Kernel instance (this) so we can chain calls
- Throws:
CompileFailedException
- if compilation failed for some reason
-
getKernelMinimumPrivateMemSizeInUsePerWorkItem
public long getKernelMinimumPrivateMemSizeInUsePerWorkItem(Device device) throws QueryFailedException Retrieves that minimum private memory in use per work item for this kernel instance and the specified device.- Parameters:
device
- the device where the kernel is intended to run- Returns:
- the number of bytes used per work item
- Throws:
QueryFailedException
- if the query couldn't complete
-
getKernelLocalMemSizeInUse
Retrieves the amount of local memory used in the specified device by this kernel instance.- Parameters:
device
- the device where the kernel is intended to run- Returns:
- the number of bytes of local memory in use for the specified device and current kernel
- Throws:
QueryFailedException
- if the query couldn't complete
-
getKernelPreferredWorkGroupSizeMultiple
Retrieves the preferred work-group multiple in the specified device for this kernel instance.- Parameters:
device
- the device where the kernel is intended to run- Returns:
- the preferred work group multiple
- Throws:
QueryFailedException
- if the query couldn't complete
-
getKernelMaxWorkGroupSize
Retrieves the maximum work-group size allowed for this kernel when running on the specified device.- Parameters:
device
- the device where the kernel is intended to run- Returns:
- the preferred work group multiple
- Throws:
QueryFailedException
- if the query couldn't complete
-
getKernelCompileWorkGroupSize
Retrieves the specified work-group size in the compiled kernel for the specified device or intermediate language for the device.- Parameters:
device
- the device where the kernel is intended to run- Returns:
- the preferred work group multiple
- Throws:
QueryFailedException
- if the query couldn't complete
-
isAutoCleanUpArrays
public boolean isAutoCleanUpArrays() -
setAutoCleanUpArrays
public void setAutoCleanUpArrays(boolean autoCleanUpArrays) Property which if true enables automatic calling ofcleanUpArrays()
following each execution. -
cleanUpArrays
public void cleanUpArrays()Frees the bulk of the resources used by this kernel, by setting array sizes in non-primitiveKernelArg
s to 1 (0 size is prohibited) and invoking kernel execution on a zero size range. Unlikedispose()
, this does not prohibit further invocations of this kernel, as sundry resources such as OpenCL queues are not freed by this method.This allows a "dormant" Kernel to remain in existence without undue strain on GPU resources, which may be strongly preferable to disposing a Kernel and recreating another one later, as creation/use of a new Kernel (specifically creation of its associated OpenCL context) is expensive.
Note that where the underlying array field is declared final, for obvious reasons it is not resized to zero.
-
dispose
public void dispose()Release any resources associated with this Kernel.When the execution mode is
CPU
orGPU
, Aparapi stores some OpenCL resources in a data structure associated with the kernel instance. Thedispose()
method must be called to release these resources.If
execute(int _globalSize)
is called afterdispose()
is called the results are undefined. -
isRunningCL
public boolean isRunningCL() -
getTargetDevice
-
isAllowDevice
- Returns:
- true by default, may be overriden to allow vetoing of a device or devices by a given Kernel instance.
-
getExecutionMode
Deprecated.SeeKernel.EXECUTION_MODE
Return the current execution mode. Before a Kernel executes, this return value will be the execution mode as determined by the setting of the EXECUTION_MODE enumeration. By default, this setting is either GPU if OpenCL is available on the target system, or JTP otherwise. This default setting can be changed by calling setExecutionMode().
After a Kernel executes, the return value will be the mode in which the Kernel actually executed.
- Returns:
- The current execution mode.
- See Also:
-
setExecutionMode
Deprecated.SeeKernel.EXECUTION_MODE
Set the execution mode.
This should be regarded as a request. The real mode will be determined at runtime based on the availability of OpenCL and the characteristics of the workload.
- Parameters:
_executionMode
- the requested execution mode.- See Also:
-
setExecutionModeWithoutFallback
-
setFallbackExecutionMode
Deprecated. -
descriptorToReturnTypeLetter
-
getReturnTypeLetter
-
toClassShortNameIfAny
-
getMappedMethodName
public static String getMappedMethodName(ClassModel.ConstantPool.MethodReferenceEntry _methodReferenceEntry) -
isMappedMethod
public static boolean isMappedMethod(ClassModel.ConstantPool.MethodReferenceEntry methodReferenceEntry) -
isOpenCLDelegateMethod
public static boolean isOpenCLDelegateMethod(ClassModel.ConstantPool.MethodReferenceEntry methodReferenceEntry) -
usesAtomic32
public static boolean usesAtomic32(ClassModel.ConstantPool.MethodReferenceEntry methodReferenceEntry) -
usesAtomic64
public static boolean usesAtomic64(ClassModel.ConstantPool.MethodReferenceEntry methodReferenceEntry) -
setExplicit
public void setExplicit(boolean _explicit) For dev purposes (we should remove this for production) allow us to define that this Kernel uses explicit memory management- Parameters:
_explicit
- (true if we want explicit memory management)
-
isExplicit
public boolean isExplicit()For dev purposes (we should remove this for production) determine whether this Kernel uses explicit memory management- Returns:
- (true if we kernel is using explicit memory management)
-
put
Tag this array so that it is explicitly enqueued before the kernel is executed- Parameters:
array
-- Returns:
- This kernel so that we can use the 'fluent' style API
-
put
Tag this array so that it is explicitly enqueued before the kernel is executed- Parameters:
array
-- Returns:
- This kernel so that we can use the 'fluent' style API
-
put
Tag this array so that it is explicitly enqueued before the kernel is executed- Parameters:
array
-- Returns:
- This kernel so that we can use the 'fluent' style API
-
put
Tag this array so that it is explicitly enqueued before the kernel is executed- Parameters:
array
-- Returns:
- This kernel so that we can use the 'fluent' style API
-
put
Tag this array so that it is explicitly enqueued before the kernel is executed- Parameters:
array
-- Returns:
- This kernel so that we can use the 'fluent' style API
-
put
Tag this array so that it is explicitly enqueued before the kernel is executed- Parameters:
array
-- Returns:
- This kernel so that we can use the 'fluent' style API
-
put
Tag this array so that it is explicitly enqueued before the kernel is executed- Parameters:
array
-- Returns:
- This kernel so that we can use the 'fluent' style API
-
put
Tag this array so that it is explicitly enqueued before the kernel is executed- Parameters:
array
-- Returns:
- This kernel so that we can use the 'fluent' style API
-
put
Tag this array so that it is explicitly enqueued before the kernel is executed- Parameters:
array
-- Returns:
- This kernel so that we can use the 'fluent' style API
-
put
Tag this array so that it is explicitly enqueued before the kernel is executed- Parameters:
array
-- Returns:
- This kernel so that we can use the 'fluent' style API
-
put
Tag this array so that it is explicitly enqueued before the kernel is executed- Parameters:
array
-- Returns:
- This kernel so that we can use the 'fluent' style API
-
put
Tag this array so that it is explicitly enqueued before the kernel is executed- Parameters:
array
-- Returns:
- This kernel so that we can use the 'fluent' style API
-
put
Tag this array so that it is explicitly enqueued before the kernel is executed- Parameters:
array
-- Returns:
- This kernel so that we can use the 'fluent' style API
-
put
Tag this array so that it is explicitly enqueued before the kernel is executed- Parameters:
array
-- Returns:
- This kernel so that we can use the 'fluent' style API
-
put
Tag this array so that it is explicitly enqueued before the kernel is executed- Parameters:
array
-- Returns:
- This kernel so that we can use the 'fluent' style API
-
put
Tag this array so that it is explicitly enqueued before the kernel is executed- Parameters:
array
-- Returns:
- This kernel so that we can use the 'fluent' style API
-
put
Tag this array so that it is explicitly enqueued before the kernel is executed- Parameters:
array
-- Returns:
- This kernel so that we can use the 'fluent' style API
-
put
Tag this array so that it is explicitly enqueued before the kernel is executed- Parameters:
array
-- Returns:
- This kernel so that we can use the 'fluent' style API
-
put
Tag this array so that it is explicitly enqueued before the kernel is executed- Parameters:
array
-- Returns:
- This kernel so that we can use the 'fluent' style API
-
put
Tag this array so that it is explicitly enqueued before the kernel is executed- Parameters:
array
-- Returns:
- This kernel so that we can use the 'fluent' style API
-
put
Tag this array so that it is explicitly enqueued before the kernel is executed- Parameters:
array
-- Returns:
- This kernel so that we can use the 'fluent' style API
-
get
Enqueue a request to return this buffer from the GPU. This method blocks until the array is available.- Parameters:
array
-- Returns:
- This kernel so that we can use the 'fluent' style API
-
get
Enqueue a request to return this buffer from the GPU. This method blocks until the array is available.- Parameters:
array
-- Returns:
- This kernel so that we can use the 'fluent' style API
-
get
Enqueue a request to return this buffer from the GPU. This method blocks until the array is available.- Parameters:
array
-- Returns:
- This kernel so that we can use the 'fluent' style API
-
get
Enqueue a request to return this buffer from the GPU. This method blocks until the array is available.- Parameters:
array
-- Returns:
- This kernel so that we can use the 'fluent' style API
-
get
Enqueue a request to return this buffer from the GPU. This method blocks until the array is available.- Parameters:
array
-- Returns:
- This kernel so that we can use the 'fluent' style API
-
get
Enqueue a request to return this buffer from the GPU. This method blocks until the array is available.- Parameters:
array
-- Returns:
- This kernel so that we can use the 'fluent' style API
-
get
Enqueue a request to return this buffer from the GPU. This method blocks until the array is available.- Parameters:
array
-- Returns:
- This kernel so that we can use the 'fluent' style API
-
get
Enqueue a request to return this buffer from the GPU. This method blocks until the array is available.- Parameters:
array
-- Returns:
- This kernel so that we can use the 'fluent' style API
-
get
Enqueue a request to return this buffer from the GPU. This method blocks until the array is available.- Parameters:
array
-- Returns:
- This kernel so that we can use the 'fluent' style API
-
get
Enqueue a request to return this buffer from the GPU. This method blocks until the array is available.- Parameters:
array
-- Returns:
- This kernel so that we can use the 'fluent' style API
-
get
Enqueue a request to return this buffer from the GPU. This method blocks until the array is available.- Parameters:
array
-- Returns:
- This kernel so that we can use the 'fluent' style API
-
get
Enqueue a request to return this buffer from the GPU. This method blocks until the array is available.- Parameters:
array
-- Returns:
- This kernel so that we can use the 'fluent' style API
-
get
Enqueue a request to return this buffer from the GPU. This method blocks until the array is available.- Parameters:
array
-- Returns:
- This kernel so that we can use the 'fluent' style API
-
get
Enqueue a request to return this buffer from the GPU. This method blocks until the array is available.- Parameters:
array
-- Returns:
- This kernel so that we can use the 'fluent' style API
-
get
Enqueue a request to return this buffer from the GPU. This method blocks until the array is available.- Parameters:
array
-- Returns:
- This kernel so that we can use the 'fluent' style API
-
get
Enqueue a request to return this buffer from the GPU. This method blocks until the array is available.- Parameters:
array
-- Returns:
- This kernel so that we can use the 'fluent' style API
-
get
Enqueue a request to return this buffer from the GPU. This method blocks until the array is available.- Parameters:
array
-- Returns:
- This kernel so that we can use the 'fluent' style API
-
get
Enqueue a request to return this buffer from the GPU. This method blocks until the array is available.- Parameters:
array
-- Returns:
- This kernel so that we can use the 'fluent' style API
-
get
Enqueue a request to return this buffer from the GPU. This method blocks until the array is available.- Parameters:
array
-- Returns:
- This kernel so that we can use the 'fluent' style API
-
get
Enqueue a request to return this buffer from the GPU. This method blocks until the array is available.- Parameters:
array
-- Returns:
- This kernel so that we can use the 'fluent' style API
-
get
Enqueue a request to return this buffer from the GPU. This method blocks until the array is available.- Parameters:
array
-- Returns:
- This kernel so that we can use the 'fluent' style API
-
getProfileInfo
Get the profiling information from the last successful call to Kernel.execute().- Returns:
- A list of ProfileInfo records
-
addExecutionModes
Deprecated.SeeKernel.EXECUTION_MODE
.set possible fallback path for execution modes. for example setExecutionFallbackPath(GPU,CPU,JTP) will try to use the GPU if it fails it will fall back to OpenCL CPU and finally it will try JTP.
-
hasNextExecutionMode
Deprecated.- Returns:
- is there another execution path we can try
-
tryNextExecutionMode
Deprecated.SeeKernel.EXECUTION_MODE
. try the next execution path in the list if there aren't any more than give up -
getBoolean
private static boolean getBoolean(ValueCache<Class<?>, Map<String, Boolean>, RuntimeException> methodNamesCache, ClassModel.ConstantPool.MethodReferenceEntry methodReferenceEntry) -
markedWith
private static <A extends Annotation> ValueCache<Class<?>,Map<String, markedWithBoolean>, RuntimeException> (Class<A> annotationClass) -
toSignature
-
getArgumentsLetters
-
isRelevant
-
getProperty
private static <V,T extends Throwable> V getProperty(ValueCache<Class<?>, Map<String, throws TV>, T> cache, ClassModel.ConstantPool.MethodReferenceEntry methodReferenceEntry, V defaultValue) - Throws:
T
-
toSignature
private static String toSignature(ClassModel.ConstantPool.MethodReferenceEntry methodReferenceEntry) -
cacheProperty
private static <K,V, ValueCache<Class<?>,T extends Throwable> Map<K, cachePropertyV>, T> (ValueCache.ThrowingValueComputer<Class<?>, Map<K, V>, T> throwingValueComputer) -
invalidateCaches
public static void invalidateCaches()
-
EXECUTION_MODE
s are used, as a more sophisticatedDevice
preference mechanism is in place, seeKernelManager
.