public abstract class Kernel
extends java.lang.Object
implements java.lang.Cloneable
To write a new kernel, a developer extends the Kernel
class and overrides the Kernel.run()
method.
To execute this kernel, the developer creates a new instance of it and calls Kernel.execute(int globalSize)
with a suitable 'global size'. At runtime
Aparapi will attempt to convert the Kernel.run()
method (and any method called directly or indirectly
by Kernel.run()
) into OpenCL for execution on GPU devices made available via the OpenCL platform.
Note that Kernel.run()
is not called directly. Instead,
the Kernel.execute(int globalSize)
method will cause the overridden Kernel.run()
method to be invoked once for each value in the range 0...globalSize
.
On the first call to Kernel.execute(int _globalSize)
, Aparapi will determine the EXECUTION_MODE of the kernel.
This decision is made dynamically based on two factors:
run()
method (and every method that can be called directly or indirectly from the run()
method)
can be converted into OpenCL.Below is an example Kernel that calculates the square of a set of input values.
class SquareKernel extends Kernel{ private int values[]; private int squares[]; public SquareKernel(int values[]){ this.values = values; squares = new int[values.length]; } public void run() { int gid = getGlobalID(); squares[gid] = values[gid]*values[gid]; } public int[] getSquares(){ return(squares); } }
To execute this kernel, first create a new instance of it and then call execute(Range _range)
.
int[] values = new int[1024]; // fill values array Range range = Range.create(values.length); // create a range 0..1024 SquareKernel kernel = new SquareKernel(values); kernel.execute(range);
When execute(Range)
returns, all the executions of Kernel.run()
have completed and the results are available in the squares
array.
int[] squares = kernel.getSquares(); for (int i=0; i< values.length; i++){ System.out.printf("%4d %4d %8d\n", i, values[i], squares[i]); }
A different approach to creating kernels that avoids extending Kernel is to write an anonymous inner class:
final int[] values = new int[1024]; // fill the values array final int[] squares = new int[values.length]; final Range range = Range.create(values.length); Kernel kernel = new Kernel(){ public void run() { int gid = getGlobalID(); squares[gid] = values[gid]*values[gid]; } }; kernel.execute(range); for (int i=0; i< values.length; i++){ System.out.printf("%4d %4d %8d\n", i, values[i], squares[i]); }
Modifier and Type | Class and Description | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
static interface |
Kernel.Constant
We can use this Annotation to 'tag' intended constant buffers.
|
||||||||||||
class |
Kernel.Entry |
||||||||||||
static class |
Kernel.EXECUTION_MODE
Deprecated.
It is no longer recommended that
EXECUTION_MODE s are used, as a more sophisticated Device
preference mechanism is in place, see KernelManager . Though setExecutionMode(EXECUTION_MODE)
is still honored, the default EXECUTION_MODE is now Kernel.EXECUTION_MODE.AUTO , which indicates that the KernelManager
will determine execution behaviours.
The execution mode ENUM enumerates the possible modes of executing a kernel. One can request a mode of execution using the values below, and query a kernel after it first executes to determine how it executed. Aparapi supports 5 execution modes. Default is GPU.
To request that a kernel is executed in a specific mode, call
int[] values = new int[1024]; // fill values array SquareKernel kernel = new SquareKernel(values); kernel.setExecutionMode(Kernel.EXECUTION_MODE.JTP); kernel.execute(values.length);
Alternatively, the property java -classpath ....;codegen.jar -Dcom.codegen.executionMode=GPU MyApplication Generally setting the execution mode is not recommended (it is best to let Aparapi decide automatically) but the option provides a way to compare a kernel's performance under multiple execution modes. |
||||||||||||
class |
Kernel.KernelState
This class is for internal Kernel state management
|
||||||||||||
static interface |
Kernel.Local
We can use this Annotation to 'tag' intended local buffers.
|
||||||||||||
static interface |
Kernel.NoCL
Annotation which can be applied to either a getter (with usual java bean naming convention relative to an instance field), or to any method
with void return type, which prevents both the method body and any calls to the method being emitted in the generated OpenCL.
|
||||||||||||
protected static interface |
Kernel.OpenCLDelegate
This annotation is for internal use only
|
||||||||||||
protected static interface |
Kernel.OpenCLMapping
This annotation is for internal use only
|
||||||||||||
static interface |
Kernel.PrivateMemorySpace
We can use this Annotation to 'tag' __private (unshared) array fields.
|
Modifier and Type | Field and Description |
---|---|
private static java.util.function.IntBinaryOperator |
andOperator |
private static ValueCache<java.lang.Class<?>,java.util.Map<java.lang.String,java.lang.Boolean>,java.lang.RuntimeException> |
atomic32Cache |
private static ValueCache<java.lang.Class<?>,java.util.Map<java.lang.String,java.lang.Boolean>,java.lang.RuntimeException> |
atomic64Cache |
private boolean |
autoCleanUpArrays |
static java.lang.String |
CONSTANT_SUFFIX
We can use this suffix to 'tag' intended constant buffers.
|
private java.util.Iterator<Kernel.EXECUTION_MODE> |
currentMode
Deprecated.
|
private Kernel.EXECUTION_MODE |
executionMode
Deprecated.
|
private java.util.LinkedHashSet<Kernel.EXECUTION_MODE> |
executionModes
Deprecated.
|
private KernelRunner |
kernelRunner |
private Kernel.KernelState |
kernelState |
static java.lang.String |
LOCAL_SUFFIX
We can use this suffix to 'tag' intended local buffers.
|
private static double |
LOG_2_RECIPROCAL |
private static java.util.logging.Logger |
logger |
private static ValueCache<java.lang.Class<?>,java.util.Map<java.lang.String,java.lang.Boolean>,java.lang.RuntimeException> |
mappedMethodFlags |
private static ValueCache<java.lang.Class<?>,java.util.Map<java.lang.String,java.lang.String>,java.lang.RuntimeException> |
mappedMethodNamesCache |
private static java.util.function.IntBinaryOperator |
maxOperator |
private static java.util.function.IntBinaryOperator |
minOperator |
private static ValueCache<java.lang.Class<?>,java.util.Map<java.lang.String,java.lang.Boolean>,java.lang.RuntimeException> |
openCLDelegateMethodFlags |
private static java.util.function.IntBinaryOperator |
orOperator |
private static double |
PI_RECIPROCAL |
static java.lang.String |
PRIVATE_SUFFIX
We can use this suffix to 'tag' __private buffers.
|
(package private) static java.util.Map<java.lang.String,java.lang.String> |
typeToLetterMap |
(package private) boolean |
useNullForLocalSize |
private static java.util.function.IntBinaryOperator |
xorOperator |
Constructor and Description |
---|
Kernel() |
Modifier and Type | Method and Description |
---|---|
protected double |
abs(double _d)
Delegates to either
Math.abs(double) (Java) or fabs(double) (OpenCL). |
protected float |
abs(float _f)
Delegates to either
Math.abs(float) (Java) or fabs(float) (OpenCL). |
protected int |
abs(int n)
Delegates to either
Math.abs(int) (Java) or abs(int) (OpenCL). |
protected long |
abs(long n)
Delegates to either
Math.abs(long) (Java) or abs(long) (OpenCL). |
protected double |
acos(double a)
Delegates to either
Math.acos(double) (Java) or acos(double) (OpenCL). |
protected float |
acos(float a)
Delegates to either
Math.acos(double) (Java) or acos(float) (OpenCL). |
protected double |
acospi(double a) |
protected float |
acospi(float a) |
void |
addExecutionModes(Kernel.EXECUTION_MODE... platforms)
Deprecated.
See
Kernel.EXECUTION_MODE .
set possible fallback path for execution modes. for example setExecutionFallbackPath(GPU,CPU,JTP) will try to use the GPU if it fails it will fall back to OpenCL CPU and finally it will try JTP. |
protected double |
asin(double _d)
Delegates to either
Math.asin(double) (Java) or asin(double) (OpenCL). |
protected float |
asin(float _f)
Delegates to either
Math.asin(double) (Java) or asin(float) (OpenCL). |
protected double |
asinpi(double a) |
protected float |
asinpi(float a) |
protected double |
atan(double _d)
Delegates to either
Math.atan(double) (Java) or atan(double) (OpenCL). |
protected float |
atan(float _f)
Delegates to either
Math.atan(double) (Java) or atan(float) (OpenCL). |
protected double |
atan2(double _d1,
double _d2)
Delegates to either
Math.atan2(double, double) (Java) or atan2(double, double) (OpenCL). |
protected float |
atan2(float _f1,
float _f2)
Delegates to either
Math.atan2(double, double) (Java) or atan2(float, float) (OpenCL). |
protected double |
atan2pi(double y,
double x) |
protected float |
atan2pi(float y,
double x) |
protected double |
atanpi(double a) |
protected float |
atanpi(float a) |
protected int |
atomicAdd(java.util.concurrent.atomic.AtomicInteger p,
int val) |
protected int |
atomicAdd(int[] _arr,
int _index,
int _delta)
Atomically adds
_delta value to _index element of array _arr (Java) or delegates to atomic_add(volatile int*, int) (OpenCL). |
protected int |
atomicAnd(java.util.concurrent.atomic.AtomicInteger p,
int val) |
protected int |
atomicCmpXchg(java.util.concurrent.atomic.AtomicInteger p,
int expectedVal,
int newVal) |
protected int |
atomicDec(java.util.concurrent.atomic.AtomicInteger p) |
protected int |
atomicGet(java.util.concurrent.atomic.AtomicInteger p) |
protected int |
atomicInc(java.util.concurrent.atomic.AtomicInteger p) |
protected int |
atomicMax(java.util.concurrent.atomic.AtomicInteger p,
int val) |
protected int |
atomicMin(java.util.concurrent.atomic.AtomicInteger p,
int val) |
protected int |
atomicOr(java.util.concurrent.atomic.AtomicInteger p,
int val) |
protected void |
atomicSet(java.util.concurrent.atomic.AtomicInteger p,
int val) |
protected int |
atomicSub(java.util.concurrent.atomic.AtomicInteger p,
int val) |
protected int |
atomicXchg(java.util.concurrent.atomic.AtomicInteger p,
int newVal) |
protected int |
atomicXor(java.util.concurrent.atomic.AtomicInteger p,
int val) |
private static <K,V,T extends java.lang.Throwable> |
cacheProperty(ValueCache.ThrowingValueComputer<java.lang.Class<?>,java.util.Map<K,V>,T> throwingValueComputer) |
void |
cancelMultiPass()
Invoking this method flags that once the current pass is complete execution should be abandoned.
|
protected double |
cbrt(double a) |
protected float |
cbrt(float a) |
protected double |
ceil(double _d)
Delegates to either
Math.ceil(double) (Java) or ceil(double) (OpenCL). |
protected float |
ceil(float _f)
Delegates to either
Math.ceil(double) (Java) or ceil(float) (OpenCL). |
void |
cleanUpArrays()
Frees the bulk of the resources used by this kernel, by setting array sizes in non-primitive
KernelArg s to 1 (0 size is prohibited) and invoking kernel
execution on a zero size range. |
Kernel |
clone()
When using a Java Thread Pool Aparapi uses clone to copy the initial instance to each thread.
|
protected int |
clz(int _i)
Delegates to either
Integer.numberOfLeadingZeros(int) (Java) or clz(int) (OpenCL). |
protected long |
clz(long _l)
Delegates to either
Long.numberOfLeadingZeros(long) (Java) or clz(long) (OpenCL). |
Kernel |
compile(Device _device)
Force pre-compilation of the kernel for a given device, without executing it.
|
Kernel |
compile(java.lang.String _entrypoint,
Device _device)
Force pre-compilation of the kernel for a given device, without executing it.
|
protected double |
cos(double _d)
Delegates to either
Math.cos(double) (Java) or cos(double) (OpenCL). |
protected float |
cos(float _f)
Delegates to either
Math.cos(double) (Java) or cos(float) (OpenCL). |
protected double |
cosh(double x) |
protected float |
cosh(float x) |
protected double |
cospi(double a) |
protected float |
cospi(float a) |
protected Range |
createRange(int _range) |
private static java.lang.String |
descriptorToReturnTypeLetter(java.lang.String desc) |
void |
dispose()
Release any resources associated with this Kernel.
|
Kernel |
execute(int _range)
Start execution of
_range kernels. |
Kernel |
execute(int _range,
int _passes)
Start execution of
_passes iterations over the _range of kernels. |
Kernel |
execute(Range _range)
Start execution of
_range kernels. |
Kernel |
execute(Range _range,
int _passes)
Start execution of
_passes iterations of _range kernels. |
Kernel |
execute(java.lang.String _entrypoint,
Range _range)
Start execution of
globalSize kernels for the given entrypoint. |
Kernel |
execute(java.lang.String _entrypoint,
Range _range,
int _passes)
Start execution of
globalSize kernels for the given entrypoint. |
void |
executeFallbackAlgorithm(Range _range,
int _passId)
If
hasFallbackAlgorithm() has been overriden to return true, this method should be overriden so as to
apply a single pass of the kernel's logic to the entire _range. |
protected double |
exp(double _d)
Delegates to either
Math.exp(double) (Java) or exp(double) (OpenCL). |
protected float |
exp(float _f)
Delegates to either
Math.exp(double) (Java) or exp(float) (OpenCL). |
protected double |
exp10(double a) |
protected float |
exp10(float a) |
protected double |
exp2(double a) |
protected float |
exp2(float a) |
protected double |
expm1(double x) |
protected float |
expm1(float x) |
protected double |
floor(double _d)
Delegates to either
Math.floor(double) (Java) or floor(double) (OpenCL). |
protected float |
floor(float _f)
Delegates to either
Math.floor(double) (Java) or floor(float) (OpenCL). |
protected double |
fma(double a,
double b,
double c)
Delegates to either {code}a*b+c{code} (Java) or
fma(double, double, double) (OpenCL). |
protected float |
fma(float a,
float b,
float c)
Delegates to either {code}a*b+c{code} (Java) or
fma(float, float, float) (OpenCL). |
Kernel |
get(boolean[] array)
Enqueue a request to return this buffer from the GPU.
|
Kernel |
get(boolean[][] array)
Enqueue a request to return this buffer from the GPU.
|
Kernel |
get(boolean[][][] array)
Enqueue a request to return this buffer from the GPU.
|
Kernel |
get(byte[] array)
Enqueue a request to return this buffer from the GPU.
|
Kernel |
get(byte[][] array)
Enqueue a request to return this buffer from the GPU.
|
Kernel |
get(byte[][][] array)
Enqueue a request to return this buffer from the GPU.
|
Kernel |
get(char[] array)
Enqueue a request to return this buffer from the GPU.
|
Kernel |
get(char[][] array)
Enqueue a request to return this buffer from the GPU.
|
Kernel |
get(char[][][] array)
Enqueue a request to return this buffer from the GPU.
|
Kernel |
get(double[] array)
Enqueue a request to return this buffer from the GPU.
|
Kernel |
get(double[][] array)
Enqueue a request to return this buffer from the GPU.
|
Kernel |
get(double[][][] array)
Enqueue a request to return this buffer from the GPU.
|
Kernel |
get(float[] array)
Enqueue a request to return this buffer from the GPU.
|
Kernel |
get(float[][] array)
Enqueue a request to return this buffer from the GPU.
|
Kernel |
get(float[][][] array)
Enqueue a request to return this buffer from the GPU.
|
Kernel |
get(int[] array)
Enqueue a request to return this buffer from the GPU.
|
Kernel |
get(int[][] array)
Enqueue a request to return this buffer from the GPU.
|
Kernel |
get(int[][][] array)
Enqueue a request to return this buffer from the GPU.
|
Kernel |
get(long[] array)
Enqueue a request to return this buffer from the GPU.
|
Kernel |
get(long[][] array)
Enqueue a request to return this buffer from the GPU.
|
Kernel |
get(long[][][] array)
Enqueue a request to return this buffer from the GPU.
|
double |
getAccumulatedExecutionTime()
Determine the total execution time of all previous Kernel.execute(range) calls for all threads
that ran this kernel for the device used in the last kernel execution.
|
double |
getAccumulatedExecutionTimeAllThreads(Device device)
Determine the total execution time of all produced profile reports from all threads that executed the
current kernel on the specified device.
|
double |
getAccumulatedExecutionTimeCurrentThread(Device device)
Determine the total execution time of all previous kernel executions called from the current thread,
calling this method, that executed the current kernel on the specified device.
|
private static java.lang.String |
getArgumentsLetters(java.lang.reflect.Method method) |
private static boolean |
getBoolean(ValueCache<java.lang.Class<?>,java.util.Map<java.lang.String,java.lang.Boolean>,java.lang.RuntimeException> methodNamesCache,
ClassModel.ConstantPool.MethodReferenceEntry methodReferenceEntry) |
int |
getCancelState() |
double |
getConversionTime()
Determine the time taken to convert bytecode to OpenCL for first Kernel.execute(range) call.
|
int |
getCurrentPass() |
Kernel.EXECUTION_MODE |
getExecutionMode()
Deprecated.
See
Kernel.EXECUTION_MODE
Return the current execution mode. Before a Kernel executes, this return value will be the execution mode as determined by the setting of the EXECUTION_MODE enumeration. By default, this setting is either GPU if OpenCL is available on the target system, or JTP otherwise. This default setting can be changed by calling setExecutionMode(). After a Kernel executes, the return value will be the mode in which the Kernel actually executed. |
double |
getExecutionTime()
Determine the execution time of the previous Kernel.execute(range) called from the last thread that ran and
executed on the most recently used device.
|
protected int |
getGlobalId()
Determine the globalId of an executing kernel.
|
protected int |
getGlobalId(int _dim) |
protected int |
getGlobalSize()
Determine the value that was passed to
Kernel.execute(int globalSize) method. |
protected int |
getGlobalSize(int _dim) |
protected int |
getGroupId()
Determine the groupId of an executing kernel.
|
protected int |
getGroupId(int _dim) |
int[] |
getKernelCompileWorkGroupSize(Device device)
Retrieves the specified work-group size in the compiled kernel for the specified device or intermediate language for the device.
|
long |
getKernelLocalMemSizeInUse(Device device)
Retrieves the amount of local memory used in the specified device by this kernel instance.
|
int |
getKernelMaxWorkGroupSize(Device device)
Retrieves the maximum work-group size allowed for this kernel when running on the specified device.
|
long |
getKernelMinimumPrivateMemSizeInUsePerWorkItem(Device device)
Retrieves that minimum private memory in use per work item for this kernel instance and
the specified device.
|
int |
getKernelPreferredWorkGroupSizeMultiple(Device device)
Retrieves the preferred work-group multiple in the specified device for this kernel instance.
|
Kernel.KernelState |
getKernelState() |
protected int |
getLocalId()
Determine the local id of an executing kernel.
|
protected int |
getLocalId(int _dim) |
protected int |
getLocalSize()
Determine the size of the group that an executing kernel is a member of.
|
protected int |
getLocalSize(int _dim) |
static java.lang.String |
getMappedMethodName(ClassModel.ConstantPool.MethodReferenceEntry _methodReferenceEntry) |
protected int |
getNumGroups()
Determine the number of groups that will be used to execute a kernel
|
protected int |
getNumGroups(int _dim) |
protected int |
getPassId()
Determine the passId of an executing kernel.
|
java.util.List<ProfileInfo> |
getProfileInfo()
Get the profiling information from the last successful call to Kernel.execute().
|
java.lang.ref.WeakReference<ProfileReport> |
getProfileReportCurrentThread(Device device)
Retrieves the most recent complete report available for the current thread calling this method for
the current kernel instance and executed on the given device.
|
java.lang.ref.WeakReference<ProfileReport> |
getProfileReportLastThread(Device device)
Retrieves a profile report for the last thread that executed this kernel on the given device.
|
private static <V,T extends java.lang.Throwable> |
getProperty(ValueCache<java.lang.Class<?>,java.util.Map<java.lang.String,V>,T> cache,
ClassModel.ConstantPool.MethodReferenceEntry methodReferenceEntry,
V defaultValue) |
private static java.lang.String |
getReturnTypeLetter(java.lang.reflect.Method meth) |
Device |
getTargetDevice() |
protected void |
globalBarrier()
Wait for all kernels in the current work group to rendezvous at this call before continuing execution.
It will also enforce memory ordering, such that modifications made by each thread in the work-group, to the memory, before entering into this barrier call will be visible by all threads leaving the barrier. |
boolean |
hasFallbackAlgorithm()
False by default.
|
boolean |
hasNextExecutionMode()
Deprecated.
|
protected double |
hypot(double a,
double b) |
protected float |
hypot(float a,
float b) |
protected double |
IEEEremainder(double _d1,
double _d2)
Delegates to either
Math.IEEEremainder(double, double) (Java) or remainder(double, double) (OpenCL). |
protected float |
IEEEremainder(float _f1,
float _f2)
Delegates to either
Math.IEEEremainder(double, double) (Java) or remainder(float, float) (OpenCL). |
static void |
invalidateCaches() |
boolean |
isAllowDevice(Device _device) |
boolean |
isAutoCleanUpArrays() |
boolean |
isExecuting() |
boolean |
isExplicit()
For dev purposes (we should remove this for production) determine whether this Kernel uses explicit memory management
|
static boolean |
isMappedMethod(ClassModel.ConstantPool.MethodReferenceEntry methodReferenceEntry) |
static boolean |
isOpenCLDelegateMethod(ClassModel.ConstantPool.MethodReferenceEntry methodReferenceEntry) |
private static boolean |
isRelevant(java.lang.reflect.Method method) |
boolean |
isRunningCL() |
protected void |
localBarrier()
Wait for all kernels in the current work group to rendezvous at this call before continuing execution.
It will also enforce memory ordering, such that modifications made by each thread in the work-group, to the memory, before entering into this barrier call will be visible by all threads leaving the barrier. |
protected void |
localGlobalBarrier()
Wait for all kernels in the current work group to rendezvous at this call before continuing execution.
It will also enforce memory ordering, such that modifications made by each thread in the work-group, to the memory, before entering into this barrier call will be visible by all threads leaving the barrier. |
protected double |
log(double _d)
Delegates to either
Math.log(double) (Java) or log(double) (OpenCL). |
protected float |
log(float _f)
Delegates to either
Math.log(double) (Java) or log(float) (OpenCL). |
protected double |
log10(double a) |
protected float |
log10(float a) |
protected double |
log1p(double x) |
protected float |
log1p(float x) |
protected double |
log2(double a) |
protected float |
log2(float a) |
protected double |
mad(double a,
double b,
double c) |
protected float |
mad(float a,
float b,
float c) |
private static <A extends java.lang.annotation.Annotation> |
markedWith(java.lang.Class<A> annotationClass) |
protected double |
max(double _d1,
double _d2)
Delegates to either
Math.max(double, double) (Java) or fmax(double, double) (OpenCL). |
protected float |
max(float _f1,
float _f2)
Delegates to either
Math.max(float, float) (Java) or fmax(float, float) (OpenCL). |
protected int |
max(int n1,
int n2)
Delegates to either
Math.max(int, int) (Java) or max(int, int) (OpenCL). |
protected long |
max(long n1,
long n2)
Delegates to either
Math.max(long, long) (Java) or max(long, long) (OpenCL). |
protected double |
min(double _d1,
double _d2)
Delegates to either
Math.min(double, double) (Java) or fmin(double, double) (OpenCL). |
protected float |
min(float _f1,
float _f2)
Delegates to either
Math.min(float, float) (Java) or fmin(float, float) (OpenCL). |
protected int |
min(int n1,
int n2)
Delegates to either
Math.min(int, int) (Java) or min(int, int) (OpenCL). |
protected long |
min(long n1,
long n2)
Delegates to either
Math.min(long, long) (Java) or min(long, long) (OpenCL). |
private float |
native_rsqrt(float _f) |
private float |
native_sqrt(float _f) |
protected double |
nextAfter(double start,
double direction) |
protected float |
nextAfter(float start,
float direction) |
protected int |
popcount(int _i)
Delegates to either
Integer.bitCount(int) (Java) or popcount(int) (OpenCL). |
protected long |
popcount(long _i)
Delegates to either
Long.bitCount(long) (Java) or popcount(long) (OpenCL). |
protected double |
pow(double _d1,
double _d2)
Delegates to either
Math.pow(double, double) (Java) or pow(double, double) (OpenCL). |
protected float |
pow(float _f1,
float _f2)
Delegates to either
Math.pow(double, double) (Java) or pow(float, float) (OpenCL). |
private KernelRunner |
prepareKernelRunner() |
Kernel |
put(boolean[] array)
Tag this array so that it is explicitly enqueued before the kernel is executed
|
Kernel |
put(boolean[][] array)
Tag this array so that it is explicitly enqueued before the kernel is executed
|
Kernel |
put(boolean[][][] array)
Tag this array so that it is explicitly enqueued before the kernel is executed
|
Kernel |
put(byte[] array)
Tag this array so that it is explicitly enqueued before the kernel is executed
|
Kernel |
put(byte[][] array)
Tag this array so that it is explicitly enqueued before the kernel is executed
|
Kernel |
put(byte[][][] array)
Tag this array so that it is explicitly enqueued before the kernel is executed
|
Kernel |
put(char[] array)
Tag this array so that it is explicitly enqueued before the kernel is executed
|
Kernel |
put(char[][] array)
Tag this array so that it is explicitly enqueued before the kernel is executed
|
Kernel |
put(char[][][] array)
Tag this array so that it is explicitly enqueued before the kernel is executed
|
Kernel |
put(double[] array)
Tag this array so that it is explicitly enqueued before the kernel is executed
|
Kernel |
put(double[][] array)
Tag this array so that it is explicitly enqueued before the kernel is executed
|
Kernel |
put(double[][][] array)
Tag this array so that it is explicitly enqueued before the kernel is executed
|
Kernel |
put(float[] array)
Tag this array so that it is explicitly enqueued before the kernel is executed
|
Kernel |
put(float[][] array)
Tag this array so that it is explicitly enqueued before the kernel is executed
|
Kernel |
put(float[][][] array)
Tag this array so that it is explicitly enqueued before the kernel is executed
|
Kernel |
put(int[] array)
Tag this array so that it is explicitly enqueued before the kernel is executed
|
Kernel |
put(int[][] array)
Tag this array so that it is explicitly enqueued before the kernel is executed
|
Kernel |
put(int[][][] array)
Tag this array so that it is explicitly enqueued before the kernel is executed
|
Kernel |
put(long[] array)
Tag this array so that it is explicitly enqueued before the kernel is executed
|
Kernel |
put(long[][] array)
Tag this array so that it is explicitly enqueued before the kernel is executed
|
Kernel |
put(long[][][] array)
Tag this array so that it is explicitly enqueued before the kernel is executed
|
void |
registerProfileReportObserver(IProfileReportObserver observer)
Registers a new profile report observer to receive profile reports as they're produced.
|
protected double |
rint(double _d)
Delegates to either
Math.rint(double) (Java) or rint(double) (OpenCL). |
protected float |
rint(float _f)
Delegates to either
Math.rint(double) (Java) or rint(float) (OpenCL). |
protected long |
round(double _d)
Delegates to either
Math.round(double) (Java) or round(double) (OpenCL). |
protected int |
round(float _f)
Delegates to either
Math.round(float) (Java) or round(float) (OpenCL). |
protected double |
rsqrt(double _d)
Computes inverse square root using
Math.sqrt(double) (Java) or delegates to rsqrt(double) (OpenCL). |
protected float |
rsqrt(float _f)
Computes inverse square root using
Math.sqrt(double) (Java) or delegates to rsqrt(double) (OpenCL). |
abstract void |
run()
The entry point of a kernel.
|
void |
setAutoCleanUpArrays(boolean autoCleanUpArrays)
Property which if true enables automatic calling of
cleanUpArrays() following each execution. |
void |
setExecutionMode(Kernel.EXECUTION_MODE _executionMode)
Deprecated.
See
Kernel.EXECUTION_MODE
Set the execution mode. This should be regarded as a request. The real mode will be determined at runtime based on the availability of OpenCL and the characteristics of the workload. |
void |
setExecutionModeWithoutFallback(Kernel.EXECUTION_MODE _executionMode) |
void |
setExplicit(boolean _explicit)
For dev purposes (we should remove this for production) allow us to define that this Kernel uses explicit memory management
|
void |
setFallbackExecutionMode()
Deprecated.
|
protected double |
sin(double _d)
Delegates to either
Math.sin(double) (Java) or sin(double) (OpenCL). |
protected float |
sin(float _f)
Delegates to either
Math.sin(double) (Java) or sin(float) (OpenCL). |
protected double |
sinh(double x)
Delegates to either
Math.sinh(double) (Java) or sinh(double) (OpenCL). |
protected float |
sinh(float x)
Delegates to either
Math.sinh(double) (Java) or sinh(float) (OpenCL). |
protected double |
sinpi(double a)
Backed by either
Math.sin(double) (Java) or sinpi(double) (OpenCL). |
protected float |
sinpi(float a)
Backed by either
Math.sin(double) (Java) or sinpi(float) (OpenCL). |
protected double |
sqrt(double _d)
Delegates to either
Math.sqrt(double) (Java) or sqrt(double) (OpenCL). |
protected float |
sqrt(float _f)
Delegates to either
Math.sqrt(double) (Java) or sqrt(float) (OpenCL). |
protected double |
tan(double _d)
Delegates to either
Math.tan(double) (Java) or tan(double) (OpenCL). |
protected float |
tan(float _f)
Delegates to either
Math.tan(double) (Java) or tan(float) (OpenCL). |
protected double |
tanh(double x)
Delegates to either
Math.tanh(double) (Java) or tanh(double) (OpenCL). |
protected float |
tanh(float x)
Delegates to either
java.lang.Math#tanh(float) (Java) or tanh(float) (OpenCL). |
protected double |
tanpi(double a)
Backed by either
Math.tan(double) (Java) or tanpi(double) (OpenCL). |
protected float |
tanpi(float a)
Backed by either
Math.tan(double) (Java) or tanpi(float) (OpenCL). |
private static java.lang.String |
toClassShortNameIfAny(java.lang.Class<?> retClass) |
protected double |
toDegrees(double _d)
Delegates to either
Math.toDegrees(double) (Java) or degrees(double) (OpenCL). |
protected float |
toDegrees(float _f)
Delegates to either
Math.toDegrees(double) (Java) or degrees(float) (OpenCL). |
protected double |
toRadians(double _d)
Delegates to either
Math.toRadians(double) (Java) or radians(double) (OpenCL). |
protected float |
toRadians(float _f)
Delegates to either
Math.toRadians(double) (Java) or radians(float) (OpenCL). |
private static java.lang.String |
toSignature(ClassModel.ConstantPool.MethodReferenceEntry methodReferenceEntry) |
(package private) static java.lang.String |
toSignature(java.lang.reflect.Method method) |
java.lang.String |
toString() |
void |
tryNextExecutionMode()
Deprecated.
See
Kernel.EXECUTION_MODE .
try the next execution path in the list if there aren't any more than give up |
static boolean |
usesAtomic32(ClassModel.ConstantPool.MethodReferenceEntry methodReferenceEntry) |
static boolean |
usesAtomic64(ClassModel.ConstantPool.MethodReferenceEntry methodReferenceEntry) |
private static java.util.logging.Logger logger
public static final java.lang.String LOCAL_SUFFIX
int[] buffer_$local$ = new int[1024];
Or use the Annotation form
@Local int[] buffer = new int[1024];
public static final java.lang.String CONSTANT_SUFFIX
int[] buffer_$constant$ = new int[1024];
Or use the Annotation form
@Constant int[] buffer = new int[1024];
public static final java.lang.String PRIVATE_SUFFIX
So either name the buffer
int[] buffer_$private$32 = new int[32];
Or use the Annotation form
@PrivateMemorySpace(32) int[] buffer = new int[32];
private KernelRunner kernelRunner
private boolean autoCleanUpArrays
private Kernel.KernelState kernelState
private static final double LOG_2_RECIPROCAL
private static final double PI_RECIPROCAL
private static final java.util.function.IntBinaryOperator minOperator
private static final java.util.function.IntBinaryOperator maxOperator
private static final java.util.function.IntBinaryOperator andOperator
private static final java.util.function.IntBinaryOperator orOperator
private static final java.util.function.IntBinaryOperator xorOperator
static final java.util.Map<java.lang.String,java.lang.String> typeToLetterMap
boolean useNullForLocalSize
@Deprecated private final java.util.LinkedHashSet<Kernel.EXECUTION_MODE> executionModes
Kernel.EXECUTION_MODE
.@Deprecated private java.util.Iterator<Kernel.EXECUTION_MODE> currentMode
Kernel.EXECUTION_MODE
.@Deprecated private Kernel.EXECUTION_MODE executionMode
Kernel.EXECUTION_MODE
.private static final ValueCache<java.lang.Class<?>,java.util.Map<java.lang.String,java.lang.Boolean>,java.lang.RuntimeException> mappedMethodFlags
private static final ValueCache<java.lang.Class<?>,java.util.Map<java.lang.String,java.lang.Boolean>,java.lang.RuntimeException> openCLDelegateMethodFlags
private static final ValueCache<java.lang.Class<?>,java.util.Map<java.lang.String,java.lang.Boolean>,java.lang.RuntimeException> atomic32Cache
private static final ValueCache<java.lang.Class<?>,java.util.Map<java.lang.String,java.lang.Boolean>,java.lang.RuntimeException> atomic64Cache
private static final ValueCache<java.lang.Class<?>,java.util.Map<java.lang.String,java.lang.String>,java.lang.RuntimeException> mappedMethodNamesCache
protected final int getGlobalId()
The kernel implementation uses the globalId to determine which of the executing kernels (in the global domain space) this invocation is expected to deal with.
For example in a SquareKernel
implementation:
class SquareKernel extends Kernel{ private int values[]; private int squares[]; public SquareKernel(int values[]){ this.values = values; squares = new int[values.length]; } public void run() { int gid = getGlobalID(); squares[gid] = values[gid]*values[gid]; } public int[] getSquares(){ return(squares); } }
Each invocation of SquareKernel.run()
retrieves it's globalId by calling getGlobalId()
, and then computes the value of square[gid]
for a given value of value[gid]
.
getLocalId()
,
getGroupId()
,
getGlobalSize()
,
getNumGroups()
,
getLocalSize()
protected final int getGlobalId(int _dim)
protected final int getGroupId()
When a Kernel.execute(int globalSize)
is invoked for a particular kernel, the runtime will break the work into various 'groups'.
A kernel can use getGroupId()
to determine which group a kernel is currently
dispatched to
The following code would capture the groupId for each kernel and map it against globalId.
final int[] groupIds = new int[1024]; Kernel kernel = new Kernel(){ public void run() { int gid = getGlobalId(); groupIds[gid] = getGroupId(); } }; kernel.execute(groupIds.length); for (int i=0; i< values.length; i++){ System.out.printf("%4d %4d\n", i, groupIds[i]); }
getLocalId()
,
getGlobalId()
,
getGlobalSize()
,
getNumGroups()
,
getLocalSize()
protected final int getGroupId(int _dim)
protected final int getPassId()
When a Kernel.execute(int globalSize, int passes)
is invoked for a particular kernel, the runtime will break the work into various 'groups'.
A kernel can use getPassId()
to determine which pass we are in. This is ideal for 'reduce' type phases
getLocalId()
,
getGlobalId()
,
getGlobalSize()
,
getNumGroups()
,
getLocalSize()
protected final int getLocalId()
When a Kernel.execute(int globalSize)
is invoked for a particular kernel, the runtime will break the work into
various 'groups'.
getLocalId()
can be used to determine the relative id of the current kernel within a specific group.
The following code would capture the groupId for each kernel and map it against globalId.
final int[] localIds = new int[1024]; Kernel kernel = new Kernel(){ public void run() { int gid = getGlobalId(); localIds[gid] = getLocalId(); } }; kernel.execute(localIds.length); for (int i=0; i< values.length; i++){ System.out.printf("%4d %4d\n", i, localIds[i]); }
getGroupId()
,
getGlobalId()
,
getGlobalSize()
,
getNumGroups()
,
getLocalSize()
protected final int getLocalId(int _dim)
protected final int getLocalSize()
When a Kernel.execute(int globalSize)
is invoked for a particular kernel, the runtime will break the work into
various 'groups'. getLocalSize()
allows a kernel to determine the size of the current group.
Note groups may not all be the same size. In particular, if (global size)%(# of compute devices)!=0
, the runtime can choose to dispatch kernels to
groups with differing sizes.
getGroupId()
,
getGlobalId()
,
getGlobalSize()
,
getNumGroups()
,
getLocalSize()
protected final int getLocalSize(int _dim)
protected final int getGlobalSize()
Kernel.execute(int globalSize)
method.Kernel.execute(int globalSize)
causing the current execution.getGroupId()
,
getGlobalId()
,
getNumGroups()
,
getLocalSize()
protected final int getGlobalSize(int _dim)
protected final int getNumGroups()
When Kernel.execute(int globalSize)
is invoked, the runtime will split the work into
multiple 'groups'. getNumGroups()
returns the total number of groups that will be used.
getGroupId()
,
getGlobalId()
,
getGlobalSize()
,
getNumGroups()
,
getLocalSize()
protected final int getNumGroups(int _dim)
public abstract void run()
Every kernel must override this method.
public boolean hasFallbackAlgorithm()
executeFallbackAlgorithm(Range, int)
with the alternate
algorithm.public void executeFallbackAlgorithm(Range _range, int _passId)
hasFallbackAlgorithm()
has been overriden to return true, this method should be overriden so as to
apply a single pass of the kernel's logic to the entire _range.
This is not normally required, as fallback to JavaDevice.THREAD_POOL
will implement the algorithm in parallel. However
in the event that thread pool execution may be prohibitively slow, this method might implement a "quick and dirty" approximation
to the desired result (for example, a simple box-blur as opposed to a gaussian blur in an image processing application).
public void cancelMultiPass()
Note that in the case of thread-pool/pure java execution we could do better already, using Thread.interrupt() (and/or other means) to abandon execution mid-pass. However at present this is not attempted.
public int getCancelState()
public int getCurrentPass()
KernelRunner.getCurrentPass()
public boolean isExecuting()
KernelRunner.isExecuting()
public Kernel clone()
If you choose to override clone()
you are responsible for delegating to super.clone();
clone
in class java.lang.Object
protected float acos(float a)
Math.acos(double)
(Java) or acos(float)
(OpenCL).
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.a
- value to delegate to Math.acos(double)
/acos(float)
Math.acos(double)
casted to float/acos(float)
Math.acos(double)
,
acos(float)
protected double acos(double a)
Math.acos(double)
(Java) or acos(double)
(OpenCL).
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.a
- value to delegate to Math.acos(double)
/acos(double)
Math.acos(double)
/acos(double)
Math.acos(double)
,
acos(double)
protected float asin(float _f)
Math.asin(double)
(Java) or asin(float)
(OpenCL).
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable._f
- value to delegate to Math.asin(double)
/asin(float)
Math.asin(double)
casted to float/asin(float)
Math.asin(double)
,
asin(float)
protected double asin(double _d)
Math.asin(double)
(Java) or asin(double)
(OpenCL).
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable._d
- value to delegate to Math.asin(double)
/asin(double)
Math.asin(double)
/asin(double)
Math.asin(double)
,
asin(double)
protected float atan(float _f)
Math.atan(double)
(Java) or atan(float)
(OpenCL).
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable._f
- value to delegate to Math.atan(double)
/atan(float)
Math.atan(double)
casted to float/atan(float)
Math.atan(double)
,
atan(float)
protected double atan(double _d)
Math.atan(double)
(Java) or atan(double)
(OpenCL).
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable._d
- value to delegate to Math.atan(double)
/atan(double)
Math.atan(double)
/atan(double)
Math.atan(double)
,
atan(double)
protected float atan2(float _f1, float _f2)
Math.atan2(double, double)
(Java) or atan2(float, float)
(OpenCL).
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable._f1
- value to delegate to first argument of Math.atan2(double, double)
/atan2(float, float)
_f2
- value to delegate to second argument of Math.atan2(double, double)
/atan2(float, float)
Math.atan2(double, double)
casted to float/atan2(float, float)
Math.atan2(double, double)
,
atan2(float, float)
protected double atan2(double _d1, double _d2)
Math.atan2(double, double)
(Java) or atan2(double, double)
(OpenCL).
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable._d1
- value to delegate to first argument of Math.atan2(double, double)
/atan2(double, double)
_d2
- value to delegate to second argument of Math.atan2(double, double)
/atan2(double, double)
Math.atan2(double, double)
/atan2(double, double)
Math.atan2(double, double)
,
atan2(double, double)
protected float ceil(float _f)
Math.ceil(double)
(Java) or ceil(float)
(OpenCL).
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable._f
- value to delegate to Math.ceil(double)
/ceil(float)
Math.ceil(double)
casted to float/ceil(float)
Math.ceil(double)
,
ceil(float)
protected double ceil(double _d)
Math.ceil(double)
(Java) or ceil(double)
(OpenCL).
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable._d
- value to delegate to Math.ceil(double)
/ceil(double)
Math.ceil(double)
/ceil(double)
Math.ceil(double)
,
ceil(double)
protected float cos(float _f)
Math.cos(double)
(Java) or cos(float)
(OpenCL).
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable._f
- value to delegate to Math.cos(double)
/cos(float)
Math.cos(double)
casted to float/cos(float)
Math.cos(double)
,
cos(float)
protected double cos(double _d)
Math.cos(double)
(Java) or cos(double)
(OpenCL).
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable._d
- value to delegate to Math.cos(double)
/cos(double)
Math.cos(double)
/cos(double)
Math.cos(double)
,
cos(double)
protected float exp(float _f)
Math.exp(double)
(Java) or exp(float)
(OpenCL).
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable._f
- value to delegate to Math.exp(double)
/exp(float)
Math.exp(double)
casted to float/exp(float)
Math.exp(double)
,
exp(float)
protected double exp(double _d)
Math.exp(double)
(Java) or exp(double)
(OpenCL).
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable._d
- value to delegate to Math.exp(double)
/exp(double)
Math.exp(double)
/exp(double)
Math.exp(double)
,
exp(double)
protected float abs(float _f)
Math.abs(float)
(Java) or fabs(float)
(OpenCL).
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable._f
- value to delegate to Math.abs(float)
/fabs(float)
Math.abs(float)
/fabs(float)
Math.abs(float)
,
fabs(float)
protected int popcount(int _i)
Integer.bitCount(int)
(Java) or popcount(int)
(OpenCL)._i
- value to delegate to Integer.bitCount(int)
/popcount(int)
Integer.bitCount(int)
/popcount(int)
Integer.bitCount(int)
,
popcount(int)
protected long popcount(long _i)
Long.bitCount(long)
(Java) or popcount(long)
(OpenCL)._i
- value to delegate to Long.bitCount(long)
/popcount(long)
Long.bitCount(long)
/popcount(long)
Long.bitCount(long)
,
popcount(long)
protected int clz(int _i)
Integer.numberOfLeadingZeros(int)
(Java) or clz(int)
(OpenCL).protected long clz(long _l)
Long.numberOfLeadingZeros(long)
(Java) or clz(long)
(OpenCL).protected double abs(double _d)
Math.abs(double)
(Java) or fabs(double)
(OpenCL).
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable._d
- value to delegate to Math.abs(double)
/fabs(double)
Math.abs(double)
/fabs(double)
Math.abs(double)
,
fabs(double)
protected int abs(int n)
Math.abs(int)
(Java) or abs(int)
(OpenCL).
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.protected long abs(long n)
Math.abs(long)
(Java) or abs(long)
(OpenCL).
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.protected float floor(float _f)
Math.floor(double)
(Java) or floor(float)
(OpenCL).
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable._f
- value to delegate to Math.floor(double)
/floor(float)
Math.floor(double)
casted to float/floor(float)
Math.floor(double)
,
floor(float)
protected double floor(double _d)
Math.floor(double)
(Java) or floor(double)
(OpenCL).
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable._d
- value to delegate to Math.floor(double)
/floor(double)
Math.floor(double)
/floor(double)
Math.floor(double)
,
floor(double)
protected float max(float _f1, float _f2)
Math.max(float, float)
(Java) or fmax(float, float)
(OpenCL).
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable._f1
- value to delegate to first argument of Math.max(float, float)
/fmax(float, float)
_f2
- value to delegate to second argument of Math.max(float, float)
/fmax(float, float)
Math.max(float, float)
/fmax(float, float)
Math.max(float, float)
,
fmax(float, float)
protected double max(double _d1, double _d2)
Math.max(double, double)
(Java) or fmax(double, double)
(OpenCL).
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable._d1
- value to delegate to first argument of Math.max(double, double)
/fmax(double, double)
_d2
- value to delegate to second argument of Math.max(double, double)
/fmax(double, double)
Math.max(double, double)
/fmax(double, double)
Math.max(double, double)
,
fmax(double, double)
protected int max(int n1, int n2)
Math.max(int, int)
(Java) or max(int, int)
(OpenCL).
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.n1
- value to delegate to Math.max(int, int)
/max(int, int)
n2
- value to delegate to Math.max(int, int)
/max(int, int)
Math.max(int, int)
/max(int, int)
Math.max(int, int)
,
max(int, int)
protected long max(long n1, long n2)
Math.max(long, long)
(Java) or max(long, long)
(OpenCL).
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.n1
- value to delegate to first argument of Math.max(long, long)
/max(long, long)
n2
- value to delegate to second argument of Math.max(long, long)
/max(long, long)
Math.max(long, long)
/max(long, long)
Math.max(long, long)
,
max(long, long)
protected float min(float _f1, float _f2)
Math.min(float, float)
(Java) or fmin(float, float)
(OpenCL).
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable._f1
- value to delegate to first argument of Math.min(float, float)
/fmin(float, float)
_f2
- value to delegate to second argument of Math.min(float, float)
/fmin(float, float)
Math.min(float, float)
/fmin(float, float)
Math.min(float, float)
,
fmin(float, float)
protected double min(double _d1, double _d2)
Math.min(double, double)
(Java) or fmin(double, double)
(OpenCL).
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable._d1
- value to delegate to first argument of Math.min(double, double)
/fmin(double, double)
_d2
- value to delegate to second argument of Math.min(double, double)
/fmin(double, double)
Math.min(double, double)
/fmin(double, double)
Math.min(double, double)
,
fmin(double, double)
protected int min(int n1, int n2)
Math.min(int, int)
(Java) or min(int, int)
(OpenCL).
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.n1
- value to delegate to first argument of Math.min(int, int)
/min(int, int)
n2
- value to delegate to second argument of Math.min(int, int)
/min(int, int)
Math.min(int, int)
/min(int, int)
Math.min(int, int)
,
min(int, int)
protected long min(long n1, long n2)
Math.min(long, long)
(Java) or min(long, long)
(OpenCL).
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.n1
- value to delegate to first argument of Math.min(long, long)
/min(long, long)
n2
- value to delegate to second argument of Math.min(long, long)
/min(long, long)
Math.min(long, long)
/min(long, long)
Math.min(long, long)
,
min(long, long)
protected float log(float _f)
Math.log(double)
(Java) or log(float)
(OpenCL).
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable._f
- value to delegate to Math.log(double)
/log(float)
Math.log(double)
casted to float/log(float)
Math.log(double)
,
log(float)
protected double log(double _d)
Math.log(double)
(Java) or log(double)
(OpenCL).
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable._d
- value to delegate to Math.log(double)
/log(double)
Math.log(double)
/log(double)
Math.log(double)
,
log(double)
protected float pow(float _f1, float _f2)
Math.pow(double, double)
(Java) or pow(float, float)
(OpenCL).
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable._f1
- value to delegate to first argument of Math.pow(double, double)
/pow(float, float)
_f2
- value to delegate to second argument of Math.pow(double, double)
/pow(float, float)
Math.pow(double, double)
casted to float/pow(float, float)
Math.pow(double, double)
,
pow(float, float)
protected double pow(double _d1, double _d2)
Math.pow(double, double)
(Java) or pow(double, double)
(OpenCL).
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable._d1
- value to delegate to first argument of Math.pow(double, double)
/pow(double, double)
_d2
- value to delegate to second argument of Math.pow(double, double)
/pow(double, double)
Math.pow(double, double)
/pow(double, double)
Math.pow(double, double)
,
pow(double, double)
protected float IEEEremainder(float _f1, float _f2)
Math.IEEEremainder(double, double)
(Java) or remainder(float, float)
(OpenCL).
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable._f1
- value to delegate to first argument of Math.IEEEremainder(double, double)
/remainder(float, float)
_f2
- value to delegate to second argument of Math.IEEEremainder(double, double)
/remainder(float, float)
Math.IEEEremainder(double, double)
casted to float/remainder(float, float)
Math.IEEEremainder(double, double)
,
remainder(float, float)
protected double IEEEremainder(double _d1, double _d2)
Math.IEEEremainder(double, double)
(Java) or remainder(double, double)
(OpenCL).
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable._d1
- value to delegate to first argument of Math.IEEEremainder(double, double)
/remainder(double, double)
_d2
- value to delegate to second argument of Math.IEEEremainder(double, double)
/remainder(double, double)
Math.IEEEremainder(double, double)
/remainder(double, double)
Math.IEEEremainder(double, double)
,
remainder(double, double)
protected float toRadians(float _f)
Math.toRadians(double)
(Java) or radians(float)
(OpenCL).
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable._f
- value to delegate to Math.toRadians(double)
/radians(float)
Math.toRadians(double)
casted to float/radians(float)
Math.toRadians(double)
,
radians(float)
protected double toRadians(double _d)
Math.toRadians(double)
(Java) or radians(double)
(OpenCL).
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable._d
- value to delegate to Math.toRadians(double)
/radians(double)
Math.toRadians(double)
/radians(double)
Math.toRadians(double)
,
radians(double)
protected float toDegrees(float _f)
Math.toDegrees(double)
(Java) or degrees(float)
(OpenCL).
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable._f
- value to delegate to Math.toDegrees(double)
/degrees(float)
Math.toDegrees(double)
casted to float/degrees(float)
Math.toDegrees(double)
,
degrees(float)
protected double toDegrees(double _d)
Math.toDegrees(double)
(Java) or degrees(double)
(OpenCL).
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable._d
- value to delegate to Math.toDegrees(double)
/degrees(double)
Math.toDegrees(double)
/degrees(double)
Math.toDegrees(double)
,
degrees(double)
protected float rint(float _f)
Math.rint(double)
(Java) or rint(float)
(OpenCL).
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable._f
- value to delegate to Math.rint(double)
/rint(float)
Math.rint(double)
casted to float/rint(float)
Math.rint(double)
,
rint(float)
protected double rint(double _d)
Math.rint(double)
(Java) or rint(double)
(OpenCL).
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable._d
- value to delegate to Math.rint(double)
/rint(double)
Math.rint(double)
/rint(double)
Math.rint(double)
,
rint(double)
protected int round(float _f)
Math.round(float)
(Java) or round(float)
(OpenCL).
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable._f
- value to delegate to Math.round(float)
/round(float)
Math.round(float)
/round(float)
Math.round(float)
,
round(float)
protected long round(double _d)
Math.round(double)
(Java) or round(double)
(OpenCL).
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable._d
- value to delegate to Math.round(double)
/round(double)
Math.round(double)
/round(double)
Math.round(double)
,
round(double)
protected float sin(float _f)
Math.sin(double)
(Java) or sin(float)
(OpenCL).
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable._f
- value to delegate to Math.sin(double)
/sin(float)
Math.sin(double)
casted to float/sin(float)
Math.sin(double)
,
sin(float)
protected double sin(double _d)
Math.sin(double)
(Java) or sin(double)
(OpenCL).
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable._d
- value to delegate to Math.sin(double)
/sin(double)
Math.sin(double)
/sin(double)
Math.sin(double)
,
sin(double)
protected float sqrt(float _f)
Math.sqrt(double)
(Java) or sqrt(float)
(OpenCL).
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable._f
- value to delegate to Math.sqrt(double)
/sqrt(float)
Math.sqrt(double)
casted to float/sqrt(float)
Math.sqrt(double)
,
sqrt(float)
protected double sqrt(double _d)
Math.sqrt(double)
(Java) or sqrt(double)
(OpenCL).
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable._d
- value to delegate to Math.sqrt(double)
/sqrt(double)
Math.sqrt(double)
/sqrt(double)
Math.sqrt(double)
,
sqrt(double)
protected float tan(float _f)
Math.tan(double)
(Java) or tan(float)
(OpenCL).
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable._f
- value to delegate to Math.tan(double)
/tan(float)
Math.tan(double)
casted to float/tan(float)
Math.tan(double)
,
tan(float)
protected double tan(double _d)
Math.tan(double)
(Java) or tan(double)
(OpenCL).
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable._d
- value to delegate to Math.tan(double)
/tan(double)
Math.tan(double)
/tan(double)
Math.tan(double)
,
tan(double)
protected final double acospi(double a)
protected final float acospi(float a)
protected final double asinpi(double a)
protected final float asinpi(float a)
protected final double atanpi(double a)
protected final float atanpi(float a)
protected final double atan2pi(double y, double x)
protected final float atan2pi(float y, double x)
protected final double cbrt(double a)
protected final float cbrt(float a)
protected final double cosh(double x)
protected final float cosh(float x)
protected final double cospi(double a)
protected final float cospi(float a)
protected final double exp2(double a)
protected final float exp2(float a)
protected final double exp10(double a)
protected final float exp10(float a)
protected final double expm1(double x)
protected final float expm1(float x)
protected final double log2(double a)
protected final float log2(float a)
protected final double log10(double a)
protected final float log10(float a)
protected final double log1p(double x)
protected final float log1p(float x)
protected final double mad(double a, double b, double c)
protected final float mad(float a, float b, float c)
protected float fma(float a, float b, float c)
fma(float, float, float)
(OpenCL).
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.a
- value to delegate to first argument of fma(float, float, float)
b
- value to delegate to second argument of fma(float, float, float)
c
- value to delegate to third argument of fma(float, float, float)
fma(float, float, float)
fma(float, float, float)
protected double fma(double a, double b, double c)
fma(double, double, double)
(OpenCL).
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.a
- value to delegate to first argument of fma(double, double, double)
b
- value to delegate to second argument of fma(double, double, double)
c
- value to delegate to third argument of fma(double, double, double)
fma(double, double, double)
fma(double, double, double)
protected final double nextAfter(double start, double direction)
protected final float nextAfter(float start, float direction)
protected final double sinh(double x)
Math.sinh(double)
(Java) or sinh(double)
(OpenCL).
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.x
- value to delegate to Math.sinh(double)
/sinh(double)
Math.sinh(double)
/sinh(double)
Math.sinh(double)
,
sinh(double)
protected final float sinh(float x)
Math.sinh(double)
(Java) or sinh(float)
(OpenCL).
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.x
- value to delegate to Math.sinh(double)
/sinh(float)
Math.sinh(double)
/sinh(float)
Math.sinh(double)
,
sinh(float)
protected final double sinpi(double a)
Math.sin(double)
(Java) or sinpi(double)
(OpenCL).
This method is equivelant to Math.sin(a * Math.PI)
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.a
- value to delegate to sinpi(double)
or java equivelantsinpi(double)
or java equivelantMath.sin(double)
,
sinpi(double)
protected final float sinpi(float a)
Math.sin(double)
(Java) or sinpi(float)
(OpenCL).
This method is equivelant to Math.sin(a * Math.PI)
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.a
- value to delegate to sinpi(float)
or java equivelantsinpi(float)
or java equivelantMath.sin(double)
,
sinpi(float)
protected final double tanh(double x)
Math.tanh(double)
(Java) or tanh(double)
(OpenCL).
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.x
- value to delegate to Math.tanh(double)
/tanh(double)
Math.tanh(double)
/tanh(double)
Math.tanh(double)
,
tanh(double)
protected final float tanh(float x)
java.lang.Math#tanh(float)
(Java) or tanh(float)
(OpenCL).
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.x
- value to delegate to java.lang.Math#tanh(float)
/tanh(float)
java.lang.Math#tanh(float)
/tanh(float)
java.lang.Math#tanh(float)
,
tanh(float)
protected final double tanpi(double a)
Math.tan(double)
(Java) or tanpi(double)
(OpenCL).
This method is equivelant to Math.tan(a * Math.PI)
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.a
- value to delegate to tanpi(double)
or java equivelanttanpi(double)
or java equivelantMath.tan(double)
,
tanpi(double)
protected final float tanpi(float a)
Math.tan(double)
(Java) or tanpi(float)
(OpenCL).
This method is equivelant to Math.tan(a * Math.PI)
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.a
- value to delegate to tanpi(float)
or java equivelanttanpi(float)
or java equivelantMath.tan(double)
,
tanpi(float)
protected float rsqrt(float _f)
Math.sqrt(double)
(Java) or delegates to rsqrt(double)
(OpenCL).
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable._f
- value to delegate to Math.sqrt(double)
/rsqrt(double)
( 1.0f / Math.sqrt(double)
casted to float )
/rsqrt(double)
Math.sqrt(double)
,
rsqrt(double)
protected double rsqrt(double _d)
Math.sqrt(double)
(Java) or delegates to rsqrt(double)
(OpenCL).
User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable._d
- value to delegate to Math.sqrt(double)
/rsqrt(double)
( 1.0f / Math.sqrt(double)
)
/rsqrt(double)
Math.sqrt(double)
,
rsqrt(double)
private float native_sqrt(float _f)
private float native_rsqrt(float _f)
protected int atomicAdd(int[] _arr, int _index, int _delta)
_delta
value to _index
element of array _arr
(Java) or delegates to atomic_add(volatile int*, int)
(OpenCL)._arr
- array for which an element value needs to be atomically incremented by _delta
_index
- index of the _arr
array that needs to be atomically incremented by _delta
_delta
- value by which _index
element of _arr
array needs to be atomically incremented_index
element of _arr
arrayatomic_add(volatile int*, int)
protected final int atomicGet(java.util.concurrent.atomic.AtomicInteger p)
protected final void atomicSet(java.util.concurrent.atomic.AtomicInteger p, int val)
protected final int atomicAdd(java.util.concurrent.atomic.AtomicInteger p, int val)
protected final int atomicSub(java.util.concurrent.atomic.AtomicInteger p, int val)
protected final int atomicXchg(java.util.concurrent.atomic.AtomicInteger p, int newVal)
protected final int atomicInc(java.util.concurrent.atomic.AtomicInteger p)
protected final int atomicDec(java.util.concurrent.atomic.AtomicInteger p)
protected final int atomicCmpXchg(java.util.concurrent.atomic.AtomicInteger p, int expectedVal, int newVal)
protected final int atomicMin(java.util.concurrent.atomic.AtomicInteger p, int val)
protected final int atomicMax(java.util.concurrent.atomic.AtomicInteger p, int val)
protected final int atomicAnd(java.util.concurrent.atomic.AtomicInteger p, int val)
protected final int atomicOr(java.util.concurrent.atomic.AtomicInteger p, int val)
protected final int atomicXor(java.util.concurrent.atomic.AtomicInteger p, int val)
protected final void localBarrier()
protected final void globalBarrier()
protected final void localGlobalBarrier()
protected float hypot(float a, float b)
protected double hypot(double a, double b)
public Kernel.KernelState getKernelState()
private KernelRunner prepareKernelRunner()
public void registerProfileReportObserver(IProfileReportObserver observer)
null
value.
observer
- the observer instance that will receive the profile reportspublic java.lang.ref.WeakReference<ProfileReport> getProfileReportLastThread(Device device)
ProfileReport.clone()
device
- the relevant device where the kernel executedgetProfileReportCurrentThread(Device)
,
registerProfileReportObserver(IProfileReportObserver)
,
getAccumulatedExecutionTimeAllThreads(Device)
,
#getExecutionTimeLastThread()
,
#getConversionTimeLastThread()
public java.lang.ref.WeakReference<ProfileReport> getProfileReportCurrentThread(Device device)
ProfileReport.clone()
device
- the relevant device where the kernel executedgetProfileReportLastThread(Device)
,
registerProfileReportObserver(IProfileReportObserver)
,
#getExecutionTimeCurrentThread(Device)
,
#getConversionTimeCurrentThread(Device)
,
getAccumulatedExecutionTimeAllThreads(Device)
public double getExecutionTime()
getProfileReportLastThread(Device)
or registerProfileReportObserver(IProfileReportObserver)
is encouraged instead.getProfileReportCurrentThread(Device)
,
registerProfileReportObserver(IProfileReportObserver)
,
getAccumulatedExecutionTimeAllThreads(Device)
,
getConversionTime();
,
getAccumulatedExecutionTime();
public double getConversionTime()
getProfileReportLastThread(Device)
or registerProfileReportObserver(IProfileReportObserver)
is encouraged instead.getProfileReportCurrentThread(Device)
,
registerProfileReportObserver(IProfileReportObserver)
,
getAccumulatedExecutionTimeAllThreads(Device)
,
getAccumulatedExecutionTime();
,
getExecutionTime();
public double getAccumulatedExecutionTimeCurrentThread(Device device)
the
- device of interest where the kernel executedgetProfileReportCurrentThread(Device)
,
getProfileReportLastThread(Device)
,
registerProfileReportObserver(IProfileReportObserver)
,
getAccumulatedExecutionTimeAllThreads(Device)
public double getAccumulatedExecutionTimeAllThreads(Device device)
the
- device of interest where the kernel executedgetProfileReportCurrentThread(Device)
,
getProfileReportLastThread(Device)
,
registerProfileReportObserver(IProfileReportObserver)
,
getAccumulatedExecutionTimeCurrentThread(Device)
public double getAccumulatedExecutionTime()
getAccumulatedExecutionTimeAllThreads(Device)
is encouraged instead.#getAccumulatedExecutionTime(Device));
,
#getProfileReport(Device)
,
registerProfileReportObserver(IProfileReportObserver)
,
getExecutionTime();
,
getConversionTime();
public Kernel execute(Range _range)
_range
kernels.
When kernel.execute(globalSize)
is invoked, Aparapi will schedule the execution of globalSize
kernels. If the execution mode is GPU then
the kernels will execute as OpenCL code on the GPU device. Otherwise, if the mode is JTP, the kernels will execute as a pool of Java threads on the CPU.
_range
- The number of Kernels that we would like to initiate.public java.lang.String toString()
toString
in class java.lang.Object
public Kernel execute(int _range)
_range
kernels.
When kernel.execute(_range)
is 1invoked, Aparapi will schedule the execution of _range
kernels. If the execution mode is GPU then
the kernels will execute as OpenCL code on the GPU device. Otherwise, if the mode is JTP, the kernels will execute as a pool of Java threads on the CPU.
Since adding the new Range class
this method offers backward compatibility and merely defers to return (execute(Range.create(_range), 1));
.
_range
- The number of Kernels that we would like to initiate.protected Range createRange(int _range)
public Kernel execute(Range _range, int _passes)
_passes
iterations of _range
kernels.
When kernel.execute(_range, _passes)
is invoked, Aparapi will schedule the execution of _reange
kernels. If the execution mode is GPU then
the kernels will execute as OpenCL code on the GPU device. Otherwise, if the mode is JTP, the kernels will execute as a pool of Java threads on the CPU.
_passes
- The number of passes to makepublic Kernel execute(int _range, int _passes)
_passes
iterations over the _range
of kernels.
When kernel.execute(_range)
is invoked, Aparapi will schedule the execution of _range
kernels. If the execution mode is GPU then
the kernels will execute as OpenCL code on the GPU device. Otherwise, if the mode is JTP, the kernels will execute as a pool of Java threads on the CPU.
Since adding the new Range class
this method offers backward compatibility and merely defers to return (execute(Range.create(_range), 1));
.
_range
- The number of Kernels that we would like to initiate.public Kernel execute(java.lang.String _entrypoint, Range _range)
globalSize
kernels for the given entrypoint.
When kernel.execute("entrypoint", globalSize)
is invoked, Aparapi will schedule the execution of globalSize
kernels. If the execution mode is GPU then
the kernels will execute as OpenCL code on the GPU device. Otherwise, if the mode is JTP, the kernels will execute as a pool of Java threads on the CPU.
_entrypoint
- is the name of the method we wish to use as the entrypoint to the kernelpublic Kernel execute(java.lang.String _entrypoint, Range _range, int _passes)
globalSize
kernels for the given entrypoint.
When kernel.execute("entrypoint", globalSize)
is invoked, Aparapi will schedule the execution of globalSize
kernels. If the execution mode is GPU then
the kernels will execute as OpenCL code on the GPU device. Otherwise, if the mode is JTP, the kernels will execute as a pool of Java threads on the CPU.
_entrypoint
- is the name of the method we wish to use as the entrypoint to the kernelpublic Kernel compile(Device _device) throws CompileFailedException
_device
- the device for which the kernel is to be compiledCompileFailedException
- if compilation failed for some reasonpublic Kernel compile(java.lang.String _entrypoint, Device _device) throws CompileFailedException
_entrypoint
- is the name of the method we wish to use as the entrypoint to the kernel_device
- the device for which the kernel is to be compiledCompileFailedException
- if compilation failed for some reasonpublic long getKernelMinimumPrivateMemSizeInUsePerWorkItem(Device device) throws QueryFailedException
device
- the device where the kernel is intended to runQueryFailedException
- if the query couldn't completepublic long getKernelLocalMemSizeInUse(Device device) throws QueryFailedException
device
- the device where the kernel is intended to runQueryFailedException
- if the query couldn't completepublic int getKernelPreferredWorkGroupSizeMultiple(Device device) throws QueryFailedException
device
- the device where the kernel is intended to runQueryFailedException
- if the query couldn't completepublic int getKernelMaxWorkGroupSize(Device device) throws QueryFailedException
device
- the device where the kernel is intended to runQueryFailedException
- if the query couldn't completepublic int[] getKernelCompileWorkGroupSize(Device device) throws QueryFailedException
device
- the device where the kernel is intended to runQueryFailedException
- if the query couldn't completepublic boolean isAutoCleanUpArrays()
public void setAutoCleanUpArrays(boolean autoCleanUpArrays)
cleanUpArrays()
following each execution.public void cleanUpArrays()
KernelArg
s to 1 (0 size is prohibited) and invoking kernel
execution on a zero size range. Unlike dispose()
, this does not prohibit further invocations of this kernel, as sundry resources such as OpenCL queues are
not freed by this method.
This allows a "dormant" Kernel to remain in existence without undue strain on GPU resources, which may be strongly preferable to disposing a Kernel and recreating another one later, as creation/use of a new Kernel (specifically creation of its associated OpenCL context) is expensive.
Note that where the underlying array field is declared final, for obvious reasons it is not resized to zero.
public void dispose()
When the execution mode is CPU
or GPU
, Aparapi stores some OpenCL resources in a data structure associated with the kernel instance. The
dispose()
method must be called to release these resources.
If execute(int _globalSize)
is called after dispose()
is called the results are undefined.
public boolean isRunningCL()
public final Device getTargetDevice()
public boolean isAllowDevice(Device _device)
@Deprecated public Kernel.EXECUTION_MODE getExecutionMode()
Kernel.EXECUTION_MODE
Return the current execution mode. Before a Kernel executes, this return value will be the execution mode as determined by the setting of the EXECUTION_MODE enumeration. By default, this setting is either GPU if OpenCL is available on the target system, or JTP otherwise. This default setting can be changed by calling setExecutionMode().
After a Kernel executes, the return value will be the mode in which the Kernel actually executed.
setExecutionMode(EXECUTION_MODE)
@Deprecated public void setExecutionMode(Kernel.EXECUTION_MODE _executionMode)
Kernel.EXECUTION_MODE
Set the execution mode.
This should be regarded as a request. The real mode will be determined at runtime based on the availability of OpenCL and the characteristics of the workload.
_executionMode
- the requested execution mode.getExecutionMode()
public void setExecutionModeWithoutFallback(Kernel.EXECUTION_MODE _executionMode)
@Deprecated public void setFallbackExecutionMode()
Kernel.EXECUTION_MODE
private static java.lang.String descriptorToReturnTypeLetter(java.lang.String desc)
private static java.lang.String getReturnTypeLetter(java.lang.reflect.Method meth)
private static java.lang.String toClassShortNameIfAny(java.lang.Class<?> retClass)
public static java.lang.String getMappedMethodName(ClassModel.ConstantPool.MethodReferenceEntry _methodReferenceEntry)
public static boolean isMappedMethod(ClassModel.ConstantPool.MethodReferenceEntry methodReferenceEntry)
public static boolean isOpenCLDelegateMethod(ClassModel.ConstantPool.MethodReferenceEntry methodReferenceEntry)
public static boolean usesAtomic32(ClassModel.ConstantPool.MethodReferenceEntry methodReferenceEntry)
public static boolean usesAtomic64(ClassModel.ConstantPool.MethodReferenceEntry methodReferenceEntry)
public void setExplicit(boolean _explicit)
_explicit
- (true if we want explicit memory management)public boolean isExplicit()
public Kernel put(long[] array)
array
- public Kernel put(long[][] array)
array
- public Kernel put(long[][][] array)
array
- public Kernel put(double[] array)
array
- public Kernel put(double[][] array)
array
- public Kernel put(double[][][] array)
array
- public Kernel put(float[] array)
array
- public Kernel put(float[][] array)
array
- public Kernel put(float[][][] array)
array
- public Kernel put(int[] array)
array
- public Kernel put(int[][] array)
array
- public Kernel put(int[][][] array)
array
- public Kernel put(byte[] array)
array
- public Kernel put(byte[][] array)
array
- public Kernel put(byte[][][] array)
array
- public Kernel put(char[] array)
array
- public Kernel put(char[][] array)
array
- public Kernel put(char[][][] array)
array
- public Kernel put(boolean[] array)
array
- public Kernel put(boolean[][] array)
array
- public Kernel put(boolean[][][] array)
array
- public Kernel get(long[] array)
array
- public Kernel get(long[][] array)
array
- public Kernel get(long[][][] array)
array
- public Kernel get(double[] array)
array
- public Kernel get(double[][] array)
array
- public Kernel get(double[][][] array)
array
- public Kernel get(float[] array)
array
- public Kernel get(float[][] array)
array
- public Kernel get(float[][][] array)
array
- public Kernel get(int[] array)
array
- public Kernel get(int[][] array)
array
- public Kernel get(int[][][] array)
array
- public Kernel get(byte[] array)
array
- public Kernel get(byte[][] array)
array
- public Kernel get(byte[][][] array)
array
- public Kernel get(char[] array)
array
- public Kernel get(char[][] array)
array
- public Kernel get(char[][][] array)
array
- public Kernel get(boolean[] array)
array
- public Kernel get(boolean[][] array)
array
- public Kernel get(boolean[][][] array)
array
- public java.util.List<ProfileInfo> getProfileInfo()
@Deprecated public void addExecutionModes(Kernel.EXECUTION_MODE... platforms)
Kernel.EXECUTION_MODE
.
set possible fallback path for execution modes. for example setExecutionFallbackPath(GPU,CPU,JTP) will try to use the GPU if it fails it will fall back to OpenCL CPU and finally it will try JTP.
@Deprecated public boolean hasNextExecutionMode()
Kernel.EXECUTION_MODE
.@Deprecated public void tryNextExecutionMode()
Kernel.EXECUTION_MODE
.
try the next execution path in the list if there aren't any more than give upprivate static boolean getBoolean(ValueCache<java.lang.Class<?>,java.util.Map<java.lang.String,java.lang.Boolean>,java.lang.RuntimeException> methodNamesCache, ClassModel.ConstantPool.MethodReferenceEntry methodReferenceEntry)
private static <A extends java.lang.annotation.Annotation> ValueCache<java.lang.Class<?>,java.util.Map<java.lang.String,java.lang.Boolean>,java.lang.RuntimeException> markedWith(java.lang.Class<A> annotationClass)
static java.lang.String toSignature(java.lang.reflect.Method method)
private static java.lang.String getArgumentsLetters(java.lang.reflect.Method method)
private static boolean isRelevant(java.lang.reflect.Method method)
private static <V,T extends java.lang.Throwable> V getProperty(ValueCache<java.lang.Class<?>,java.util.Map<java.lang.String,V>,T> cache, ClassModel.ConstantPool.MethodReferenceEntry methodReferenceEntry, V defaultValue) throws T extends java.lang.Throwable
T extends java.lang.Throwable
private static java.lang.String toSignature(ClassModel.ConstantPool.MethodReferenceEntry methodReferenceEntry)
private static <K,V,T extends java.lang.Throwable> ValueCache<java.lang.Class<?>,java.util.Map<K,V>,T> cacheProperty(ValueCache.ThrowingValueComputer<java.lang.Class<?>,java.util.Map<K,V>,T> throwingValueComputer)
public static void invalidateCaches()