public class KernelRunner extends KernelRunnerJNI
Kernel
implementations. KernelRunner
is the real workhorse for Aparapi. Each Kernel
instance creates a single
KernelRunner
to encapsulate state and to help coordinate interactions between the Kernel
and it's execution logic.KernelRunner
is created lazily as a result of calling Kernel.execute()
. A this
time the ExecutionMode
is consulted to determine the default requested mode. This will dictate how
the KernelRunner
will attempt to execute the Kernel
Kernel.execute(int _globalSize)
Modifier and Type | Class and Description |
---|---|
private static class |
KernelRunner.ExecutionSettings |
private static class |
KernelRunner.FJSafeBarrier |
private class |
KernelRunner.ThreadDiedHandler |
private static interface |
KernelRunner.ThreadIdSetter |
Modifier and Type | Field and Description |
---|---|
private int |
argc |
private KernelArg[] |
args |
static boolean |
BINARY_CACHING_DISABLED |
static int |
CANCEL_STATUS_FALSE |
static int |
CANCEL_STATUS_TRUE |
private java.util.Set<java.lang.String> |
capabilitiesSet |
private static java.lang.String |
CODE_GEN_ERROR_MARKER |
private Entrypoint |
entryPoint |
private boolean |
executing |
private boolean |
explicit |
private KernelRunner.ThreadDiedHandler |
handler |
private java.nio.ByteBuffer |
inBufferRemote
A direct ByteBuffer used for asynchronous intercommunication between java and JNI C code.
|
private java.nio.IntBuffer |
inBufferRemoteInt |
private boolean |
isFallBack |
private long |
jniContextHandle |
private Kernel |
kernel |
private java.util.Hashtable<Device,java.lang.Boolean> |
kernelIsCompiledForDeviceHash |
private java.util.Hashtable<Device,java.lang.Boolean> |
kernelNeverExecutedForDeviceHash |
private static java.util.logging.Logger |
logger |
private static java.util.concurrent.ForkJoinPool.ForkJoinWorkerThreadFactory |
lowPriorityThreadFactory |
private static int |
MINIMUM_ARRAY_SIZE |
private static java.util.HashMap<java.lang.Class<? extends Kernel>,java.lang.String> |
openCLCache |
private java.nio.ByteBuffer |
outBufferRemote
A direct ByteBuffer used for asynchronous intercommunication between java and JNI C code.
|
private java.nio.IntBuffer |
outBufferRemoteInt |
static int |
PASS_ID_COMPLETED_EXECUTION |
static int |
PASS_ID_PREPARING_EXECUTION |
private int |
passId |
private java.util.Set<java.lang.Object> |
puts |
private static java.util.LinkedHashSet<java.lang.String> |
seenBinaryKeys |
private java.util.concurrent.ForkJoinPool |
threadPool |
private boolean |
usesOopConversion |
ARG_APARAPI_BUFFER, ARG_ARRAY, ARG_ARRAYLENGTH, ARG_BOOLEAN, ARG_BYTE, ARG_CHAR, ARG_CONSTANT, ARG_DOUBLE, ARG_EXPLICIT, ARG_EXPLICIT_WRITE, ARG_FLOAT, ARG_GLOBAL, ARG_INT, ARG_LOCAL, ARG_LONG, ARG_OBJ_ARRAY_STRUCT, ARG_PRIMITIVE, ARG_READ, ARG_SHORT, ARG_STATIC, ARG_WRITE, JNI_FLAG_USE_ACC, JNI_FLAG_USE_GPU
Constructor and Description |
---|
KernelRunner(Kernel _kernel)
Create a KernelRunner for a specific Kernel instance.
|
Modifier and Type | Method and Description |
---|---|
boolean |
allocateArrayBufferIfFirstTimeOrArrayChanged(KernelArg arg,
java.lang.Object newRef,
int objArraySize,
int totalStructSize,
int totalBufferSize)
Helper method that manages the memory allocation for storing the kernel argument data,
so that the data can be exchanged between the host and the OpenCL device.
|
void |
cancelMultiPass() |
void |
cleanUpArrays() |
private void |
clearCancelMultiPass() |
Kernel |
compile(java.lang.String _entrypoint,
Device device) |
private java.lang.String |
describeDevice() |
void |
dispose()
Kernel.dispose() delegates to KernelRunner.dispose() which delegates to disposeJNI() to actually close JNI data structures. |
Kernel |
execute(java.lang.String _entrypoint,
Range _range,
int _passes) |
private Kernel |
executeInternalInner(KernelRunner.ExecutionSettings _settings,
Device aparapiDevice,
boolean compileOnly) |
private Kernel |
executeInternalOuter(KernelRunner.ExecutionSettings _settings) |
protected void |
executeJava(KernelRunner.ExecutionSettings _settings,
Device device)
Execute using a Java thread pool, or sequentially, or using an alternative algorithm, usually as a result of failing to compile or execute OpenCL
|
private Kernel |
executeOpenCL(Device device,
KernelRunner.ExecutionSettings _settings) |
private void |
extractAtomicIntegerConversionBuffer(KernelArg arg) |
private void |
extractOopConversionBuffer(KernelArg arg) |
private Kernel |
fallBackByExecutionMode(KernelRunner.ExecutionSettings _settings) |
private Kernel |
fallBackToNextDevice(Device device,
KernelRunner.ExecutionSettings _settings,
java.lang.Exception _exception) |
private Kernel |
fallBackToNextDevice(Device device,
KernelRunner.ExecutionSettings _settings,
java.lang.Exception _exception,
boolean _silently) |
private Kernel |
fallBackToNextDevice(Device device,
KernelRunner.ExecutionSettings _settings,
java.lang.String _reason) |
void |
get(java.lang.Object array)
Enqueue a request to return this array from the GPU.
|
int |
getCancelState() |
private ClassModel |
getClassModelFromArg(KernelArg arg,
java.lang.Class<?> arrayClass)
Helper method to retrieve the class model from a kernel argument.
|
int |
getCurrentPass()
Returns the index of the current pass, or one of two special constants with negative values to indicate special progress states.
|
private int |
getCurrentPassLocal() |
protected int |
getCurrentPassRemote() |
int[] |
getKernelCompileWorkGroupSize(Device device) |
long |
getKernelLocalMemSizeInUse(Device device) |
int |
getKernelMaxWorkGroupSize(Device device) |
long |
getKernelMinimumPrivateMemSizeInUsePerWorkItem(Device device) |
int |
getKernelPreferredWorkGroupSizeMultiple(Device device) |
private int |
getPrimitiveSize(int type) |
java.util.List<ProfileInfo> |
getProfileInfo() |
(package private) boolean |
has3DImageWritesSupport() |
(package private) boolean |
hasByteAddressableStoreSupport() |
(package private) boolean |
hasFP16Support() |
(package private) boolean |
hasFP64Support() |
(package private) boolean |
hasGlobalInt32BaseAtomicsSupport() |
(package private) boolean |
hasGlobalInt32ExtendedAtomicsSupport() |
(package private) boolean |
hasGLSharingSupport() |
(package private) boolean |
hasInt64BaseAtomicsSupport() |
(package private) boolean |
hasInt64ExtendedAtomicsSupport() |
(package private) boolean |
hasLocalInt32BaseAtomicsSupport() |
(package private) boolean |
hasLocalInt32ExtendedAtomicsSupport() |
(package private) boolean |
hasSelectFPRoundingModeSupport() |
private boolean |
isDeviceCompatible(Device device) |
boolean |
isExecuting()
True while any of the
execute() methods are in progress. |
boolean |
isExplicit() |
private void |
maybeReportProfile(KernelRunner.ExecutionSettings _settings) |
private boolean |
prepareAtomicIntegerConversionBuffer(KernelArg arg) |
private boolean |
prepareOopConversionBuffer(KernelArg arg) |
void |
put(java.lang.Object array)
Tag this array so that it is explicitly enqueued before the kernel is executed.
|
private void |
recreateRange(KernelRunner.ExecutionSettings _settings) |
private void |
restoreObjects() |
void |
setExplicit(boolean _explicit) |
private void |
setMultiArrayType(KernelArg arg,
java.lang.Class<?> type) |
java.lang.String |
toString() |
private boolean |
updateKernelArrayRefs() |
buildProgramJNI, disposeJNI, getExtensionsJNI, getJNI, getKernelCompileWorkGroupSizeJNI, getKernelLocalMemSizeInUseJNI, getKernelMaxWorkGroupSizeJNI, getKernelMinimumPrivateMemSizeInUsePerWorkItemJNI, getKernelPreferredWorkGroupSizeMultipleJNI, getProfileInfoJNI, initJNI, runKernelJNI, setArgsJNI
public static boolean BINARY_CACHING_DISABLED
private static final int MINIMUM_ARRAY_SIZE
public static final int PASS_ID_PREPARING_EXECUTION
getCurrentPass()
,
Constant Field Valuespublic static final int PASS_ID_COMPLETED_EXECUTION
getCurrentPass()
,
Constant Field Valuespublic static final int CANCEL_STATUS_FALSE
public static final int CANCEL_STATUS_TRUE
private static final java.lang.String CODE_GEN_ERROR_MARKER
private static java.util.logging.Logger logger
private long jniContextHandle
private final Kernel kernel
private Entrypoint entryPoint
private int argc
private volatile boolean executing
private volatile int passId
private final java.nio.ByteBuffer inBufferRemote
At present this is a 4 byte buffer to be interpreted as an int[1], used for passing from java to C a single integer interpreted as a cancellation indicator.
private final java.nio.IntBuffer inBufferRemoteInt
private final java.nio.ByteBuffer outBufferRemote
At present this is a 4 byte buffer to be interpreted as an int[1], used for passing from C to java a single integer interpreted as a the current pass id.
private final java.nio.IntBuffer outBufferRemoteInt
private boolean isFallBack
private static final java.util.concurrent.ForkJoinPool.ForkJoinWorkerThreadFactory lowPriorityThreadFactory
private final KernelRunner.ThreadDiedHandler handler
private final java.util.concurrent.ForkJoinPool threadPool
private static java.util.HashMap<java.lang.Class<? extends Kernel>,java.lang.String> openCLCache
private static java.util.LinkedHashSet<java.lang.String> seenBinaryKeys
private final java.util.Hashtable<Device,java.lang.Boolean> kernelIsCompiledForDeviceHash
private final java.util.Hashtable<Device,java.lang.Boolean> kernelNeverExecutedForDeviceHash
private java.util.Set<java.lang.String> capabilitiesSet
private KernelArg[] args
private boolean usesOopConversion
private final java.util.Set<java.lang.Object> puts
private boolean explicit
public KernelRunner(Kernel _kernel)
_kernel
- public void cleanUpArrays()
Kernel.cleanUpArrays().
public void dispose()
Kernel.dispose()
delegates to KernelRunner.dispose()
which delegates to disposeJNI()
to actually close JNI data structures.KernelRunnerJNI.disposeJNI(long)
public long getKernelMinimumPrivateMemSizeInUsePerWorkItem(Device device) throws QueryFailedException
QueryFailedException
public long getKernelLocalMemSizeInUse(Device device) throws QueryFailedException
QueryFailedException
public int getKernelPreferredWorkGroupSizeMultiple(Device device) throws QueryFailedException
QueryFailedException
public int getKernelMaxWorkGroupSize(Device device) throws QueryFailedException
QueryFailedException
public int[] getKernelCompileWorkGroupSize(Device device) throws QueryFailedException
QueryFailedException
boolean hasFP64Support()
boolean hasSelectFPRoundingModeSupport()
boolean hasGlobalInt32BaseAtomicsSupport()
boolean hasGlobalInt32ExtendedAtomicsSupport()
boolean hasLocalInt32BaseAtomicsSupport()
boolean hasLocalInt32ExtendedAtomicsSupport()
boolean hasInt64BaseAtomicsSupport()
boolean hasInt64ExtendedAtomicsSupport()
boolean has3DImageWritesSupport()
boolean hasByteAddressableStoreSupport()
boolean hasFP16Support()
boolean hasGLSharingSupport()
protected void executeJava(KernelRunner.ExecutionSettings _settings, Device device)
private ClassModel getClassModelFromArg(KernelArg arg, java.lang.Class<?> arrayClass)
arg
- the kernel argumentarrayClass
- the array Java class for the argumentpublic boolean allocateArrayBufferIfFirstTimeOrArrayChanged(KernelArg arg, java.lang.Object newRef, int objArraySize, int totalStructSize, int totalBufferSize)
arg
- the kernel argumentnewRef
- the actual Java data instanceobjArraySize
- the number of elements in the Java arraytotalStructSize
- the size of each target array elementtotalBufferSize
- the total buffer size including memory alignmentprivate boolean prepareOopConversionBuffer(KernelArg arg) throws AparapiException
arg
- AparapiException
private void extractOopConversionBuffer(KernelArg arg) throws AparapiException
AparapiException
private void restoreObjects() throws AparapiException
AparapiException
private boolean prepareAtomicIntegerConversionBuffer(KernelArg arg) throws AparapiException
AparapiException
private void extractAtomicIntegerConversionBuffer(KernelArg arg) throws AparapiException
AparapiException
private boolean updateKernelArrayRefs() throws AparapiException
AparapiException
private Kernel executeOpenCL(Device device, KernelRunner.ExecutionSettings _settings) throws AparapiException
AparapiException
private Kernel fallBackByExecutionMode(KernelRunner.ExecutionSettings _settings)
private void recreateRange(KernelRunner.ExecutionSettings _settings)
private Kernel fallBackToNextDevice(Device device, KernelRunner.ExecutionSettings _settings, java.lang.String _reason)
private Kernel fallBackToNextDevice(Device device, KernelRunner.ExecutionSettings _settings, java.lang.Exception _exception)
private Kernel fallBackToNextDevice(Device device, KernelRunner.ExecutionSettings _settings, java.lang.Exception _exception, boolean _silently)
public Kernel compile(java.lang.String _entrypoint, Device device) throws CompileFailedException
CompileFailedException
private Kernel executeInternalOuter(KernelRunner.ExecutionSettings _settings)
private Kernel executeInternalInner(KernelRunner.ExecutionSettings _settings, Device aparapiDevice, boolean compileOnly) throws CompileFailedException
CompileFailedException
public java.lang.String toString()
toString
in class java.lang.Object
private java.lang.String describeDevice()
private void maybeReportProfile(KernelRunner.ExecutionSettings _settings)
private boolean isDeviceCompatible(Device device)
public int getCancelState()
public void cancelMultiPass()
private void clearCancelMultiPass()
public int getCurrentPass()
PASS_ID_PREPARING_EXECUTION
to indicate that the Kernel has started executing but not reached the initial pass, or
PASS_ID_COMPLETED_EXECUTION
to indicate that execution is complete (possibly due to early termination via cancelMultiPass()
), i.e. the Kernel
is idle. PASS_ID_COMPLETED_EXECUTION
is also returned before the first execution has been invoked.
This can be used, for instance, to update a visual progress bar.
execute(String, Range, int)
public boolean isExecuting()
execute()
methods are in progress.protected int getCurrentPassRemote()
private int getCurrentPassLocal()
private int getPrimitiveSize(int type)
private void setMultiArrayType(KernelArg arg, java.lang.Class<?> type) throws AparapiException
AparapiException
public void get(java.lang.Object array)
Kernel.put(type [])
calls will delegate to this call.
array
- It is assumed that this parameter is indeed an array (of int, float, short etc).Kernel.get(int[] arr)
,
Kernel.get(float[] arr)
,
Kernel.get(double[] arr)
,
Kernel.get(long[] arr)
,
Kernel.get(char[] arr)
,
Kernel.get(boolean[] arr)
public java.util.List<ProfileInfo> getProfileInfo()
public void put(java.lang.Object array)
Kernel.put(type [])
calls will delegate to this call. array
- It is assumed that this parameter is indeed an array (of int, float, short etc).Kernel.put(int[] arr)
,
Kernel.put(float[] arr)
,
Kernel.put(double[] arr)
,
Kernel.put(long[] arr)
,
Kernel.put(char[] arr)
,
Kernel.put(boolean[] arr)
public void setExplicit(boolean _explicit)
public boolean isExplicit()