Package org.ojalgo.data.batch
Class BatchNode.Builder<T>
java.lang.Object
org.ojalgo.data.batch.BatchNode.Builder<T>
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate final File
private ToIntFunction
<T> private ExecutorService
private int
private final DataInterpreter
<T> private int
private int
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionbuild()
distributor
(ToIntFunction<T> distributor) The default is to distribute randomly.executor
(ExecutorService executor) fragmentation
(int fragmentation) The number of underlying files/shards.(package private) ToIntFunction
<T> (package private) int
The total number of files/shards.(package private) DataInterpreter
<T> (package private) String
getName()
(package private) IntSupplier
Will always be power of 2(package private) ProcessingService
(package private) int
(package private) ShardedFile
parallelism
(int parallelism) parallelism
(IntSupplier parallelism) How many worker threads should process data in parallel?queue
(int capacity) When reading and/or writing data from/to disk data is temporarily queued.
-
Field Details
-
myDirectory
-
myDistributor
-
myExecutor
-
myFragmentation
private int myFragmentation -
myInterpreter
-
myParallelism
private int myParallelism -
myQueueCapacity
private int myQueueCapacity
-
-
Constructor Details
-
Builder
Builder(File directory, DataInterpreter<T> interpreter)
-
-
Method Details
-
build
-
distributor
The default is to distribute randomly. Most likely you want to distribute based on some property of the item/type – extract that property and get its hash code. That causes all items with same value on that property to end up in the same shard, and that you can exploit when processing the data. -
executor
-
fragmentation
The number of underlying files/shards. Increasing the fragmentation (the number of shards) typically reduces memory requirements when processong. The value set here is only an indication of the desired order of magnitude. The exact number of shards actually used is a derived property. -
parallelism
- See Also:
-
parallelism
How many worker threads should process data in parallel? -
queue
When reading and/or writing data from/to disk data is temporarily queued. This specifies the total maximum number of items kept in the queues. -
getDistributor
ToIntFunction<T> getDistributor() -
getFragmentation
int getFragmentation()The total number of files/shards. Will always be power of 2 as well as a multiple ofgetParallelism()
. -
getInterpreter
DataInterpreter<T> getInterpreter() -
getName
String getName() -
getParallelism
IntSupplier getParallelism()Will always be power of 2 -
getProcessor
ProcessingService getProcessor() -
getQueueCapacity
int getQueueCapacity() -
getShardedFile
ShardedFile getShardedFile()
-