Class BatchNode.Builder<T>

  • Enclosing class:
    BatchNode<T>

    public static final class BatchNode.Builder<T>
    extends java.lang.Object
    • Field Detail

      • myDirectory

        private final java.io.File myDirectory
      • myDistributor

        private java.util.function.ToIntFunction<T> myDistributor
      • myExecutor

        private java.util.concurrent.ExecutorService myExecutor
      • myFragmentation

        private int myFragmentation
      • myParallelism

        private int myParallelism
      • myQueueCapacity

        private int myQueueCapacity
    • Constructor Detail

    • Method Detail

      • distributor

        public BatchNode.Builder<T> distributor​(java.util.function.ToIntFunction<T> distributor)
        The default is to distribute randomly. Most likely you want to distribute based on some property of the item/type – extract that property and get its hash code. That causes all items with same value on that property to end up in the same shard, and that you can exploit when processing the data.
      • executor

        public BatchNode.Builder<T> executor​(java.util.concurrent.ExecutorService executor)
      • fragmentation

        public BatchNode.Builder<T> fragmentation​(int fragmentation)
        The number of underlying files/shards. Increasing the fragmentation (the number of shards) typically reduces memory requirements when processong. The value set here is only an indication of the desired order of magnitude. The exact number of shards actually used is a derived property.
      • parallelism

        public BatchNode.Builder<T> parallelism​(java.util.function.IntSupplier parallelism)
        How many worker threads should process data in parallel?
      • queue

        public BatchNode.Builder<T> queue​(int capacity)
        When reading and/or writing data from/to disk data is temporarily queued. This specifies the total maximum number of items kept in the queues.
      • getDistributor

        java.util.function.ToIntFunction<T> getDistributor()
      • getFragmentation

        int getFragmentation()
        The total number of files/shards. Will always be power of 2 as well as a multiple of getParallelism().
      • getName

        java.lang.String getName()
      • getParallelism

        java.util.function.IntSupplier getParallelism()
        Will always be power of 2
      • getQueueCapacity

        int getQueueCapacity()