Processor<T>
BatchedColumnProcessor
public abstract class AbstractBatchedColumnProcessor<T extends Context> extends java.lang.Object implements Processor<T>
Processor
implementation that stores values of columns in batches. Use this implementation in favor of AbstractColumnProcessor
when processing large inputs to avoid running out of memory.
Values parsed in each row will be split into columns of Strings. Each column has its own list of values.
During the execution of the process, the batchProcessed(int)
method will be invoked after a given number of rows has been processed.
The user can access the lists with values parsed for all columns using the methods getColumnValuesAsList()
,
getColumnValuesAsMapOfIndexes()
and getColumnValuesAsMapOfNames()
.
After batchProcessed(int)
is invoked, all values will be discarded and the next batch of column values will be accumulated.
This process will repeat until there's no more rows in the input.
AbstractParser
,
BatchedColumnReader
,
Processor
Constructor | Description |
---|---|
AbstractBatchedColumnProcessor(int rowsPerBatch) |
Constructs a batched column processor configured to invoke the
batchesProcessed method after a given number of rows has been processed. |
Modifier and Type | Method | Description |
---|---|---|
abstract void |
batchProcessed(int rowsInThisBatch) |
|
int |
getBatchesProcessed() |
|
java.util.List<java.lang.String> |
getColumn(int columnIndex) |
|
java.util.List<java.lang.String> |
getColumn(java.lang.String columnName) |
|
java.util.List<java.util.List<java.lang.String>> |
getColumnValuesAsList() |
|
java.util.Map<java.lang.Integer,java.util.List<java.lang.String>> |
getColumnValuesAsMapOfIndexes() |
|
java.util.Map<java.lang.String,java.util.List<java.lang.String>> |
getColumnValuesAsMapOfNames() |
|
java.lang.String[] |
getHeaders() |
|
int |
getRowsPerBatch() |
|
void |
processEnded(T context) |
This method will by invoked by the parser once, after the parsing process stopped and all resources were closed.
|
void |
processStarted(T context) |
This method will by invoked by the parser once, when it is ready to start processing the input.
|
void |
putColumnValuesInMapOfIndexes(java.util.Map<java.lang.Integer,java.util.List<java.lang.String>> map) |
|
void |
putColumnValuesInMapOfNames(java.util.Map<java.lang.String,java.util.List<java.lang.String>> map) |
|
void |
rowProcessed(java.lang.String[] row,
T context) |
Invoked by the parser after all values of a valid record have been processed.
|
public AbstractBatchedColumnProcessor(int rowsPerBatch)
batchesProcessed
method after a given number of rows has been processed.rowsPerBatch
- the number of rows to process in each batch.public void processStarted(T context)
Processor
processStarted
in interface Processor<T extends Context>
context
- A contextual object with information and controls over the current state of the parsing processpublic void rowProcessed(java.lang.String[] row, T context)
Processor
rowProcessed
in interface Processor<T extends Context>
row
- the data extracted by the parser for an individual record. Note that:
CommonSettings.setSkipEmptyLines(boolean)
Format.setComment(char)
to '\0'context
- A contextual object with information and controls over the current state of the parsing processpublic void processEnded(T context)
Processor
It will always be called by the parser: in case of errors, if the end of the input us reached, or if the user stopped the process manually using Context.stop()
.
processEnded
in interface Processor<T extends Context>
context
- A contextual object with information and controls over the state of the parsing processpublic final java.lang.String[] getHeaders()
public final java.util.List<java.util.List<java.lang.String>> getColumnValuesAsList()
public final void putColumnValuesInMapOfNames(java.util.Map<java.lang.String,java.util.List<java.lang.String>> map)
public final void putColumnValuesInMapOfIndexes(java.util.Map<java.lang.Integer,java.util.List<java.lang.String>> map)
public final java.util.Map<java.lang.String,java.util.List<java.lang.String>> getColumnValuesAsMapOfNames()
public final java.util.Map<java.lang.Integer,java.util.List<java.lang.String>> getColumnValuesAsMapOfIndexes()
public java.util.List<java.lang.String> getColumn(java.lang.String columnName)
public java.util.List<java.lang.String> getColumn(int columnIndex)
public int getRowsPerBatch()
public int getBatchesProcessed()
public abstract void batchProcessed(int rowsInThisBatch)