GUC Parameters
This session introduces PG-Strom's configuration parameters.
Enables/disables a particular feature
Parameter | Type | Default | Description |
---|---|---|---|
pg_strom.enabled |
bool |
on |
Enables/disables entire PG-Strom features at once |
pg_strom.enable_gpuscan |
bool |
on |
Enables/disables GpuScan |
pg_strom.enable_gpuhashjoin |
bool |
on |
Enables/disables GpuJoin by HashJoin |
pg_strom.enable_gpunestloop |
bool |
on |
Enables/disables GpuJoin by NestLoop |
pg_strom.enable_gpupreagg |
bool |
on |
Enables/disables GpuPreAgg |
pg_strom.enable_brin |
bool |
on |
Enables/disables BRIN index support on tables scan |
pg_strom.enable_partitionwise_gpujoin |
bool |
on |
Enables/disables whether GpuJoin is pushed down to the partition children. Available only PostgreSQL v10 or later. |
pg_strom.enable_partitionwise_gpupreagg |
bool |
on |
Enables/disables whether GpuPreAgg is pushed down to the partition children. Available only PostgreSQL v10 or later. |
pg_strom.pullup_outer_scan |
bool |
on |
Enables/disables to pull up full-table scan if it is just below GpuPreAgg/GpuJoin, to reduce data transfer between CPU/RAM and GPU. |
pg_strom.pullup_outer_join |
bool |
on |
Enables/disables to pull up tables-join if GpuJoin is just below GpuPreAgg, to reduce data transfer between CPU/RAM and GPU. |
pg_strom.enable_numeric_aggfuncs |
bool |
on |
Enables/disables support of aggregate function that takes numeric data type. |
pg_strom.cpu_fallback |
bool |
off |
Controls whether it actually run CPU fallback operations, if GPU program returned "CPU ReCheck Error" |
pg_strom.regression_test_mode |
bool |
off |
It disables some EXPLAIN command output that depends on software execution platform, like GPU model name. It avoid "false-positive" on the regression test, so use usually don't tough this configuration. |
Optimizer Configuration
Parameter | Type | Default | Description |
---|---|---|---|
pg_strom.chunk_size |
int |
65534kB | Size of the data blocks processed by a single GPU kernel invocation. It was configurable, but makes less sense, so fixed to about 64MB in the current version. |
pg_strom.gpu_setup_cost |
real |
4000 | Cost value for initialization of GPU device |
pg_strom.gpu_dma_cost |
real |
10 | Cost value for DMA transfer over PCIe bus per data-chunk (64MB) |
pg_strom.gpu_operator_cost |
real |
0.00015 | Cost value to process an expression formula on GPU. If larger value than cpu_operator_cost is configured, no chance to choose PG-Strom towards any size of tables |
Executor Configuration
Parameter | Type | Default | Description |
---|---|---|---|
pg_strom.global_max_async_tasks |
int |
160 | Number of asynchronous taks PG-Strom can throw into GPU's execution queue in the whole system. |
pg_strom.local_max_async_tasks |
int |
8 | Number of asynchronous taks PG-Strom can throw into GPU's execution queue per process. If CPU parallel is used in combination, this limitation shall be applied for each background worker. So, more than pg_strom.local_max_async_tasks asynchronous tasks are executed in parallel on the entire batch job. |
pg_strom.max_number_of_gpucontext |
int |
auto | Specifies the number of internal data structure GpuContext to abstract GPU device. Usually, no need to expand the initial value. |
SSD-to-GPU Direct Configuration
Parameter | Type | Default | Description |
---|---|---|---|
pg_strom.nvme_strom_enabled |
bool |
on |
Enables/disables SSD-to-GPU Direct SQL mechanism |
pg_strom.nvme_strom_threshold |
int |
auto | Controls the table-size threshold to invoke SSD-to-GPU Direct SQL mechanism |
pg_strom.nvme_distance_map |
string |
NULL |
Manually configures the closest GPU for each NVME-SSD. Usually, it is configured automatically according to the PCIe bus topology information by sysfs. |
Arrow_Fdw Configuration
Parameter | Type | Default | Description |
---|---|---|---|
arrow_fdw.enabled |
bool |
on |
By adjustment of estimated cost value, it turns on/off Arrow_Fdw. Note that only Foreign Scan (Arrow_Fdw) can scan on Arrow files, if GpuScan is not capable to run on. |
arrow_fdw.metadata_cache_size |
int |
128MB | Size of shared memory to cache metadata of Arrow files. It needs to restart to update the parameter. |
arrow_fdw.record_batch_size |
int |
256MB | Threshold of RecordBatch when Arrow_Fdw foreign table is written. When total amount of the buffer size exceeds this configuration, Arrow_Fdw writes out the buffer to Apache Arrow file, even if INSERT command is not completed yet. |
Configuration of GPU code generation and build
Parameter | Type | Default | Description |
---|---|---|---|
pg_strom.program_cache_size |
int |
256MB |
Amount of the shared memory size to cache GPU programs already built. It needs restart to update the parameter. |
pg_strom.num_program_builders |
int |
2 |
Number of background workers to build GPU programs asynchronously. It needs restart to update the parameter. |
pg_strom.debug_jit_compile_options |
bool |
off |
Controls to include debug option (line-numbers and symbol information) on JIT compile of GPU programs. It is valuable for complicated bug analysis using GPU core dump, however, should not be enabled on daily use because of performance degradation. |
pg_strom.debug_kernel_source |
bool |
off |
If enables, EXPLAIN VERBOSE command also prints out file paths of GPU programs written out. |
GPU Device Configuration
Parameter | Type | Default | Description |
---|---|---|---|
pg_strom.cuda_visible_devices |
string |
'' |
List of GPU device numbers in comma separated, if you want to recognize particular GPUs on PostgreSQL startup. It is equivalent to the environment variable CUDAVISIBLE_DEVICES |
pg_strom.gpu_memory_segment_size |
int |
512MB |
Specifies the amount of device memory to be allocated per CUDA API call. Larger configuration will reduce the overhead of API calls, but not efficient usage of device memory. |
pg_strom.max_num_preserved_gpu_memory |
int |
2048 | Upper limit of the number of preserved GPU device memory segment. Usually, don't need to change from the default value. |
System Shared Memory Configuration
Parameter | Type | Default | Description |
---|---|---|---|
shmbuf.segment_size | int |
256MB |
|
shmbuf.num_logical_segments | int |
auto | Default logical segment size is double size of system physical memory size. |