Data Types
PG-Strom support the following data types for use on GPU device.
Numeric types
SQL data types | Internal format | Length | Memo |
---|---|---|---|
smallint |
short |
2 bytes | |
integer |
int |
4 bytes | |
bigint |
long |
8 bytes | |
float2 |
short |
2 bytes | Half precision data type. An extra data type by PG-Strom |
real |
float |
4 bytes | |
float |
double |
8 bytes | |
numeric |
int128 |
variable length | mapped to 128bit fixed-point numerical internal format |
Note
When GPU processes values in numeric
data type, it is converted to an internal 128bit fixed-point number because of implementation reason. (This layout is identical to Decimal
type in Apache Arrow.)
It is transparently converted to/from the internal format, on the other hands, PG-Strom cannot convert numaric
datum with large number of digits, so tries to fallback operations by CPU. Therefore, it may lead slowdown if numeric
data with large number of digits are supplied to GPU device.
To avoid the problem, turn off the GUC option pg_strom.enable_numeric_type
not to run operational expression including numeric
data types on GPU devices.
Note
Even though GPU supports half-precision floating-point numbers by hardware, CPU (x86_64 processor) does not support it yet. So, when CPU processes float2
data types, it transform them to float
or double
on calculations. So, CPU has no advantages for calculation performance of float2
, unlike GPU. It is a feature to save storage/memory capacity for machine-learning / statistical-analytics.
Built-in date and time types
SQL data types | Internal format | Length | Memo |
---|---|---|---|
date |
DateADT |
4 bytes | |
time |
TimeADT |
8 bytes | |
timetz |
TimeTzADT |
12 bytes | |
timestamp |
Timestamp |
8 bytes | |
timestamptz |
TimestampTz |
8 bytes | |
interval |
Interval |
16 bytes |
Built-in variable length types
SQL data types | Internal format | Length | Memo |
---|---|---|---|
bpchar |
varlena * |
variable length | |
varchar |
varlena * |
variable length | |
bytea |
varlena * |
variable length | |
text |
varlena * |
variable length |
Built-in unstructured data types
SQL data types | Internal format | Length | Memo |
---|---|---|---|
jsonb |
varlena * |
variable length |
Note
Pay attention for the two points below, when GPU processes jsonb
data types.
jsonb
is not performance efficient data types because it has to load unreferenced attributes onto GPU from the storage, so tend to consume I/O bandwidth by junk data.
In case when jsonb
data length exceeds the threshold of datum TOASTen, entire jsonb
value is written out to TOAST table, thus, GPU cannot process these values and invokes inefficient CPU-fallback operations.
Regarding to the 2nd problem, you can extend table's storage option toast_tuple_target
to enlarge the threshold for datum TOASTen.
Built-in miscellaneous types
SQL data types | Internal format | Length | Memo |
---|---|---|---|
boolean |
cl_bool |
1 byte | |
money |
cl_long |
8 bytes | |
uuid |
pg_uuid |
16 bytes | |
macaddr |
macaddr |
6 bytes | |
inet |
inet_struct |
7 bytes or 19 bytes | |
cidr |
inet_struct |
7 bytes or 19 bytes |
Built-in range data types
SQL data types | Internal format | Length | Memo |
---|---|---|---|
int4range |
__int4range |
14 bytes | |
int8range |
__int8range |
22 bytes | |
tsrange |
__tsrange |
22 bytes | |
tstzrange |
__tstzrange |
22 bytes | |
daterange |
__daterange |
14 bytes |