Using Arrow Table Columns with Shaders
Apache Arrow tables store data in typed columns. luma.gl shaders consume typed vertex
attributes. The @luma.gl/arrow helpers connect those two models by deriving a
BufferLayout from an Arrow table and a shader ShaderLayout.
Apache Arrow Preliminaries
Apache Arrow has a rich type system that can represent a wide variety of binary data columns. A subset of these column types can be used directly as GPU vertex attribute data, meaning that such arrow columns can be uploaded efficiently to the GPU.
Apache Arrow supports primitive types like Float32, Uint32, and Uint8 that
describe the value stored in each row. It also supports fixed-length vectors of
these types with FixedSizeList. These scalar and fixed-length vector types map
directly to the memory layouts used by GPU vertex attributes.
Arrow also supports variable-length List columns. These are useful for data
such as polygons and paths, but they do not map directly to a single vertex
attribute without an additional conversion step.
Shader and Buffer Layout Preliminaries
luma.gl provides separate descriptions of shader attributes and the buffers that
provide data for those attributes. The key observation here is that shaders only
work with four vertex attribute scalar types (f32, f16, i32, and u32),
and there is some flexibility in what binary buffer layouts can feed these
declarations.
ShaderLayoutdescribes what the shader can accept, such asvec4<f32>.BufferLayoutdescribes how the current table column is stored in memory, such asfloat32x4,float16x4, orunorm8x4.
This means the shader does not need to be written specifically for every memory representation. A shader attribute is declared using the type used in the shader source code:
const shaderLayout = {
attributes: [{name: 'colors', location: 0, type: 'vec4<f32>'}],
bindings: []
};
Then different Arrow table schemas can use different buffer layouts. For the
vec4<f32> described in the shader layout, the following buffer formats are all
accepted by the GPU.
| Arrow column type | Buffer layout format | Notes |
|---|---|---|
FixedSizeList<Float32, 4> | float32x4 | |
FixedSizeList<Float16, 4> | float16x4 | Shader sees f32 |
FixedSizeList<Int16, 4> | snorm16x4 | Shader sees f32, normalized to [-1.0, 1.0] |
FixedSizeList<Uint16, 4> | unorm16x4 | |
FixedSizeList<Int8, 4> | snorm8x4 | |
FixedSizeList<Uint8, 4> | unorm8x4 |
Creating a BufferLayout
Use getArrowBufferLayout() with an Arrow table when Arrow column names match
shader attribute names:
import {getArrowBufferLayout} from '@luma.gl/arrow';
const bufferLayout = getArrowBufferLayout(shaderLayout, {
arrowTable: table,
arrowPaths: {
instanceColors: 'properties.color'
}
});
const model = new Model(device, {
vs,
fs,
shaderLayout,
bufferLayout,
vertexCount
});
You can also provide Arrow vectors directly. In this mode, object keys are shader attribute names:
const bufferLayout = getArrowBufferLayout(shaderLayout, {
arrowVectors: {
instanceColors: table.getChild('properties').getChild('color')
}
});
The generated layouts use shader attribute names as buffer names:
[
{name: 'instanceColors', format: 'unorm8x4'}
]
Arrow GPU Objects
ArrowGPUVector and ArrowGPUTable are GPU-side representations derived from
Apache Arrow data. Arrow vectors and tables are construction inputs; the GPU
objects do not retain references to those sources after extracting the buffer
data and metadata they need.
An ArrowGPUTable owns GPU buffers and a GPU-facing Arrow Schema for the
selected shader attributes. Field names, types, nullability, and metadata live in
arrowGPUTable.schema.fields. An ArrowGPUVector references one GPU buffer and
uses Arrow's type system to describe it through type, length, and stride.
Vectors created from Arrow data own their generated buffers. Vectors wrapping
existing buffers are non-owning by default unless ownsBuffer is supplied.
In-place operations may transfer ownership to a new vector view over the same
buffer, so the returned vector becomes responsible for destroying the buffer.
import {ArrowGPUTable, ArrowGPUVector} from '@luma.gl/arrow';
const arrowGPUTable = new ArrowGPUTable(device, table, {shaderLayout});
// The schema describes the selected GPU columns, not necessarily the full table.
const [colorField] = arrowGPUTable.schema.fields;
// Each GPU vector has a buffer plus Arrow-derived type/shape metadata.
const colorVector: ArrowGPUVector = arrowGPUTable.gpuVectors.instanceColors;
ArrowModel is the convenience wrapper that combines ArrowGPUTable with
Model. It accepts an Arrow table as an update source and replaces the GPU
representation when setProps({arrowTable}) is called.
Mesh Arrow Geometry
ArrowGeometry converts Mesh Arrow tables into GPUGeometry. This is intended
for loaders.gl-compatible mesh and point-cloud tables that use glTF-style column
names such as POSITION, NORMAL, COLOR_0, and TEXCOORD_0.
import {ArrowGeometry, ArrowModel, type ArrowMeshTable} from '@luma.gl/arrow';
const geometry = new ArrowGeometry(device, {
arrowMesh,
interleaved: true
});
const model = new ArrowModel(device, {
vs,
fs,
shaderLayout,
arrowMesh
});
The local ArrowMeshTable type is structural and does not add a dependency on
loaders.gl. It intentionally mirrors loaders.gl MeshArrowTable: a wrapper with
shape: 'arrow-table', topology, optional top-level indices, and raw
data: arrow.Table.
Mesh Arrow tables use one row per vertex. Vertex attributes are scalar numeric
or FixedSizeList<numeric, 1 | 2 | 3 | 4> columns. ArrowGeometry normalizes
common glTF semantics to luma.gl shader attribute names:
| Mesh Arrow column | Shader attribute |
|---|---|
POSITION | positions |
NORMAL | normals |
COLOR_0 | colors |
TEXCOORD_0 | texCoords |
TEXCOORD_1 | texCoords1 |
Unknown column names are preserved unless arrowPaths maps a shader attribute
name to a specific Arrow column name.
Indexed Mesh Arrow tables follow the loaders.gl convention: a lowercase
indices: List<Int32> column stores the full primitive index list in row 0,
and remaining vertex rows are null. ArrowGeometry uploads those indices as a
separate GPU index buffer. If the wrapper has a top-level indices accessor and
the Arrow table has no indices column, that accessor is used as a fallback.
By default, ArrowGeometry packs all selected vertex attributes into one
interleaved vertex buffer and keeps indices separate. Pass interleaved: false
to upload one vertex buffer per attribute.
StreamingArrowGPUTable uses the same shader attribute selection model but keeps
DynamicBuffer attributes that can grow as record batches arrive. Use it when
the table is append-only and model attribute objects should remain stable across
buffer reallocations.
import {StreamingArrowGPUTable} from '@luma.gl/arrow';
const streamingTable = new StreamingArrowGPUTable({
device,
schema,
shaderLayout
});
model.setBufferLayout(streamingTable.bufferLayout);
model.setAttributes(streamingTable.attributes);
streamingTable.appendRecordBatch(recordBatch);
model.setInstanceCount(streamingTable.numRows);
The constructor can also consume synchronous record batch iterators immediately, or async record batch iterators when a schema is provided:
const streamingTable = new StreamingArrowGPUTable({
device,
schema,
asyncRecordBatches,
shaderLayout
});
await streamingTable.ready;
Planning Table Buffer Groups
TableBufferPlanner is a lower-level helper for applications that already have
column descriptors and need to decide how those columns should consume GPU
buffer bindings. It does not upload buffers, interleave data, or bind storage
buffers. It only returns a plan that describes allocation groups, column
mappings, and which columns should be represented by planner-owned packed or
storage buffers.
Use it when a table has more columns than the target device can expose as separate vertex buffers, or when row-geometry data may later be read from WebGPU storage buffers instead of expanded into per-vertex attributes.
import {TableBufferPlanner} from '@luma.gl/arrow';
const plan = TableBufferPlanner.getAllocationPlan({
device,
modelInfo: {isInstanced: true},
generateConstantAttributes: device.type === 'webgpu',
columns: [
{
id: 'positions',
byteStride: 8,
byteLength: 8 * 4,
rowCount: 4,
stepMode: 'vertex',
supportsPackedBuffer: true
},
{
id: 'instancePositions',
byteStride: 12,
byteLength: 12 * table.numRows,
rowCount: table.numRows,
stepMode: 'instance',
isPosition: true,
supportsPackedBuffer: true,
priority: 'high'
},
{
id: 'instanceColors',
byteStride: 4,
byteLength: 4 * table.numRows,
rowCount: table.numRows,
stepMode: 'instance',
supportsPackedBuffer: true
}
]
});
The planner supports two modes:
table-with-shared-geometry: one reusable geometry is drawn once for each table row. Vertex-rate columns describe the shared geometry; table columns are usually instance-rate attributes.table-with-row-geometries: each table row expands into its own generated vertices, such as paths or polygons. Constants are planned as a one-row instance-rate group.
The returned plan.groups describe physical allocation groups such as
separate-attribute-column, interleaved-attribute-columns,
position-attribute-columns, and interleaved-constant-attribute-columns.
plan.mappingsByColumnId maps each source column to shader-visible attribute
names and group ids. plan.packedColumnIds identifies columns that callers may
pack into planner-owned vertex buffers.
When useStorageBuffers is enabled, WebGPU row-geometry data columns may be
assigned to separate-storage-column or stacked-storage-columns groups. This
is planner output only in the current arrow module; callers still need their own
storage-buffer upload and shader binding path. Storage planning observes
maxStorageBuffersPerShaderStage, maxStorageBufferBindingSize, and uses
256-byte alignment for stacked column offsets.
Supported Shader Types
Arrow scalar numeric columns map to scalar shader attributes. Arrow
FixedSizeList<numeric, 2 | 3 | 4> columns map to vector shader attributes.
| Shader attribute type | Portable Arrow columns |
|---|---|
f32 | Float32, Float16, Int8, Uint8, Int16, Uint16 |
vec2<f32> | FixedSizeList<Float32, 2>, FixedSizeList<Float16, 2>, FixedSizeList<Int8, 2>, FixedSizeList<Uint8, 2>, FixedSizeList<Int16, 2>, FixedSizeList<Uint16, 2> |
vec3<f32> | FixedSizeList<Float32, 3> |
vec4<f32> | FixedSizeList<Float32, 4>, FixedSizeList<Float16, 4>, FixedSizeList<Int8, 4>, FixedSizeList<Uint8, 4>, FixedSizeList<Int16, 4>, FixedSizeList<Uint16, 4> |
f16 | Float16, Int8, Uint8, Int16, Uint16 |
vec2<f16> | FixedSizeList<Float16, 2>, FixedSizeList<Int8, 2>, FixedSizeList<Uint8, 2>, FixedSizeList<Int16, 2>, FixedSizeList<Uint16, 2> |
vec3<f16> | None in portable WebGPU layouts |
vec4<f16> | FixedSizeList<Float16, 4>, FixedSizeList<Int8, 4>, FixedSizeList<Uint8, 4>, FixedSizeList<Int16, 4>, FixedSizeList<Uint16, 4> |
i32 | Int8, Int16, Int32 |
vec2<i32> | FixedSizeList<Int8, 2>, FixedSizeList<Int16, 2>, FixedSizeList<Int32, 2> |
vec3<i32> | FixedSizeList<Int32, 3> |
vec4<i32> | FixedSizeList<Int8, 4>, FixedSizeList<Int16, 4>, FixedSizeList<Int32, 4> |
u32 | Uint8, Uint16, Uint32 |
vec2<u32> | FixedSizeList<Uint8, 2>, FixedSizeList<Uint16, 2>, FixedSizeList<Uint32, 2> |
vec3<u32> | FixedSizeList<Uint32, 3> |
vec4<u32> | FixedSizeList<Uint8, 4>, FixedSizeList<Uint16, 4>, FixedSizeList<Uint32, 4> |
Component counts must match. For example, FixedSizeList<Uint8, 4> can feed
vec4<f32>, but not vec3<f32>.
For f32 and f16 shader attributes, integer Arrow columns are read through
normalized vertex formats (snorm* for signed integers and unorm* for unsigned
integers).
WebGPU Portability
WebGPU does not support every vertex format that WebGL can read. In particular,
3-component 8-bit and 16-bit integer-backed columns are not portable. By default,
getArrowBufferLayout() rejects those mappings with an error.
Shaders that declare f16 attributes have an additional capability requirement.
Before creating a WebGPU device, check adapter.features.has('shader-f16'), request
that feature when creating the device, and include enable f16; in WGSL. Without
that feature, WebGPU rejects shader modules that use f16 types.
For WebGL-only use cases, opt in to WebGL-only formats:
const bufferLayout = getArrowBufferLayout(shaderLayout, {
arrowTable: table,
allowWebGLOnlyFormats: true
});
For portable WebGPU layouts, prefer Float32 for vec3<f32> attributes or pad
8-bit and 16-bit vector data to four components.