The WebGPU context
The input buffer or buffers. Providing only a buffer here is the same as providing it like this: { inputDataBuffer: buffer }.
inputDataBuffer: The input data where the scan will be performed (WARNING: workgroup-wise! see class
description for more details)
The output/reduced sum buffer will be based on the contents of this buffer.
If 'inputMaxBuffer' is NOT provided, the output/reduced max buffer will also be
based on the contents of this buffer.
inputMaxBuffer: If provided, the output/reduced max buffer will be based on the contents of this
buffer. Otherwise, they will be based on 'inputDataBuffer'
reducedSumBuffer: If provided, the output/reduced sum values will be stored here.
Otherwise, a new buffer will be created to store these values.
reducedMaxBuffer: If provided, the output/reduced max values will be stored here.
Otherwise, a new buffer will be created to store these values.
The output buffers and offsets (useful for debugging)
Generated using TypeDoc
Scan (prefix-sum) shader that computes the prefix-sum of a given buffer IN PLACE. This means that the values of the given/input buffer will be REPLACED by its scan. Additionally, this shader also finds the max (largest value) in the input buffer.
WARNING: The scan (and also the largest value) is calculated only PER WORKGROUP and NOT for the entire buffer. Example: If the input buffer has 30 elements and each workgroup has 10 elements, the output scan buffer will look like this (in Javascript-inspired pseudocode): - [...scan(0,9), ...scan(10,19), ...scan(20,29)] - where scan() gives an array with the prefix sum of the elements in the given range
This shader will also output reduced SUM and MAX buffers. In the same example with 30 elements and workgroup size of 10, they will both have 3 elements and will look like this: - Sum buffer: [sum(0,9), sum(10,19), sum(20,29)] - where sum() gives the sum of the elements in the given range - Max buffer: [max(0,9), max(10,19), max(20,29)] - where max() gives the max of the elements in the given range In order to get the full scan/prefix-sum of the input array, further iterations of this shader are required, followed by additions (see WgCompAddShader). One more iteration of the example with 30 elements and workgroup size of 10 will look like this: - Input buffer: same as reduced Sum buffer shown above: [sum(0,9), sum(10,19), sum(20,29)] <- now only 3 elements, but workgroup size of 10 remains the same - Sum buffer: [sum(0,29)] <- now only 1 element - Max buffer: [max(0,29)] <- now only 1 element The max buffer now contains the largest value in the entire 30-element input buffer. The sum buffer now contains the sum of all values in the entire 30-element input buffer. To finish the scan, we now need to work our way up and add the sum (see WgCompAddShader and WgScanAlgorithm) - 1-element Sum buffer can be discarded - 3-element Sum-buffer is added into the 30-element scan like so: add( input = [...scan(0,9), ...scan(10,19), ...scan(20,29)], <- length 30 addend = [sum(0,9), sum(10,19), sum(20,29)], <- length 3 blockSize = 10 <- same as workgroup size ); After this step, the scan will be completed: it will be calculated for the entire input buffer. Additionally, the sum of the entire input buffer will be available in the final 1-element Sum buffer, and the max value of the entire input buffer will also be available in the final 1-element Max buffer. This entire pipeline/algorithm is implemented in WgScanAlgorithm
Based on this article: https://developer.nvidia.com/gpugems/gpugems3/part-vi-gpu-computing/chapter-39-parallel-prefix-sum-scan-cuda