General Purpose Float Calculations via GPU / GLSL / Shaders in Processing.org

Update: These encoding methods look promising but I've still not got a 100% solution.

I've had some success using the GPU to do general purpose calculations in Processing.  Processing now supports Shaders, although the documentation is sparse.  This tutorial is worth looking at, and I found the Book of Shaders very useful too, although it requires some support to be finished.

Concept

Before we go anywhere, there are already some drawbacks.  This example seems to break with (very) large numbers, perhaps because Processing stores RGBA colours within 32bits.  This method also drops the fractional component.  Therefore, to retain fractional precision (such as working with numbers in the range 0:1), you need to multiply your numbers up (by 10, or 100, etc) to move the decimal point and consistently divide them own again after the GPU operation.  Also, it doesn't support negative numbers.  Just a few drawbacks!

The meat of the problem is how to get data back from a GPU.  The GPU is designed to write images to a screen buffer.  It is intended as a fast one-way process, where the GPU crunches some numbers and writes out to the screen.  If we can subvert that process we can make use of the parallel and hardware optimised maths processors in the GPU.

The code I provide below is just my initial foray, it is by no means comprehensive.  I'm still trying to figure it out myself.  Debugging GLSL code is hard because the GPU as a peripheral does not pass any errors back.  It does seem to fail gracefully (i.e., at worst, not render any thing to the screen), but it leaves you wondering what recent change in your code made it go to a black screen.  To debug you basically have to inspect variable values with 'if' statements, and change the colour of the screen to represent true or false values.  Detective work!

I'm doing a lot of image processing in research at the moment.  To develop this code I've worked from the MaskImage shaders example from within Processing - it would be worth checking that out.  It uses a 2D texture shader written as just a Frag Shader - so hopefully simple enough to understand.
The key requirements for me was to be able to send specific values to the Shaders, and for the Shaders to then send the answers from calculations back.  I think it is best to imagine the Shaders as an idiot swarm of processors.  They all have to run the exact same code, and you have no control over which perform when.  The Shaders will complete a whole image-worth of calculations, but they can't intercommunicate or send information back whilst the operation happens.  That is the big constraint to GPU programming as I understand it.  It is an uncontrolled one-way process.  There are probably lots of tricks possible with branching code to get Shaders to do different things, but you'd still have no control over timing.

For our complete general calculation process to happen:

1. We predetermine the calculation operation of the shaders.
2. We set up the information available to the shaders (prior, because it is a one way process).
3. We then action the shaders do their individual calculations on the image buffer.
4. The Shaders store their results by modifying pixels within a returned image buffer.
Therefore, within Processing we capture the resultant image from the Shaders, and then inspect each pixel for that Shaders answer.  I hope this makes some sense.  I am working with the assumption that there are as many Shaders in operation as there are pixels in the returned image buffer.

The core of this given code then, is to squash (encode) data into the pixel-data of a source image, and decode it back again.  In Processing an image pixel is the color datatype, which stores Alpha Red Green and Blue as 32bits of information (AAAAAAAARRRRRRRRGGGGGGGGBBBBBBBB, see here).    Processing stores each RGBA as a number between 0 and 255.  GLSL, as I understand it, stores and reads these values as numbers between 0 and 1.  This difference is the most noteworthy of my troubles.  I've not been able to use the alpha channel. It seems that PGraphics represents alpha with just one bit (on, off).  Here is a diagram:

If you are doing some form of image processing you may not really need to encode data into a source image to begin with.  You just pass the GPU your source image and you are only interested in decoding results back out of the returned image buffer.  However sending data both ways is a good test, since the GPU doesn't normally send useful numerical data back.  If we can get information to flow both ways it can become a debug method in more complex use.  As a side note of the above diagram, if you encode data into an image buffer, it is unlikely you'll get anything that resembles a sensible visual image. A screen shot of the screen buffer in error.

First Test Example

I found a GLSL example of encoding a float into a vec3 datatype, here.  So I rewrote this to encode a float into color() to work in Processing to give that two-way functionality.  I found that the example used the value of 256 for it's encoding operations, but this caused error artefacts in my results.  The value of 255 works consistently for me.  This requires a bit of investigation, and may have something to do with the very large number problem.

We then need the corresponding functions within the GLSL Shader code.  In general, the code for a Shader looks complete, it has a main() and it can have it's own support functions.  It runs independently from Processing, executing on the each of the GPU Shaders.  You have to store the GLSL shader code within the data folder of your processing sketch, or else reference it's file path directly when you initialise the shader in Processing.  This will make more sense when we look at the whole code.

It is important to note that the GLSL side decode has an extra * 255 operation, because RGBA are stored in GLSL with a value between 0 and 1.   We bring that 0:1 value to a 0:255 value to be consistent with the processing encode and decode operations.  Processing has more user-friendly information available, so I decided to be consistent with Processing, rather than make Processing consistent with GLSL.

Here is a simple test program and the complimentary GLSL shader code.  In this code we set up a PImage buffer, write two distinct float values to it, draw it to a PGraphics buffer using the shaders, and then inspect the values returned.  Note, we use a PGraphics buffer, rather then directly drawing to the screen, because a PGraphics buffer will later let us use Shaders outside of Processings draw loop framework.  In the GLSL code, the Shader simply does a decode and then encode, so we can see if the routine is working by comparing the sent and received values:

Processing code:

// Processing Code
PGraphics shaderOutput; // image buffer for shader results
PImage data_set;     // input image buffer for shaders

void setup() {

// Create a window
size(800,800, P2D);

// Create an image to store our data.
// You could think of this as a 2D array in
// computer memory that we are going to pass
// to the GPU. We just have to encode data
// in to the Red, Green, Blue, Alpha (RGBA)
// format per pixel.
data_set = createImage( 800, 800, RGB );

// Create a graphics buffer for the shader to
// write to, because we want to catch the output
// data from the GPU rather than render to screen.
// We can read this image (array) for per-pixel
shaderOutput = createGraphics( 800, 800, P2D );

// Initialise our shader
// *.glsl is another piece of code written to
// run on the GPU shaders. It needs to be in
// the /data/ folder of your sketch.

// Populate our data_set image with test data for
// this example.
int x,y;

// Ready image buffer
float count = 1;
for( y = 0; y < height; y++ ) {

// Set a value to pixels of left half of image.
for( x = 0; x < width/2; x++ ) {
data_set.set(x,y, encodeFloatToColor( 1234.5 ) );
}

// Set value to pixels of right half of image.
for( x = width/2; x < width; x++ ) {
data_set.set(x,y, encodeFloatToColor( 9876.5) );
}
}

// finalise image buffer
data_set.updatePixels();

// Pass shader the image buffer.

}

void draw() {

// Ready our PGraphics buffer.

// Draw an image.

// Finalise PGraphics buffer.

// Draw buffer to screen, just to see.
image( shaderOutput, 0,0 );

// For some reason, PGraphics.get(x,y) was not working
// for me, transfer whole buffer to a PImage works.
PImage img = shaderOutput.get();

println("Top left sent: " + decodeColorToFloat( data_set.get(0,0) ) );
println("Top left recv: " + decodeColorToFloat( img.get(0,0) ) );

println("Btm right sent: " + decodeColorToFloat( data_set.get( data_set.width-1, data_set.height-1) ) );
println("Btm right recv: " + decodeColorToFloat( img.get( data_set.width-1, data_set.height-1) ) );

// Nothing more to do.
exit();
}

// http://stackoverflow.com/questions/6893302/decode-rgb-value-to-single-float-without-bit-shift-in-glsl
//
color encodeFloatToColor( float f ) {
float r,g,b;
b = floor(f / 255.0 / 255.0);
g = floor((f - (b * 255.0 * 255.0) ) / 255.0);
r = floor(f - (b * 255.0 * 255.0) - (g * 255.0) );

// Processing RGB works in range 0:255, so
// so no need for /255 unlike GLSL encode.
return color( r,g,b, 255 );
}
float decodeColorToFloat( color c ) {
float r,g,b;
r = red(c);
g = green(c);
b = blue(c);
float result;
result = ( r + ( g * 255 ) + ( b * 255 * 255 ) );

return result;
}

GLSL Code (save into your sketch data folder as calc.glsl):

//
// this help thread gave the pack/unpack code:
// http://stackoverflow.com/questions/6893302/decode-rgb-value-to-single-float-without-bit-shift-in-glsl
//

#ifdef GL_ES
precision highp float; // stack exchange thread advices highp for fragshader
precision highp int;
#endif

uniform sampler2D texture;     // set by Processing
varying vec4 vertTexCoord;     // autoset by GLSL

vec3 encodeFloatToVec3( const in float f ) {
vec3 color;
color.b = floor(f / 255.0 / 255.0);
color.g = floor( (f - (color.b * 255.0 * 255.0) ) / 255.0 );
color.r = floor( f - (color.b * 255.0 * 255.0) - (color.g * 255.0) );

// color.rgb are in range 0:255, but GLSL
// buffer needs values in 0:1 range.
// Divide through by 255.
return (color/255.0);
}

// This function has an extra *255 for RGB compared
// to the processing sketch because GLSL operates RGB
// in the 0:1 range, processing in 0:255
float decodeVec3ToFloat( vec3 color) {
float result;
result = float(color.r * 255.0);
result += float(color.g * 255.0 * 255.0 );
result += float(color.b * 255.0 * 255.0 * 255.0);
return result;
}

void main() {

// Read this pixel value by using the texture
// set by Processing and this Frag's vertTexCoord
vec3 texColor = texture2D(texture, vertTexCoord.xy ).rgb;

// Decode the pixel to a float
float in_value = decodeVec3ToFloat( texColor );

// Reencode the value back to a RGB value.
vec3 out_value = encodeFloatToVec3( in_value );

// Write it back to screen.
gl_FragColor.rgb = out_value;

// If alpha is set to zero, nothing is drawn,
// and processing reads the color as 0,0,0
// Make sure alpha is 255
gl_FragColor.a = 1.0;

}

This is not perfect.  Here are a few interesting methods to break the encode and decode algorithm.  Instead of using encodeFloatToColor(), you can place a colour into the data_set image buffer directly.  If you use color( 255,0,0,255), color(0,255,0,255), or color( 0,0,255,255), the encode routine does not encode properly.  I haven't figured out a solution yet.  That means there are 3 distinct float values which won't encode and decode properly - and perhaps more.

Second Test Example

For this example we will actually perform a calculation with the Shaders to see if we get a sensible result back.  We will set the Shader with a reference coordinate, and then ask each Shader to calculate it's respective distance.  This is something I need to do, although I will be doing it in colour space models on a source image, rather than just pixel coordinate.  Lets keep this simple though.  To do this, we introduce another variable into the GLSL code that we can set from Processing.  We do this with 'uniform vec2 testCoord;'.   This has to be uniform, because each Shader gets the same read only value from Processing.  Using the code below, you should see rings of red which follow the mouse in real time:

Processing Code:

PGraphics shaderOutput; // image buffer for shader results
PImage data_set;     // input image buffer for shaders

void setup() {

// Create a window
size(800,800, P2D);

// Create an image to store our data.
// You could think of this as a 2D array in
// computer memory that we are going to pass
// to the GPU. We just have to encode data
// in to the Red, Green, Blue, Alpha (RGBA)
// format per pixel.
data_set = createImage( 800, 800, RGB );

// Create a graphics buffer for the shader to
// write to, because we want to catch the output
// data from the GPU rather than render to screen.
// We can read this image (array) for per-pixel
shaderOutput = createGraphics( 800, 800, P2D );

// Initialise our shader
// *.glsl is another piece of code written to
// run on the GPU shaders. It needs to be in
// the /data/ folder of your sketch.

// Set the initial reference as the center of the
// screen and/or image buffer.
distShader.set( "testCoord", width/2, height/2 );
}

void draw() {

// If the mouse is sensible, change the reference point
if( mouseX >= 0 && mouseY >= 0 ) {

// GLSL buffer Y axis is inverted relative to Processing!
distShader.set( "testCoord", float(mouseX), float( height - mouseY) );
}

// Ready our PGraphics buffer.

// Draw an image.
// Even though we haven't defined a dataset
// we need to give the shaders something to
// render so they action their calculations.

// Finalise PGraphics buffer.

// Draw buffer to screen, just to see.
image( shaderOutput, 0,0 );

}

GLSL Code (save into sketch data folder as dist.glsl):

//
// this help thread gave the pack/unpack code:
// http://stackoverflow.com/questions/6893302/decode-rgb-value-to-single-float-without-bit-shift-in-glsl
//

#ifdef GL_ES
precision highp float; // stack exchange thread advices highp for fragshader
precision highp int;
#endif

varying vec4 vertTexCoord;
uniform vec2 testCoord;

vec3 encodeFloatToVec3( const in float f ) {
vec3 color;
color.b = floor(f / 255.0 / 255.0);
color.g = floor( (f - (color.b * 255.0 * 255.0) ) / 255.0 );
color.r = floor( f - (color.b * 255.0 * 255.0) - (color.g * 255.0) );

// color.rgb are in range 0:255, but GLSL
// buffer needs values in 0:1 range.
// Divide through by 255.
return (color/255.0);
}

// This function has an extra *255 for RGB compared
// to the processing sketch because GLSL operates RGB
// in the 0:1 range, processing in 0:255
float decodeVec3ToFloat( vec3 color) {
float result;
result = float(color.r * 255.0);
result += float(color.g * 255.0 * 255.0 );
result += float(color.b * 255.0 * 255.0 * 255.0);
return result;
}

void main() {

// calculate distance.
float d = distance( gl_FragCoord.xy, testCoord.xy );

// Encode result
vec3 out_value = encodeFloatToVec3( d );

// Write as pixel
gl_FragColor.rgb = out_value;

// If alpha is set to zero, nothing is drawn,
// so processing reads the color as 0,0,0
gl_FragColor.a = 1.0;

}

I see this type of output.  The pattern makes sense as the distance increases radially away from the mouse cursor, giving a patterned pixel-encoding:

Test Example 3

Another way to inspect our results: we get Processing to write out an ascii file that we can read into OpenSCAD.  It is laborious to inspect every pixel of an image, so instead we can do it visually by using OpenSCAD to draw each decoded pixel value as a height map.

Processing code:

PGraphics shaderOutput; // image buffer for shader results
PImage data_set;     // input image buffer for shaders

void setup() {

// Create a window
size(800,800, P2D);

// Create an image to store our data.
// You could think of this as a 2D array in
// computer memory that we are going to pass
// to the GPU. We just have to encode data
// in to the Red, Green, Blue, Alpha (RGBA)
// format per pixel.
data_set = createImage( 800, 800, RGB );

// Create a graphics buffer for the shader to
// write to, because we want to catch the output
// data from the GPU rather than render to screen.
// We can read this image (array) for per-pixel
shaderOutput = createGraphics( 800, 800, P2D );

// Initialise our shader
// *.glsl is another piece of code written to
// run on the GPU shaders. It needs to be in
// the /data/ folder of your sketch.

// Set the initial reference as the center of the
// screen and/or image buffer.
distShader.set( "testCoord", float( width/2 ), float( height/2 ) );
}

void draw() {

// Ready our PGraphics buffer.

// Draw an image.
// Even though we haven't defined a dataset
// we need to give the shaders something to
// render so they action their calculations.

// Finalise PGraphics buffer.

// Draw buffer to screen, just to see.
image( shaderOutput, 0,0 );

// Write out data to an ascii file.
PImage img = shaderOutput.get();
int x,y;
PrintWriter ascii_file = createWriter( "shader_output.dat" );

// Decode each pixel, writing out to OpenSCAD dat file
// for surface render.
for( y = 0; y < img.height; y++ ) {
for( x = 0; x < img.width; x++ ) {
color c = img.get(x,y);
float value = decodeColorToFloat( c );

// Use nf to remove decimal place.
// OpenSCAD wants blank space between values.
ascii_file.print( nf( value, 0,0 ) + " " );
}

// Carriage return.
ascii_file.print("\n");
}

// Make sure data is written properly.
ascii_file.flush();
ascii_file.close();

// End sketch
exit();

}

// http://stackoverflow.com/questions/6893302/decode-rgb-value-to-single-float-without-bit-shift-in-glsl
//
color encodeFloatToColor( float f ) {
float r,g,b;
b = floor(f / 255.0 / 255.0);
g = floor((f - (b * 255.0 * 255.0) ) / 255.0);
r = floor(f - (b * 255.0 * 255.0) - (g * 255.0) );

// Processing RGB works in range 0:255, so
// so no need for /255 unlike GLSL encode.
return color( r,g,b, 255 );
}
float decodeColorToFloat( color c ) {
float r,g,b;
r = red(c);
g = green(c);
b = blue(c);
float result;
result = ( r + ( g * 255 ) + ( b * 255 * 255 ) );

return result;
}