Graphics
cgvr.korea.ac.kr 1 Graphics Lab @ Korea University
Using Vertex Shader in DirectX 8.1
강 신 진
2002. 02. 05
cgvr.korea.ac.kr 2
CGVR
Graphics Lab @ Korea University
Overview
What is Vertex Programming? Vertex Programming Assembly Language Sample Programs
Instruction Set
Graphics
cgvr.korea.ac.kr 3 Graphics Lab @ Korea University
What is Vertex Programming?
Part 1
cgvr.korea.ac.kr 4
CGVR
Graphics Lab @ Korea University
Traditional Rendering Pipeline
Traditional Graphics Pipeline
frame-bufferanti-aliasingframe-bufferanti-aliasing
textureblendingtexture
blending
setuprasterizer
setuprasterizer
transform &lighting
transform &lighting
cgvr.korea.ac.kr 5
CGVR
Graphics Lab @ Korea University
Vertex Shader Rendering Pipeline
SetVertexShader()
frame-bufferanti-aliasingframe-bufferanti-aliasing
textureblendingtexture
blending
setuprasterizer
setuprasterizer
transform &lighting
transform &lighting
VertexProgramVertex
Program
Switch from standard T&L modeto
Vertex Program mode
cgvr.korea.ac.kr 6
CGVR
Graphics Lab @ Korea University
What is Possible?
Complete control of transform and lighting HW Complex vertex operations accelerated in HW Custom vertex lighting Custom skinning and blending Custom texture coordinate generation Custom texture matrix operations Custom vertex computations of your choice Offloading vertex computations frees up CPU
cgvr.korea.ac.kr 7
CGVR
Graphics Lab @ Korea University
What is Possible?
Custom transform, lighting, and skinning
Directional Light
Bump Point LightingKeyframe Interpolation
cgvr.korea.ac.kr 8
CGVR
Graphics Lab @ Korea University
Vertex Attributes
Vertex Program
Vertex Output
Program Parameters
Temporary Registers Read/Write-able
16x4 registers
128 instructions
96x4 registers
12x4 registers
Read-only
Vertex ProgrammingConceptual Overview
cgvr.korea.ac.kr 9
CGVR
Graphics Lab @ Korea University
Vertex ProgrammingConceptual Overview
Vertex Attribute Registers
Vertex Program
Vertex Result Registers
Program Parameter Registers
Temporary Registers
v[0] v[1] … v[15]
c[0] c[1] … c[95]
R0 R1 … R10 R11
oPos, oD0
r
r
r/w
w
Address Register
A0.x
cgvr.korea.ac.kr 10
CGVR
Graphics Lab @ Korea University
Sample Code
Position & Constant Color
reg c0 = (0,0.5,1.0,2.0)
reg c4-7 = WorldViewProj matrix
reg c8 = constant color
reg v0 = position ( 4x1 vector )
reg v5 = diffuse color
const char SimpleVertexShader0[] =“
vs.1.0 //Shader version 1.0
m4x4 oPos , v0, c4 //emit projected position
mov oD0, c8 //Diffuse color = c8”
cgvr.korea.ac.kr 11
CGVR
Graphics Lab @ Korea University
What is Vertex Programming?
Vertex Program Assembly language interface to T&L unit GPU instruction set to perform all vertex math Reads an untransformed, unlit vertex Creates a transformed vertex Optionally creates
Lights a vertex Creates texture coordinates Creates fog coordinates Creates point sizes
cgvr.korea.ac.kr 12
CGVR
Graphics Lab @ Korea University
What is Vertex Programming?
Vertex Program Does not create or delete vertices
1 vertex in and 1 vertex out
No topological information provided No edge, face, nor neighboring vertex info
Dynamically loadable
Graphics
cgvr.korea.ac.kr 13 Graphics Lab @ Korea University
Vertex ProgrammingAssembly Language
Part 2
cgvr.korea.ac.kr 14
CGVR
Graphics Lab @ Korea University
Instruction Format:
Opcode dst, [-]s0 [,[-]s1 [,[-]s2]]; #comment
Instruction name
Destination Register
Source0 Register
Source1 Register
Source2 Register
Assembly Language Format
cgvr.korea.ac.kr 15
CGVR
Graphics Lab @ Korea University
Instruction Format:
Opcode dst, [-]s0 [,[-]s1 [,[-]s2]]; #comment
Instruction name
Destination Register
Source0 Register
Source1 Register
Source2 Register
Example:
MOV r1, r2
R1xyzw
R2xyzw
Assembly Language Format
cgvr.korea.ac.kr 16
CGVR
Graphics Lab @ Korea University
Simple Example:
MOV R1, R2;
R1
x
y
z
w
7.0
3.03.0
6.0
2.0
R2
x
y
z
w
7.0
3.0
6.0
2.0
R1
x
y
z
w
0.0
0.0
0.0
0.0
R2
x
y
z
w
7.0
3.0
6.0
2.0
before after
Assembly Example
cgvr.korea.ac.kr 17
CGVR
Graphics Lab @ Korea University
Source registers can be negated:
MOV R1, -R2;
before after
R1
x
y
z
w
0.0
0.0
0.0
0.0
R2
x
y
z
w
7.0
3.0
6.0
2.0
R1
x
y
z
w
-7.0
-3.0
-6.0
-2.0
R2
x
y
z
w
7.0
3.0
6.0
2.0
Assembly Example
cgvr.korea.ac.kr 18
CGVR
Graphics Lab @ Korea University
Destination register can mask which components are written to…
R1 write all components
R1.x write only x component
R1.xw write only x, w components
Masking
cgvr.korea.ac.kr 19
CGVR
Graphics Lab @ Korea University
Destination register masking:
MOV R1.xw, -R2;
before after
R1
x
y
z
w
0.0
0.0
0.0
0.0
R2
x
y
z
w
7.0
3.0
6.0
2.0
R1
x
y
z
w
-7.0
0.0
0.0
-2.0
R2
x
y
z
w
7.0
3.0
6.0
2.0
Masking
cgvr.korea.ac.kr 20
CGVR
Graphics Lab @ Korea University
There are 17 instructions in total
• ARL• MOV• MUL• ADD• MAD• RCP
• RSQ• DP3• DP4• DST• MIN• MAX
• SLT• SGE• EXP• LOG• LIT
All Instructions
Graphics
cgvr.korea.ac.kr 21 Graphics Lab @ Korea University
Sample Program
Part 3
cgvr.korea.ac.kr 22
CGVR
Graphics Lab @ Korea University
Vertex Shader Frame Work in DX 8.1
Step 1: Declare the vertex data Step 2: Design the shader functionality Step 3: Check for vertex shader support Step 4: Declare the shader registers Step 5: Create the shader Step 6: Render the output pixels
cgvr.korea.ac.kr 23
CGVR
Graphics Lab @ Korea University
Vertex Shader Frame Work in DX 8.1
Step 1: Declare the vertex data Step 2: Design the shader functionality Step 3: Check for vertex shader support Step 4: Declare the shader registers Step 5: Create the shader Step 6: Render the output pixels
cgvr.korea.ac.kr 24
CGVR
Graphics Lab @ Korea University
Step 1
Declare the vertex datastruct CUSTOMVERTEX { FLOAT x, y, z; DWORD diffuseColor; };
#define D3DFVF_CUSTOMVERTEX
(D3DFVF_XYZ|D3DFVF_DIFFUSE)
CUSTOMVERTEX g_Vertices[]= {
{ -1.0f, -1.0f, 0.0f, 0xffff0000 },
{ +1.0f, -1.0f, 0.0f, 0xff00ff00 },
{ +1.0f, +1.0f, 0.0f, 0xff0000ff },
{ -1.0f, +1.0f, 0.0f, 0xffffff00 }, };
cgvr.korea.ac.kr 25
CGVR
Graphics Lab @ Korea University
Vertex Shader Frame Work in DX 8.1
Step 1: Declare the vertex data Step 2: Design the shader functionality Step 3: Check for vertex shader support Step 4: Declare the shader registers Step 5: Create the shader Step 6: Render the output pixels
cgvr.korea.ac.kr 26
CGVR
Graphics Lab @ Korea University
Basic Instruction (Step 2)
DP3: Three-Component Dot Product
Function:
Computes the three-component (x,y,z) dot product of two source vectors and replicates the result across the destination register.
Syntax:
DP3 dest, src0, src1;
cgvr.korea.ac.kr 27
CGVR
Graphics Lab @ Korea University
Basic Instruction (Step 2)
before after
DP3 Example:
DP3 R1, R6, R6;
x
y
z
w
R6
3.0
2.0
1.0
1.0
R1
0.0
0.0
0.0
0.0
x
y
z
w
R6
3.0
2.0
1.0
1.0
R1
14.0
14.0
14.0
14.0
cgvr.korea.ac.kr 28
CGVR
Graphics Lab @ Korea University
Sample Code – 1 (Step 2)
Position & Diffuse Colorreg c0 = (0,0.5,1.0,2.0)
reg c4-7 = WorldViewProj matrix
reg c8 = constant color
reg v0 = position ( 4x1 vector )
reg v5 = diffuse color
const char SimpleVertexShader1[] =“vs.1.0 //Shader version 1.0dp4 oPos.x , v0, c4 //emit projected x positiondp4 oPos.y , v0, c5 //emit projected y positiondp4 oPos.z , v0, c6 //emit projected z positiondp4 oPos.w , v0, c7 //emit projected w positionmov oD0, v5 //Diffuse color = vertex color ”
cgvr.korea.ac.kr 29
CGVR
Graphics Lab @ Korea University
Sample Code – 2 (Step 2)
Position & Texturereg c0 = (0,0.5,1.0,2.0)
reg c4-7 = WorldViewProj matrix
reg c8 = constant color
reg v0 = position ( 4x1 vector )
reg v5 = diffuse color
reg v7 = texcoords ( 2x1 vector )
const char SimpleVertexShader2[] =“vs.1.0 //Shader version 1.0dp4 oPos.x , v0, c4 //emit projected x positiondp4 oPos.y , v0, c5 //emit projected y positiondp4 oPos.z , v0, c6 //emit projected z positiondp4 oPos.w , v0, c7 //emit projected w positionmov oT0.xy , v7 //copy texcoords”
cgvr.korea.ac.kr 30
CGVR
Graphics Lab @ Korea University
Sample Code – 3 (Step 2)
Position & Lightingreg c0 = (0,0.5,1.0,2.0)
reg c4-7 = WorldViewProj matrix
reg c8 = constant color
reg v0 = position ( 4x1 vector )
reg v5 = diffuse color
reg v7 = texcoords ( 2x1 vector )
const char SimpleVertexShader3[] =vs.1.0 //Shader version 1.0dp4 oPos.x, v0, c4 //emit projected x positiondp4 oPos.y, v0, c5 //emit projected y positiondp4 oPos.z, v0, c6 //emit projected z positiondp4 oPos.w, v0, c7 //emit projected w positiondp3 r0.x, v3, c12 //N dot L in world space mul oD0, r0.x , v5 //Calculate color intensity mov oT0.xy , v7 //copy texcoords
cgvr.korea.ac.kr 31
CGVR
Graphics Lab @ Korea University
Vertex Shader Frame Work in DX 8.1
Step 1: Declare the vertex data Step 2: Design the shader functionality Step 3: Check for vertex shader support Step 4: Declare the shader registers Step 5: Create the shader Step 6: Render the output pixels
cgvr.korea.ac.kr 32
CGVR
Graphics Lab @ Korea University
Step 3
Check for vertex shader supportD3DCAPS8 caps; m_pd3dDevice->GetDeviceCaps(&caps);
if( D3DSHADER_VERSION_MAJOR( caps.VertexShaderVersion ) < 1 ) return E_FAIL;
cgvr.korea.ac.kr 33
CGVR
Graphics Lab @ Korea University
Vertex Shader Frame Work in DX 8.1
Step 1: Declare the vertex data Step 2: Design the shader functionality Step 3: Check for vertex shader support Step 4: Declare the shader registers Step 5: Create the shader Step 6: Render the output pixels
cgvr.korea.ac.kr 34
CGVR
Graphics Lab @ Korea University
Step 4
Declare the shader registersDWORD dwDecl[] = { D3DVSD_STREAM(0), D3DVSD_REG(D3DVSDE_POSITION, D3DVSDT_FLOAT3), D3DVSD_REG( D3DVSDE_DIFFUSE, D3DVSDT_D3DCOLOR ),
D3DVSD_END() };
cgvr.korea.ac.kr 35
CGVR
Graphics Lab @ Korea University
Vertex Shader Frame Work in DX 8.1
Step 1: Declare the vertex data Step 2: Design the shader functionality Step 3: Check for vertex shader support Step 4: Declare the shader registers Step 5: Create the shader Step 6: Render the output pixels
cgvr.korea.ac.kr 36
CGVR
Graphics Lab @ Korea University
Step 5
Create the shaderTCHAR strPath[512]; LPD3DXBUFFER pCode; DXUtil_FindMediaFile( strPath, _T("VertexShader.vsh") );
D3DXAssembleShaderFromFile( strPath, 0, NULL, &pCode, NULL );
m_pd3dDevice->CreateVertexShader( dwDecl, (DWORD*)pCode->GetBufferPointer(), &m_hVertexShader, 0 )))
pCode->Release();
cgvr.korea.ac.kr 37
CGVR
Graphics Lab @ Korea University
Vertex Shader Frame Work in DX 8.1
Step 1: Declare the vertex data Step 2: Design the shader functionality Step 3: Check for vertex shader support Step 4: Declare the shader registers Step 5: Create the shader Step 6: Render the output pixels
cgvr.korea.ac.kr 38
CGVR
Graphics Lab @ Korea University
Step 6
Render the output pixelsm_pd3dDevice->SetVertexShaderConstant( 0, &mat, 4 ); float color[4] = {0,1,0,0}; m_pd3dDevice->SetVertexShaderConstant( 4, &color, 1 ); m_pd3dDevice->SetStreamSource( 0, m_pQuadVB,
sizeof(CUSTOMVERTEX) ); m_pd3dDevice->SetVertexShader( m_hVertexShader );
m_pd3dDevice->DrawPrimitive( D3DPT_TRIANGLEFAN, 0, 2 );
cgvr.korea.ac.kr 39
CGVR
Graphics Lab @ Korea University
Demo Program
cgvr.korea.ac.kr 40
CGVR
Graphics Lab @ Korea University
Performance
For Optimal performance
• Be clever
• Exploit vector parallelism
• (Ex. 4 scalar adds with a vector add)
• Swizzle and negate away
• (no performance penalty for doing so)
• Use LIT and DST effectively
cgvr.korea.ac.kr 41
CGVR
Graphics Lab @ Korea University
Summary – Vertex Programs
Increased programmability Customizable engine for transform, lighting,
texture coordinate generation, and more. Facilitates setup for per-fragment shading. Allows animation/deformation through key-frame
interpolation and skinning.
Accelerated in Future Generation GPUs Offloads CPU tasks to GPU yielding higher
performance.
Graphics
cgvr.korea.ac.kr 42 Graphics Lab @ Korea University
The Instruction Set
Appendix
cgvr.korea.ac.kr 43
CGVR
Graphics Lab @ Korea University
There are 17 instructions in total
• ARL• MOV• MUL• ADD• MAD• RCP
• RSQ• DP3• DP4• DST• MIN• MAX
• SLT• SGE• EXP• LOG• LIT
All Instructions
cgvr.korea.ac.kr 44
CGVR
Graphics Lab @ Korea University
RCP
RCP: Reciprocal
Function:
Inverts the value of the source and replicates the result across the destination register.
Syntax:
RCP dest, src0.C;
where ‘C’ is x, y, z, or w
cgvr.korea.ac.kr 45
CGVR
Graphics Lab @ Korea University
RCP
before after
RCP Example:
RCP R1, R2.w;
x
y
z
w
R2
7.0
3.0
6.0
2.0
R1
0.0
0.0
0.0
0.0
x
y
z
w
R2
7.0
3.0
6.0
2.0
R1
0.5
0.5
0.5
0.5
cgvr.korea.ac.kr 46
CGVR
Graphics Lab @ Korea University
RSQ
RSQ: Reciprocal Square Root
Function:
Computes the inverse square root of the absolute value of the source scalar and replicates the result across the destination register.
Syntax:
RSQ dest, src0.C;
where ‘C’ is x, y, z, or w
cgvr.korea.ac.kr 47
CGVR
Graphics Lab @ Korea University
RSQ
before after
RSQ Example:
RSQ R1.x, R5.x;
x
y
z
w
R5
-4.0
3.0
7.0
9.0
R1
0.0
0.0
0.0
0.0
x
y
z
w
R5
-4.0
3.0
7.0
9.0
R1
0.5
0.0
0.0
0.0
cgvr.korea.ac.kr 48
CGVR
Graphics Lab @ Korea University
SLT
SLT: Set On Less Than
Function:
Performs a component-wise assignment of either 1.0 or 0.0. 1.0 is assigned if the
value of the first source is less than the value of the second. Otherwise, 0.0 is assigned.
Syntax:
SLT dest, src0, src1;
cgvr.korea.ac.kr 49
CGVR
Graphics Lab @ Korea University
SLT
before after
SLT Example:
SLT R1, R2, R3;
x
y
z
w
R3
2.0
2.1
5.0
7.0
R2
7.0
3.0
6.0
2.0
R1
0.0
0.0
0.0
0.0
x
y
z
w
R3
2.0
2.1
5.0
7.0
R2
7.0
3.0
6.0
2.0
R1
0.0
0.0
0.0
1.0
cgvr.korea.ac.kr 50
CGVR
Graphics Lab @ Korea University
LIT
LIT: Light Coefficients
Function:
Computes ambient, diffuse, and specular lighting coefficients from a diffuse dot product, a specular dot product, and a specular power.
Assumes:
src0.x = diffuse dot product (N • L)src0.y = specular dot product (N • H)
src0.w = power (m)
cgvr.korea.ac.kr 51
CGVR
Graphics Lab @ Korea University
LIT
LIT: Light Coefficients
Syntax:
LIT dest, src0
Result:
dest.x = 1.0 (ambient coeff.)
dest.y = CLAMP(src0.x, 0, 1) = CLAMP(N • L, 0, 1)
(diffuse coeff.)
dest.z = (see next slide…) (specular
coeff.) dest.w = 1.0
cgvr.korea.ac.kr 52
CGVR
Graphics Lab @ Korea University
LIT
LIT: Light Coefficients
Result: (Recall: src0.x N • L)
if ( src0.x > 0.0 ) dest.z = (MAX(src0.y,0))
(ECLAMP(src0.w,-128,128)) = (MAX(N • H,0))m
where m in (-128,128)
otherwise, dest.z = 0.0
(dest.z is specular coeff. as defined by OpenGL)
cgvr.korea.ac.kr 53
CGVR
Graphics Lab @ Korea University
LIT
before after
LIT Example:
LIT R1, R7;
x
y
z
w
R7
0.3
0.8
0.0
2.0
R1
0.0
0.0
0.0
0.0
R7
0.3
0.8
0.0
2.0
x
y
z
w
R1
1.0
0.3
0.64
1.0
(ambient)
(diffuse)
(specular)
(Good to 8+ bits)