草地

8.5k words

Desc:

思路:

GrassBendingRTPrePass:
利用 TrailRender 组件,将自动淡出的运动轨迹渲染到一张 RT 上,作为草地交互的信息源

InstancedIndirectGrassPosDefine:
定义草生成的世界坐标

UpdateAllInstanceTransformBufferIfNeeded:
更新草生成所需要的基础信息
例如: 计算生成草范围的中心坐标
包围盒
草地分块等基础信息
传入 Shader 和 ComputeShader

GeometryUtility.CalculateFrustumPlanes:
根据设置的距离计算出 Camera 的六个裁剪平面

GeometryUtility.TestPlanesAABB:
通过每个草地块构建 Bound 进行 AABB 碰撞测试,测试通过才会加入可见的草地块进入渲染流程

CullingComputeShader.Dispatch:
按照可见的草地块索引使用 ComputeShader 对草地块中每棵草进行裁剪

ComputeBuffer.CopyCount:
将经过 ComputeShader 裁剪后草的数量复制到 argsBuffer 的第四个参数中

Graphics.DrawMeshInstancedIndirect:
根据草的顶点数据,渲染材质和 argsBuffer 进行绘制

GUI Instance

数据处理的角度:

1: 使用支持并启用了 GPU Instance 的 Shader 的材质的物体在进行渲染时。Unity会对所有渲染对象进行特殊处理,为所有的渲染目标在GPU的常量缓冲区(Constant Buffer中)准备各种缓冲区(顶点数据缓冲区,材质数据缓冲区,transform矩阵数据缓冲区等)

2: 第二种是我们自己调用GPU Instance API进行实例绘制,那么Unity只会根据我们所传递的参数为其准备顶点缓冲区,材质数据缓冲区,对于矩阵数据缓冲区或者其他自定义数据是不提供的,也就需要我们自己通过ComputeBuffer来传递这些数据,然后在Shader中根据instanceId进行处理。例如我们使用GPU Instance API绘制100w个三角形,那么Unity会控制GPU后端为我们准备一个能容纳300w个顶点的缓冲区和一个材质数据缓冲区

API:
Graphic 提供的
摄像机能根据物体是否在摄像机裁剪区域中进行筛选
Graphics.DrawMeshInstancedIndirect:(常用)
自由度和性能上限更高,没有那么多限制,需要自己处理 LOD 和裁剪
Graphics.DrawMeshInstanced:
受限于 Unity 内部的预先定义,比如单次 DrawMeshInstanced 调用最多绘制1023实例,单次 DrawCall 最多绘制500个实例,但是可以一定程度上享受 Unity 内部的基建(将单次DrawMeshInstanced绘制的所有实例作为一个剔除组对待,但是并不支持单个实例的剔除和排序来提高透明度测试和深度测试的性能),所以这个“一定程度上的Unity基建”已经是相当鸡肋了
CommandBuffer 提供的
一直绘制在摄像机上的?需要自己做剔除操作说法来源
DrawMeshInstanced:
需要提供Matrix[]数组,来操作每个Instance的位置、旋转、缩放
DrawMeshInstancedIndirect:
可以通过ComputeBuffer来提供这些数据,包括数量
DrawMeshInstancedProcedural:
类似DrawMeshInstancedIndirect,数量通过接口传入

二者效率上无差别

存在的问题:

基于GPU Instance的原理——我们只把一个物体的顶点数据传递到了GPU侧,然后通过instance id和不同的变换矩阵,在vs和ps中绘制出多个对象,虽然只有一次Drawcall调用,但是渲染后端内部处理的时候会将这一个物体的顶点数据使用vs处理多次(我们想要绘制出的实例数量),那么对于在视角中完全不可见的实例,对它进行的vs处理就是不必要的。

但是我们自己使用GPU Instance控制绘制的实例没有办法享受应用阶段的裁剪红利(一般由引擎提供粗粒度的,以对象为单位的裁剪),也就引申出GPU Instance需要做的裁剪工作

虽然最后在齐次裁剪空间的裁剪会帮我们把不必要的片元给裁剪掉,但是我们自己提前手动做视锥体裁剪的话可以减少很多顶点操作,对于超多数量的实例,性能提升的幅度一想便知

进行两次裁剪

1:第一次是纯CPU端的裁剪,通过对每棵草进行分块,然后对分出的块使用摄像机的视锥体进行AABB测试,在视锥体内的才会加入待渲染的草地块列表

1
2
3
4
if (GeometryUtility.TestPlanesAABB(cameraFrustumPlanes, cellBound))
{
visibleCellIDList.Add(i);
}

2:第二次是基于Compute Shader的纯GPU的裁剪,将传递进来的一块或多块草地区域中的每棵草通过VP矩阵变换到齐次裁剪空间进行手动裁剪,通过测试的才会加入最终的待渲染列表

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
#pragma kernel CSMain

float4x4 _VPMatrix;
float _MaxDrawDistance;
uint _StartOffset;
StructuredBuffer<float3> _AllInstancesPosWSBuffer; //will not change until instance count change
AppendStructuredBuffer<uint> _VisibleInstancesOnlyPosWSIDBuffer; //will set counter to 0 per frame, then fill in by this compute shader

[numthreads(64,1,1)]
void CSMain (uint3 id : SV_DispatchThreadID)
{
//posWS -> posCS
float4 absPosCS = abs(mul(_VPMatrix,float4(_AllInstancesPosWSBuffer[id.x + _StartOffset],1.0)));

//do culling test in clip space, result is the same as doing test in NDC space.
//prefer clip space here because doing culling test in clip space is faster than doing culling test in NDC, because we can skip 1 division.
//the test is using OpenGL standard projection matrix, because all matrix from unity C# is OpenGL standard
//if instance is inside camera frustum, and is within draw distance, we append it to _VisibleInstanceOnlyTransformBuffer
//y test allow 50% more threshold (hardcode for grass)
//x test allow 10% more threshold (hardcode for grass)
if (absPosCS.z <= absPosCS.w && absPosCS.y <= absPosCS.w*1.5 && absPosCS.x <= absPosCS.w*1.1 && absPosCS.w <= _MaxDrawDistance)
_VisibleInstancesOnlyPosWSIDBuffer.Append(id.x + _StartOffset);
}

所以对于分块的大小设置是会直接影响到是CPU Heavy还是GPU Heavy:如果每个分块的大小越大,则在CPU裁剪粒度越大,则CPU压力越小,GPU压力越大,反之则反之

交互

拖尾

link
Image text
核心为GrassBendingRTPrePass,利用TrailRender组件,将自动淡出的运行轨迹渲染到一张RT上,作为草地交互的信息源
Image text

风场

1
2
3
4
5
6
7
8
//wind animation (biilboard Left Right direction only sin wave)            
float wind = 0;
wind += (sin(_Time.y * _WindAFrequency + perGrassPivotPosWS.x * _WindATiling.x + perGrassPivotPosWS.z * _WindATiling.y)*_WindAWrap.x+_WindAWrap.y) * _WindAIntensity; //windA
wind += (sin(_Time.y * _WindBFrequency + perGrassPivotPosWS.x * _WindBTiling.x + perGrassPivotPosWS.z * _WindBTiling.y)*_WindBWrap.x+_WindBWrap.y) * _WindBIntensity; //windB
wind += (sin(_Time.y * _WindCFrequency + perGrassPivotPosWS.x * _WindCTiling.x + perGrassPivotPosWS.z * _WindCTiling.y)*_WindCWrap.x+_WindCWrap.y) * _WindCIntensity; //windC
wind *= IN.positionOS.y; //wind only affect top region, don't affect root region
float3 windOffset = cameraTransformRightWS * wind; //swing using billboard left right direction
positionWS.xyz += windOffset;

拓展

草地编辑器

绘制画草区域

Unity Docs

1
2
3
4
5
6
7
8
void OnSceneGUI()
{
//base
Handles.color = Color.green;
Handles.DrawWireDisc(grassPainter.hitPosGizmo, grassPainter.hitNormal, grassPainter.brushSize);
Handles.color = new Color(0, 0.5f, 0, 0.4f);
Handles.DrawSolidDisc(grassPainter.hitPosGizmo, grassPainter.hitNormal, grassPainter.brushSize);
}

草的数据存储

使用顶点存放草的位置

1
2
3
4
5
6
7
8
9
10
11
12
13
if (mesh == null)
{
mesh = new Mesh();
}
mesh.Clear();
mesh.SetVertices(positions);
indi = indicies.ToArray();
mesh.SetIndices(indi, MeshTopology.Points, 0);
mesh.SetUVs(0, length);
mesh.SetColors(colors);
mesh.SetNormals(normals);
mesh.RecalculateBounds();
filter.sharedMesh = mesh;

草的绘制

通过compute shader计算需要绘制的草的顶点位置和法线

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
// A triangle on the generated mesh
struct DrawTriangle
{
float3 normalOS;
DrawVertex vertices[3]; // The three points on the triangle
};

AppendStructuredBuffer<DrawTriangle> _DrawTriangles;

SourceVertex sv = _SourceVertices[id.x];

float randomisedPos = rand(sv.positionOS.xyz);

float3 worldPos = mul(_LocalToWorld, float4(sv.positionOS, 1)).xyz;

// Blades & Segments
int numBladesPerVertex = max(1, _MaxBladesPerVertex);
int numSegmentsPerBlade = max(1, _MaxSegmentsPerBlade);
int numTrianglesPerBlade = (numSegmentsPerBlade - 1) * 2 + 1;


if(numBladesPerVertex <= 0 || distanceCutoff < 0){
return;
}

float3 perpendicularAngle = float3(0, 0, 1);
float3 faceNormal = cross(perpendicularAngle, sv.normalOS); // multiply GetMainLight().direction in later stage

// Set vertex color
float3 color = sv.color;

// Set grass height
_GrassHeight = sv.uv.y;
_GrassWidth = sv.uv.x; // UV.x == width multiplier (set in GrassPainter.cs)
_GrassHeight += clamp(randomisedPos * _GrassRandomHeight, 1 - _GrassRandomHeight,
1 + _GrassRandomHeight);
_GrassWidth *= (distanceFade);

DrawVertex drawVertices[GRASS_NUM_VERTICES_PER_BLADE];

for (int j = 0; j < numBladesPerVertex; ++j)
{
// set rotation and radius of the blades
float3x3 facingRotationMatrix = AngleAxis3x3(
randomisedPos * TWO_PI +j, float3(0, 1, -0.1));
float3x3 transformationMatrix = facingRotationMatrix;
float bladeRadius = j/(float) numBladesPerVertex;
float offset = (1 - bladeRadius) * _BladeRadius;

for (int i = 0; i < numSegmentsPerBlade; ++i)
{
// taper width, increase height
float t = i / (float) numSegmentsPerBlade;
float segmentHeight = _GrassHeight * t;
float segmentWidth = _GrassWidth * (1 - t);

// the first (0) grass segment is thinner
segmentWidth = i == 0 ? _BottomWidth * segmentWidth : segmentWidth;

float segmentForward = pow(abs(t), _BladeCurve) * _BladeForward;

// Add below the line declaring float segmentWidth
float3x3 transformMatrix = (i == 0) ? facingRotationMatrix: transformationMatrix;

// First grass (0) segment does not get displaced by interactor
float3 newPos = (i == 0) ? v0 : v0 + (sphereDisp * t) + wind1 * t;

// Append First Vertex
drawVertices[i * 2] = GrassVertex(newPos, segmentWidth, segmentHeight, offset, segmentForward, float2(0, t), transformMatrix, color);

// Append Second Vertex
drawVertices[i * 2 + 1] = GrassVertex(newPos, -segmentWidth, segmentHeight, offset, segmentForward, float2(1, t), transformMatrix, color);
}
// Append Top Vertex
float3 topPosOS = v0 + sphereDisp + wind1;
drawVertices[numSegmentsPerBlade * 2] = GrassVertex(topPosOS, 0, _GrassHeight, offset, _BladeForward, float2(0.5, 1), transformationMatrix, color);
// Append Triangles
for (int k = 0; k < numTrianglesPerBlade; ++k)
{
DrawTriangle tri = (DrawTriangle)0;
tri.normalOS = faceNormal;
tri.vertices[0] = drawVertices[k];
tri.vertices[1] = drawVertices[k + 1];
tri.vertices[2] = drawVertices[k + 2];
_DrawTriangles.Append(tri);
}

} // For loop - Blade

通过着色器进行绘制

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
// A buffer containing the generated mesh
StructuredBuffer<DrawTriangle> _DrawTriangles;


// -- retrieve data generated from compute shader
v2f vert(uint vertexID : SV_VertexID)
{
// Initialize the output struct
v2f output = (v2f)0;

// Get the vertex from the buffer
// Since the buffer is structured in triangles, we need to divide the vertexID by three
// to get the triangle, and then modulo by 3 to get the vertex on the triangle
DrawTriangle tri = _DrawTriangles[vertexID / 3];
DrawVertex input = tri.vertices[vertexID % 3];

output.positionCS = TransformWorldToHClip(input.positionWS);
output.positionWS = input.positionWS;

float3 faceNormal = GetMainLight().direction * tri.normalOS;
output.normalWS = TransformObjectToWorldNormal(faceNormal, true);
float fogFactor = ComputeFogFactor(output.positionCS.z);
output.fogFactor = fogFactor;
output.uv = input.uv;

output.diffuseColor = input.diffuseColor;

return output;
}

Image text

Image text
Terrains
刷草

参考

github
link