草地优化

3.3k words

分块管理(空间数据结构上的优化)

在考虑大场景的情况下,可能存在几百万乃至更多的草,如果每帧都去进行可见性剔除,开销过大。所以选择使用空间加速算法来优化,通过划分多个AABB,根据AABB是否与摄像机裁剪区域相交、包含,再进行具体草的剔除

图形引擎实战:一种大世界巨量植被管理与渲染方案 - 搜狐畅游引擎部的文章 - 知乎

CPU裁剪

对于草的数据构建一个BVH树,使用AABB包围指定数量的草,通过判断AABB是否与摄像机区域交相、包含,决定是否传递数据给GPU
UnityURP-MobileDrawMeshInstancedIndirectExample

作者AABB裁剪的思想

1
2
3
4
5
6
7
8
9
10
11
12
13
14
for (int i = 0; i < cellPosWSsList.Length; i++)
{
//create cell bound
Vector3 centerPosWS = new Vector3 (i % cellCountX + 0.5f, 0, i / cellCountX + 0.5f);
centerPosWS.x = Mathf.Lerp(minX, maxX, centerPosWS.x / cellCountX);
centerPosWS.z = Mathf.Lerp(minZ, maxZ, centerPosWS.z / cellCountZ);
Vector3 sizeWS = new Vector3(Mathf.Abs(maxX - minX) / cellCountX,0,Mathf.Abs(maxX - minX) / cellCountX);
Bounds cellBound = new Bounds(centerPosWS, sizeWS);

if (GeometryUtility.TestPlanesAABB(cameraFrustumPlanes, cellBound))
{
visibleCellIDList.Add(i);
}
}

Image text
Image text
先将所有草的数据传递给ComputeShader
遍历各个 AABB,如果和摄像机裁剪区域相交,则将 AABB 内的草数据传递给 ComputeShader,ComputeShader 中再根据裁剪空间进行裁剪

我AABB裁剪的思想
在绘制时,自己指定AABB的区间,使用顶点色的Alpha通道,根据AABB和摄像机裁剪区域相交情况,将数据传递给 ComputeShader
Image text
Image text

细节

剔除后的所有顶点数据
BVH_AABB.cs

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
public List<NodeData> nodeDataList = new List<NodeData>();
public List<NodeData> v_nodeDataList
{
get {
if (nodeDataList == null || nodeDataList.Count == 0) {
AddVisibleList(tree);
}
return nodeDataList;
}
}

private void AddVisibleList(BVHNode node) {
if (node != null)
{
if (node.nodeDataList.Count > 0)
{
if (GeometryUtility.TestPlanesAABB(cameraFrustumPlanes, node.aabb.bounds))
{
visibleAABB.Add(node.aabb);
nodeDataList.AddRange(node.nodeDataList);
}
}
if (node.left != null) {
AddVisibleList(node.left);
}
if (node.right != null) {
AddVisibleList(node.right);
}
}
}

GPU裁剪

进行裁剪添加数据
Github-HiZ_grass_culling

1. 转换为裁剪空间剔除

Unity Shader 各个空间坐标的获取方式及xyzw含义 - 小电脑的文章 - 知乎
裁剪空间下,裁剪区域内的坐标 -w<=(x/y/z)<=w

1
2
3
4
5
6
7
8
9
10
float4 absPosCS = abs(mul(_VPMatrix,float4(_AllInstancesPosWSBuffer[id.x + _StartOffset],1.0)));

//do culling test in clip space, result is the same as doing test in NDC space.
//prefer clip space here because doing culling test in clip space is faster than doing culling test in NDC, because we can skip 1 division.
//the test is using OpenGL standard projection matrix, because all matrix from unity C# is OpenGL standard
//if instance is inside camera frustum, and is within draw distance, we append it to _VisibleInstanceOnlyTransformBuffer
//y test allow 50% more threshold (hardcode for grass)
//x test allow 10% more threshold (hardcode for grass)
if (absPosCS.z <= absPosCS.w && absPosCS.y <= absPosCS.w*1.5 && absPosCS.x <= absPosCS.w*1.1 && absPosCS.w <= _MaxDrawDistance)
_VisibleInstancesOnlyPosWSIDBuffer.Append(id.x + _StartOffset);

LOD处理

距离远的草没必要使用和近处草同样的顶点

1. 设置距离外的草,不渲染

1
2
3
4
5
6
7
8
9

float3 worldPos = mul(_LocalToWorld, float4(sv.positionOS, 1)).xyz;
float distanceFromCamera = distance(worldPos, _CameraPositionWS);
float distanceCutoff = saturate(1 - saturate((distanceFromCamera - _MinFadeDist) / (_MaxFadeDist - _MinFadeDist)));

if(numBladesPerVertex <= 0 || distanceCutoff <= 0){
return;
}

2. 设置宽度的渐变,不会显得突兀

Image text
Image text

1
2
3
float distanceFade = saturate(1 - saturate((distanceFromCamera - _MinFadeDist - 5) / (_MaxFadeDist - _MinFadeDist -5)));

_GrassWidth *= (distanceFade);

3. 顶点数的渐变

Image text
Image text

1
2
3
float distanceFade = saturate(1 - saturate((distanceFromCamera - _MinFadeDist - 5) / (_MaxFadeDist - _MinFadeDist -5)));

numSegmentsPerBlade = max(1, numSegmentsPerBlade * distanceFade);