1
Continuous Distance-Dependent Level of Detail for
Rendering Heightmaps (CDLOD)
Filip Strugar, 11 July 2010
Abstract. This paper presents a technique for GPU-based rendering of heightmap terrains,
which is a refinement of several existing methods with some new ideas. It is similar to the terrain
clipmap approaches [Tanner et al. 98, Losasso 04], as it draws the terrain directly from the source
heightmap data. However, instead of using a set of regular nested grids, it is structured around a
quadtree of regular grids, more similar to [Ulrich 02], which provides it with better level-of-detail
distribution. The algorithm's main improvement over previous techniques is that the LOD function
is the same across the whole rendered mesh and is based on the precise three-dimensional
distance between the observer and the terrain. To accomplish this, a novel technique for handling
transition between LOD levels is used, which gives smooth and accurate results. For these reasons
the system is more predictable and reliable, with better screen-triangle distribution, cleaner
transitions between levels, and no need for stitching meshes. This also simplifies integration with
other LOD systems that are common in games and simulation applications. With regard to the
performance, it remains favourable compared to similar GPU-based approaches and works on all
graphics hardware supporting Shader Model 3.0 and above. Demo and complete source code is
available online under a free software license.
Introduction
Heightmap display and interaction are a frequent requirement of game and simulation graphics
engines. The simplest way to render a terrain is the brute force approach, in which every texel in
the source heightmap data is represented with one vertex in the regular grid of triangles. For
larger datasets this is not practical or possible: thus the need for level of detail (LOD) algorithms.
An LOD system will produce a mesh of different complexity, usually as the function of the observer
distance, so that the on-screen triangle distribution is relatively equal while using only a subset of
data and rendering resources.
Historically these algorithms were executed on the CPU; a good example is the classic
academic algorithm [Duchainea et al. 97]. However, since the GPU's raw (mostly parallel)
processing power has been improving much faster than the CPU's, in order for the whole system
to be optimally used, the terrain-rendering algorithms have changed to draw on the graphics
hardware as much as possible.
Currently, in the context of modern PC and game console architecture, there is little benefit of
having an algorithm that produces optimal triangle distribution on the CPU if it cannot provide
enough triangles for the hardware graphics pipeline or if it uses too much CPU processing power.
The API, driver, and OS layer between the CPU and GPU is also a common bottleneck, as
explained in [Wloka 03]. Therefore, even a simplistic GPU-based approach can be faster and
provide better visual results than those complex approaches executed on the CPU and formerly
considered optimal.
Administrator
附注
与CLipmap相同在直接从高度图中读取绘制地形,但并没有使用规则嵌套格网组成的集合,而是采用了类似Chunked LOD的方法,构建了四叉树规则网格,提供了更好的细节层次分配。
Administrator
附注
“Administrator”设置的“Unmarked”
Administrator
附注
算法相对于过去算法的主要改进是LOD公式在整个地形块中保持不变,并且是基于精确的三维距离计算的。为了达到这一目的,采用了一种全新的LOD层级转换控制技术,使转换更加平滑和精确。
Administrator
附注
ROAM
2
One of the first examples of this trend is the algorithm given in [de Boer 00], which, while still
essentially a CPU-based algorithm, is aimed at producing a high triangle output with less optimal
distribution but also less expensive execution [Duchainea et al. 97], thus providing better results
when running on early dedicated graphics hardware.
Later [Ulrich 02] developed one of the first completely GPU-oriented algorithms, which is still
an excellent choice for rendering terrains on modern hardware due to its good detail distribution
and optimally tessellated mesh. Its drawbacks are the lengthy pre-processing step involved,
inability to modify terrain data in real time, and a somewhat inflexible and less correct LOD
system.
A simpler heightmap-based approach is sometimes preferred; one of the most popular is the
[Asirvatham et al. 05] and its various improvements.
This paper presents a technique that builds upon the idea of using a fixed grid mesh displaced
in the vertex shader to render the terrain directly from the heightmap data source while dealing
with some of the shortcomings and complexities found in [Ulrich 02] and [Asirvatham et al. 05] by
using a quadtree-based structure and a novel, completely predictable continuous LOD system.
This technique is intended to provide an optimal way of rendering heightmap-based terrains on
graphics hardware from the Shader Model 3.0 generation and above; the technique can be
extended with hardware tessellation support to fully take advantage of the latest Shader Model
5.0 generation capabilities.
LOD function
One major drawback of the basic clipmaps [Asirvatham et al. 05] algorithm is that the level of
detail is essentially based on the two-dimensional (x; y: latitude, longitude) components of the
observer position, while ignoring the height. This results in unequal distribution of mesh
complexity and aliasing problems as, for example, when the observer is high above the mesh, the
detail level below remains much greater than required and vice versa. This is only partially
addressed in [Asirvatham et al. 05] by taking the current observer height above the terrain into
consideration and dropping higher LOD levels if needed.
On the other hand, [Ulrich 02] uses an LOD function that is the same over the whole chunk
(mesh block), providing only approximate three-dimensional distance-based LOD.
Such approximations can be limiting in scenarios frequently encountered in modern games or
simulation systems, where terrain is commonly very uneven and the observer's height and
position changes quickly, as it produces poor detail distribution or movement-induced rendering
artifacts. It also causes integration difficulties with other rendering system, some of which use
level-of-detail optimizations themselves, since the unpredictable terrain LOD causes rendering
errors such as unwanted intersection or floating of objects placed on the terrain.
The technique presented here of continuous distance-dependent level detail (CDLOD) solves
these problems by providing an LOD distribution that is a direct function of three-dimensional
distance across all rendered vertices and is thus completely predictable.
LOD transition
Another drawback of the techniques of [Ulrich 02] and [Asirvatham et al. 05] is that the
discontinuities between LOD levels require additional work to remove gaps and provide smooth
transitions. This is addressed by using additional connecting (“stitching”) strips between different
LOD levels. These strips, besides adding to the rendering cost, can cause various issues such as
artifacts when rendering shadow maps; unwanted overdraw when the terrain is rendered
transparently (which is a problem when rendering terrain water or similar effects); etc.
Administrator
附注
Geo Mipmapping
Administrator
附注
Chunked LOD
Administrator
附注
Geo CLipmapping
Administrator
高亮
3
Also, since the mesh swap between LOD levels happens between meshes of different triangle
count and shape, differences in vertex output interpolation will result in “popping” artifacts that
appear if any vertex-based effect (such as vertex lighting) is used. This also makes it a less
suitable platform for hardware tessellation.
The CDLOD technique inherently avoids these problems because the algorithm used to
transition between LOD levels completely transforms the mesh of a higher level into the lower
detailed one before the actual swap occurs. This ensures a perfectly smooth transition with no
seams or artifacts. In addition, the rendering itself is simpler and more predictable than that of
[Asirvatham et al. 05], as only one rectangular regular grid mesh is needed to render everything.
Data organization and streaming
Storing, compressing, and streaming the terrain data can be done as with other techniques,
requiring no special attention. While this topic is outside the scope of this paper, it is a necessary
part of any practical large terrain rendering system implementation. Thus an example CDLOD
implementation with full data streaming is provided with the StreamingCDLOD demo.
Algorithm implementation
Overview
CDLOD organizes the heightmap into a quadtree, which is used to select appropriate quadtree
nodes from different LOD levels at run time. The selection algorithm is performed in such a way as
to provide approximately the same amount of on-screen triangle complexity regardless of the
distance from the observer.
The actual rendering is performed by rendering areas covered by selected nodes using only
one unique grid mesh, reading the heightmap in the vertex shader, and displacing the mesh
vertices accordingly.
A more detailed mesh can be smoothly morphed into the less detailed one in the vertex shader
so that it can be seamlessly replaced by a lower resolution one when it goes out of range, and
vice versa.
The quadtree structure is generated from the input heightmap. It is of constant depth,
predetermined by memory and granularity requirements. Once created, the quadtree does not
change unless the source heightmap changes. Every node has four child nodes and contains
minimum and maximum height values for the rectangular heightmap area that it covers. A
provided example, BasicCDLOD, uses a naive explicit quadtree implementation where all required
quadtree data is contained in the node structure. The StreamingCDLOD example implements a
more advanced version in which the algorithm relies only on the (partially compressed) min/max
maps while all other data is implicit and generated during the quadtree traversal. This version
uses far less memory but is slightly more complex.
Quadtree and node selection
The first step of the rendering process is the quadtree node selection. It is performed every
time the observer moves, which usually means during every frame.
In the presented version of the algorithm, a quadtree depth level always corresponds to the
4
level of detail. This is an artificial constraint induced for simplicity reasons because it allows for the
same grid mesh with a fixed triangle count to be used to render every quadtree node. In that case
every child node has four times more mesh complexity per square area unit than its parent, since
a child node occupies a fourth of the area. This complexity difference scale factor of four is then
used as a basis for the LOD level distribution. In other words, each successive LOD level is used to
render four times as many triangles and contains four times more nodes than the previous one
(see Figure 1).
Figure 1 Quadtree and LOD layers.
LOD distances and morph areas
In order to know which nodes to select where, distances covered by each LOD layer are
precalculated before the node selection process is performed. These are calculated with the goal
of producing approximately the same average number of triangles per square unit of screen over
the whole rendered terrain.
Since the complexity difference between each successive LOD level is fixed to four by the
algorithm design, the difference in distance covered by them needs to be close to 2.0 to
accomplish relatively even screen triangle distribution, assuming that the three-dimensional
projection transform is used for rendering (due to the way projection transform works).
The array of LOD ranges is thus created (see Figure 2) with the range of each level being two
times larger than that of the previous one, and the range of the final level (largest, least detailed)
representing the required total viewing distance.
Figure 2 The distribution of six LOD ranges with their relative sizes at the top.
To provide smooth LOD transitions, the morph area is also defined, marking the range along
which a higher complexity mesh will morph into the lower one. This morph area usually covers the
last 15%-30% of every LOD range.
Quadtree traversal and node selection
Once the array of LOD ranges is calculated it is used to create a selection (subset) of nodes
representing the currently observable part of the terrain. To do this, the quadtree is traversed
recursively beginning from the most distant nodes of the lowest-detailed level, working down to
the closest, most detailed ones. This selection then contains all dynamic metadata required to
5
render the terrain.
The following C++ pseudocode illustrates a basic version of the algorithm:
01 // Beginning from the LODLevelCount and going down to 0; lodLevel 0 is the highest detailed level.
02 bool Node::LODSelect( int ranges[], int lodLevel, Frustum frustum )
03 {
04 if( !nodeBoundingBox.IntersectsSphere( ranges[ lodLevel ] ) )
05 {
06 // no node or child nodes were selected; return false so that our parent node handles our area
07 return false;
08 }
09
10 if( !FrustumIntersect(frustum) )
11 {
12 // we are out of frustum, select nothing but return true to mark this node as having been
13 // correctly handled so that our parent node does not select itself over our area
14 return true;
15 }
16
17 if( lodLevel == 0 )
18 {
19 // we are in our LOD range and we are the last LOD level
20 AddWholeNodeToSelectionList( );
21
22 return true; // we have handled the area of our node
23 }
24 else
25 {
26 // we are in range of our LOD level and we are not the last level: if we are also in range
27 // of a more detailed LOD level, then some of our child nodes will need to be selected
28 // instead in order to display a more detailed mesh.
29 if( !nodeBoundingBox.IntersectsSphere( ranges[lodLevel-1]) )
30 {
31 // we cover the required lodLevel range
32 AddWholeNodeToSelectionList( ) ;
33 }
34 else
35 {
36 // we cover the more detailed lodLevel range: some or all of our four child nodes will
37 // have to be selected instead
38 foreach( childNode )
39 {
40 if( !childNode.LODSelect( ranges, lodLevel-1, frustum ) )
41 {
42 // if a child node is out of its LOD range, we need to make sure that the area
43 // is covered by its parent
44 AddPartOfNodeToSelectionList( childNode.ParentSubArea ) ;
45 }
46 }
47 }
48 return true; // we have handled the area of our node
49 }
50 }
Selecting a node involves storing its position, size, LOD level, info on partial selection, and
other data if needed. This is saved in a temporary list later used for rendering.
Each node can be selected partially over the area of only some of its four child nodes. This is
done so that not all child nodes need to be rendered if only a few are in their LOD range, allowing
for earlier exchange between LOD levels, which increases the algorithm performance and
flexibility.
Nodes are also frustum culled while traversing the quadtree, eliminating the rendering of non-
visible nodes. When rendering a shadow map or a similar effect, the frustum cull is based on the
shadow camera, but the actual LOD selection should still be based on the main camera settings.
This is done to avoid rendering artifacts that would be caused by difference in the terrain mesh
resulting from two different LOD functions (for the shadow map camera and the main camera).
6
Figure 3 LOD selection of quadtree nodes (the frustum culled section is shaded in dark).
Since the LOD layer selection is based on the actual three-dimensional distance from the
observer, it works correctly for all terrain configurations and observer heights. This results in
correct detail complexity and better performance in various scenarios, as illustrated in Figure 3
and Figure 4.
Figure 4 LOD selection of quadtree nodes with two different observer heights
(frustum culled section shaded in dark).
7
Rendering
To render the terrain, we iterate through the selected nodes and their data is used to render
the patch of the terrain that they cover. Continuous transition between LOD levels is done by
morphing the area of each layer into a less complex neighbouring layer to achieve seamless
transition between them (see Figure 5).
Figure 5 Distribution of LOD levels and nodes (different colors represent different layers).
Rendering is then fairly straightforward: a single grid mesh of fixed dimensions is transformed
in the vertex shader to cover each selected node area in the world space, and vertex heights are
displaced using texture fetches, thus forming the representation of the particular terrain patch.
Commonly used grid-mesh dimensions are 16x16, 32x32, 64x64 or 128x128, depending on the
required output complexity. Once chosen, this grid mesh resolution is constant (i.e., equal for all
nodes) but can be changed at run time, which can come in handy for rendering the terrain at
lower resolutions for effects requiring less detail such as reflections, secondary views, low quality
shadows, etc.
Morph implementation
Using CDLOD, each vertex is morphed individually based on its own LOD metric unlike the
method in [Ulrich 02], where the morph is performed per-node (per-chunk). Each node supports
transition between two LOD layers: its own and the next larger and less complex one. Morph
operation is performed in the vertex shader in such way that every block of eight triangles from
the more detailed mesh is smoothly morphed into the corresponding block of two triangles on the
less detailed mesh by gradual enlargement of the two triangles and reduction of the remaining six
triangles into degenerate triangles that are not rasterized. This process produces smooth
transitions with no seams or T-junctions (see Figure 6 and Figure 7).
8
Figure 6 Grid mesh morph steps with morphK values of 0.0, 0.2, 0.4, 0.6, 0.9, and 1.0.
First, the approximate distance between the observer and the vertex is calculated to determine
the amount of morph required. The vertex position used to calculate this distance can be an
approximation as long as the approximating function is consistent over the whole dataset domain.
This is necessary since, in order to prevent seams, the vertices on the node's grid mesh edges
must remain exactly the same as the ones on the neighbouring nodes. In our case, x (latitude)
and y (longitude) components will always be correct as the same function is used to stretch them
to the world space, but z (height) must either be approximated or sampled from the heightmap
using a consistent filter.
The morph value, morphK in the example code, ranging between 0 (no morph, high detail
mesh) and 1 (full morph, four times fewer triangles), is used to gradually move each morph mesh
vertex towards the corresponding no-morph one. A morph vertex is defined as a grid vertex
having one or both of its grid indices (i, j) as an odd number, and its corresponding no-morph
neighbour is the one with coordinates (i – i/2, j – j/2).
Following is the HLSL code used to morph vertices:
01 // morphs input vertex uv from high to low detailed mesh position
02 // - gr