Ray / Tri-Mesh Intersection Benchmark

The purpose of this benchmark is to get some performance numbers for tracing rays through hi-res triangle meshes. This test is designed to be helpful in measuring the performance of spatial data structures like kd-trees, which are neccessary for efficient ray/hi-res-mesh intersections.

The Stanford Bunny Benchmark

This test uses the 'Stanford Bunny', available here. You might want to use my ply model loading code to load it. The bunny is 69451 triangles.

The test is as follows: Randomly generate a ray origin point on the surface of the sphere of radius 0.2 centered on the point (-0.016840, 0.110154, -0.001537), which happens to be the center of the bunny axis-aligned bounding box. Generate another point randomly on the surface of the same sphere. Form the (infinite) ray starting at the first point and passing through the second point. This is the current test ray. Trace against the model. You must get at least the following information: triangle hit and distance till intersection position. Both of these items of information are essential in any kind of ray-tracing for image generation. Repeat for as many iterations as needed to get an accurate value for the number of rays traced per second.

Right now I'm getting a fraction of about 0.104 rays that are launched that actually hit the bunny.

Generating the rays

With sufficiently efficient ray/mesh intersection code, the time spent generating the random points on the unit sphere required by the benchmark becomes a significant fraction of the total time. This problem can be minimised by requiring that the random points are generated in a particular way, with predictable performance. Therefore, the following code should be used when implementing the benchmark:

sphereunitvecpool.cpp

sphereunitvecpool.h

The code defines a class SphereUnitVecPool that provides fairly fast random points on the unit ball (ball = surface of sphere), using a pool of pre-constructed points. The code includes a random number generator, which is helpful in achieving consistent performance on different platforms.

Sample Code

Here's some sample code to do the testing:


SphereUnitVecPool vecpool;//create pool of random points

Timer testtimer;//start timer
int num_hits = 0;//number of rays that actually hit the model

for(int i=0; i<num_iters; ++i)
{
	const float RADIUS = 0.2f;//radius of sphere

	///generate ray origin///
	const PoolVec3 pool_rayorigin = vecpool.getVec();//get random point on unit ball from pool

	Vec3 rayorigin(pool_rayorigin.x, pool_rayorigin.y, pool_rayorigin.z);//convert to usual 3-vector object
	rayorigin *= RADIUS;
	rayorigin += aabb_center;//aabb_center is (-0.016840, 0.110154, -0.001537)

	///generate other point on ray path///
	const PoolVec3 pool_rayend = vecpool.getVec();//get random point on unit ball from pool
		
	Vec3 rayend(pool_rayend.x, pool_rayend.y, pool_rayend.z);//convert to usual 3-vector object
	rayend *= RADIUS;
	rayend += aabb_center;

	//form the ray object
	const Ray ray(rayorigin, normalise(rayend - rayorigin));

	//do the trace
	const float dist = geom->getDistanceUntilHit(ray);

	if(dist >= 0.0f)//if hit the model
		num_hits++;//count the hit.
}

const double timetaken = testtimer.getSecondsElapsed();

const double fraction_hit = (double)num_hits / (double)NUM_ITERS;

Debugging the benchmark

It can be kinda tricky to make sure that you are computing the benchmark correctly given that it has no visual output. So here are some checks you can make:

Limitations of the benchmark

This benchmark is designed to be simple and as a consequence has a number of limitations:

First off, optimisations can be made to accelerate the tracing of coherent rays. Coherent rays are rays in which some characteristics of the rays are the same or similar. For example, primary rays traced from a camera are coherent in the sense that their origins are identical or near to each other. See for example [1] which describes a scheme to trace 4 coherent rays at once using SIMD (single instruction, multiple data) instructions.

There are also different types of ray-shooting queries. The most general is a query which returns all intersections along the ray path. This type of query is useful for constructive solid geometry (CSG) operations. The second type of query is the type that this benchmark measures, a query which returns only the first hit along the ray path. However there is still the choice of exactly what information to return, among the possibilities of triangle hit, distance to collision point, normal of triangle hit, texture UV coords, etc. In this benchmark only a pointer to the triangle hit and the distance to the collision point is required. A third type of ray-shooting query exists, one which returns only a boolean value indicating if any triangle was hit. This type of query is useful for testing for shadowing at a point, and can be optimised quite a lot over the more general queries.

Results

As of 23 Nov 2004, with my (much better now) kd-tree code, I'm getting about 780K rays/second on a 2.8GHz Pentium 4.

Thierry Berger-Perrin reports preliminary results of 1.67M rays/sec on a 2GHz Opteron, using bounded volume hierarchies (BVH).

Samuel Rødal reports results of 846K rays/sec using a kd-tree, and 276K rays/sec using an uniform grid structure. Machine specs:
Pentium 4 2.533 GHz, 512K cache, 533 MHz FSB 512 MB DDR-333

Iñigo Quilez reports results of 203K rays/sec with an Octree, and 445K rays/sec with a kd-tree implementation, on a 1.1Ghz Pentium 4.

Štepán Hrbek reports results of 1.51M rays/sec on Athlon64 2800+ (1.8GHz) with his improved KD tree.

Other Benchmarks

Standard Procedural Databases
BART: A Benchmark for Animated Ray Tracing

References

[1] A Cross-Platform Framework for Interactive Ray Tracing, Markus Geimer and Stefan Muller.

[2] Interactive Rendering with Coherent Ray Tracing, Ingo Wald, Philipp Slusallek, Carsten Benthin, and Markus Wagner, EuroGraphics 2001.


--Nick Chapman, nickamy@paradise.net.nz, 15 Nov 2004

Big thanks to Thierry Berger-Perrin for code, comments, and concepts.

Changes

14th Sep 2005: Added new results from Štepán Hrbek

5 March 2005: Added new results from Samuel Rødal

8 Dec 2004: more results, ref to Wald's paper.

23 Nov 2004: Added new SphereUnitVecPool code, added section on benchmark limitations.


: Main

: Projekts

: Code

: Articles

: Links