The purpose of this benchmark is to get some performance numbers for tracing rays through hi-res triangle meshes. This test is designed to be helpful in measuring the performance of spatial data structures like kd-trees, which are neccessary for efficient ray/hi-res-mesh intersections.
The test is as follows: Randomly generate a ray origin point on the surface of the sphere of radius 0.2 centered on the point (-0.016840, 0.110154, -0.001537), which happens to be the center of the bunny axis-aligned bounding box. Generate another point randomly on the surface of the same sphere. Form the (infinite) ray starting at the first point and passing through the second point. This is the current test ray. Trace against the model. You must get at least the following information: triangle hit and distance till intersection position. Both of these items of information are essential in any kind of ray-tracing for image generation. Repeat for as many iterations as needed to get an accurate value for the number of rays traced per second.
Right now I'm getting a fraction of about 0.104 rays that are launched that actually hit the bunny.
The code defines a class SphereUnitVecPool that provides fairly fast random points on the unit ball (ball = surface of sphere), using a pool of pre-constructed points. The code includes a random number generator, which is helpful in achieving consistent performance on different platforms.
Here's some sample code to do the testing:
SphereUnitVecPool vecpool;//create pool of random points
Timer testtimer;//start timer
int num_hits = 0;//number of rays that actually hit the model
for(int i=0; i<num_iters; ++i)
{
const float RADIUS = 0.2f;//radius of sphere
///generate ray origin///
const PoolVec3 pool_rayorigin = vecpool.getVec();//get random point on unit ball from pool
Vec3 rayorigin(pool_rayorigin.x, pool_rayorigin.y, pool_rayorigin.z);//convert to usual 3-vector object
rayorigin *= RADIUS;
rayorigin += aabb_center;//aabb_center is (-0.016840, 0.110154, -0.001537)
///generate other point on ray path///
const PoolVec3 pool_rayend = vecpool.getVec();//get random point on unit ball from pool
Vec3 rayend(pool_rayend.x, pool_rayend.y, pool_rayend.z);//convert to usual 3-vector object
rayend *= RADIUS;
rayend += aabb_center;
//form the ray object
const Ray ray(rayorigin, normalise(rayend - rayorigin));
//do the trace
const float dist = geom->getDistanceUntilHit(ray);
if(dist >= 0.0f)//if hit the model
num_hits++;//count the hit.
}
const double timetaken = testtimer.getSecondsElapsed();
const double fraction_hit = (double)num_hits / (double)NUM_ITERS;
First off, optimisations can be made to accelerate the tracing of coherent rays. Coherent rays are rays in which some characteristics of the rays are the same or similar. For example, primary rays traced from a camera are coherent in the sense that their origins are identical or near to each other. See for example [1] which describes a scheme to trace 4 coherent rays at once using SIMD (single instruction, multiple data) instructions.
There are also different types of ray-shooting queries. The most general is a query which returns all intersections along the ray path. This type of query is useful for constructive solid geometry (CSG) operations. The second type of query is the type that this benchmark measures, a query which returns only the first hit along the ray path. However there is still the choice of exactly what information to return, among the possibilities of triangle hit, distance to collision point, normal of triangle hit, texture UV coords, etc. In this benchmark only a pointer to the triangle hit and the distance to the collision point is required. A third type of ray-shooting query exists, one which returns only a boolean value indicating if any triangle was hit. This type of query is useful for testing for shadowing at a point, and can be optimised quite a lot over the more general queries.
Thierry Berger-Perrin reports preliminary results of 1.67M rays/sec on a 2GHz Opteron, using bounded volume hierarchies (BVH).
Samuel Rødal reports results of 846K rays/sec using a kd-tree, and 276K rays/sec using an uniform grid
structure. Machine specs:
Pentium 4 2.533 GHz, 512K cache, 533 MHz FSB
512 MB DDR-333
Iñigo Quilez reports results of 203K rays/sec with an Octree, and 445K rays/sec with a kd-tree implementation, on a 1.1Ghz Pentium 4.
tepán Hrbek reports results of 1.51M rays/sec on Athlon64 2800+ (1.8GHz) with his improved KD tree.
[2] Interactive Rendering with Coherent Ray Tracing, Ingo Wald, Philipp Slusallek, Carsten Benthin, and Markus Wagner, EuroGraphics 2001.
--Nick Chapman, nickamy@paradise.net.nz, 15 Nov 2004
Big thanks to Thierry Berger-Perrin for code, comments, and concepts.
14th Sep 2005: Added new results from tepán Hrbek
5 March 2005: Added new results from Samuel Rødal
8 Dec 2004: more results, ref to Wald's paper.23 Nov 2004: Added new SphereUnitVecPool code, added section on benchmark limitations.