Review

ATi Radeon X800 XT PCIe

by John Reynolds

 

Introduction

ATiThe old adage, “if it’s not broke, don’t fix it,” comes to mind with ATi's Radeon X800 graphics boards. When ATi released the 9700 Pro in the fall of 2002 using a 150nm process, the world-and more importantly for the company, their competition — was taken by surprise. The R300 architecture, upon which the 9700 Pro was based, was a DX9 compliant part that used floating point precision throughout its rendering pipelines, a design that wasn’t thought possible by most of the industry if using the older fabrication process. Yet this is exactly what ATi accomplished, introducing the first DX9 graphics chip months ahead of competing parts, and moreover outperforming the competition once it arrived significantly later. In fact, ATi obviously felt so confident in their new architecture that they have continued to essentially leave the core technology unchanged for several refreshes throughout 2003 and for this year’s new generation, the X800 series of products. This has allowed ATi to instead focus engineering resources on a new architecture that will be powering both Microsoft’s Xbox 2 console and the company’s next generation of graphics boards for the PC.

Announced this past May, the Radeon X800 product series consists of the X800 Pro, the X800 XT, and the X800 XT PE (Platinum Edition). Based on the R420 architecture (R423 for PCIe versions), these chips are 160m transistor parts manufactured using TSMC’s 130nm low-k process and have a 256-bit memory interface that supports DDR, GDDR2, and GDDR3. To diversify the X800 product line so that it can address multiple price points in the market, ATi has created the following board specifications:

 
X800 Pro
X800 XT
X800 XT PE
Pipelines
12 pipes
16 pipes
16 pipes
Clock speed
475 MHz
500 MHz
520 MHz
Pixel Fill-rate
5.7 GP
8 GP
8.3 GP
Memory speed
450 MHz
500 MHz
575 MHz
Memory bandwidth
28.8 GB
32 GB
36.8 GB
MSRP
$400
$450
$500

The R300 was the first graphics chip on the market to feature eight pixel pipelines, and ATi has continued this parallelism by doubling the number of pipes for the R420 and increasing the vertex engine from four to six units. Yet while there are chip and memory speed variations among the X800 lineup, the most noteworthy difference is the X800 Pro’s 12 pipelines as opposed to the other boards’ 16 pipes. Essentially the same chip as the XT and XT PE but with one block, or quad, of pipelines disabled, the Pro’s configuration results in a significantly lower fill-rate. The boards themselves offer ATi's typical output options, with single VGA and DVI connections and an S-video port. And thanks to the use of TSMC’s low-k process, the X800s are single slot boards that can be installed in small form factor PCs and cooled using a copper heatsink and fairly quiet fan (the speed for which is controlled by an on-die thermal probe that monitors the chip’s temperature and adjusts its rotation accordingly). In fact, the 256 MB of GDDR3 memory that ATi uses for these new cards does not require active cooling, so the heatsink doesn’t actually make contact with the RAM modules (four on the front and back each) on the board.

X800 Architecture Overview

As mentioned above, the Radeon X800 XT is a 16 pipeline configuration that also includes six vertex units, identical to the GeForce 6800 GT PCIe board SimHQ reviewed earlier this year. The following is a list of the features the X800 boards support:

Smartshader HD

Smoothvision HD

3Dc

Hyper Z HD

Videoshader HD

Display Features

ATi Radeon X800 XT PCIe

While the pixel shader architecture of the R420 is nearly identical to that of the R300, the newer chip has been modified to make improvements in its rendering capabilities. The number of registers and instruction limits have both been increased, which should help performance and prevent developers from running into issues with instruction lengths while developing current and near-future engines. The X800s are, however, are still Shader Model 2.0 parts and, as such, do not support features like dynamic branching and flow control, though it’s worth noting that such features are used to aid performance and not create visual effects otherwise impossible to render without SM 3.0 support. And as with the R300 architecture, the X800s use 24-bit floating point precision throughout their rendering pipelines.

ATi Radeon chipOne new feature found in the X800 parts that ATi has been heavily touting is 3Dc, a compression tool for normal maps. As game developers strive to create more detailed and realistic looking environments, one of the key means of doing so has traditionally been through higher levels of geometry for game models and environments . However, even greater levels of geometry with multiple texture layers applied can still appear as rather flat and unrealistic looking, and the level of geometry required to avoid this appearance would be beyond the means of graphics boards engineered for the consumer market, which have traditionally been designed with more of an emphasis on pixel rather than geometry rendering (in contrast to professional graphics workstations). Because of this situation, developers have begun relying on a technique known as normal mapping, which is a special texture map that stores information dealing with how in-game light interacts with the rendered surface. This creates an illusion of greater detail than what’s actually present while taking advantage of consumer graphics boards’ emphasis on pixel processing.

However, for a normal map to create a model or object that is substantially more realistic looking than traditional rendering techniques, the level of geometry and texture detail required to generate the normal map increases the memory space requirements to load the map into a graphics board’s frame buffer, which is obviously finite in space and required for other uses. 3Dc, a 4:1 compression algorithm for normal maps that’s based on DX5’s texture compression mode (DXTC), is what ATi hopes will enable developers to make greater use of normal mapping by allowing them to create either more detailed normal maps or to use the feature to save memory space and bandwidth by applying it to non-compressed normal maps. In fact, according to ATi numerous game developers have indicated that they plan to include 3Dc support in future games.

Test System Setup

The benchmark suite that will be used to evaluate this test system is listed here. Again, unless specified otherwise all games are configured to their highest settings, and 32-bit color and trilinear texture filtering are the default baseline during testing. Also, Windows XP is configured to have Automatic Update, System Restore, and all unnecessary startup services disabled. Fraps 2.3.2 is used to record performance scores unless otherwise noted.

In addition, along with the latest Catalyst driver release from ATi the Catalyst Control Center (CCC), the new .NET Framework-based replacement for the traditional control panel, was also installed and used to disable ATi's new A.I. optimizations throughout testing. The X800 series of graphics chips use a more aggressive filtering method than their predecessors, and the Catalyst driver suite now includes title-specific optimizations and shader replacements, all of which are of course intended to accelerate performance in targeted games. In all fairness, however, some of the application detections are designed to avoid known bugs or issues, such as disabling anti-aliasing for the Splinter Cell titles since the feature is incompatible with these games. Either way, ATi allows the end user to toggle A.I. settings via the CCC and care was taken to ensure that A.I. was disabled throughout testing.

Catalyst Control Center

The X800 XT reviewed today is a native PCI Express board that can take advantage of the full up- and downstream bandwidth of the new bus. It draws power from the test system’s PSU via a 6-pin connector on the back of the board.

Benchmark Scores

The test option of High Quality represents scores with both 4x AA and 8x AF enabled.

Lock On: Modern Air Combat was tested using the MiG-29 Intercept demo. In-game settings were at their highest options, except for several features such as water and heat bltr which were set to low and turned off. The demo was run for three minutes and scores were recorded using Fraps.

LOMAC

LOMAC scales well with graphics options and resolution changes, displaying a roughly 20% frame rate loss from 4x AA and slightly over 30% for 8x AF; neither feature allows for playable frame rates at 1600x1200 however. With both features combined, the simulation is barely playable at 1024x768.

Microsoft’s Flight Simulator 2004 was tested using SimHQ’s in-house dusk flight over the city of Hong Kong, with an external camera view set behind the plane. Frame rate recording is stopped once the plane lands. MS2004 was configured with ultra high settings across its four hardware panels.

FS2004

Another simulation that scales with changes to the graphics sub-system, FS2004 incurs a much sharper performance hit from anisotropic filtering than from anti-aliasing in SimHQ’s test flight video. 4x AA drops the frame rate by 20-30% across the tested resolutions, while 8x AF performance is closer to 40-50% lower than the baseline scores. And High quality isn’t a viable option at any of the tested resolutions with the title’s in-game options configured with ultra high settings, except for perhaps 1024x768.

IL-2: Forgotten Battles - Aces Expansion Pack represents SimHQ’s non-modern flight simulation test. Using OpenGL, the landscape option was set to perfect and all other graphics options were at their highest settings. Testing consisted of using the Black Death track.

IL-2 FB AEP

IL2: FB AEP takes the least performance hit from anti-aliasing among the flight simulations tested for this review, losing less than 10% at 1024x768, roughly 15% at 1280x960, and increasing to almost 20% at 1600x1200. As in the past, the simulation incurs a much sharper loss from anisotropic filtering, with frame rates dropping by around 30-50% across the resolutions and leaving the game unplayable at 1600x1200. And as with the above titles, high quality isn’t a viable option at the higher resolutions.

Far Cry benchmark numbers are generated by repeated playing of the Research map, which consists of an good mix of beach, jungle, and interior settings found throughout the game’s various levels. Fraps is used to record performance as the same path is taken through the map during each test. Both anti-aliasing and anisotropic filtering were enabled via the Catalyst Control Center and all in-game options were configured for their highest settings (water at ultra high).

FarCry

Far Cry is the most graphically demanding game in SimHQ’s benchmark suite, and is the only title tested that makes use of DX9 shaders. As such, it certainly allows the X800 XT to shine, as the above scores indicate, pushing into 3-digit frame rates at 1024x768. With anti-aliasing enabled, the performance loss scales from roughly 15-30%, while anisotropic filtering is even lower, roughly 5-15%. Even high quality offers playable frame rates at 1600x1200, at which point the game is rendering some of the best visuals currently available for PC gaming.

Developed using id Software’s five-year-old Quake 3 engine, Call of Duty is the second title SimHQ uses testing OpenGL rather than the D3D API. Scores were derived from the Dawnville demo using the in-game timedemo utility to capture performance. The “com_maxfps” console command was also used to lift the default frame rate cap of 85.

Call of Duty

Based on an aging engine, Call of Duty isn’t particularly graphics intensive, though the title still looks good once AA and AF are enabled. Both features incur very similar performance losses, and the game stays in the realm of triple digit frame rates even at 1600x1200 with high quality settings.

NASCAR Racing 2003 Season was tested using SimHQ’s in-house replay, which consists of a crowded Daytona track with the camera view set to Earnhardt’s cockpit. All graphics options were placed at their highest settings.

NR2003

This driving simulation obviously scales far closer to system rather than graphics processing power, essentially locking itself at 35-38 fps across all tested resolutions and settings. And with shadows enabled, NASCAR performs much slower in these tests compared to scores from SimHQ’s CPU articles which test without shadows.

Last, SimHQ has decided to include scores from Valve’s Video Stress Test, a utility now included with the new Source engine-based version of Counter-Strike. The update of the popular online shooter is currently available only via Valve’s distribution package, Steam, but will be bundled with Half Life 2 once the game reaches store shelves. The Video Stress test itself is a fly-by of a relatively small custom level and is designed to show off numerous graphical effects rendered through the heavy use of various shaders.

Valve's Video Stress Test

It’s interesting to note that the Video Stress test loses more performance from the fill-rate demands of higher resolutions than it does from either AA or AF. Anti-aliasing costs the test slightly more at higher resolutions than anisotropic filtering, roughly 5-20% across the resolutions. At high quality the frame rate is cut by some 45% across the resolutions, though if the test represents a game’s final performance based on the Source engine, 74 fps at 1600x1200 with 4x AA and 8x AF is hardly a negligible score.

Image Quality

ATi's GPUs have since the 9700 Pro supported programmable multi-sampling modes of anti-aliasing. In addition, the X800 does not possess a hardware ROP (render output) limitation, allowing for more than four sub-samples to be used. While the AA modes are programmable, when enabled via the control panel a static, sparse sampling pattern is used, and to which ATi applies a gamma correction that gives the X800’s AA a slight quality advantage with thin lines or along edges with a sharp color contrast. And while the X800 is capable of supporting super-sampling, ATi has chosen to not expose this mode of anti-aliasing for their products. Regardless, using Colourless’ D3D FSAA viewer the sparse sampling pattern of the X800s can be seen below:

4x AA Sampling Pattern

4x AA Sampling Pattern

6x AA Sampling Pattern

6x AA Sampling Pattern

Temporal AA is a feature that takes advantage of the programmable sampling patterns available in ATi hardware since the R300. By changing the sampling pattern used per frame, temporal AA increases the visual quality of the AA mode enabled to an appreciable degree. The caveat with the feature, however, is that frame rate must stay high or the changing sampling patterns will become apparent to the human eye, resulting in a flickering along polygon edges that detracts from the AA quality. Enabling temporal AA, however, also enables V-Sync. The performance of the various anti-aliasing modes were tested using IL2: FB AEP's "Black Death" track.

Black Death Track

As we can see above, 6x AA obviously incurs a higher performance impact than 4x AA, though IL-2 remains fairly playable even at 1600x1200 and certainly looks amazing. Temporal AA, however, is really only useful for IL-2 when configured with maximum settings at the lower resolutions, since the lower frame rate will result in the shifting patterns becoming noticeable. Again, because V-sync is enabled with temporal AA, its scores are lower than the other 4x AA numbers since the frame rate is unable to go above the refresh rate; otherwise, temporal AA has no impact beyond regular AA modes. Visually, however, for games that can maintain high frame rates, temporal AA significantly increases the anti-aliasing quality and really has to be experienced hands-on to be truly appreciated.

According to ATi, A.I. examines a game’s textures as they’re loaded and, using an adaptive filtering algorithm, evaluates how to best filter them from a performance perspective without causing a noticeable degradation in filtering quality. The setting of low for A.I. enables all title-specific optimizations and the adaptive filtering algorithm, while the high setting also enables the optimizations as it increases the aggressiveness of the filtering algorithm. The filtering differences can be seen in these images, created using the Direct3D AF Tester utility:

Direct3D Anisotropic Filtering (AF) Test Results

Optimizations Off Optimizations On
A.I. Off A.I. High

IL-2: Forgotten Battles - Aces Expansion Pack was used to test the performance impact A.I. has on texture filtering with 8x anisotropic enabled.

Texture Filtering - 8x AF

Image quality differences with A.I. disabled or set at low are extremely difficult to discern, with a slight increase in texture aliasing in certain titles being the most noticeable effect. Once set to high, however, the more aggressive filtering algorithm results in a perceptible banding in textures that would be expected when bilinear filtering was in use, though not quite as apparent as true bilinear. And as the scores above demonstrate, A.I. has a strong impact on performance, with the setting low increasing test scores by 30-50% over regular 8x AF and high returning the frame rate to numbers almost identical to those without anisotropic filtering enabled. Those desiring the best image quality possible for their games, however, will want to leave A.I. either turned off or set at low, though ATi should be congratulated for allowing users the choice through the CCC.

Overclocking

To see how well the Radeon X800 XT would run above its default clock settings for both the chip and onboard memory, the Catalyst Control Center’s Overdrive option was enabled and IL2: FB AEP again ran through a variety of tests. SimHQ, however, strongly stresses that the overclocking results of one review board cannot serve as an adequate sampling by which to judge an entire product line, and for our readers to please keep this in mind while considering the following benchmarks.

Overclocked

With Overdrive enabled the chip clock speed was increased to a paltry 506 MHz while the memory maintained its default of 500 MHz. With the Platinum Editions clocked a mere 20 MHz faster than the XTs, it’s no surprise that 500+ MHz appears to be the threshold for these chips. As for the memory, the Samsung modules are rated at 2ns and a frequency of 500 MHz, so it’s unsurprising that the onboard RAM would overclock poorly, if at all.

Gallery

All pictures were taken at 1024x768 using 4x anti-aliasing and 8x anisotropic filtering.

Far Cry NASCAR Racing 2003 Season
Far Cry
(2.25MB)
NASCAR Racing 2003 Season
(2.25MB)
Rome: Total War Valve's Stress Test
Rome: Total War
(2.25MB)
Valve's Video Stress Test
(2.25MB)

Conclusion

It appears as though ATi's confidence in the core architecture the company released with the Radeon 9700 Pro back in 2002 is a gamble that hasn’t yet backfired. The X800 series of boards has won numerous OEM contracts and been extremely popular with gamers in the retail market, though availability for the XT and PE boards has been poor throughout the summer. The Radeon X800 XT used in this review displayed no issues rendering any of the games in the benchmark suite, nor were there any stability problems with the test system while running the board. And the new Catalyst Control Center is a fairly user friendly control panel that offers a wide host of configurable options for the end user, though in its current form it is somewhat bloated in terms of system resources required to run; hopefully ATi will be able to optimize the package in the near future.

Though we suspect the Radeon X800 XT’s performance is somewhat limited by the 500 MHz GDDR3 memory installed, the board still managed to blaze through many of the games included in SimHQ’s benchmark suite. This hardly comes as a surprise, considering the sheer fill-rate generated by 16 pixel pipelines running at 500 MHz, a high frequency achieved largely due to TSMC’s 130nm low-k process and ATi's choice of employing it for this generation of high-end chips. While the R420 design lacks support for certain features found in Shader Model 3.0, and for higher floating point precision rendering, it’s doubtful a game will be released within an expected lifespan for the X800 boards that significantly eclipses their capabilities. And unique features such as temporal anti-aliasing and 3Dc are worth consideration for those looking to purchase a high-end graphics card. ATi has struck a remarkable balance over the past few generations, successfully leveraging their initial DX9 architecture across several years, largely through the smart use of process technology, while freeing up the engineering resources necessary to work on a future design that has already won one major contract: Microsoft’s upcoming Xbox 2 console. Only time will tell, however, whether or not ATi's future design will likewise function as an architectural vinculum, a foundation able to bridge generations of products as the graphics industry transitions itself toward Longhorn and DirectX 10.

 


Copyright 2008, SimHQ.com. All Rights Reserved. Contact the webmaster.