hpdz.net

High-Precision Deep Zoom

Technical Info - Old Main Computers

First Dedicated Fractal System - Core2 Quad

Pictures

Click on the thumbnails for full-size images

Core2 quad fractal rendering sytem

Here's a shot of the unit sitting on my table.

Yes, it has a clear side panel with a fan that has multi-colored LEDs in it, and yes, the front panel also has LEDs. It's marketed as a "gaming" PC, and there was no easy alternative case option. This is actually kind of cool. It's growing on me. All future animations will be cranked out by this thing.

You can see the old Dell Precision 360 in the background, looking kind of sad and feeling a little intimidated.

Core2 quad fractal rendering system

This is the interior.

Pretty sparse, but that's what I wanted. All I need is CPU power. That's a modest 160 GB SATA hard drive there. There'a also a DVD player at the top that you can just barely see.

Core2 quad fractal software screen shot

This is a screen shot of one of my benchmark tests. (I should have used PrtScr, but I had the camera in my hand and I wasn't thinking).

This is a rendering of the set in a 4x4 box centered at (0,0) with an escape count of 1000, using 384-bit high-precision arithmetic, or 12 DWORDs (see the lower-right corner), and with the Distance Estimator drawing method. Of course, the precision is set that high only for benchmark purposes; you could easily render this image with standard math precision.

The color palette is not set up, so the image is nearly monochrome.

This was drawn in 13 seconds, compared to about 55-56 seconds on the Precision 360.

Specifications

This is a quad-core Core2 Q6600 system running at 2.4 GHz. It has a 1066 MHz FSB and 2 GB of RAM. CPUID returns Family 6, Model 15, Type 0, Stepping 11. This is a Kentsfield 65 nm model.

I haven't had time to fully absorb all the detailed specs about the video and audio and such, but it plays even the gigantic raw uncompressed 640x480 30 fps AVI files that my program generates with no hesitation or dropped frames. Centanimus, for example, is 3.1 GB of raw data, and this system plays that AVI file flawlessly. It nearly made me cry. (Tevaris was rendered to five separate 1GB AVI files, so I can't play the whole thing uncompressed start-to-finish).

I wish there were some way to publish this level of quality video to the internet. Perhaps a high-definition DVD....?

The system has multiple audio outputs, including a subwoofer output, but I don't have anything good enough to connect to them right now.

Performance

I have a couple of quick benchmark tests I use to check performance (like after I tweak the math functions, to verify I made them better):

First test

My previous Pentium-IV system took 55-56 seconds to render that. The new system takes 13-14 seconds.

Second Test

My previous Pentium-IV system could render that in about 310-315 seconds. The new system takes 65 seconds.

Some less precise tests I did compared some frames of a super-deep zoom (to E140!) I'm in the early phases of planning. I'm only rendering small frames at this point, and on the old system a 200x200 image took about 25 minutes. The new system takes only around 6 minutes.

These results are, of course, utilizing all four cores of the CPU fully. The performance gain is always just a little over a factor of 4, even though the clock speed is about 30% lower. This is consistent with Intel's promises that improved architectural design features in the Core2 processors offset the lower clock speed.

Interestingly, when I tried setting the program to only use one thread rather than four, a more detailed and precise set of benchmarks (millions of iterations of various operations are precisely timed at all available levels of precision) showed that the single thread performed almost exactly as the 3.2 GHz P-IV did, although when all four cores are working, the speed gain is always a little more than a factor of four. I believe this is due to the way the internal caches are shared -- if one core triggers a cache miss, the others can keep on going as long as they can get what they need from the cache. The core that triggered the miss likely caused the cache to be filled with data that another core will use soon. Just a guess.

I have recently had the opportunity to test the program's benchmarking on some 2.4GHz Core2 Duo systems and found that they render images exactly half as fast as my current machine (half the cores, makes sense) and achieve the same single-thread throughput as my current machine (again, makes sense, same processor family and same clock speed).

Old Computer System

CPU

I have been using a Dell Precision 360 with a 3.2 GHz Pentium IV Extreme Edition, which also works quite nicely as a space heater. All of the animations on this site prior to April 2008 were rendered with that system. I'm no expert on Pentium taxonomy, but this is the Gallatin core with 2 MB of L3 cache. The CPUID instruction returns Family 15, Model 2, Type 0, Stepping 5. This is not the same thing as the Pentium Extreme Edition (without the IV), which is a later model based on the Pentium D.

HyperThreading

This processor supports HyperThreading. I rarely enable it. I've done speed tests on my high-precision math functions with and without HT. When HyperThreading is enabled, the math functions run a bit slower, anywhere from 0.95 to 1.00 times the speed that they run without HyperThreading. Interestingly, when I configure the rendering engine to use two threads and HyperThreading is enabled, it actually runs slower than when it's configured to use only one thread. The time for a standard drawing benchmark I use (500x500 pixels, size=4 centered at (0,0), escape count=1000, 352-bit precision) is 55 seconds with one thread and 67 seconds with two threads. Also interestingly, when HT is enabled, drawing low-magnification images with standard low-precision math (using the native floating-point arithmetic) is much much faster, probably because Windows is more efficiently dividing up time for calculating and time for actually painting the image on the screen. At magnifications around 0.1 to 1.0, a the standard test image described previously draws in way less than 1 second.

In any event, I will be switching to a 2.4 GHZ quad-core Core2 system soon. Very soon (it's here now! See above!)

Prior to August 2004, I was using a 1.0 GHz Pentium III. That system generated many of the still images on this site and is still in active use around the house for other purposes, but it is so slow compared to the P-IV that I don't use it anymore for drawing fractals.

Video

The video subsytem on the machine doesn't affect the rendering process much but it does affect the quality of playback. My system came with an NVIDIA Quadro NVS 280 SD with an AGP bus. I don't have a very high-end graphics card or anything like that, since the stock video hardware seems to be doing a decent job playing all my files with minimal hesitations. The videos are all just video data, with no 3D graphics commands or anything like what a game would be doing, so as long as the data stream can get into the card fast enough, everything will work well. This is a system-wide issue, involving the main bus speed, memory speed, and hard drive speed, as well as the video card's bandwidth.

Audio

The audio hardware doesn't affect the quality or speed of the animation process much, but it does affect what kind of audio I might choose to add to a video. The audio accompanying the ZeroOne zoom is an example -- the cheap speakers that originally came with the computer couldn't play the deep 120 Hz tone at all. I now have Bose Companion2 speakers, which make acceptable sound and were not too expensive. The sound hardware in the computer is just the stock equipment that came with the motherboard, which DeviceManager is telling me is SoundMax Integrated Digital Audio by Analog Devices.