Older rants/news


03/16/2001

Doh! My tripod homepage got shutdown for 'terms of service' violation. Stay tuned while I try to get this straightened out. (Hopefully my inquiries to Tripod won't go unanswered.)
Oh a whim, I returned to the solder-bench and tried to fix a recently acquired ATI Rage128 Pro 8MB. Here's some background information: I bought the used-card used ebay, mainly as a workplace upgrade. Unfortunately, my office upgrade plans were foiled by the video-card's relatively poor sharpness. Initial attempts to deactivate the RF-filter met with limited success. I stripped the capacitors, then shorted the inductors by soldering over them. Yet the card was still blurry at 1280x1024 85Hz.
Returning to today's escapade and building on yesterday's newfound desoldering skills, I removed the inductors completely by desoldering them. Shaky hands, a long solder tip, and small devices just don't mix. The remodified-board shows evidence of bad soldering, the kind that would proudly earn an 'F' in any community college EE lab. (*pats self on back*) Once again, sound advice which I love lecturing to others, falls on my own two deaf ears.
Despite an apparent increase in entropy on the VGA card's appearance (the space next to the VGA connector), the finished board, in spite of ugly solder joints, shines in high resolution video modes.
I will probably rework the RadeonLE, to desolder the 3 inductors hidden under 3 mounds of poorly applied solder. I've already stripped the last 3 capacitors from the RadeonLE's RF-filter, a modification which failed to yield any enhanced sharpness. Gee, why did I have to 'learn' how to solder with the RadeonLE...oh well
I've also just figured out, listing my FULL name (instead of just 'First Last') prevents most search-engines (including Yahoo's own) from directly locating my pages. From now on, I'll include (Royce Liao) in the title of index.htm. My old friend David Wu tells me MS-Word97 'ruins' webpages made real webauthoring tools, like Macromedia Dreamweaver or Adobe GoLive. Since I'm using MS-Word97 to edit Word97 originated html documents, I'm guessing these pages can't possibly be further fubar'd.

03/15/2001

I'm mostly done with testing these infernal VGA cards. As my luck would have it, I finish one VGA card, then think of another dozen things I should have tested. Oh well, no market survey is completely accurate, and I'll be sure to include a disclaimer pointing out what test methodology flaws are known to me.
One more card will be included for certain : ATI Radeon LE 32MB DDR. I bought this for myself, I'm done with using the Savage4. It's 64-bit memory bus is simply too slow for 1024x768 32bpp. The RadeonLE's memory bus is 128-bit DDR (@ 150MHz); or more than 4X faster than my Savage4.

03/04/2001

Well time has run out on my Geforce2/MX, it needs to be returned to my boss's PC. (I removed it for, ummm, digital calibration and the R/F inductor bypass hack. Quiet.) This poses a personal problem for me, because the testdata I gathered from the MX leads to some potentially controversial statements. For example, the Geforce2/MX probably doesn't support YV12 natively. The memory allocation measurements are just too high on the MX. Well maybe the MX follows the military's policy of securing resources now, and using later (or never.)
I'm almost ready to release a 'raw dump' of my test data. The data will include measurements/observations from the following AGP hardware :
Each card was tested with the following players :
 
Testing so many cards with so many programs took a lot of time. But it smooths out any irregularities in the testdata. And it minimizes the chance of one anomalous data-point corrupting my funny study. Then again, I only tested on 1 primary platform : Celeron-400, Intel i440BX. When possible, I ran several additional tests on an Via MVP3 and AMD K6/2. (The K6/2 was clocked at 233MHz, 300MHz, and 400MHz.) The Via board would not run with the Blade3D, and suffered stability problems with the Savage3D and SiS.
I plan on testing at least one more video card, the SiS6326DVD. If possible, I will also include Intel i810-DC100. Integrated video has become standard equipment on low-end PCs, so it makes sense to benchmark an integrated-video chipset for my roundup.

02/28/2001

Starting a few weeks ago, I got bored with my Trident Blade3D and decided to compare its DVD-playback with my Savage3D and Savage4. I first ran PowerDVD 2.55, which came with my DVDROM drive. Noticing that the DVD-menu highlights and DVD-subtitle text looked blockier than usual, I downloaded and installed WinDVD trial, with the same result. Then I tried a hacked software Cinemaster2000 (for the Elsa Geforce cards), found the character-code 'TDMC' embedded in cinmst.dll file, and proceeded to figure out which of the many VideoDecoder.PerformanceClass referenced the Trident hardware. As luck would have it, it was 0x0E, which comes right after the Geforce hardware code (0x0D.)
And yet again, I was rewarded with ugly, opaque menu-highlights and blocky subtitle-text. But at least the Trident's DVD-accelerator performance matched the Savage4. And unlike the Savage3D, the Trident supports DVD-playback in the highest support desktop resolutions and color-depths.
Intrigued by the paradoxical performance parity and image-quality disparity, I set out to survey DVD-performance on every AGP-video architecture. Well, every card advertised with DVD-acceleration. Over the course of 2 weeks, I wasted lots of after-work free time, just watching the same clips over and over, to get a sense of how these different video-cards compare. I collected measurements from both the Win98SE SystemMonitor (probably a pointless endeavor, as the CPU-meter is unreliable) and the DirectX control panel applet. The applet tells me how free-memory is on the video-card, from which I derive the DVD-overlay's video-memory consumption. So far, there is no correlation between memory-consumption and image-quality/performance, but a few cards allocate lots more memory than expected. For example, my Geforce2/MX allocated anywhere from 19MB to 26MB of video-memory, depending on the software player. The Trident and ATI Rage Pro are much more economical, with 2030KB.
In watching the same video over and over, I began to see things I never noticed before. For example, many video cards which offer 'smooth filtering' on zoomed video, actually do a lazy job and only filter part of the video. The Trident and SiS cards only filter luminance (intensity) pixel data, and simply replicate chrominance values as needed. Overlay-Zoom behavior was most complex with the Savage4 card: RGB overlays are fully interpolated in X/Y direction. For YUV overlays, luminance data is fully interpolated, but color-data is interpolated only in the X-direction. Videos with bright clashing color patterns have a jagged appearance around the different color boundaries. But paradoxically, the Savage4's DVD-overlay was near perfect, with no evidence of bad filtering.
I plan to write-up all these findings in an article which will be caled 'the low-end PC-DVD roundup.'

02/02/2001

It's been a while since I updated my webpage.
One thing has been bothering me this past month, partly because I am to blame. Last month I rambled about modifying my Savage4's RF-filter, to improve its 2D-sharpness.
Apparently, a lot of people aren't happy with their video card's sharpness, either. Because many people have emailed me for more pictures and more detailed instructions on the modification procedure. I appreciate the interest, but these requests put me in a bind. First of all, the typical user should NEVER EVER physically alter any hardware in his system. At best, you'll merely void your warranty is a given; at worst you could destroy the hardware in question, and quite possibly any attached hardware (like the rest of your PC.) Second of all, the nature of these requests have been tutorial, i.e. 'step by step' instructions for identifying the RF filter, identifying the culprit components, removing them, etc. It is my opinion that the provided link (http://www.geocities.com/porotuner) offers adequate instructions.
If you need more information, then you probably should not perform the modification, period. RF-filters vary from VGA board to VGA board. Part of being 'qualified' to modify your board means a bit of independent detective work, like tracking down the correct components, knowing how to solder (don't do what I did, don't tear components from the PC-board. This risks pulling out a board trace.) This is work YOU need to do, because every card is different. (*I* am not qualified to answer these questions for YOUR board. Only the board engineer who did the PCB layout can reliably answer these questions. So please do NOT send me JPEGs of your video card.)
Don't practice classification of surface mount components Don't practice your soldering skills on expensive hardware! Don't practice any of this stuff on your PC, especially if it's your only one and you need it.
(Saying this makes me feel like a hypocrit. I practiced on PC stuff and destroyed a few items in the process. That's how I learn. I always understood and accepted the potential consequences of my 'experiments.')

12/21/2000

Yesterday I forgot to write one important note. A few months back I came across a usenet discussion where a poster claimed he had 'fixed' the Geforce's sharpness problems. Intrigued, I visited the posted link - http://www.geocities.com/porotuner The article covered, in depth, what I had previously known about, VGA RF-filters. It also went further, by offering a circuit schematic of the filter. To summarize, the complete filter-circuit uses both capacitors and inductors. I dealt with the inductors on my other VGA cards, but I also ignored the capacitors. The relative amount of attenuation contributed by each component, is dependent on the filter's precise design.
Today, I had other soldering stuff to do for work, so I gathered the nerve to do one last soldering job, this time on the Savage4. Soldering the three bypass wires proved just as difficult as I expected. I have very bad soldering skills; it took me nearly 20 minutes to perform what ought to be a 2-minute job for an average technician. But the results were dissapointing. While I did not damage my card as I had feared (thank goodness), the modification had no effect!
Here's where my 'plan B' came in. Untouched from the initial job, the capacitors were now the focus. But without special tools (or requisite skill), I could not desolder anything. Instead, I used a pair of needle-nose pliers and simply crushed/twisted/tore the capacitors from the board. The RF-filter had a total of 6 capacitors, conveniently located next to the three inductors.
I snapped off three capacitors (one each from R, G, B signal lines.) Then I reinstalled and rebooted to check the board. Stripping the capacitors was a good call; the Windows desktop @ <1024x768 85Hz> was somewhat sharper. Finally, I removed the remaining 3 capacitors. Once again, the display improved in clarity. While I wouldn't call the end result 'razor sharp', it was actually very good - on par with an old PCI Stealth II S220 (Rendition V2100.) Now, the Trident Blade3D, Stealth II, and Stealth III (Savage4) had comparable 2D-image quality. The Savage3D stands out better, though. And if I snapped off the capacitors from the Savage3D, it might look even better!

12/20/2000

I've had a cheap Savage4 16MB AGP for several months now, and I've always been bothered by its poor 2D sharpness. Contrary to popular belief, the blurry quality of generic VGA cards has little to do with the VGA chip itself. 200MHz+ RAMDACs have been standard for sometime now, even in bargain-bin VGA chips (SiS6326, TNT2/M64, Blade3D, Savage3D, etc.) The real culprit is the RF-filter found on every VGA card. The filter is designed to reduce radio emissions down to FCC compliant levels. In properly designed VGA boards, the RF-filter achieves this without sacrificing much (if any) VGA signal bandwidth. The "bargain" video cards (which I have plenty of) use cheaper components, and owing to component tolerance, attenuate too much VGA signal.
To support my claim, here's some relevant experience. More than a year ago, I performed a little modification on my Savage3D board, to disable the VGA RF-filter. The modification focused on three barrel inductors (solder-through mounted), which surpress high-frequencies on the R,G, B signal lines. As the board had more than 20 inductors, identifying the correct inductors was task #1. This proved easy; 3 inductors were arranged in parallel, next to the VGA connector. I shorted out each inductor by attaching a metal wire to both terminals of each inductor. (If I ever wanted to, I could undo the modification by yanking the 3 shorting wires.)
With the RF filter disabled, I reinstalled the Savage3D, flipped the power switch, and crossed my fingers. 1024x768 85Hz, before a blurry mess. And now? Crystal clear! 1152x864 85Hz, same story. 1280x1024 85Hz, alas this mode a bit beyond my monitor's reach, it's still blurry due to the CRT element exceeding the pixel size. Finally I used 640x480, to compare 60Hz refresh with 160Hz refresh. The display was noticeably less sharp at 160Hz, but far better than before. In fact, the modified Savage3D's 640x480 160Hz output roughly matched the original Savage3D's 60Hz!
I then went through my other junk video cards : Trident985AGP, S3/Trio64V+, S3/Virge, and modified them similarly. All cards experienced some degree of improvement, with the Trident card taking the "most improved appearance" award.
Returning to the subject of my blurry Savage4, by now you must be asking, why didn't I already apply the same modification to my Savage4 card? Quite simply, this Savage4 board (like other modern VGA cards) could not be modified in the same way. The RF-filter used tiny surface mount components instead of the "fat" axial inductors. Shorting out a surface mount component requires delicate soldering work. One slip of the soldering iron, and the entire board might be destroyed. I assessed the risk and decided against soldering (and quite possibly ruining) my Savage4.
Now, I've just about had it with my Savage4. It locks up in Deus Ex, Unreal Tournament, and Matlab5 Student Edition. (To be fair, newer device drivers fixed the Matlab issue.) As Homer once said to Carl and Lenny, "I've got a plan. A plan that'll fix YOU good." ("Hey, what did we .. do?") And so, I, too, have a plan for my Savage4, a plan that'll fix it good.

11/05/2000

For the past few weeks, I have been working on a simple VGAtext controller. I'm entering Verilog RTL code into Xilinx Student Edition 2.1 software, and downloading the synthesized netlist to an Xess XS40-010XL+ prototyping-board. So far, I've made steady progress : the CRT controller (scanline state machine) is working. VGAtext uses a character-cell bitmap of 8 x 16. The horizontal cellsize is fixed to 8 (changing this requires substantial modification to the CRT timer and textgrid reader.) The vertical cellsize is programmable, supporting up to 16 scanlines. The current design lacks any host I/O interface, so nothing is programmable at runtime, not even the framebuffer address (critical values are initialized at power-on/reset.) On the target hardware part (XC4010XL-3), the overall design consumes roughly 25% available CLBs and runs at >50MHz.
For the actual font data, I simply extracted the bitmap image from my PC's VGA BIOS using a simple Turbo C++ program. The ultimate goal is to interface VGAtext with Jan Grey's XSOC RISC CPU. It would be cool to have everything (CPU, peripherals, display, keyboard input) fit onto a single XC4010XL part!

09/07/2000

All new iDCT package (V1.2), featuring "AP922float." AP922float is a fast, high-precision iDCT, derived from Intel's Application Note AP-922. (Fast is a relative term. Any MMX iDCT implementation will always execute faster than a comparable floating-point iDCT.) Idctpk12 includes three versions of AP922float: a standard C-code listing (X87 FPU), which can be configured for 64-bit reference precision, b) an Intel-SSE optimized listing, and c) AMD 3D-Now optimized listing.. To use the Intel-SSE listing 'as is', you need Visual C++ 6.0 with the "Processor Pack Beta" (get it at msdn.microsoft.com) and Visual Studio Service Pack 4. The AMD 3D-Now listing was also developed under Visual C++ 6.0, but does not require Processor Pack Beta (at least I don't think so...it uses AMD's clever macro file 'amd3dx.h'.) The C-code listing should compile without any issues.

08/26/2000

I uploaded an updated forward-DCT package. The updates include a full C-code version of AP922 fDCT (in addition to the MMX-assembly listing.) The output of the AP922 fDCTs are now range-clipped (-2048,+2047.) And, after some thought, I removed the "pass/fail" indication from the test-program. Apparently, some people felt the test-program was a valid IEEE-1180 precision benchmark. This is *not* the case, and I apologize for any misunderstanding. The update now prints a disclaimer disavowing any pretense of any relationship with the IEEE-1180 standard.

08/16/2000

After some goofing off with the Trident Blade3D, I have concluded that it just doesn't cut it for 3D-gaming. HalfLife runs *faster* with the sotware renderer than Direct3D renderer (Blade3D.) In OpenGL, Halflife on the Blade3D crawls. Unreal Tournament fares a bit better. The Direct3D renderer barely edges out software rendering. And of course, the image quality of the Blade3D smokes either software-renderer.
On the plus side, hardware motion compensation seems to work even with AGP disabled. Both my Savage3D and Savage4 cards shutoff motion-comp if AGP is disabled.
On the subject of image quality, the Blade3D does exhibit annoying dithering artifacts in most games. Well I think I've wasted more words than warranted on the Blade3D. In conclusion, it belongs in your mother-in-law's Celeron-300, or K6/2-350. Even my friend, Ken, avoids Trident in budget-PC upgrades for his office. Now let me never speak of this Blade3D again.

08/09/2000

Wouldn't you know it, the Trident Blade3D AGP doesn't like my FIC VA-503+ motherboard. So it looks like its permanent home *will* be my Celeron-850. The Stealth III S540's blurring at 1024x768 85Hz was the ultimate deciding factor.
After installing the Microsoft Platform SDK (April 2000), I used the directx.cpl tool to examine the Trident's Directdraw capabilities. According to the tool, system->video transfer are not accelerated (i.e. no bus-mastering.) This means most (non-AGP) AVI hardware-assisted BITBLTs between system bus and AVI/MPEG players won't have access to main-memory. Bus-mastering is such a basic feature nowadays, I wonder whether this is just a peculiarity in the driver. I'd hate to see how poorly the Blade3D fares in PCI format, especially with Direct3D games using so many textures these days.
Theoretically an AVI/MPEG player could utilize AGP-functionality to create a 'non-local video memory' structure in system RAM. Then the Trident's AGP-DMA hardware would copy the system-RAM contents to video RAM. But very few players work like this. Though perhaps certain SoftDVD players might use this technique to squeeze extra performance. That would help explain why some motion-comp capable chipsets (S3, ATI) don't support hardware motion-comp in PCI form.}

08/06/2000

Ok I got sick of using trying to use Word97 as an webpage editor. Now I'm using the Frontpage Express, which is the free-version of Microsoft's Frontpage. Every time I go to the computer swapmeet, I return home with some piece of garbageware. Most of the time, the garbageware goes straight into the garb...err closet, where it remains for eternity. But once in a while I get lucky and end up with a perfectly usable item. And by buying generic, I save $$$, too.
Well, today I went slumming for more garbageware. My research turned up the Trident-9880 8mb AGP. This low-cost VGA card could very well complete the shelf of garbageware inside my closet, so why not? Just a few months ago, nearly every other vendor stocked Blade3D cards, alongside with the SiS6326AGP and Trio3D/2X AGP. But today the Blade3D was nowhere to be found. Nowhere, until I checked one last vendor. He had a display-box for the Jaton 107AGP, which is an AGP Blade3D configured with 8MB SDRAM. Oddly enough, convincing the vendor to sell me his last card proved to be a chore.
You'd expect vendors to fulfill any and all purchase requests, whether big or small. But this guy had a real attitude. With no more 107's in inventory, he told me to pick another card. I didn't want the SiS6326 or Trident 9750 4mb, knowing the Blade3D "blew away" its $20 competitors (so to speak.) Therefore I pressed for him to check the display-box unit. "Why do you want this card? Why don't you buy a real video card?" $!@^#! I felt like giving up and going home, but I wanted the card, and this was the last vendor.
So I defended my choice by claiming the superiority of its "special feature"...long pause...look of disbelief...then says he "...which is?" We both laughed. I guess that was pretty funny, Tridents with 'special features.' Yet he was humored enough to check the display-box : The contests? One Trident9880 with 'special feature.'
So what is this 'special feature'? "It's a secret...shadddyup!" Just kidding. There is no feature in particular that qualifies as 'special.' It's the *set* of features: hardware motion-comp, 32bpp 3D rendering, YUV planar acceleration, and dual simultaneous Directdraw overlay support. This featureset deserves the special distinction : "best of the rest : under $30 mediocre video card." I wanted it for the YUV planar acceleration.
My other primary dispaly cards, an Atrend S3 Savage3D and a Diamond Stealth III S540 16mb OEM (Savage4 Pro+), both have hardware motion-compensation, but no YUV planar acceleration. This is rather odd, because consumer MPEGs are *always* the lowest color profile, 4:2:0. Therefore, MPEG-video acceleration (hardware motion-compensation) makes the most sense when used in conjunction with planar-YUV. The S3 Savage3D/4 support packed YUV (YUY2), which is equivalent to the color-profile 4:2:2. More cards support packed-YUV than planar-YUV, but every other card capable of motion-compensation, additionally supports planar-YUV. Theoretically, the S3 could handle the display of higher quality MPEG-2 files, but due to overlay limitations the S3 has a hard enough time just accelerating 720x480 DVD-video overlays. And since 4:2:2 overlays occupy more space, they inflate video RAM requirements.
Back to this new video card, I installed the Trident9880 into my Celeron-850. (Don't worry, it's not the Trident's permanent home! That would be a crime.) First, I ran 3DMark 2000 v1.1, just to see if the Trident's Win95/98 drivers are DirectX7 capable...well the demo completed without incident! Some texture crackling was apparent (probably due to inaccuracy in the z-buffer calculations.) All the direct3d effects were intact.
Speedwise, the Blade3D wasn't exactly 'fast.' It was mediocre when it was introduced in late-1998, and of course it hasn't gotten any faster. Qualitatively, it checks in just a little slower than my Savage3D. The Savage3D has a slight edge in terms of memory : SGRAM clocked @ 110MHz versus the Trident's 100MHz SDRAM clock. In the Windows desktop, 2D performance was roughly the same as the Savage3D. Both cards have 64-bit wide memories, so they lag far behind the 128-bit wide memories of the Voodoo3, RivaTNT, G400, and Rage128. 2D image quality was very good for such a low-cost card. 1024x768 85Hz was crystal clear, even sharper than my Stealth III OEM! At 1280x1024 85Hz, the Blade3D's aging 170MHz RAMDAC blurred text quite a bit, though the Blade3D was marginally better than the Stealth III. The Savage3D is razor sharp at that resolution, thanks to some 'modifications' designed to defeat the board's RF filter.
Overclocking was very easy. Trident supplies a Windows-based 'set_mclk' utility. Unfortunately, it's interactive only, meaning it can't be executed as a batch-util during startup. Amazingly, MCLK worked, too. When I forced MCLK to use the Trident 9750/9850 PLL programmer, I could program the PLL all the way up to 160MHz (unstable in 3D of course.) Fooling around the Windows desktop (2D), I tried different speeds, gradually ramping up higher and higher. At 135MHz (Blade3D turbo), this Trident felt faster than my Savage3D in 2D. At 160MHz, the Trident felt faster than my Savage4, whose memory clock is set to 150MHz. No display-glitches at 160MHz, but Direct3D programs hung. Backing off to 149MHz let me play several hours of Counterstrike and Team Fortress Classic (in Direct3D only.) Even at 150MHz, speed was not up to my Savage3D. I think I can attribute the disparity to the Savage3D's trilinear filtering. The Savage3D can perform trilinear filtering 'for free', no additional performance penalty compared, whereas the Blade3D an additional pass. Therefore in Halflife, the Savage3D is always faster than the Blade3D. The Savage4's dual texture-unit puts it even further ahead (in OpenGL only, though...Direct3D performance is pretty slow.)
Finally, I tested AVI/MPEG playback, which is the very reason I bought the Blade3D. PowerDVD 2.55 is the only softDVD-player I have, with support for Trident motion-compensation. It's hard to quantify 'performance' with DVD playback, so the best I could do was check for common deficiencies I've seen in other cards. Despite a solid feature-set, Trident Blade3D has some serious deficiencies. First, the Blade3D does not have hardware shrinking. This means shrinking a playback window below 1.0X size, forces the CPU to handle the scaling. It's slow, ugly looking, and just unacceptable for a contemporary video accelerator. ATI, S3, Matrox are all top-notch in hardware shrink/zoom support. NVidia has recently caught up with the Geforce (TNT1/2 had bad overlay scaling.) Second, the zoom quality is worse than my S3 cards. Trident boasts a nice sounding feature, "Trident Edge Recovery", but for YUV-planar overlays (YV12), the edge-recovery only functions on the luminance (Y) data. The color planes are not filtered at all, leaving visible stairstep artifacts at sharp color-transition boundaries. Combine this with a playback window smaller than 1.0X, and the appearance is even worse. And third, the Trident drivers or hardware don't multitask 2D operations well. During playback of an MPEG-1 (352x240) file, I moved other background desktop windows around, burdening the video card with not only the video-playback, but also desktop window redraws. The Savage3D/4 maintains smooth video playback, not skipping a single frame. The Trident chokes badly, dropping frames for even momentary redraws (like closing a window.) With the Trident, scrolling in Netscape/IE effectively froze video-playback. Dragging the mouse pointer across a menu-bar (causing drop-down menus to appear and disappear), likewise stalled video-playback. And finally the fourth and *worst* deficiency, the trident just doesn't share the system bus well. I played an MP3 file in the background, then tried to browse the web. Heavy scrolling or 2D-activity caused horrible static. This behavior brings back memories of my old Rendition Verite2x00 PCI. Overclocking the Trident alleviated the problem somewhat, but the fact remains that Trident's drivers are poorly written for a multi-peripheral environment. I can't imagine how much worse the problem would be, for a PCI Trident.
For all these limitations, the Trident still serves as a reliable backup/secondary display card. Linux support is here, and though Win2000 support isn't great (direct3d stuff crashes), 2D-acceleration works reliably.
I've notice that Avery Lee's VirtualDub includes an AP922 based MMX-IDCT. So if you're looking for a high-speed precise iDCT, check Virtualdub's source code.

07/23/2000

MMX forward DCT based on Intel Application Note AP-922 is available for download. The AP922 forward-DCT exceeds the precision of the conventional 32-bit integer AAN forward-DCT. I've also created a tweaked version, using AMD's pmulhrw instruction. The tweaked version is for AMD CPUs with 3D-Now extensions, and is slightly more accurate than the basic MMX ap922 fdct.
Also, there is a bug in my AP922 MMX IDCT. (Actually, there are several things wrong with it.) If you want the *correct* code, please visit Peter Gubanov's website at http://www.elecard.com/peter

07/16/2000

It has been almost since months since I even bothered to look at my own homepage. (Actually that's not entirely true. Several gentlemen contacted me about problems with downloading files from my Lycos/Tripod homepage. Neither Netscape 4.x nor IE5.x caused problems with Tripod's funky download-redirector, so I resorted to manually emailing the requested file as an attachment.) Soon after the new millenium, I started work at a small company. After staring at a computer screen the whole day, one quickly dissociates the terms "PC" and "recreational use."
If you're here for the 3D-Now implementation of the AAN-FDCT, use the link "MPEG2AVI." Otherwise keep reading and you'll (eventually) wind up at the same place.
Almost a year ago, I unloaded a dysfunctional PC onto my high-school friend Andy. It was one of those vaunted "Super7" setups with an FIC VA-503+ and AMD K6/2-300 (set to run at 350MHz.) Nothing worked right with that system. My AGP S3 Savage3D would randomly lockup during Direct3D/OpenGL applications. Installing a PCI soundcard (Yamaha 724PCI) would hang the system very quickly. The AGP throughput was dog slow (well faster than PCI, but half as fast as a good i440LX/BX motherboard), busmaster IDE support was questionable with CPU-usage spikes during hard disk access, and worst of all, some programs ran slower than my old ASUS T2P4 + P/MMX-250! The setup was cursed, thought I. Even though it would replace my friend's aging Pentium/166, "fairly warned, be thee, says I!" is what I told Andy. (Well a few months later he tells me he's given up on it.) For myself, I dumped the Super7 platform and bought me a Celeron-366. This turned out to be not nearly as good a deal as the legendary Celeron-300A, since my 366 only made it to 458MHz, not 550 as I had hoped.
Then a few months ago I spotted an advertisement in the paper. For $99, an AMD K6/2-500 and FIC VA-503+ v1.2A (my dooomed system had been a v1.1B.) Actually, dealing with my first K6/2's idiosyncracies had brain-damaged me to the point where I forgot why I bought it in the first place: 3D-Now support. Late last summer, after I had unloaded the junk FIC, I bought an MMX-programming book to speed up MPEG2AVI. I had wanted to try 3D-Now optimizations (although there aren't any opportunities for 3D-Now to help mpeg-video decoding), but I no longer had my K6/2. So, at risk of repeating a past mistake, I ran out and bought the $99 CPU+motherbaord combo, not realizing that I needed a spare AT-case, RAM, floppy-drive, monitor, keyboard, mouse, etc. The new equipment sat unused for almost a month, until I remembered a broken AT-case sitting in the closet. (The story behind that tragedy was, well, airport baggage handling. The handling was so rough, some of the metal rivets keeping the case together popped out, leaving behind a somewhat jiggly box.)
Finally the new K6/2 was up and running, this time very stable. (Well as stable as the unreliable Savage3D would allow.) I was so jubilant I even bought a brand new Diamond Supraexpress K56/V90 speakerphone modem for it (there's another story for this later.) Performance-wise, the K6/2-500 felt slower than my Celery-458. While fast enough for software-DVD playback (well the PC-100 SDRAM and PCI-audio helped), other important applications like Half-Life were just sluggish. And now that I've upgraded to a Celeron2-566A (@850MHz) the K6/2 just feels like an old 486, relatively speaking. But it was sufficient fast for code development. And so began my 3D-Now programming adventures. (yeah right, get a life.)

01/2000

I've always wondered how other people have so much to say, but I have nothing to say. Maybe everyone else leads a more interesting life than I do. Or maybe they aren't stuck in front of a computer everyday to the point their brains go numb.
It's been six long months since I graduated from Stanford University, with a degree in Electrical Engineering. When you get out of college, the first thing you hear is "You only use 5% of what you learned." I can see where this statement comes from. After working for a few months, you quickly realize that everything you need to know could have been condensed into a few short courses. But if we weren't forced to take all those general-ed classes, many of those departments might cease to exist. So maybe that's the challenge, putting to good use everything you learn.


< back home >