I’ve spilled a whole lot of bytes over the past year on timing effects on the Commodore 64’s display. So far, all of those articles have assumed that sprites are either disabled or operating far away from us. It’s time to drop that assumption, or, more honestly, it’s time to demonstrate why it’s best to enforce that assumption.
Sprite data is loaded as-needed by the VIC-II, much as character data is. The way it basically works is like this: during the time period between lines—when the border is being drawn, basically—there are 16 cycles set aside for loading sprite data. Each sprite, from 0 to 7, are given 2 specific cycles. If that sprite will be active on the next line, the VIC-II will need to read from RAM for the associated two cycles.
Where things get weird or nonlinear is when we factor in the fact that it takes three cycles to take control of the bus from the CPU. That means that any time we transition from a nondisplayed sprite to a displayed sprite, we need to make sure the VIC has control for the previous three cycles. That would mean five cycles per sprite, but if consecutive or nearly-consecutive sprites are activated, this cost can drop dramatically. There are three basic cases here, assuming we aren’t also on a badline:
- Just one sprite is active. This means the three previous cycles weren’t reading anything, which means we need to pay the full five cycles. (This also happens if the two sprites previous to us were inactive.)
- The sprite before us was active. The VIC already had control of the bus the previous cycle, so it simply doesn’t have to let go; we only pay two cycles.
- The sprite before us was inactive, but the sprite before that was active. In this case, the VIC had control of the bus three cycles ago, when we would have started taking control again. The VIC-II simply refuses to relinquish control and idles for two cycles, producing a total cost for this sprite of four cycles.
An interesting side effect of the previous rule is that it costs exactly the same number of cycles to have sprites 0, 1, and 2 on (5+2+2) as it does to have sprites 0 and 2 on alone (5+4). The only change is that instead of reading Sprite 1’s data during its cycles, it is instead reserving the bus.
It’s not tremendously difficult to write a program to simulate the timing for this: groepaz’s VIC Timer tool creates dynamic charts based on the 1996 VIC article, for instance, which will also correctly capture cases for cycles where the VIC-II is taking control but the CPU is still allowed to write values. But for rapidly computing values, I must give the nod to Vorn the Unsqueakable’s Python one-liner:
def sprdelay(active): return sum(map(int, list((bin(active)[2:] + '00').replace('100', '5').replace('10', '4').replace('1','2'))))
I’ve created a native test program that demonstrates sprite delays in realtime. This is, under the hood, just a slight modification of my earlier very simple raster stabilizer program. After stabilizing the timing, it changes the background color, and then—assuming that there weren’t any sprites active—turns the color back on the first column of the next line. This means that if sprite preparation eats into the CPU’s allocation of cycles, the color change back will happen later, thus resulting in a visible extension of the color bar. The program shows a scale for testing this in realtime and also implements a version of the predicted-delay formula so you can check the two against each other.
In writing this, I ran into a number of sort of interesting implementation challenges. Well, interesting if you’re me, anyway.
- If you actually look at the timing, we actually start getting blocked by the VIC while the previous line is still displaying. That means we need to change the color to the gauge color in the middle of the previous scanline. That’s fine, but it’s ugly; we can cover up the ugliness by filling the right side of that text line with inverted spaces that are the color of the background.
- We’re relying on a stable raster here, and as you may recall, there isn’t enough time to stabilize the raster properly in the general case on PAL when the KERNAL’s still swapped in. I could have swapped the KERNAL out here, but I instead altered the main program so that by the time you’re at a frame boundary and thus ready for an interrupt to happen, you’re in a loop of instructions that are either 3 or 4 cycles long. With that additional constraint added, the stabilization code fits easily within PAL’s space.
- We’re relying on cross-line, strictly accurate cycle counts. That means we need to adapt the system to the model of VIC we’re running on, which, for the stabilizer, at least, means we need three different interrupts.
In the end, it would have probably been easiest to just redirect the interrupt pointer to one of the three routines at the beginning and then left it there; however, I instead decided to use this to test out the technique of offset assembly. Each routine is assembled as if it were at
$C000, despite actually being somewhere in the
$0800-$0A00 range. That’s an interesting topic in its own right, especially after my tour of assemblers in the previous article, so I think that deserves its own post.