Sprite placement is probably the most infamous aspect of the Atari 2600’s hardware interface. The general proposition is that you have some registers, and you place sprites by writing to a register that means “Hey, wherever the TV’s electron gun is pointed, I want a sprite right there.” Surely such a matter is reserved for only the mightiest of wizards.
Here at Bumbershoot Software we hold very little truck with wizardry. In this post we will derive a function to tame the sprite placement system: we will feed it an X coordinate and it will place a sprite at that X coordinate.
We have already used some ad-hoc techniques to place sprite elements well enough for my current project, both of which relied on writing the smallest possible loop in 6502 machine code:
loop: dex
bne loop
This loop is three bytes long and it takes five cycles to execute (consuming fifteen pixels of horizontal space) for each iteration it runs. The DEX instruction decrements the X register by 1, and the BNE loops back if the result isn’t zero. When we finished the horizontal placement code for Lights-Out, we observed that a pixel-based loop would be possible because this loop is one byte larger but takes exactly the same amount of time:
loop: sbc #$0f
bcs loop
This loop subtracts 15 from the accumulator each iteration and loops back if it hasn’t crossed zero. And that capability is just what we need to write a generic placement algorithm:
- Wait for HBLANK.
- Add some magic constant to our target pixel value that accounts for HBLANK and setup time, that means the pixel we asked for will be the pixel we get.
- Repeatedly subtract 15 from the target pixel until the result crosses zero. (We can’t say “is negative”, because we need the pixel value to be an unsigned byte to represent target pixels 128 through 159. We’ll be checking the carry bit here, not the sign bit.)
- Reset the target graphic.
- Take the value now left in the accumulator—241-255—and turn it into a signed value between -7 and 7, suitable for use with HMOVE.
- Take that result and shift it left four places so that it will actually be in the correct bit locations for use with the HMOVE registers.
- Execute the horizontal nudge.
The left shift step there is the main reason I’m not bothering with it for Lights Out—just the shift instructions alone are about the size of the extra table space we need to just store coarse/fine values.
Let’s go through this, step by step. First we wait for HBLANK:
sta WSYNC
Then we add the magic constant, which takes four cycles:
clc
adc #magic
We don’t know what magic is yet. We’ll work it out later. The loop and placement come next:
sec
loop: sbc #15
bcs loop
sta RESP0
Every time we subtract and don’t cross zero, the carry bit stays set. Furthermore, the carry bit being set is what prevents an extra borrow from happening.
The minimum amount of time this loop can take is if we underflow immediately. That will take six cycles, which leaves us at cycle 10, which is still well within the HBLANK period. This should never happen—the magic number should end up pushing even the X coordinate of zero out at least that far.
Now we need to convert the results. If we go back to considering the accumulator value as signed—which we can do for free because that’s how two’s-complement addition works—we now have a value between -1 and -15. If our pixel was in the rightmost point in our coarse range, it will need a nudge of 7 to the right, which corresponds to a corrected value of -7. If we are on the leftmost edge, we got a -15, and that needs to become a value of +7. Everything in between works out the same way, which also means that if we actually nailed the positioning at the coarse step, the accumulator should now have -8 in it. Between that and the fact that we need to reverse the direction of our result, we find that the value we need to compute is -(A+8).
Negation is a bit obnoxious on the 6502 because there isn’t actually a “negate” instruction. However, we can take the one’s complement of the accumulator vary easily by flipping all its bits with an EOR instruction, and the two’s complement of a number is the one’s complement plus one. That means that -(A+8) becomes -A-8 becomes ~A+1-8 becomes ~A-7. Furthermore, we already know that the carry is cleared, because that was the test that let us exit the loop. This means that the conversion becomes two instructions:
eor #$ff
adc #$f9
The final step is to shift it into place and do the nudge:
asl
asl
asl
asl
sta HMP0
sta WSYNC
sta HMOVE
One last improvement
There’s one other thing we really should do here, and that’s to index our writes to RESP0 and HMP0 with the X register. As we mentioned earlier, the X register is the most efficient way to index into the zero page, and all the sprites have adjacent registers for these operations. This will let us create a routine that will allow us to place any sprite, as long as we respect that the ball and missiles will be placed one to the left, or magnified players one to the right.
This also implies now that we’re setting this up as some kind of subroutine. Because of this, we don’t actually want to do the last two stores to sync and nudge; we’ll be building up multiple nudges over multiple calls and we should reserve the ability to do those all at once.
Finding the Magic Number
The technique for finding the magic number is pretty simple; we guess a number, then correct based on where it goes. I start with a guess of 54 and try to place a player at location 7. I find that the actual location ends up being 33. That gets us a fixed magic number of 54-(33-7)=28, which indeed happily places our players throughout most of the screen.
But not quite all of it. It turns out that we can’t place players at pixels 0 or 1. The lowest 15-pixel range we can actually perform a reset on is centered around pixel 9: 9+28=37, and after 3 iterations that drops the value to -8. To hit pixel locations 0 or 1, we’d have to write RESP0 at pixel -6, which, since we are in HBLANK then, is actually pixel 3. There are several possible fixes for this.
- We can sleaze out. Forbid locations 0 and 1, requiring an extra nudge to push a player to those locations later.
- We can sleaze out in a slightly less obnoxious way. The player graphics wrap around perfectly, to the point that they cannot be smoothly scrolled offscrene. We can forbid locations 0 and 1 and offer locations 160 and 161 in their place.
- We can automate the previous sleaze-out, adding 160 to any value less than two. This is arguably not sleazing out at all.
- We can fiddle with our instruction timing a bit so that the first possible reset point in the scanline is between coordinates 3 and 7, inclusive, so that it is both available for resetting and so that a single HMOVE can reach pixel 0.
It turns out that last option is possible: by moving the sync operation to just before the loop, the magic number changes to 46 and this places the earliest reset point at 6 (6+46-60=-8). Also, our math ends up matching our engineering: we moved six cycles worth of instructions out before the sync, and as a result the magic number increased by 3*6=18 pixels.
Accounting For Missiles
One issue with this function so far, though, is that missiles and the ball will be placed one pixel too far to the left. We need to increase the value of the accumulator by 1 for missiles or the ball. The registers go in the order Player 0, Player 1, Missile 0, Missile 1, and Ball. That means we need to increment the accumulator if our X register is 2 or larger.
We can do this incredibly concisely. We’ve been using the carry bit to represent the result of unsigned comparisons throughout this routine, but we can exploit that. The instruction CPX #$02 will set the carry bit if X is anything but 0 or 1 and set it to 1 otherwise. But then we can use the carry bit as a carry bit and increment the accumulator as needed with ADC #$00. We don’t even need any branches.
The Final Routine
Here, then, is our final, pixel-exact, multisprite placement routine for the Atari 2600:
place_sprite:
cpx #$02
adc #$00
clc
adc #$2e
sec
sta WSYNC
* sbc #$0F
bcs -
sta RESP0,X
eor #$ff
adc #$f9
asl
asl
asl
asl
sta HMP0,X
rts
And there we have it—sprite placement tamed in 27 bytes.
Further Improvements
There are still a couple of places where this routine could be improved:
- This routine is tuned to unmagnified players. Magnified players will end up one pixel to the right. I don’t really have a good approach to dealing with that automatically, because the NUSIZ registers are write-only. The programmer will need to know and adjust the pixels themselves.
- As written I’m batching up all the positioning values for a single call to HMOVE. It turns out that there is a common discipline where you hit HMOVE every single scanline. There are some benefits to doing this—we’ll cover this later on—but the benefits it grants will be broken if we use this function in the wrong places. We’ll need to retune it if we want to mix this with an HMOVE-every-line protocol.
But for now, I’m quite happy with this result.