A Tour of 6502 Cross-Assemblers

Recently, I decided to revisit one of my projects from over ten years ago; a tiny game for the NES that I’d written with a very early version of my assembler. I’d actually developed that assembler in the first place because the options I had available at the time didn’t work the way I wanted them to, or were excessively finicky or underpowered, or weren’t freely available, or didn’t run on my system.

The world’s gotten better since, both because I’ve been part of more communities and seen more things, and also because tools have been better developed and disseminated. I can say with some confidence that had I known about the ACME assembler back in 2002 I would never have written Ophis.

As I was restructuring and bugfixing that old project—which also involved porting it over to ACME—the Habitat project of the Museum of Art and Digital Entertainment released Chip Morningstar’s Macross and Slinky, a structured assembler/linker system. I spent a weekend helping to bash that into a shape that runs on modern systems.

So this is all quite interesting because there are a great many assemblers out there for the 6502 series of chips, and all of them have subtly different features and semantics. I seem to have moved from a world where I couldn’t get an assembler that worked the way I wanted to needing to figure out how to differentiate a plethora of slightly incompatible options.

In working a bit with various assemblers, these are the things I think are important when picking an cross-assembler for your project.

Dramatis Personae

I’ll be comparing the following assemblers:

  • Ophis. This is the assembler I wrote back in 2001 to teach myself Python and have developed on and off over fifteen years, since I’d never found anything that let me easily target the platforms I wanted to target. It’s what I’ve used for pretty much all my examples on this blog.
  • xa65. One of the only 6502 cross-assemblers you can reliably find in Linux repositories.
  • CA65. This is the assembler for the CC65 development suite. It is now the assembler recommended for use by the NES development community.
  • ACME, aka ACME Cross-Assembler for Multiple Environments. Has a pretty strong Commodore/Apple bias, but also is one of the few 6502 assembler syntaxes to be represented on Pastebin.
  • 64tass. Designed to mimic Turbo Assembler. This is a favorite amongst many of the folks I talk Commodore with.
  • Kick Assembler. Another common one for sample code; The C64 codebase wiki seems to like this one.
  • Macross. The development system for LucasArts’s 6502 and 68000 products. In one sense, the oldest tool here, but in another, the newest kid on the block. Its ideas of code organization make it a very low-level high-level language more than anything else, so its idea of what features are important are thus very different.

Scoping and Temporary Labels

When you’re writing a routine, you want to be able to have a command like jmp loop_done without having the name loop_done be globally unique in your entire program.

Of all our assemblers here, only Macross lacks this feature, and that is because every label is temporary within its file. Any label that needs to be exported must be declared extern, C-style. That still doesn’t help inside functions, though, but Macross feels this lack less because temp labels are usually for if/then branches, or loops. In Macross code you actually organize your code in block-structured if or while statements, so temporary labels are created by the assembly of those structured blocks.

Of the others, ca65 behaves the closest to modern expectations, by which I mostly mean it behaves in a manner almost identical to NASM. You can declare a block to be a “procedure”, and all labels within that procedure block are not exported. Within a procedure, you can mark a label as “local” by putting an @ in front of it, and it is accessible only between the previous and next non-local label. Also, following NASM, local labels are still accessible elsewhere via access paths. A label foo in procedure f may be referred to globally as f.foo, and local labels treat the previous “true” label as their “procedure”. 64tass does this one better, allowing blocks to be arbitrarily nested.

Ophis doesn’t allow local labels outside of a procedure-like block (which it calls a “scope”), and within them you have to distinguish global from local abels. (Local labels begin with an underscore). Scopes may be nested arbitrarily, and when looking for a symbol it looks up the scope chain.

The xa65 assembler does this one better, by defaulting to local labels (which turns out to be more convenient) and while allowing full nesting, it also lets you export labels either to the global level, or one level of nesting up.

ACME has a notion of “zones”, which work like Ophis or xa65 scopes, in that they may be nested (unlike CA65 procedures). However, subzones do not have access to the local labels defined in an enclosing zone; it’s less “subzone” and more “interrupting zone”. One neat feature of these is that zones can be named in ways that show up in error messages but nowhere else.

Kick Assembler works sort of like ACME, but no label inside a scope can escape. It provides a separate namespace feature to for exposed, named values.

Summary: CA65 has the most familiar design, with some unique features here that are customary for x86 assembly programmers; 64tass has the best overall design.

Anonymous Labels

For extremely local control constructs, even generating temporary labels like .done is overkill. Many assemblers allow for anonymous labels, and then you can use labels like -- to mean “two anonymous labels before this line” or +++ to mean “three anonymous labels after this line”.

Ophis and ca65 implement this feature as stated, though they use slightly different syntax to do so. Ophis uses the asterisk to mark anonymous labels, while ca65 uses a colon. When jumping to an anonymous label, ca65 also prefixes a colon to the string of -s or +s.

Kick Assembler merges this with temporary labels, with a concept they call “Multilabels”. The target !loop-- is “Two definitions of !loop ago”.

ACME and 64tass pretend to have this, but use strings of -s or +s as a special kind of local label; when resolving it they search backwards or fowards, as needed, for the next matching string. This means that the labels themselves are also strings of – or + characters. This can make code superficially identical to Ophis/ca65 loops jump to different places. It is however internally consistent and seems to also track the behavior of some of the older assemblers. While this isn’t strictly as powerful as full anonymous labels, for everything that matters it’s good enough.

Macross and xa65 both lack this feature entirely. Macross, like it does with temporary labels, obviates the need for them with its structured-programming approach; in xa65 you just have to live without and use temporary labels.

Summary: Ophis, Kick, and CA65 do this right; ACME and 64tass fake it acceptably.

Linking and Relocating Code

Most jump commands require you to name the address you are jumping to. This means that each instruction needs to have a memory location associated with it at the time the final binary is emitted. There are three basic strategies for doing this:

  • Memory dump. When code is assembled, it’s assembled into the target chunk of memory directly, or a simulation thereof. You change where the “write cursor” is with a command like .org or by assigning the program counter as if it were a variable. It is often legal to write the same location multiple times; this has the effect of back-patching already emitted code. Once the sources are processed, a chunk of the memory is then written out as the result. ACME and Kick work this way by default; CA65 and Macross can be programmed to.
  • Linker script. Like memory dump, but you actually assign the addresses as late as possible. Code is assembled in a relocatable format, to produce object files not unlike modern development systems. A linker then patches up all the addresses and knits it into a final binary image. The linker may also be responsible for emitting binary headers that allow the OS (or emulator) to interpret the binary properly. CA65 and Macross work this way by default; ACME includes output modes that handle Commodore PRG files and Apple II binaries.
  • Simulated Program Counters. Here, the output file format reigns supreme. There is a file offset, that starts at 0 and increases steadily with each byte output. There is also a virtual program counter, which increments automatically as instructions are assembled, and instructions are assembled as if the program counter had that value. Like the memory-dump scheme, the program counter is assignable; unlike it, all this assignment does is alter the virtual program counter. This is how Ophis and xa65 work; ACME and Macross can be programmed to work this way. I stronly suspect that CA65 can also do this, but I have never worked out how. Ophis’s segment system allows an unlimited number of virtual PCs for laying out both code and reserving memory in various locaions; ACME and Kick have a pseudopc directive that is the best implementation I’ve seen for this when arranging code, but it doesn’t handle reserving and collating memory. Fortunately, ACME’s macro system is strong enough to handle that. Kick prefers to use “virtual” segments, where you lay them out but don’t output them.
  • All of the above. 64tass is a wild combination of all of these. It is normally in memory dump mode, but it supports offset or relocated assembly, and it includes segment definitions that primarily serve to operate as a linker script for the part of the code that’s in memory-dump mode.

Summary: There actually isn’t a consensus on how to do this, but I have a very strong preference for simulated program counters and free-form file output and collated data segments, which is why ACME is the only assembler that’s tempted me to switch away from Ophis.


Nobody anywhere ever uses macros the same way. If macro systems are important to you this feature will probably not only influence but dictate your tool of choice.

  • xa65 uses a subset of the C preprocessor. This works out to textual replacement macros, with all the benefits and pitfalls we know and love.
  • Ophis defines blocks of code that take evaluated values as arguments, which it binds to temporary labels. This neatly evades most of the issues with the C preprocessor, but it does mean that we can’t, say, change the addressing mode of an instruction within a macro by prepending a # to what is supposed to be an address. A more serious emission is that Ophis entirely lacks conditional assembly and reassignable labels, so you cannot have macros that do things like “emit ten copies of this block of code with the label named I taking values from 1 to 10”.
  • ACME works like Ophis, but does support reassignable labels and conditional assembly, giving it a full macro language. It also allows labels to be imported by name, letting you define macros that define labels. (This feature is sufficient to implement Ophis-style unlimited uninitialized memory segments.)
  • CA65 uses textual replacement with full conditional assembly, for a result that’s similar to ACME but more finicky and more powerful.
  • Macross operates on the expression level. Furthermore, its macros are a complete interpreted C-like macro language, including the ability to define intermediary functions that return values while also emitting code.
  • 64tass and Kick are super-finicky, defining various flavors of macro and function and distinguishing between numeric, string, and “pseudo-instruction” invocation modes, each of which handles arguments slightly differently. The end result covers CA65’s range.

Summary: CA65 is the most powerful and generic of the traditional macro systems; ACME is less powerful but still generally good enough and slightly easier to use. Macross is a breed apart.

Text Generation

Annoyingly, no 8-bit system actually uses ASCII. This means we need some way to get strings in your program text into formats the machine expects. Ophis, ACME, xa65, and CA65 all support some predefined target character sets. Ophis, ACME, and ca65 also allow you to define arbitrary translation tables. Ophis and CA65 also let you do this in pure text. ACME allows some bitmasking operation as you go, which is handy for obfuscation or for screen effects.


This would be the part where I’d get to say that some assembler or other is obviously the best one to use for new projects, but I actually can’t say that. Every single one of the assemblers I name here has something both good and unique in at least one of these categories, and there isn’t a consensus amongst developers of which features are the crucial ones. Hopefully, though, this tour is enough to give you food for thought about how to evaluate the various tools.

Edited 3 Feb 2016: added description of Macross’s extern functionality.