A Grimoire of 8-Bit Implementation Patterns

In theory, retro assembly language programming is pretty simple. There’s only a small number of things that an 8-bit chip can do, and each of those things is one instruction. The challenge is to break down your design and your logic into pieces small enough to represent in these small steps, and to do so without the entire design collapsing into chaos. But theory doesn’t quite translate into practice. As one grows more experienced, certain idioms and patterns start to emerge on top of the chip design’s core substrate.

There’s nothing unusual about that, of course—implementation patterns are a thing everywhere, including in 6502 and Z80 assembly code—but when you’re at this low of a level it means that the basic instructions themselves are being bent or warped in the service of something that they were not quite designed to do. That can be bewildering if you’re not entirely sure what you should be looking at.

So—below the jump, I present a catalog of seven different, common ways to bend 8-bit assembly language out of its design space. Intermediate developers may learn some useful tricks. Novice reverse engineers may find some explanations for the insane things that period code often got up to.

Continue reading

Sponsored Post Learn from the experts: Create a successful blog with our brand new courseThe WordPress.com Blog

WordPress.com is excited to announce our newest offering: a course just for beginning bloggers where you’ll learn everything you need to know about blogging from the most trusted experts in the industry. We have helped millions of blogs get up and running, we know what works, and we want you to to know everything we know. This course provides all the fundamental skills and inspiration you need to get your blog started, an interactive community forum, and content updated annually.

C64 Startup Code In Detail

Despite it being one of a primary topic on this blog (including the very first substantial post), and despite revisiting it again years later, I’ve only ever touched on how to actually build a Commodore 64 program from scratch in passing. It’s one of the simplest systems to work with, but as a result of that I’ve never given it the kind of attention that the ZX Spectrum or the Atari 800 got. Let’s fix that.


The C64 was able to load and save files off of cassette tapes, but for most of its lifespan it was expected that you would be storing data on floppy disks. These plugged into the C64 via its serial port, and, interestingly, the DOS for these floppy disks was stored on ROMs inside the disk drives themselves. As far as the main computer itself was concerned, it was just sending file requests down the serial port and then getting file data back in reply.

This is particularly nice when doing cross-development. As a rule, there is no difference whatsoever between the file you create with your tools and the file that the Commodore ultimately sees. You may use other tools to package the files into a disk or tape image, but most emulators will happily load files directly off your hard disk with no intermediate step required, and this doesn’t even require any trickery in the emulated system—the BIOS asks for a file and gets the data back through the serial port, just as it expects.

The Commodore DOS has notions of various kinds of file on disks, but when dealing with more abstract files like these there are really only two that make sense to talk about: program (PRG) and sequential (SEQ) files.

Sequential files are exactly like files on a modern file system; a stream of bytes with no other structure imposed on it. There are only two reasons you can’t just share text files directly between a C64 and a modern computer:

  • PETSCII doesn’t map cleanly to ASCII. In particular, its mixed-case mode flips the location of the uppercase and lowercase letters.
  • The C64, like most home computers of its age, used a bare carriage return (character code 13) for its end-of-line marker, unlike Unix-like systems (which use a bare linefeed, character code 10) or Windows and Internet network protocols (which generally use a carriage return followed by a linefeed). Modern systems have conversion tools to flip between these three forms; Unix-like and Windows systems tend to call the bare-carriage-return format that the C64 uses “Macintosh line endings”, despite the fact that every Mac made in the past 20 years uses Unix line endings.

Program files are generally what we’re interested in when doing cross-development. These are the files that are loaded with BASIC’s LOAD command, and they represent a dump of a contiguous block of memory. Since there isn’t much that needs to be expressed, the PRG format is extremely simple: the first two bytes of the file are the address to load into, and the rest of the file are the data to load into memory starting at that address. The famous command

LOAD "*",8,1

loads the first file off of the disk into the location it specified, which would then put it into a position where it was ready to run.

But this is not the whole story. It turns out that the load address in the file is purely advisory. The ,1 at the end of the LOAD statement is there to insist that the C64 actually respect the load address that the file specifies and copy it into place unchanged. Without it, the program will be loaded to a location of the C64’s choosing and possibly transformed along the way.

To see what it does, and why anyone would ever want to do it (much less make it the default as it in fact is), we will need to dig more deeply into how BASIC programs are stored on disk and in memory.

Continue reading

Building a Faster C64 Bitmap Library

One of the big advantages of home computers in the early 1980s over their console counterparts was that they usually had some form of bitmap mode—a graphics mode that grants control over each pixel individually. The alternatives generally involved character or tile graphics, where a set of predefined patterns are assembled in a grid to produce a screen. This approach is much faster and allows for more efficient use of memory, and it also meshes better with text modes. Each letter gets its own tile and the layout of tiles corresponds to a text screen. Every system we’ve looked at on this blog, with the exception of the ZX Spectrum, has some kind of text or tile mode. The Spectrum only provides a bitmap mode, and it prints text by copying letters out of its character generator ROM into the bitmap as if they were 8×8 sprites.

Sometimes the boundary between the two is fuzzy. When we did our rescue of Mandala Checkers on the ZX81, we made use of its predefined characters to produce something like a very low-resolution bitmap (a technique often called semigraphics). When experimenting with the PC’s CGA card, we warped the text mode so severely that we actually got a relatively credible 16-color graphics display out. When we looked at the graphics on the Sega Genesis, we found that it relied entirely on tile-based graphics but it also provided so many tiles that we were able to produce the equivalent of a bitmapped display with it.

The C64 has a programmable character mode like the consoles of its era—we’ve used it in the past to experiment with custom fonts—but it also provides a bitmap mode. This bitmap mode turns out to more or less be an automated version of the technique we used on the Genesis—the 320×200 screen is organized as a 40×25 grid of a thousand 8×8 cells. Each cell is specified by 8 bytes, each of which represents a single row in the 8×8 character space, much like it would in a custom character set definition.

This produces a deeply weird memory layout compared to the bitmapped framebuffers we’ve seen on the PC or the Apple II. The first byte represents the top row pixels 0-7, but then pixels 8-15 are byte 8, and 16-23 are byte 16, and so on up to the right edge, where pixels 312-319 are represented by byte 312. When we begin the next row of pixels, we use bytes 1, 9, 17, and so on up to 313. This continues until the 8th row of pixels, which repeats the pattern again starting at byte 320. Even then, that’s only enough to narrow down your location to 8 pixels. The 8 bits of the byte then correspond to each one pixel, with the larger pixels representing pixels further to the left (so, the leftmost pixel is the 128 bit, and the rightmost one is the 1 bit). The Commodore 64 Programmer’s Reference Guide provides a master formula for setting a pixel in BASIC (assuming the bitmap is at $2000, which is the usual place to put it in small programs):

BYTE=8192+INT(Y/8)*20+INT(X/8)*8+(Y AND 7)

That is quite a bit of math, and it includes several multiplications that the 6502 chip will not do for us. There is also the delightful quality that properly specifying an X coordinate requires a 9-bit integer. My old bitmap library managed all this as directly as possible, and it could handle initialization and pixel control with a footprint of only 168 bytes, but it was quite slow.

It’s usually possible to trade space for time, though. With 64KB of RAM, we don’t have to be quite as stingy as we have been.

The Goal

I started looking into this because my old bitmap library wasn’t good enough to support the Simulated Evolution program from back in April. I had two problems with it: it was way too slow, and it wouldn’t be able to draw the displays we needed anyway.

Normally, the C64’s bitmap mode is a 320×200 display, and within each 8×8 cell we may only use two colors: a foreground and a background. This follows inevitably from the fact that we’re only using one bit for each pixel. However, we need to be able to draw three colors with complete freedom to draw the Simulated Evolution screen: white for the bugs, green for the plankton, and blue for the background. Fortunately, the C64 has an alternate mode (multicolor mode) where the resolution drops to 160×200 but this menas that we now have two bits of data per pixel. This lets us do what we wish and also lets us preconfigure the whole screen to just use the colors we want.

Unfortunately, it also makes setting any given pixel rather trickier. We need to set two bits, but depending on the color we wish to set, we may have to set the pixels to different values, and that means performing more complicated bitmasking operations. As part of our operation we’d like to pay as little extra cost as possible when doing this.

The Strategy

The work that’s costing us the most time is the multiplication and bit rotation operations that let us actually find the correct offset into our bitmap. The most expensive part of that is the Y coordinate; most of the X coordinate math ends up cancelling out neatly and turning into simple bitmasking operations. (As an added bonus, now that we’re working with a 160×200 screen, our X coordinates fit into an 8-bit register again, and for those operations that need to treat it as a 9-bit value, the carry bit was custom-built for this very purpose!)

The most direct way to make an attack on the multiplications and bitmasking operations that we need for the Y coordinate is simply to not do them. With 200 possible pixel rows, we can simply precompute the start of each row of pixels and store them in RAM. That turns a lot of math into a single 16-bit table lookup. Similarly, we may create some tables to make it easier to translate the residual bit of the X coordinate into the bitmasks we’ll need to use to get our multicolor results.

Our tables definitely multiply the size of our code, though: we’ve expanded our memory footprint from 168 bytes to 703. On the plus side, the actual program size on disk is only 300 bytes or so, so we haven’t quite doubled our program size when it comes to loading time.

Continue reading

DirectX 9: A Sudden, but Inevitable, Betrayal

This DX9 pixmap library I’ve created seems useful and convenient. Perhaps DX9 isn’t so bad after all! Alas, this past week has provided two new opportunities for it to cause me grief.

Alpha Strike

The StretchRect function that is the heart of the pixmap engine seems like it is much more general than the single use we’re making of it. We’re creating a single collection of pixels and then scaling it onto part of the screen—why should we stop there? “Scale a collection of pixels onto part of the screen” is, after all, the core operation of any 2D sprite or tile-based game engine. We could build a far more sophisticated 2D graphics system out of these same parts, right?

Unfortunately, this is not the case. For a simple example, look at what happens if I load my old Bumbershoot software texture and try to display it on a blue background:

The black rectangles on the sides are not supposed to be there. The image is 256×256, as a nod to the restrictions on GPUs at the time that image was first created, but the image itself is not square. It got around that by including transparent regions along both edges. As is clear, though, these regions are not transparent!

With the code as I’ve put it on Github, there’s a very obvious reason for this: my surface formats have no alpha channels. But even correcting the format to properly include an alpha channel, the display remains unchanged. StretchRect actually copies the alpha values from the surface into place on the target surface instead of performing the blending operation that we would actually want to do.

It turns out that StretchRect is suitable only for exactly one of my problems: the problem of scaling a completed picture to an arbitrary window size. It is in fact more general than just this, but it doesn’t generalize in the direction of sprite engines. Its more general form stems from the fact that IDirect3DSurface9 is also the type for offscreen render targets, which means that we may render a 3D scene to a surface and then use StretchRect to scale the result into place in a way that grants us similar control over scaling and aspect ratio.

If I want to build a full 2D system where the GPU handles most of the scaling, rotation, and other pixel-blit-level operations, I’m pretty much obligated to hand things over to the texturing system, which will mean building a system that looks rather a lot like the OpenGL systems I’ve built repeatedly in the past—some fixed vertices, a set of static textures to represent the sprites, and a handful of other textures whose contents are updated every frame for the more dynamic parts of the display.

Normally, these days, I use SDL2 to do that. On Windows, its default renderer is even DX9-based; it identifies as direct3d, but the more modern DX11-based renderer is named direct3d11. So… is this the best of both worlds?

Well. If it were, it wouldn’t be part of this article, now would it?

Locked Out

Late last year i wrote about porting The Ur-Quan Masters to SDL2. Since then, I’ve had a somewhat bumpy road with it as it was tested on various machines and installation scenarios, which is a major reason I haven’t had anything new to report on that front, even though it’s been eight months.

One particularly nasty thing that came out of that testing, though, was that trying to mess with the default renderer generally produced bad effects—even systems that notionally supported DirectX 11 out of the box would fail to run the SDL2 DX11 renderer, systems that claimed to support OpenGL would do much worse with it than they would with OpenGL ES, or vice versa. This mostly isn’t a problem—just leave the defaults alone, right?—but one of my test machines would crash immediately after creating the window if it used the default renderer, which turned out to be the DX9 one.

Continue reading

Making Use Of Our New DX9 Pixmap Library

After four posts, we’ve finally got a fully functional abstraction layer over DirectX 9 that lets us treat our window as a rectangle of pixels. Let’s do something with it. Normally this would involve describing some new system that I’d written…

#include <stdlib.h>

#include <windows.h>
#include <tchar.h>

#include "dx9pixmap.h"
#include "CCA.h"

… but you know what? I already made a version of the Cyclic Cellular Automaton in portable C back when I needed something to fuel the Unfiltered Cocoa project. We’ve got it for Mac; let’s get it for Windows.

Continue reading

DirectX 9: A Working Pixmap Module

After last time, I’m now confident that my basic approach is enough to let me write small C programs that interact with the screen in ways not unlike the old programs I used to write for DOS, or the 8-bit machines that had bitmap displays. It’s time to bundle it up with some reasonable API design and let it handle the ground-floor work while I focus on the simpler parts.

The Header File

The first step is to define the header file, which will determine how much of the system will be visible to and editable by other parts of our program.

A Brief Digression: What Even Are Header Files?

Most modern languages like C#, Java, or Python have some kind of module or package system where library code is essentially named in a way that makes naming the library tell the compiler how to import it. At the other end of the scale, the sample programs I’ve presented here in the past in BASIC or assembly language were, at some level, completely self-contained programs within a single file. If a source file were ever chopped up into multiple parts, they would invariably contain a set of include instructions to assemble them into a single master source file before any actual translation occurred.

C (and descendant languages like Objective-C and C++) do not impose any structure on your source tree beyond the level of individual files, but it is traditional to build each source file on its own and then assemble the resulting compiler products (the object code) into the final binary. (The fancy term for this is separate compilation.) However, C has a problem when it wants to do this; it has no way of knowing what functions exist in other files or libraries, nor does it know how to call them even if it knows they exist. C and its descendants have a very strong rule that you can’t directly use any data types that weren’t defined in your file, and a somewhat weaker rule that you can’t use any function that hasn’t been either already defined or at least declared to exist.

What this means is that the way C files get combined is that each file that defines functions has a parallel file that declares all the functions that they want other files in the project to know about, and then each file uses the #include directive to incorporate all those declarations into themselves at the top (or “head”—hence, header file). Since functions need to have the types they work on defined in advance, those also generally go here.

Many early-1970s programming languages like Pascal, or more modern functional languages, include a notion of interface files that more formally capture the function of header files as C uses them, but C relies on nothing more complicated than textual inclusion to do its work.

Continue reading

Aspect-corrected Image Scaling with DirectX 9

Continuing our long tradition with OpenGL, Cairo, and SDL, it’s time to bring DirectX 9 up to speed with these other technologies so that our pixel buffers will be displayed in a consistent way no matter how large or small the window is made to be. This will also let us maximize the window or take it fullscreen without interfering with anything.

Our StretchRect-based system is a hybrid of the OpenGL and Cairo ones—we are dealing with an ordinary GUI window in Win32, and like Cairo, those are measured with (0,0) in the upper-left corner and with the width of the viewport measured in pixels. This means that, like Cairo, we will need to scale and move the image rectangle to make sure it fits. However, unlike Cairo, we aren’t necessarily transforming the image itself. What we need to actually provide are a pair of coordinates in the viewport space that describe where the image goes. This translates more naturally to the OpenGL system’s vertex shader.

Happily, though, we do not have to actually do this with matrix transforms like we needed to with Cairo, or with vertex shaders like in OpenGL. Instead, we can merely fill in a RECT structure. The core formulae are still the similar, though:

  • Start by assuming we’ll fill the whole window, so the scaling factors Sx and Sy are both 1, while the offsets Ox and Oy are both 0.
  • Assuming the window viewport is Vw by Vh, and the image dimensions are Iw by Ih, we compute the aspect ratios of the viewport Av=Vw/Vh and image Ai=Iw/Ih.
  • If Av<Ai, then the width of the image will be the width of the entire window, and the scaling factor of the image height will be Sy=Av/Ai and offset Oy=Vh*(1-Sy)/2. Note that this scaling factor is applied to the screen and not the image; we’re determining what fraction of the window’s height will actually be used.
  • The flip case, where Ai<Av, has the height of the image be the height of the entire window, and the scaling factor for width of the image will be Sx=Ai/Av and offset Ox=Vw*(1-Sx)/2.
  • If Ai=Av, the image exactly fits in the window and the scaling factors stay 1 and the offsets stay 0.

This lets us set the two corners of our rectangle:

  • (x1, y1) = (Ox, Oy)
  • (x2, y2) = (Ox + Vw * Sx, Oy + Vh * Sy)

And if this was all there was to it, handling image scaling would be easier in DirectX than in any other system we’ve messed with except maybe SDL2. Unfortunately, it’s not quite so simple.

Windows and Backbuffers

As you may recall from last time, when we didn’t do anything special to handle resizing, our image still wound up scaling itself to fit the edges of the window we were in. This is because the size of the window has no particular relationship to the of the DirectX rendering target. This target—the backbuffer—has its size set when we create or reset our IDirect3DDevice9 object, and then it stays fixed. That backbuffer is then scaled, without aspect correction, to the window as its size changes. Worse, that also means that if we do a 3D render to a render target, the image quality will not improve as we make the window larger. If we want to actually render to a larger area as the window grows, or to an area whose aspect ratio matches the window, we will need to update the backbuffer as the window size changes.

This does, however, get irritating, because it is only permissible to reset the device when it’s not holding any resources. This rule applies at all times, not merely when the device had been lost for external reasons. That means that we’ll need to keep an eye on the WM_SIZE message and make sure that we destroy our GPU resources before recreating them if we detect that the window has resized.

This is, in my opinion, the point where you can no longer reasonably build applications by the seat of one’s’ pants. There’s enough that needs to be managed, and managed consistently, that I’ll need to split out all the surface management code into its own module. That will come next time.

DirectX 9: Drawing Some Pixels and Living to Tell the Tale

My overall goal here is to produce some fairly self-contained libraries that will let me use DirectX9 as a DOS-like pixel map, without having to worry much at the application level about the fact that I’m really running in a window or on a modern GPU. Step one will simply be to get DirectX to do anything at all, and ideally also get a proof-of-concept that pixel-level control is even reasonably possible.

My first application simply draws a two-dimensional gradient across the window. The Y dimension slowly shifts from red to blue, and the X dimension shifts from black to green. It’s also doing this, not with a Gouraud-shaded polygon, but by actually computing the color for its pixels, so with this simple program we are well on our way.

Let’s work through it.

Win32 Preliminaries

Any DirectX application is also a Windows application. Before we can really get down to do anything fun, we’ll need to start by building up an application shell that does all the basic work that a Windows Application does. We’ve done this before, working in assembly language. This time we’ll be doing it in C.

C is in some sense the “native language” for the Win32 APIs, so this should be about as simple as it gets. If you compare the work we do here to the earlier article, though, you’ll see that there’s very little drop in complexity.

Continue reading

DirectX 9: The Newest Old Thing

My first experiences with computer graphics were experimenting with the high-resolution graphics commands in BASIC on the IBM PC and Apple II. These offered fairly intuitive access to the screen as a rectangular array of pixels, each of which could be set one of a number of colors. The PC also allowed a limited amount of palette-swapping.

My first experiences with 3D graphics, on the other hand, involved learning OpenGL as a grad student in the early 2000s, at a time where the latest version was still 1.5 and shaders were a thing Windows programmers put together in weird pretend assembly languages. I ended up teaching myself the basics of modern graphics programming years later (thanks in large part to Jason McKesson’s excellent tutorial, but for the most part this remains a specialized skill that I appreciate but have no real skill in.

One of the interesting things about the history of graphics programming, though, is the huge break that occurred around 2007. It is at this point that GPUs stop presenting themselves to programmer as graphical accelerators capable of combining image data in increasingly complex ways, and more as a very large number of extremely parallel processing units that happen to be good at the things you need to do to render 3D graphics. The breakpoint happened in Direct3D with the release of Direct3D 10 in Vista, and became ubiquitous with D3D11 in Win7. In OpenGL, the much-delayed OpenGL 3.0 wiped out almost the entire API that I learned in grad school, with 3.1 adding it back again as a “compatibility profile” that needed to be specifically requested. This contemporary vision of graphics programming—where shaders are the fundamental unit of rendering computation—is basically what I needed to command when using OpenGL to render a properly-scaled pixel display within an arbitrary window.

I never really bothered with DirectX, at least not directly. OpenGL was supported everywhere, and for quite a long span of time Windows even had the best OpenGL support of the three major desktop OSes—so why bother with something that won’t work appreciably better and restricts you to Windows?

After a few years of Bumbershoot projects, there are some pretty good answers to that question now, at least for DirectX 9.0c:

  • It’s often interesting to see different approaches taken to what is essentially the same problem. I’m pretty familiar with OpenGL 2.1 now; playing with its contemporary DX9c would be an opportunity to explore new conveniences and new pitfalls side by side.
  • It is a natural extension of my project last year, where I treated the Windows platform with the same regard and priorities that I treat retro platforms.
  • DX9 was the final form of Microsoft’s approach to fixed-pipeline graphical rendering. Here in 2020, there are basically three versions of Direct3D that it could make sense to target, depending on what exactly you are doing. DX11 is the default nowadays, corresponding to recent versions of OpenGL or OpenGL ES; DX12 is available but only recommended for those who are trying to get even closer to the hardware because they know how to squeeze extra performance out (comparable to Apple’s Metal and Khronos’s Vulkan projects), and DX9 is the API that gives you that turn-of-the-century fixed-function pipeline if for some reason it is actually what you want (comparable to OpenGL 2.1). DX9 is the apex of an abandoned technological line, that was not itself abandoned. That’s interesting in its own right.
  • Messing with DX9 arguably counts as retrocoding these days. The usual definition I’ve seen in retrogaming forums and the like places “retro” as “two or more generations behind current”. The final version of DX9 was released in 2006, as part of Windows XP Service Pack 2; with DX12 and DX11 both extant, and fully five editions of Windows post-XP, this would seem to count. On the other hand, a variation of DirectX 9 was also the graphical API that powered the Xbox 360, which is only one generation behind (at least until the Xbox Series X is released later this year).

The Project

My general goal for this project is to get enough of a handle on DX9 that I can use it in ways broadly similar to the ways I’ve used SDL2’s rendering library, or the subset of OpenGL that I’ve deployed in projects like UQM and VICE. Once that’s been done, I may elect to push on into simple 3D meshes and compare the fixed-pipeline facilities with the OpenGL 1.2 code that I first learned, and then wrap up with some experiments with its first steps towards a shader architecture (inviting comparisons with 2.1).

In part to keep things as apples-to-apples a comparison as I can manage, a lot of the artifacts I’ll be producing along the way will be conversions of programs I’ve already written for other platforms. I will also be restricting myself to C for this project, avoiding both C++ (the traditional language for programming DirectX) and raw assembly code (which I often work in here). The ideal result for me would be to end up with an extremely simple library that I can trivially link against basically anything and which will let me write small, relatively efficient graphics programs on Windows.

What I Know Going In

Not much, but not nothing:

  • Direct3D libraries are semi-secretly COM objects, but this fact is usually hidden from developers. COM has an extremely strong C++ bias, so my decision to stick to C is certain to end up costing me at least something.
  • Versions of DirectX before 10 tended to be super-crashy unless you were very careful, because they allowed the OS to yank away the GPU resources you were using whenever it wanted, and not all programs were prepared to deal with that. In particular, this would happen like clockwork if you were running some fullscreen application and some taskbar application decided to pop up a notification.
  • Direct3D’s default Y axis and polygon winding are backwards from OpenGL.
  • At least some of my 2D operations are going to be much simpler because Direct3D 9 includes StretchRect() as a primitive.
  • There was once a library called D3DX that provided a lot of backup mathematical assistance for 3D work, very similar to the GLU library. Unfortunately, much like GLU, D3DX ended up being deprecated and removed. I have a vague sense that in D3D11 and later, much of this functionality was replaced by powerful inline functions defined in C++ header files external to D3D itself.

I’m far enough in now that I know that some of these things were only half-truths or qualified truths, but we’ll see those things when we come to them.

Up Next

The first thing I’ll need to do is convert my Windows application shell from WinLights from assembly language into C, and then expand that into a generic set of routines that can manage and display a pixel buffer. That would be sufficient power to let me port over pretty much all of the graphical but non-8-bit programs I’ve written on this blog over the years.

Restoring Traditional OOP to Go

The Go programming language is a relative newcomer, as such things as measured, and it’s a bit of an odd beast. My academic background included a great deal of programming language theory and comparison, and this has definitely colored my approach to the language. In particular, Go has always struck me as a language that looks to the past with a mixture of nostalgia and disdain—nostalgia, as its fundamental abstractions remind me more of 1970s systems languages and 1980s “end user” languages than current industry standards—and disdain for the trends in software engineering that grew out of them and came to dominate the field by the mid-1990s. It then had to operate in a world informed by and deeply embedded within those intervening 25 years of history, and it’s not always the cleanest fit.

The end result of all of this is that I do not particularly like Go, as a language. It has always struck me as loudly and pointedly casting aside the blinkers and excesses of modern object-oriented design, but then reinventing it very poorly. I have written extensively in the past about how object-oriented and functional programming styles can be imported into much simpler languages. Can we do the same thing moving forward, to provide a reasonable simulacrum of OOP behavior that Go’s design philosophy neglected or rejected?

Yes! Mostly.

Let’s start, though, by looking through what facilities the language provides us and how they map to the three main features of object-orientation:

  • ENCAPSULATION: Go programs are organized into packages. Only identifiers that start with a capital letter will be visible outside the package.
  • POLYMORPHISM: Go allows developers to define interfaces to which structures may conform. Interfaces may be used as type specifiers in their own right, which may be exploited in several ways to give polymorphic behavior.
  • INHERITANCE: Interfaces are additive, and may be defined to incorporate other interfaces within them. Any data type conforming to the outer interface will also conform to the types within it.

Well, OK then. We’ve hit all the points, right? What are we missing? Why have I been claiming that Go is neglecting OOP principles?

  • Data structures within a module can’t hide their members from other code within the module. In practice, this is fine—it’s still more enforcement than, say, Python has, and opaque-outside-of-the-module types are common and useful as-is.
  • Interfaces are maximally inclusive. If you need to restrict what data types may be used as part of an interface, you can’t rely on their lack of declarations. This is also rarely a problem in practice. In those cases where you do need to do this, the standard solution is to add dummy methods to your interface that serve only to identify intentional implementations. (See os.Signal for an example of this.)
  • Interfaces cannot have fields. If you wish to provide polymorphism over partially shared data, you will have to use accessor methods. This would normally be only mildly problematic, but this compounds our final issue…
  • Go has no notion of implementation inheritance. All data structure types are completely distinct. If you want ten types to behave identically when a shared method is called upon them, you must implement ten identical methods. The “type embedding” facilities can save you the need to type out the boilerplate by hand, but this doesn’t provide a complete solution.

This last is the real problem. As we’ll see, everything else is just a speed bump. Let’s get those out of the way first, and then we can take on the problem of implementation inheritance.

Continue reading