Objective-C for the AVR, part 2

This is part two in an occasional series about my AVR Objective-C runtime. See also my other entries on the topic (1, 3, 4).

My previous entry culminated in the successful dynamic dispatch of an Objective-C message. But there are a few other cases of messaging that I didn't cover. Here, I'll discuss how I implemented them in my runtime.

Messaging nil

In Objective-C, sending a message to nil is a legal operation, and the result of the message (if not of type void) is a value whose bit representation is all zeros. For example: null pointers for pointer types, 0 for integer types, 0. for floating-point types, false for _Bool, etc.

In theory this could be handled by the compiler. It could insert a check at each message-send and bypass the messaging routine if the receiver is nil. But this could prove extremely costly in terms of code size for message-send-heavy code, increasing memory pressure and reducing the effectiveness of the instruction cache. (In theory, certain checks could be elided where the compiler can prove the receiver is non-nil—messages to self, messages to class objects, or other situations provable with static analysis. But many callsites would still need to be checked.)

So in most cases, the nil handling is done by the runtime. For avr-objc, that means it needs to happen in objc_msgSend.

It's easy enough to check whether the first argument is nil: check r24 and r25, the registers in which a two-byte id would be passed according to the calling convention. Here's what that looks like:

    ; self <- r25:r24
    ; _cmd <- r23:r22

    tst r24
    brne __msgSend_nonzero
    tst r25
    brne __msgSend_nonzero

    ; TODO: handle nil self

    ; ...

What goes in the TODO? When self is nil, the routine needs to return the appropriate "zero" value based on the signature of the method being called. But since self is nil there's no way to know what class of object the sender expects to be messaging.

Even _cmd—which is assumed to be always valid—is no help here. After all, there might be two unrelated classes, one of which declares - (uint8_t)size and another which declares - (double)size. So whatever code goes here has to work for any of the return type possibilities.

The calling convention tells us that:

Return values with a size of 1 byte up to and including a size of 8 bytes will be returned in registers. Return values whose size is outside that range will be returned in memory.

Conveniently, most of our common C types—_Bool, all of the integer types, all of the floating-point types, all pointer types—fit within 8 bytes. It also reminds us that:

R18–R27, R30, R31 [...] are call clobbered. An ordinary function may use them without restoring the contents.

Put together, this means that the message dispatcher can set r18 through r25 to zero regardless of what the caller expects. This works out nicely: no matter what size the expected value is (up to 8 bytes), the caller will get the right zero value. If the expected return type is smaller than 8 bytes, the caller will simply "see" the extra zeros as clobbered registers—no harm done.

So the nil branch simply looks like:

; self == nil, so return zeros
ldi r18, 0
ldi r19, 0
ldi r20, 0
ldi r21, 0
ldi r22, 0
ldi r23, 0
ldi r24, 0
ldi r25, 0

Struct return

What about values larger than 8 bytes? In C (and Objective-C), a function (or method) can return certain aggregate types (structs and unions, though not arrays), weird implementation-defined integer types like __uint128_t, plus one other case that I'll cover below. The calling convention states that, for values of these types:

the caller will allocate stack space and pass the address as implicit first pointer argument to the callee. The callee will put the return value into the space provided by the caller.

So these values are returned "in memory", and since they are of unknown size, there's nothing the dispatcher can do to create the return value for the nil case. Write too few bytes through the pointer, and the caller will see garbage; write too many, and you've smashed the stack.

So for these situations, the compiler—which statically knows the return type—needs to insert code for a nil check. And we can see empirically that it does:[1]

$ avr-gcc -S -o - -Os -fnext-runtime -fobjc-abi-version=2 -fno-objc-sjlj-exceptions -fobjc-nilcheck -xobjective-c -
struct point { double x, y, z; };

@interface View3D
- (struct point)origin;
+ (void)printPoint:(struct point)p;

View3D *view;

int main(void) {
    struct point p = [view origin];
    [View3D printPoint:p];

    ; ...
    lds r22,view          ; load `view` global
    lds r23,view+1
    mov r17,r29           ; move the stack pointer to r16:r17
    mov r16,r28
    subi r16,-1
    sbci r17,-1
    cp r22,__zero_reg__   ; is `view == nil` ?
    cpc r23,__zero_reg__
    breq .L2              ; if so, branch off to `.L2`
    lds r30,msgref        ; otherwise, call `objc_msgSend_fixup`
    lds r31,msgref+1
    ldi r20,lo8(msgref)
    ldi r21,hi8(msgref)
    mov r25,r17
    mov r24,r16
.L3:                      ; come back from either messaging path
    ; ...                 ; do rest of the function
.L2:                      ; messaging path for `view == nil`
    ldi r24,lo8(24)       ; loop through stack memory, storing 24 bytes of zeros
    mov r31,r17
    mov r30,r16
    st Z+,__zero_reg__
    dec r24
    brne 0b
    rjmp .L3              ; return to the common path at `.L3`

Struct return: a bigger concern

However, this brings up a much more important point. If the caller is passing a pointer to a return value buffer as the first argument, then the message dispatcher is going to treat that pointer as self and try to dispatch through it. That's completely wrong, and unless we're extremely lucky, it will crash the program.

Enter objc_msgSend_stret, a second dispatcher, designed to handle this very problem.[2] (Although stret stands for "struct return", it's used for any method whose return value is returned in memory. Structs that fit within the 8-byte register limit in fact don't use stret messaging.)

Luckily, all of the hard work of objc_msgSend is already split out into class_lookupMethod. objc_msgSend_stret simply needs to look at r23:r22 for self (and r21:r20 for _cmd) and then follow the same procedure otherwise. And since stret messages are nil-checked by the compiler, objc_msgSend_stret can skip the nil check we just added to objc_msgSend proper.

stret messaging is orthogonal from fixup messaging, described in my previous entry, so—yes—I also had to write an objc_msgSend_stret_fixup. I took a little time to refresh my knowledge of GNU assembler macros to make the code a bit more maintainable.

I won't claim to be an expert there, but I'm pretty content with what I ended up with. Here's a macro that adds a wrapper around objc_msgSend_foo to create objc_msgSend_foo_fixup:

.macro FIXUPVARIANT messenger stret
.globl \messenger\()_fixup
    ; X <- message_struct
    .ifb \stret
        movw X, r22
        movw X, r20

    ; X <- &message_struct._cmd
    adiw X, 2

    ; _cmd <- *X
    .ifb \stret
        ld r22, X+
        ld r23, X+
        ld r20, X+
        ld r21, X+

    jmp \messenger

FIXUPVARIANT objc_msgSend_stret STRET

Complex types

One questionable thing that C (and therefore Objective-C) supports is a set of complex types: _Complex float, _Complex double, and _Complex long double. Each is represented by two instances of its associated "real" type, one instance for the real axis, and one for the imaginary axis. So you can imagine _Complex float as looking like float[2], and so on.

double and long double are 64 bits wide in my environment, so _Complex double and _Complex long double do not fit into the 8-byte limit. So GCC correctly uses the "stret" calling convention, passing a stack buffer in r24:r25 for the return value.

Unfortunately, it proceeds then to call the non-stret messenger. So that's a bug! Luckily, no one uses this feature, so it doesn't really matter.

Messaging super

The code I wrote in my last entry, in class_lookupMethod, handled the case where a class Derived inherits a method from superclass Base without overriding it. If that method is called on an instance of Derived, the runtime will automatically dispatch it to the method implementation from Base.

A class can also override a method and explicitly call into the inherited implementation, using a [super ...] message.

super messages are different from normal messages, in that the runtime can't simply start at the object's class when looking for the implementation. Instead, the search must start from the class whose implementation contains the callsite. If we're executing inside -[Derived overriddenMethod], then a call to [super overriddenMethod] should start looking in the superclass of Derived (in this case, Base).

The dispatcher has no reliable way of knowing the identity of its caller,[3] so the caller provides that information explicitly, in the form of a struct objc_super:

struct objc_super {
    id receiver;
    Class super_class; /* super_class is the first class to search */

When the [super ...] message is ultimately dispatched to the implementation from Base, the arguments need to be in the same places as for an ordinary (i.e., non-super) message. (Otherwise, there would have to be two implementations of each of Base's methods—one for ordinary messages, and one for super messages.)

For that reason, the struct objc_super is passed into the dispatcher by pointer, in place of self. A special dispatcher routine—objc_msgSendSuper—knows how to handle the struct. Because class_lookupMethod takes a class, not an object, we can simply direct it to search for a method starting at super_class rather than receiver->_isa.

After class_lookupMethod resolves an IMP, objc_msgSendSuper swaps receiver into r25:r24—the spot for the first argument—before jumping to the IMP.

Again, since fixup and stret are orthogonal to super messaging, there are actually four super message routines to acommodate the various possibilities. (In the stret cases, the struct objc_super is passed in r23:r22—again, where self would normally reside.)

And that does it for messaging.

In my next entry, I'll discuss load-time parsing of class and category metadata.

  1. Well, it does if I first fix this frontend bug↩︎

  2. Greg Parker has a good description here, including details of how it works on somewhat less esoteric architectures. ↩︎

  3. Yeah, the dispatcher could look at the return address saved on the stack to deduce its caller's identity. But that return address might not reflect the actual caller if tail-call optimization is enabled. And resolving a symbol from an address is nontrivial in an environment without symbol tables. It would also be costly, and, anyway, it would be more code I'd have to write! ↩︎