A Look Under the Hood of objc_msgSend()

When I began to learn Objective C, I often heard of people talking about "sending message", I was curious, come on, it's just calling a method.

Then I recalled that I had that same feelings when functions become to methods as I began to touch C++.

To support polymorphism, C++ object model has a vtable inside every object which stores instance methods addresses, by which, C++ has a little dynamic characteristic.

But those dynamic are settled after compiling to machine code, unlike ruby or python, which has a virtual machine to dynamically interpreter the behaviors of a program, you might know, the duck typing.

Like C++, Objective C is also compiled to machine code, it supports polymiorphism, but also, to another level, it supports duck typing, you can send messages to any objects even the nil.

What does it do to make it possible?

A Simple Program

Let's do some tests by first creating a simple program.

Example.h

1
2
3
4
5
#import <Foundation/Foundation.h>

@interface Example: NSObject
- (void)doSomething: (int)index;
@end

Example.m

1
2
3
4
5
6
7
8
9
10
11
12
13
14
#import "Example.h"

@implementation Example
- (void)doSomething: (int)index
{
}

@end

int main (int argc, const char * argv[]) {
Example *p;
[p doSomething:10];
return 0;
}

A class Example is created, then a message is sent to its instance p.

Original Shape

Producing assembly by clang -S Example.m -> Example.s.

The result is a little complicated. Generally speaking, it does three things.

First, setup necessary information, then make a call to _objc_msgSend and store the return value to a local variable.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Ltmp9:
.cfi_def_cfa_register %rbp
subq $32, %rsp
movl $0, %eax
movl $10, %edx
movl $0, -4(%rbp)
movl %edi, -8(%rbp)
movq %rsi, -16(%rbp)
movq -24(%rbp), %rsi
movq L_OBJC_SELECTOR_REFERENCES_(%rip), %rcx
movq %rsi, %rdi
movq %rcx, %rsi
movl %eax, -28(%rbp) ## 4-byte Spill
callq _objc_msgSend
movl -28(%rbp), %eax ## 4-byte Reload
addq $32, %rsp
popq %rbp
ret
.cfi_endproc

Then we could jump to objc runtime source code to see what happens down the rabbit hole.

The Runtime

objc_msgSend is called during every message sending, which might happen millions of times only by booting the Mac OS X system, obviously, it needs to be fine tuned, no surprise, it is written in assembly.

Although those are assembly, they're fairly readable by the good naming conventions and explainable comments.

runtime/Messengers.subproj/objc-msg-x86_64.s

1
2
3
4
5
6
7
8
9
10
11
12
ENTRY	_objc_msgSend
DW_START _objc_msgSend

NilTest NORMAL

GetIsaFast NORMAL // r11 = self->isa
CacheLookup NORMAL, _objc_msgSend // r11=method, eq set (nonstret fwd)
jmp *method_imp(%r11) // goto *imp

NilTestSupport NORMAL

GetIsaSupport NORMAL

First, executes NilTest macro, to check whether the message sending target is nil, if nil, then returns nil, that's why we could send messages to a nil object.

Then, uses GetIsaFast to get the isa address.

Every objc object has a member named isa, it's the blueprint of an object, which has all objc runtime needs to inspect an object and see what its class is and then seeing if it responds to some messages.

Finally, by CacheLookup, objc runtime searches for the selector responsible for the message in a class method cache, and invokes the selector to finish the work.

Since objc is a object oriented language, some objects may inheritant hundreds of methods, but only some are frequently called, it's not efficient to look up all the selectors every time a message is needed to be sent.

If objc runtime failed to find the selector for a message, it jumps to handle the cache miss.

Then MethodTableLookup will take the responsibility to look up for the selector from the target isa.

1
2
3
4
5
6
7
8
9
LCacheMiss:
DW_MISS _objc_msgSend
GetIsa NORMAL // r11 = self->isa
MethodTableLookup %a1, %a2, _objc_msgSend // r11 = IMP
cmp %r11, %r11 // set eq (nonstret) for forwarding
jmp *%r11 // goto *imp

DW_END _objc_msgSend, 1, 1
END_ENTRY _objc_msgSend

The MethodTableLookup simply transfers the responsibility.

1
2
3
4
5
6
.macro MethodTableLookup
...

call __class_lookupMethodAndLoadCache3
...
.endmacro

Again, responsibility is transferred to lookUpMethod, pay attention to the parameters assignments.

1
2
3
4
5
6
7
8
9
10
11
// runtime/objc-class.mm
/***********************************************************************
* _class_lookupMethodAndLoadCache.
* Method lookup for dispatchers ONLY. OTHER CODE SHOULD USE lookUpMethod().
* This lookup avoids optimistic cache scan because the dispatcher
* already tried that.
**********************************************************************/
IMP _class_lookupMethodAndLoadCache3(id obj, SEL sel, Class cls)
{
return lookUpMethod(cls, sel, YES/*initialize*/, NO/*cache*/, obj);
}

Finally, lookup is done, and cache is refilled.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
// runtime/objc-class.mm
/***********************************************************************
* lookUpMethod.
* The standard method lookup.
* initialize==NO tries to avoid +initialize (but sometimes fails)
* cache==NO skips optimistic unlocked lookup (but uses cache elsewhere)
* Most callers should use initialize==YES and cache==YES.
* inst is an instance of cls or a subclass thereof, or nil if none is known.
* If cls is an un-initialized metaclass then a non-nil inst is faster.
* May return _objc_msgForward_internal. IMPs destined for external use
* must be converted to _objc_msgForward or _objc_msgForward_stret.
**********************************************************************/
IMP lookUpMethod(Class cls, SEL sel, BOOL initialize, BOOL cache, id inst)
{
...

If you want to step deeper, objc runtime source can be downloaded at this place.


At last, where do I find the objc runtime in the Mac OS X system?

/usr/lib/libobjc.A.dylib

Why am I so sure?

Since I lost my mind, wanted to remove the runtime to see what will happen.

First, Terminal stopped working, then Apps became irresponsible, no new App could be opened, and even can't reboot the whole system ...

Now I deeply know how important the objc runtime is in Mac OS X, and thank it, boot to recovery has a terminal to use, I was lucky.