I have been writing a macOS userspace driver for a CAN bus adapter. The device uses a custom USB protocol — bulk endpoints, vendor control transfers, a command language with opcodes and terminators. The Linux kernel has an open-source driver for the same hardware, so porting it should have been straightforward. I read the kernel source, mapped the protocol structures, implemented the init sequence in Rust, and… nothing. The device’s status LED stayed dark. No frames received.
The closed-source macOS driver from the vendor, on the other hand, worked perfectly. Plug in, initialize, green LED, frames flowing. Same device, same bus, same cable.
Obviously the vendor’s driver was doing something different. But what?
On Linux, you would fire up Wireshark with usbmon, or cat /sys/kernel/debug/usb/usbmon, and have a complete packet-level trace within seconds. On Windows, there is USBPcap. On macOS, there is… nothing comparable.
Let me be specific about the options I considered:
Hardware USB analyzers. The gold standard. A device like a Total Phase Beagle or an Ellisys sits between the host and the device and captures every transaction at the electrical level. They work on any OS, they are completely transparent, and they cost between 1,000 and 10,000 euros. I did not have one, and buying one to debug a single driver port felt disproportionate.
Wireshark. Supports USB capture on Linux (via usbmon) and Windows (via USBPcap). There is also a macOS XHC20 capture path, but on current macOS it is not the same convenient “start capture, see packets” workflow. In practice it still runs into the same security wall: recent Wireshark guidance points at disabling System Integrity Protection for USB capture on Catalina and later. That put it in the same bucket as dtrace for me: interesting, but not a workflow I wanted for a driver port.
Apple’s Instruments. Has a “USB” trace template, but it records high-level IOKit events — device attach, configuration changes, interface claims. Not bulk transfer payloads. Useless for protocol-level debugging.
The macOS log subsystem. You can enable debug logging for com.apple.usb and com.apple.iokit, but what you get is plugin loading messages, device matching events, and power management transitions. No pipe data.
dtrace. Powerful, kernel-level, could theoretically trace IOKit calls. But on modern macOS, dtrace requires disabling System Integrity Protection — which means rebooting into Recovery Mode, running csrutil disable, rebooting again, doing your trace, then rebooting into Recovery again to re-enable SIP. Even then, many probes are restricted or unavailable in recent macOS versions. Apple has been tightening the screws on dtrace with every release, and it is unclear how long the remaining functionality will survive. I did not want to build my workflow on a tool that requires disabling a core security feature and might stop working entirely next year.
Vendor library debug logging. The library advertises a LOG_USB parameter. I enabled it. No log files were written. Presumably compiled out in the release build.
None of these were viable. I needed another approach.
The vendor library uses the classic IOKit plugin mechanism: it calls IOCreatePlugInInterfaceForService to get an IOCFPlugInInterface, then QueryInterface to get an IOUSBInterfaceInterface. From there, it calls methods like WritePipe, ReadPipe, and ControlRequest through a COM-style vtable.
These vtable calls are indirect — function pointers in a struct, not linked symbols. You cannot interpose the vtable entries just by injecting a library with DYLD_INSERT_LIBRARIES. You also cannot easily patch the vtable, because it lives in read-only memory within the IOUSBLib plugin bundle.
But here is the thing: the synchronous methods I cared about are wrappers around the IOKit user-client call path. Internally, they end up at IOConnectCallMethod — a regular C function exported from IOKit.framework. That function can be interposed.
macOS’s dynamic linker supports a mechanism called DYLD_INSERT_LIBRARIES, which lets you inject a shared library into a process at startup. Combined with the __DATA,__interpose section, you can redirect any dynamically linked function call to your own implementation.
The idea is simple:
.dylib that defines an interposed replacement for IOConnectCallMethod.DYLD_INSERT_LIBRARIES=./your_lib.dylib before launching the target.The tricky part is step 3.
In many interposition examples, you call the original function via dlsym(RTLD_NEXT, "function_name"). Apple documents exactly that pattern, and for ordinary dependent-library interposition it is the right first thing to try.
For this logger, I wanted something more deterministic. I was interposing a system framework function from an injected library, inside a process that loaded system IOKit plugins and shared-cache code. I had already hit the classic failure mode: resolve what looks like the original, call it, and end up back in the logger. Infinite recursion. Stack overflow. Crash.
I tried a few workarounds:
dyld_info exports — possible in theory, but brittle across macOS updates, shared-cache rebuilds, and chained fixups.vm_protect gymnastics, but much more invasive than necessary.The key insight is that IOConnectCallMethod and IOConnectCallAsyncMethod are two different exported symbols that both end up in the same IOUserClient external-method machinery. If you only interpose IOConnectCallMethod, you can call through to IOConnectCallAsyncMethod with MACH_PORT_NULL as the wake port. With no async wake port, IOKit takes the synchronous method path — and the dynamic linker routes the call to the real IOConnectCallAsyncMethod, because that symbol was never interposed.
kern_return_t my_IOConnectCallMethod(
mach_port_t c, uint32_t sel,
const uint64_t *si, uint32_t sic,
const void *is, size_t isc,
uint64_t *so, uint32_t *soc,
void *os, size_t *osc)
{
// Log the outbound data
if (isc > 0 && is) hex_dump("OUT", is, isc);
// Call the real implementation via the non-interposed sibling.
// MACH_PORT_NULL keeps this on the synchronous method path.
kern_return_t kr = IOConnectCallAsyncMethod(
c, sel, MACH_PORT_NULL,
NULL, 0, si, sic, is, isc,
so, soc, os, osc);
// Log the inbound response
if (osc && *osc > 0 && os) hex_dump("IN", os, *osc);
return kr;
}
DYLD_INTERPOSE(my_IOConnectCallMethod, IOConnectCallMethod)
Build it, inject it, run the vendor’s driver — and the USB traffic that this library sends through IOUSBLib flows through your logger. Control transfers, bulk writes, bulk reads, all of it. No SIP bypass needed. No kernel extension. No hardware USB analyzer. Just 60 lines of C.
The trace immediately showed why my driver did not work. Two differences jumped out:
Transfer size. The device operates at USB Full Speed (12 Mbps). Every command I sent was in a 16-byte bulk transfer — the command payload plus a terminator, tightly packed. The vendor’s driver sent every command as a 512-byte transfer request, padded with zeros. That is not a single USB packet on Full Speed; the host controller splits it into endpoint-sized packets. But at the transfer level, the device clearly expected a fixed-size command block. The firmware silently discarded the short writes.
Terminator format. I was using 8 bytes of 0xFF as the end-of-command marker, following what I thought the Linux kernel source was doing. The vendor used a 2-byte marker: a specific opcode value (0x03FF), which is the maximum valid opcode in the protocol — essentially a “no-op, stop processing” sentinel. The 8-byte version was never part of the protocol at all; I had misread the kernel constant.
Neither issue produced an error. The device accepted the USB transfers, returned success codes, and maintained its connection. It just did not execute the commands. Silent failure — the worst kind.
Once I knew the real wire format, the fix was trivial: build a 512-byte command buffer, pad the unused bytes with zeros, and use the correct 2-byte terminator. The device came alive immediately — green LED, frames flowing, exactly as with the vendor driver.
The Linux kernel driver was not wrong. It works perfectly — in the kernel. The kernel’s USB stack submits URBs to the host controller, which handles packetization, short packets, and endpoint details. It will not invent protocol padding for you, though; somewhere in the Linux driver path, the submitted command buffer length was already correct. I missed that because I focused on the command payload and not on the transfer length.
In userspace, with libusb or IOKit, that length is right in your face. If you pass 16 bytes to WritePipe, the device gets a 16-byte transfer. The details that looked like transport trivia suddenly become part of the protocol.
I also learned that “read the open-source driver and reimplement” is not always sufficient. The open-source driver operates in a different environment with different abstractions, and the protocol’s wire-level requirements may be invisible at the source code level. Sometimes you need to look at the actual bytes on the wire — even if that means reverse engineering the one driver that already works.
The DYLD interposition technique is now a permanent tool in my debugging repertoire. It works on the kinds of binaries I usually control — Homebrew-installed tools, local helpers, custom test harnesses — as long as dyld interposition is allowed. SIP-hardened system binaries are excluded. It is not a bus analyzer, and it will not see kernel clients or code paths that bypass the symbol you interpose. But for user-space libraries that talk to IOKit user clients, it can turn an opaque driver into a very useful byte stream. And unlike dtrace, it does not require disabling System Integrity Protection.
Sixty lines of C. That is what stood between “the device does not work” and “here is every byte the working driver sends.” Not bad for an afternoon.