Skip to content

Bytecode Format

The StarLang compiler compiles .star source code into a self-contained image: a single byte array that carries everything the VM needs to run — string constants, event types, and code. The VM (star_vm_load) parses it directly; nothing lives outside the blob. This is what lets a host-compiled image be flashed into an embedded target (e.g. ESP32) and run with no extra tables. See star-lang embed.

File Structure

All multi-byte integers are big-endian.

Offset  Size    Description
0x00    4       Magic: "STAR"
0x04    1       Version: 0x03
0x05    1       Flags (bit0: 1=release, 0=debug — informational)
0x06    2       Constant count (u16)
...     ...     Constants (see below)
...     2       Event count (u16)
...     ...     Event types (see below)
...     ...     Native requirements (NREQ section, see below)
...     4       Code size (u32)
...     M       Bytecode

Magic Number

Every image starts with STAR (0x53 0x54 0x41 0x52), followed by the version byte 0x03. Version 0x03 added the NREQ section; older 0x02 images are not loaded.

Flags

The flag byte records whether the image was compiled in debug or release mode:

Mode bit0 console.log
debug 0 included
release 1 stripped

In debug mode, console.log calls are emitted into the bytecode. In release mode the compiler strips them entirely — zero runtime cost.

The flag is informational only. star_vm_load reads it but does not change runtime behavior from it; in particular it is unrelated to the VM's runtime-debugger flag (vm->debug_mode), which is set by the DAP debugger, not the image.

Constant Pool

A u16 constant count, then that many string constants. Each entry:

Offset  Size    Description
0x00    2       String length (u16)
0x02    N       String data (UTF-8)

The VM points its constant pool directly into the image buffer (no copy), so an image embedded as static const in flash works unchanged. The OP_CONST opcode loads a value by u16 index.

Event Types

A u16 event count, then that many event-type entries. The event id is the implicit array index (0, 1, 2, …). Each entry:

Offset  Size    Description
0x00    4       Signature hash (u32)
0x04    1       Name length (u8)
0x05    N       Name (UTF-8, not null-terminated)

star_vm_load registers each event type as it parses, so callers no longer reconstruct event tables separately.

Native Requirements (NREQ)

The NREQ section lists every native function the image needs, by content-addressed contract hash rather than by registration order. This is what lets a host-compiled image resolve against a separately built device firmware: the two binaries never have to agree on the order natives are registered, only on the frozen hash recipe.

Offset  Size    Description
0x00    4       Magic: "NREQ" (0x4E 0x52 0x45 0x51)
0x04    2       NREQ version (u16) — currently 1
0x06    2       Requirement count (u16)
...     ...     Requirements (count entries, 12 bytes each)

Each requirement is 12 bytes:

Offset  Size    Description
0x00    8       Contract hash (u64)
0x08    2       ABI generation (u16) — diagnostic only
0x0A    2       Flags (u16) — must be 0 in v1

The contract hash is 64-bit FNV-1a over the canonical byte stream module \0 symbol \0 param_count param_codes[] return_code abi_lo abi_hi, where the type codes are the frozen StarAbiType values (independent of the compiler's internal StarType enum, so that enum can be reordered freely). The ABI generation is folded into the hash, so a native that keeps its signature but changes behaviour can break the contract by bumping it. The standalone ABI field in the entry is therefore diagnostic only — the loader matches on the hash alone.

OP_NATIVE_CALL's operand is an image-local native_req_id (an index into this section), not a runtime slot. At load, star_vm_load resolves each requirement against the natives the runtime registered (each of which carries the same contract hash) and builds a native_req_id → slot map. If any requirement is unresolved, the load fails with VM_ERR_MISSING_NATIVE and lists the offending hashes in error_detail — all missing natives are collected, not just the first. A non-zero flags field fails the load with VM_ERR_BAD_IMAGE_FORMAT.

This is why a device firmware can register only a subset of natives (e.g. the compute libs math/time/json/queue/crypto, omitting host I/O like http/net): an image that imports an unregistered native simply gets MISSING_NATIVE at load. The hash recipe is part of the image ABI, not the compiler implementation — see dev-vm/star_native_abi.h.

star-lang deploy turns this into an up-front check: the device advertises the native modules it registered (a <<SLD CAPS …>> line, re-emitted on a QUERY-CAPS frame), and deploy refuses to send an image that needs a module the firmware lacks — naming it, rather than letting the load fail with MISSING_NATIVE. star-lang flash writes a firmware that registers the full device-capable set, so the check passes for any subset.

Debug Info

Source-mapping data (filenames, function addresses, offset→line tables, variable names) is not stored in the image — not even in debug builds. A release image therefore carries no debug information at all. The DAP debugger reconstructs this mapping host-side from the compiler at launch time; it is never shipped to the target.