Instructions

Now that you understand sBPF's registers and memory regions, let's examine the instructions that manipulate them.

Instructions are the fundamental operations your program performs—adding numbers, loading from memory, or jumping to different locations.

What are Instructions?

Instructions are your program's basic building blocks. Think of them as commands that tell the processor exactly what to do:

add64 r1, r2: "Add the values in registers r1 and r2, store result in r1"
ldxdw r0, [r10 - 8]: "Load 8 bytes from stack memory into register r0"
jeq r1, 42, +3: "If r1 equals 42, jump forward 3 instructions"

Each instruction performs exactly one operation and encodes as precisely 8 bytes of data for instant VM decoding.

sBPF instructions work with different data sizes:

byte = 8 bits (1 byte)
halfword = 16 bits (2 bytes)
word = 32 bits (4 bytes)
doubleword = 64 bits (8 bytes)

Most sBPF operations use 64-bit values (doublewords) since registers are 64 bits, but you can load and store smaller sizes when needed for efficiency though.

Instruction Categories and Format

When you compile Rust, C, or assembly code, the toolchain emits a stream of fixed-width, 8-byte instructions packed into your ELF's .text section.

Each instruction follows a consistent structure that the VM can decode in a single pass:

text

   1 byte    4 bits   4 bits     2 bytes         4 bytes
┌──────────┬────────┬────────┬──────────────┬──────────────────┐
│  opcode  │  dst   │  src   │   offset     │      imm         │
└──────────┴────────┴────────┴──────────────┴──────────────────┘

opcode: Defines the operation type. The top 3 bits select the instruction class (arithmetic, memory, jump, call, exit), while the lower 5 bits specify the exact variant (add, multiply, load, jump-if-equal).
dst: The destination register number (r0–r10) where results are stored—arithmetic results, loaded values, or helper function returns.
src: The source register providing input. For two-operand arithmetic (add r1, r2), it supplies the second value. For memory operations, it can provide the base address. For immediate variants (add r1, 10), these 4 bits fold into the opcode.
offset: A small integer that modifies instruction behavior. For loads/stores, it's added to the source address to reach [src + offset]. For jumps, it's a relative branch target measured in instructions.
imm:The immediate value field. Arithmetic operations use it for constants (add r1, 42), CALL uses it for syscall numbers (sol_log = 16), and memory operations may treat it as an absolute pointer.

Instruction Categories

Different instruction types use these fields in specific ways:

Data Movement: Move values between registers and memory:

sbpf

mov64 r1, 42           // Put immediate value 42 into r1
                       // opcode=move_imm, dst=1, src=unused, imm=42

ldxdw r0, [r10 - 8]    // Load 8 bytes from stack into r0  
                       // opcode=load64, dst=0, src=10, offset=-8, imm=unused

stxdw [r1 + 16], r0    // Store r0 to memory at [r1 + 16]
                       // opcode=store64, dst=1, src=0, offset=16, imm=unused

Arithmetic: Perform mathematical operations:

sbpf

add64 r1, r2           // r1 = r1 + r2
                       // opcode=add_reg, dst=1, src=2, offset=unused, imm=unused

add64 r1, 100          // r1 = r1 + 100  
                       // opcode=add_imm, dst=1, src=unused, offset=unused, imm=100

Control Flow: Change execution sequence:

sbpf

ja +5                  // Jump forward 5 instructions unconditionally
                       // opcode=jump, dst=unused, src=unused, offset=5, imm=unused

jeq r1, r2, +3         // If r1 == r2, jump forward 3 instructions
                       // opcode=jump_eq_reg, dst=1, src=2, offset=3, imm=unused

jeq r1, 42, +3         // If r1 == 42, jump forward 3 instructions  
                       // opcode=jump_eq_imm, dst=1, src=unused, offset=3, imm=42

Opcode Encoding

The opcode encoding captures multiple pieces of information beyond just the operation type:

Instruction class: Arithmetic, memory, jump, call, etc.
Operation size: 32-bit vs 64-bit operations
Source type: Register vs immediate value
Specific operation: Add vs subtract, load vs store, etc.

This creates distinct opcodes for instruction variants. For example, add64 r1, r2 (register source) uses a different opcode than add64 r1, 42 (immediate source). Similarly, add64 and add32 have different opcodes for different operation sizes.

Arithmetic operations further distinguish between signed and unsigned variants. udiv64 treats values as unsigned (0 to 18 quintillion), while sdiv64 handles signed values (-9 quintillion to +9 quintillion).

Instruction Execution

The opcode determines how the VM interprets the remaining fields.

When the VM encounters add64 r1, r2, it reads the opcode and recognizes this as a 64-bit arithmetic operation using two registers:

The dst field indicates the result goes into r1, the src field specifies r2 as the second operand, and the offset and immediate fields are ignored.

For add64 r1, 42, the opcode changes to indicate an immediate operation. Now dst still points to r1, but src becomes meaningless, and the immediate field provides the second operand (42).

Memory operations combine multiple fields meaningfully:

For ldxdw r1, [r2+8], the opcode indicates a 64-bit memory load, dst receives the loaded value, src provides the base address, and offset (8) is added to create the final address r2 + 8.

Control flow instructions follow the same pattern:

When you write jeq r1, r2, +5, the opcode encodes a conditional jump comparing two registers. If r1 equals r2, the VM adds the offset (5) to the program counter, jumping forward 5 instructions.

The opcode determines which fields are meaningful. The instruction format remains constant: the opcode tells you how to interpret each field, eliminating complex addressing modes or special cases.

Function Calls and Syscalls

sBPF's call mechanism evolved across versions for better clarity and security. Until sBPF v3, call imm served dual purposes: the immediate value determined whether you were calling an internal function or invoking a syscall.

The runtime distinguished between these based on the immediate value range, with syscall numbers typically being small positive integers like 16 for sol_log.

From sBPF v3 onwards, the instructions separated for explicit behavior. call off now handles internal function calls using relative offsets, while syscall imm explicitly invokes runtime functions. This separation makes bytecode intentions clear and enables better verification.

Indirect calls through callx also evolved. Earlier versions encoded the target register in the immediate field, but from v2 onwards, it's encoded in the source register field for consistency with the general instruction format.

Opcodes Reference Table

Memory Load Operations

opcode	Mnemonic	Description
lddw	`lddw dst, imm`	Load 64-bit immediate (first slot)
lddw	`lddw dst, imm`	Load 64-bit immediate (second slot)
ldxw	`ldxw dst, [src + off]`	Load word from memory
ldxh	`ldxh dst, [src + off]`	Load halfword from memory
ldxb	`ldxb dst, [src + off]`	Load byte from memory
ldxdw	`ldxdw dst, [src + off]`	Load doubleword from memory

Memory Store Operations

opcode	Mnemonic	Description
stw	`stw [dst + off], imm`	Store word immediate
sth	`sth [dst + off], imm`	Store halfword immediate
stb	`stb [dst + off], imm`	Store byte immediate
stdw	`stdw [dst + off], imm`	Store doubleword immediate
stxw	`stxw [dst + off], src`	Store word from register
stxh	`stxh [dst + off], src`	Store halfword from register
stxb	`stxb [dst + off], src`	Store byte from register
stxdw	`stxdw [dst + off], src`	Store doubleword from register

Arithmetic Operations (64-bit)

opcode	Mnemonic	Description
add64	`add64 dst, imm`	Add immediate
add64	`add64 dst, src`	Add register
sub64	`sub64 dst, imm`	Subtract immediate
sub64	`sub64 dst, src`	Subtract register
mul64	`mul64 dst, imm`	Multiply immediate
mul64	`mul64 dst, src`	Multiply register
div64	`div64 dst, imm`	Divide immediate (unsigned)
div64	`div64 dst, src`	Divide register (unsigned)
sdiv64	`sdiv64 dst, imm`	Divide immediate (signed)
sdiv64	`sdiv64 dst, src`	Divide register (signed)
mod64	`mod64 dst, imm`	Modulo immediate (unsigned)
mod64	`mod64 dst, src`	Modulo register (unsigned)
smod64	`smod64 dst, imm`	Modulo immediate (signed)
smod64	`smod64 dst, src`	Modulo register (signed)
neg64	`neg64 dst`	Negate

Arithmetic Operations (32-bit)

opcode	Mnemonic	Description
add32	`add32 dst, imm`	Add immediate (32-bit)
add32	`add32 dst, src`	Add register (32-bit)
sub32	`sub32 dst, imm`	Subtract immediate (32-bit)
sub32	`sub32 dst, src`	Subtract register (32-bit)
mul32	`mul32 dst, imm`	Multiply immediate (32-bit)
mul32	`mul32 dst, src`	Multiply register (32-bit)
div32	`div32 dst, imm`	Divide immediate (32-bit)
div32	`div32 dst, src`	Divide register (32-bit)
sdiv32	`sdiv32 dst, imm`	Divide immediate (signed 32-bit)
sdiv32	`sdiv32 dst, src`	Divide register (signed 32-bit)
mod32	`mod32 dst, imm`	Modulo immediate (32-bit)
mod32	`mod32 dst, src`	Modulo register (32-bit)
smod32	`smod32 dst, imm`	Modulo immediate (signed 32-bit)
smod32	`smod32 dst, src`	Modulo register (signed 32-bit)

Logical Operations (64-bit)

opcode	Mnemonic	Description
or64	`or64 dst, imm`	Bitwise OR immediate
or64	`or64 dst, src`	Bitwise OR register
and64	`and64 dst, imm`	Bitwise AND immediate
and64	`and64 dst, src`	Bitwise AND register
lsh64	`lsh64 dst, imm`	Left shift immediate
lsh64	`lsh64 dst, src`	Left shift register
rsh64	`rsh64 dst, imm`	Right shift immediate
rsh64	`rsh64 dst, src`	Right shift register
xor64	`xor64 dst, imm`	Bitwise XOR immediate
xor64	`xor64 dst, src`	Bitwise XOR register
mov64	`mov64 dst, imm`	Move immediate
mov64	`mov64 dst, src`	Move register
arsh64	`arsh64 dst, imm`	Arithmetic right shift imm
arsh64	`arsh64 dst, src`	Arithmetic right shift reg

Logical Operations (32-bit)

opcode	Mnemonic	Description
or32	`or32 dst, imm`	Bitwise OR immediate (32-bit)
or32	`or32 dst, src`	Bitwise OR register (32-bit)
and32	`and32 dst, imm`	Bitwise AND immediate (32-bit)
and32	`and32 dst, src`	Bitwise AND register (32-bit)
lsh32	`lsh32 dst, imm`	Left shift immediate (32-bit)
lsh32	`lsh32 dst, src`	Left shift register (32-bit)
rsh32	`rsh32 dst, imm`	Right shift immediate (32-bit)
rsh32	`rsh32 dst, src`	Right shift register (32-bit)
xor32	`xor32 dst, imm`	Bitwise XOR immediate (32-bit)
xor32	`xor32 dst, src`	Bitwise XOR register (32-bit)
mov32	`mov32 dst, imm`	Move immediate (32-bit)
mov32	`mov32 dst, src`	Move register (32-bit)
arsh32	`arsh32 dst, imm`	Arith right shift imm (32-bit)
arsh32	`arsh32 dst, src`	Arith right shift reg (32-bit)

Control Flow Operations

opcode	Mnemonic	Description
ja	`ja off`	Unconditional jump (jump 0 = jump to next)
jeq	`jeq dst, imm, off`	Jump if equal to immediate
jeq	`jeq dst, src, off`	Jump if equal to register
jgt	`jgt dst, imm, off`	Jump if greater than immediate (unsigned)
jgt	`jgt dst, src, off`	Jump if greater than register (unsigned)
jge	`jge dst, imm, off`	Jump if greater or equal immediate (unsigned)
jge	`jge dst, src, off`	Jump if greater or equal register (unsigned)
jset	`jset dst, imm, off`	Jump if bit set (immediate mask)
jset	`jset dst, src, off`	Jump if bit set (register mask)
jne	`jne dst, imm, off`	Jump if not equal to immediate
jne	`jne dst, src, off`	Jump if not equal to register
jsgt	`jsgt dst, imm, off`	Jump if greater than immediate (signed)
jsgt	`jsgt dst, src, off`	Jump if greater than register (signed)
jsge	`jsge dst, imm, off`	Jump if greater or equal immediate (signed)
jsge	`jsge dst, src, off`	Jump if greater or equal register (signed)
jlt	`jlt dst, imm, off`	Jump if less than immediate (unsigned)
jlt	`jlt dst, src, off`	Jump if less than register (unsigned)
jle	`jle dst, imm, off`	Jump if less or equal immediate (unsigned)
jle	`jle dst, src, off`	Jump if less or equal register (unsigned)
jslt	`jslt dst, imm, off`	Jump if less than immediate (signed)
jslt	`jslt dst, src, off`	Jump if less than register (signed)
jsle	`jsle dst, imm, off`	Jump if less or equal immediate (signed)
jsle	`jsle dst, src, off`	Jump if less or equal register (signed)

Function Call Operations

opcode	Mnemonic	Description
call	`call imm` or `syscall imm`	Call function or syscall
callx	`callx imm`	Indirect call (register in imm field)
exit	`exit` or `return`	Return from function

Byte Swap Operations

opcode	Mnemonic	Description
be16	`be16 dst`	Byte swap (16-bit)
be32	`be32 dst`	Byte swap (32-bit)
be64	`be64 dst`	Byte swap (64-bit)
le16	`le16 dst`	Bit mask (16-bit)
le32	`le32 dst`	Bit mask (32-bit)
le64	`le64 dst`	No op (64-bit)

Introduction to Assembly

Instructions

What are Instructions?

Instruction Categories and Format

Instruction Categories

Opcode Encoding

Instruction Execution

Function Calls and Syscalls

Opcodes Reference Table

Memory Load Operations

Memory Store Operations

Arithmetic Operations (64-bit)

Arithmetic Operations (32-bit)

Logical Operations (64-bit)

Logical Operations (32-bit)

Control Flow Operations

Function Call Operations

Byte Swap Operations