Instructions
Now that you understand sBPF's registers and memory regions, let's examine the instructions that manipulate them.
Instructions are the fundamental operations your program performs—adding numbers, loading from memory, or jumping to different locations.
What are Instructions?
Instructions are your program's basic building blocks. Think of them as commands that tell the processor exactly what to do:
add64 r1, r2
: "Add the values in registersr1
andr2
, store result inr1
"ldxdw r0, [r10 - 8]
: "Load 8 bytes from stack memory into registerr0
"jeq r1, 42, +3
: "Ifr1
equals 42, jump forward 3 instructions"
Each instruction performs exactly one operation and encodes as precisely 8 bytes of data for instant VM decoding.
sBPF instructions work with different data sizes:
- byte = 8 bits (1 byte)
- halfword = 16 bits (2 bytes)
- word = 32 bits (4 bytes)
- doubleword = 64 bits (8 bytes)
Most sBPF operations use 64-bit values (doublewords) since registers are 64 bits, but you can load and store smaller sizes when needed for efficiency tho.
Instruction Categories and Format
When you compile Rust, C, or assembly code, the toolchain emits a stream of fixed-width, 8-byte instructions packed into your ELF's .text
section.
Each instruction follows a consistent structure that the VM can decode in a single pass:
1 byte 4 bits 4 bits 2 bytes 4 bytes
┌──────────┬────────┬────────┬──────────────┬──────────────────┐
│ opcode │ dst │ src │ offset │ imm │
└──────────┴────────┴────────┴──────────────┴──────────────────┘
opcode
: Defines the operation type. The top 3 bits select the instruction class (arithmetic, memory, jump, call, exit), while the lower 5 bits specify the exact variant (add, multiply, load, jump-if-equal).dst
: The destination register number (r0–r10
) where results are stored—arithmetic results, loaded values, or helper function returns.src
: The source register providing input. For two-operand arithmetic (add r1, r2
), it supplies the second value. For memory operations, it can provide the base address. For immediate variants (add r1, 10
), these 4 bits fold into the opcode.offset
: A small integer that modifies instruction behavior. For loads/stores, it's added to the source address to reach[src + offset]
. For jumps, it's a relative branch target measured in instructions.imm
:The immediate value field. Arithmetic operations use it for constants (add r1, 42
),CALL
uses it for syscall numbers (sol_log = 16
), and memory operations may treat it as an absolute pointer.
Instruction Categories
Different instruction types use these fields in specific ways:
- Data Movement: Move values between registers and memory:
mov64 r1, 42 # Put immediate value 42 into r1
# opcode=move_imm, dst=1, src=unused, imm=42
ldxdw r0, [r10 - 8] # Load 8 bytes from stack into r0
# opcode=load64, dst=0, src=10, offset=-8, imm=unused
stxdw [r1 + 16], r0 # Store r0 to memory at [r1 + 16]
# opcode=store64, dst=1, src=0, offset=16, imm=unused
- Arithmetic: Perform mathematical operations:
add64 r1, r2 # r1 = r1 + r2
# opcode=add_reg, dst=1, src=2, offset=unused, imm=unused
add64 r1, 100 # r1 = r1 + 100
# opcode=add_imm, dst=1, src=unused, offset=unused, imm=100
- Control Flow: Change execution sequence:
ja +5 # Jump forward 5 instructions unconditionally
# opcode=jump, dst=unused, src=unused, offset=5, imm=unused
jeq r1, r2, +3 # If r1 == r2, jump forward 3 instructions
# opcode=jump_eq_reg, dst=1, src=2, offset=3, imm=unused
jeq r1, 42, +3 # If r1 == 42, jump forward 3 instructions
# opcode=jump_eq_imm, dst=1, src=unused, offset=3, imm=42
Opcode Encoding
The opcode encoding captures multiple pieces of information beyond just the operation type:
- Instruction class: Arithmetic, memory, jump, call, etc.
- Operation size: 32-bit vs 64-bit operations
- Source type: Register vs immediate value
- Specific operation: Add vs subtract, load vs store, etc.
This creates distinct opcodes for instruction variants. For example, add64 r1, r2
(register source) uses a different opcode than add64 r1, 42
(immediate source). Similarly, add64
and add32
have different opcodes for different operation sizes.
Arithmetic operations further distinguish between signed and unsigned variants. udiv64
treats values as unsigned (0 to 18 quintillion), while sdiv64
handles signed values (-9 quintillion to +9 quintillion).
Instruction Execution
The opcode determines how the VM interprets the remaining fields.
When the VM encounters add64 r1, r2
, it reads the opcode and recognizes this as a 64-bit arithmetic operation using two registers:
The dst
field indicates the result goes into r1
, the src
field specifies r2
as the second operand, and the offset
and immediate
fields are ignored.
For add64 r1, 42
, the opcode changes to indicate an immediate operation. Now dst
still points to r1
, but src
becomes meaningless, and the immediate
field provides the second operand (42).
Memory operations combine multiple fields meaningfully:
For ldxdw r1, [r2+8]
, the opcode indicates a 64-bit memory load, dst
receives the loaded value, src
provides the base address, and offset
(8) is added to create the final address r2 + 8
.
Control flow instructions follow the same pattern:
When you write jeq r1, r2, +5
, the opcode encodes a conditional jump comparing two registers. If r1
equals r2
, the VM adds the offset
(5) to the program counter, jumping forward 5 instructions.
Function Calls and Syscalls
sBPF's call mechanism evolved across versions for better clarity and security. Until sBPF v3, call imm
served dual purposes: the immediate value determined whether you were calling an internal function or invoking a syscall.
The runtime distinguished between these based on the immediate value range, with syscall numbers typically being small positive integers like 16 for sol_log
.
From sBPF v3 onwards, the instructions separated for explicit behavior. call
off now handles internal function calls using relative offsets, while syscall imm
explicitly invokes runtime functions. This separation makes bytecode intentions clear and enables better verification.
Indirect calls through callx
also evolved. Earlier versions encoded the target register in the immediate field, but from v2 onwards, it's encoded in the source register field for consistency with the general instruction format.
Opcodes Reference Table
Memory Load Operations
opcode | Mnemonic | Description |
---|---|---|
lddw | lddw dst, imm | Load 64-bit immediate (first slot) |
lddw | lddw dst, imm | Load 64-bit immediate (second slot) |
ldxw | ldxw dst, [src + off] | Load word from memory |
ldxh | ldxh dst, [src + off] | Load halfword from memory |
ldxb | ldxb dst, [src + off] | Load byte from memory |
ldxdw | ldxdw dst, [src + off] | Load doubleword from memory |
Memory Store Operations
opcode | Mnemonic | Description |
---|---|---|
stw | stw [dst + off], imm | Store word immediate |
sth | sth [dst + off], imm | Store halfword immediate |
stb | stb [dst + off], imm | Store byte immediate |
stdw | stdw [dst + off], imm | Store doubleword immediate |
stxw | stxw [dst + off], src | Store word from register |
stxh | stxh [dst + off], src | Store halfword from register |
stxb | stxb [dst + off], src | Store byte from register |
stxdw | stxdw [dst + off], src | Store doubleword from register |
Arithmetic Operations (64-bit)
opcode | Mnemonic | Description |
---|---|---|
add64 | add64 dst, imm | Add immediate |
add64 | add64 dst, src | Add register |
sub64 | sub64 dst, imm | Subtract immediate |
sub64 | sub64 dst, src | Subtract register |
mul64 | mul64 dst, imm | Multiply immediate |
mul64 | mul64 dst, src | Multiply register |
div64 | div64 dst, imm | Divide immediate (unsigned) |
div64 | div64 dst, src | Divide register (unsigned) |
sdiv64 | sdiv64 dst, imm | Divide immediate (signed) |
sdiv64 | sdiv64 dst, src | Divide register (signed) |
mod64 | mod64 dst, imm | Modulo immediate (unsigned) |
mod64 | mod64 dst, src | Modulo register (unsigned) |
smod64 | smod64 dst, imm | Modulo immediate (signed) |
smod64 | smod64 dst, src | Modulo register (signed) |
neg64 | neg64 dst | Negate |
Arithmetic Operations (32-bit)
opcode | Mnemonic | Description |
---|---|---|
add32 | add32 dst, imm | Add immediate (32-bit) |
add32 | add32 dst, src | Add register (32-bit) |
sub32 | sub32 dst, imm | Subtract immediate (32-bit) |
sub32 | sub32 dst, src | Subtract register (32-bit) |
mul32 | mul32 dst, imm | Multiply immediate (32-bit) |
mul32 | mul32 dst, src | Multiply register (32-bit) |
div32 | div32 dst, imm | Divide immediate (32-bit) |
div32 | div32 dst, src | Divide register (32-bit) |
sdiv32 | sdiv32 dst, imm | Divide immediate (signed 32-bit) |
sdiv32 | sdiv32 dst, src | Divide register (signed 32-bit) |
mod32 | mod32 dst, imm | Modulo immediate (32-bit) |
mod32 | mod32 dst, src | Modulo register (32-bit) |
smod32 | smod32 dst, imm | Modulo immediate (signed 32-bit) |
smod32 | smod32 dst, src | Modulo register (signed 32-bit) |
Logical Operations (64-bit)
opcode | Mnemonic | Description |
---|---|---|
or64 | or64 dst, imm | Bitwise OR immediate |
or64 | or64 dst, src | Bitwise OR register |
and64 | and64 dst, imm | Bitwise AND immediate |
and64 | and64 dst, src | Bitwise AND register |
lsh64 | lsh64 dst, imm | Left shift immediate |
lsh64 | lsh64 dst, src | Left shift register |
rsh64 | rsh64 dst, imm | Right shift immediate |
rsh64 | rsh64 dst, src | Right shift register |
xor64 | xor64 dst, imm | Bitwise XOR immediate |
xor64 | xor64 dst, src | Bitwise XOR register |
mov64 | mov64 dst, imm | Move immediate |
mov64 | mov64 dst, src | Move register |
arsh64 | arsh64 dst, imm | Arithmetic right shift imm |
arsh64 | arsh64 dst, src | Arithmetic right shift reg |
Logical Operations (32-bit)
opcode | Mnemonic | Description |
---|---|---|
or32 | or32 dst, imm | Bitwise OR immediate (32-bit) |
or32 | or32 dst, src | Bitwise OR register (32-bit) |
and32 | and32 dst, imm | Bitwise AND immediate (32-bit) |
and32 | and32 dst, src | Bitwise AND register (32-bit) |
lsh32 | lsh32 dst, imm | Left shift immediate (32-bit) |
lsh32 | lsh32 dst, src | Left shift register (32-bit) |
rsh32 | rsh32 dst, imm | Right shift immediate (32-bit) |
rsh32 | rsh32 dst, src | Right shift register (32-bit) |
xor32 | xor32 dst, imm | Bitwise XOR immediate (32-bit) |
xor32 | xor32 dst, src | Bitwise XOR register (32-bit) |
mov32 | mov32 dst, imm | Move immediate (32-bit) |
mov32 | mov32 dst, src | Move register (32-bit) |
arsh32 | arsh32 dst, imm | Arith right shift imm (32-bit) |
arsh32 | arsh32 dst, src | Arith right shift reg (32-bit) |
Control Flow Operations
opcode | Mnemonic | Description |
---|---|---|
ja | ja off | Unconditional jump (jump 0 = jump to next) |
jeq | jeq dst, imm, off | Jump if equal to immediate |
jeq | jeq dst, src, off | Jump if equal to register |
jgt | jgt dst, imm, off | Jump if greater than immediate (unsigned) |
jgt | jgt dst, src, off | Jump if greater than register (unsigned) |
jge | jge dst, imm, off | Jump if greater or equal immediate (unsigned) |
jge | jge dst, src, off | Jump if greater or equal register (unsigned) |
jset | jset dst, imm, off | Jump if bit set (immediate mask) |
jset | jset dst, src, off | Jump if bit set (register mask) |
jne | jne dst, imm, off | Jump if not equal to immediate |
jne | jne dst, src, off | Jump if not equal to register |
jsgt | jsgt dst, imm, off | Jump if greater than immediate (signed) |
jsgt | jsgt dst, src, off | Jump if greater than register (signed) |
jsge | jsge dst, imm, off | Jump if greater or equal immediate (signed) |
jsge | jsge dst, src, off | Jump if greater or equal register (signed) |
jlt | jlt dst, imm, off | Jump if less than immediate (unsigned) |
jlt | jlt dst, src, off | Jump if less than register (unsigned) |
jle | jle dst, imm, off | Jump if less or equal immediate (unsigned) |
jle | jle dst, src, off | Jump if less or equal register (unsigned) |
jslt | jslt dst, imm, off | Jump if less than immediate (signed) |
jslt | jslt dst, src, off | Jump if less than register (signed) |
jsle | jsle dst, imm, off | Jump if less or equal immediate (signed) |
jsle | jsle dst, src, off | Jump if less or equal register (signed) |
Function Call Operations
opcode | Mnemonic | Description |
---|---|---|
call | call imm or syscall imm | Call function or syscall |
callx | callx imm | Indirect call (register in imm field) |
exit | exit or return | Return from function |
Byte Swap Operations
opcode | Mnemonic | Description |
---|---|---|
be | be dst, imm | Byte swap (16, 32, or 64 bit) |
le | le dst, imm | Little endian convert (deprecated) |