Skip to content

Latest commit

 

History

History
90 lines (67 loc) · 3.37 KB

File metadata and controls

90 lines (67 loc) · 3.37 KB

BIR Bytecode Format

Overview

BIR (BarraCUDA Intermediate Representation) is the on-chain bytecode format for GPUCoin smart contracts. It is a target-independent SSA-form intermediate representation produced by the BarraCUDA compiler.

Structure

A BIR module (bir_module_t) is a flat C struct with fixed-size arrays:

bir_module_t {
    num_funcs, num_blocks, num_insts, num_consts, num_globals
    funcs[BIR_MAX_FUNCS]       // Function table
    blocks[BIR_MAX_BLOCKS]     // Basic block table
    insts[BIR_MAX_INSTS]       // Instruction array (SSA)
    consts[BIR_MAX_CONSTS]     // Constant pool
    globals[BIR_MAX_GLOBALS]   // Global variable table
}

Key Properties

  • Zero pointers: All references use indices, not pointers. This means the entire struct can be serialized/deserialized via memcpy.
  • Fixed size: Bounded by BIR_MAX_INSTS = 262,144 instructions.
  • Target-independent: BIR doesn't encode any GPU-specific details. The AMD GPU backend (amdgpu_compile) handles target lowering.
  • SSA form: Each instruction defines exactly one value, referenced by instruction index.

Instruction Format

Each BIR instruction (bir_inst_t) is 32 bytes:

bir_inst_t {
    op        : uint16_t   // Opcode (0-143)
    type      : uint8_t    // Result type
    subop     : uint8_t    // Sub-operation (e.g., comparison predicate)
    operands[4] : uint32_t // SSA value references or constants
}

Values are referenced using a tagged format:

  • BIR_VAL_IS_CONST(op): Check if operand is a constant
  • BIR_VAL_INDEX(op): Get the instruction/constant index

Opcode Categories

Range Category Count Examples
0-7 Integer arithmetic 8 ADD, SUB, MUL, SDIV, UDIV, SREM, UREM
8-14 Float arithmetic 7 FADD, FSUB, FMUL, FDIV, FREM
15-19 Bitwise 5 AND, OR, XOR, SHL, LSHR, ASHR
20-21 Comparison 2 ICMP, FCMP
22-31 Conversion 10 TRUNC, ZEXT, SEXT, FPTOUI, etc.
32-36 Memory 5 ALLOCA, LOAD, STORE, GEP, SHARED_ALLOC
37-42 Control flow 6 BR, BR_COND, SWITCH, RET, PHI, SELECT
43-46 Thread model 4 THREAD_ID, BLOCK_ID, BLOCK_DIM, GRID_DIM
47-48 Barriers 2 BARRIER, BARRIER_GROUP
49-55 Atomics 7 ATOMIC_ADD, ATOMIC_SUB, ATOMIC_CAS, etc.
56-60 Warp ops 5 SHFL, BALLOT, VOTE_ANY, VOTE_ALL
... ... ... ...

Total: 144 opcodes (BIR_OP_COUNT)

Serialization

BIR modules are stored on-chain as raw bytes:

// Serialize: just copy the flat struct
auto data = compiler::serialize_bir(bir_module);

// Deserialize: copy back into struct
auto bir = compiler::deserialize_bir(data);

This works because bir_module_t contains no pointers — all references are array indices.

Execution Model

When a kernel is executed:

  1. On GPU (fast path): BIR is compiled to .hsaco via amdgpu_compile + amdgpu_regalloc + amdgpu_emit_elf, then dispatched to the GPU
  2. In interpreter (verification): BIR instructions are executed sequentially in software, with each thread simulated independently

The grid/block dimensions from the ExecuteKernel transaction determine how many virtual threads are spawned:

  • Total threads = grid_x * grid_y * grid_z * block_x * block_y * block_z
  • Each thread has unique THREAD_ID, BLOCK_ID values
  • Shared memory is per-block (64KB)
  • Global memory is shared across all threads