BIR (BarraCUDA Intermediate Representation) is the on-chain bytecode format for GPUCoin smart contracts. It is a target-independent SSA-form intermediate representation produced by the BarraCUDA compiler.
A BIR module (bir_module_t) is a flat C struct with fixed-size arrays:
bir_module_t {
num_funcs, num_blocks, num_insts, num_consts, num_globals
funcs[BIR_MAX_FUNCS] // Function table
blocks[BIR_MAX_BLOCKS] // Basic block table
insts[BIR_MAX_INSTS] // Instruction array (SSA)
consts[BIR_MAX_CONSTS] // Constant pool
globals[BIR_MAX_GLOBALS] // Global variable table
}
- Zero pointers: All references use indices, not pointers. This means the entire struct can be serialized/deserialized via
memcpy. - Fixed size: Bounded by
BIR_MAX_INSTS = 262,144instructions. - Target-independent: BIR doesn't encode any GPU-specific details. The AMD GPU backend (
amdgpu_compile) handles target lowering. - SSA form: Each instruction defines exactly one value, referenced by instruction index.
Each BIR instruction (bir_inst_t) is 32 bytes:
bir_inst_t {
op : uint16_t // Opcode (0-143)
type : uint8_t // Result type
subop : uint8_t // Sub-operation (e.g., comparison predicate)
operands[4] : uint32_t // SSA value references or constants
}
Values are referenced using a tagged format:
BIR_VAL_IS_CONST(op): Check if operand is a constantBIR_VAL_INDEX(op): Get the instruction/constant index
| Range | Category | Count | Examples |
|---|---|---|---|
| 0-7 | Integer arithmetic | 8 | ADD, SUB, MUL, SDIV, UDIV, SREM, UREM |
| 8-14 | Float arithmetic | 7 | FADD, FSUB, FMUL, FDIV, FREM |
| 15-19 | Bitwise | 5 | AND, OR, XOR, SHL, LSHR, ASHR |
| 20-21 | Comparison | 2 | ICMP, FCMP |
| 22-31 | Conversion | 10 | TRUNC, ZEXT, SEXT, FPTOUI, etc. |
| 32-36 | Memory | 5 | ALLOCA, LOAD, STORE, GEP, SHARED_ALLOC |
| 37-42 | Control flow | 6 | BR, BR_COND, SWITCH, RET, PHI, SELECT |
| 43-46 | Thread model | 4 | THREAD_ID, BLOCK_ID, BLOCK_DIM, GRID_DIM |
| 47-48 | Barriers | 2 | BARRIER, BARRIER_GROUP |
| 49-55 | Atomics | 7 | ATOMIC_ADD, ATOMIC_SUB, ATOMIC_CAS, etc. |
| 56-60 | Warp ops | 5 | SHFL, BALLOT, VOTE_ANY, VOTE_ALL |
| ... | ... | ... | ... |
Total: 144 opcodes (BIR_OP_COUNT)
BIR modules are stored on-chain as raw bytes:
// Serialize: just copy the flat struct
auto data = compiler::serialize_bir(bir_module);
// Deserialize: copy back into struct
auto bir = compiler::deserialize_bir(data);This works because bir_module_t contains no pointers — all references are array indices.
When a kernel is executed:
- On GPU (fast path): BIR is compiled to
.hsacoviaamdgpu_compile+amdgpu_regalloc+amdgpu_emit_elf, then dispatched to the GPU - In interpreter (verification): BIR instructions are executed sequentially in software, with each thread simulated independently
The grid/block dimensions from the ExecuteKernel transaction determine how many virtual threads are spawned:
- Total threads =
grid_x * grid_y * grid_z * block_x * block_y * block_z - Each thread has unique
THREAD_ID,BLOCK_IDvalues - Shared memory is per-block (64KB)
- Global memory is shared across all threads