Command Palette

Search for a command to run...

Sign in

Rotation Compiler

How a JSON rotation becomes native code or an interpreted plan, why both share one lowerer, and the dense buffer they read

7 min read

When the loop asks the handler "what now?", the answer comes from a rotation: a priority list of actions with conditions, authored as JSON and compiled once at bootstrap. At runtime the handler calls evaluate(&mut buffer, now_secs) and gets back a single EvalResult: cast this spell, wait this long, pool to this resource level. That is the contract. Everything below is how it is honoured fast and identically across two backends.

This page expands the Engine box of the system-context diagram (see Architecture) along a different axis than the event system: the rotation execution path rather than the clock.

Two backends, one lowerer

There are two ways to evaluate a rotation, and the engine ships both. A JIT backend compiles the rotation to native machine code through LLVM (via the inkwell crate). An interpreter backend runs the same logic without codegen. They are selected at compile time by the jit feature, with one runtime override:

  • Native builds with the jit feature use the JIT. The published figure for a JIT evaluation is on the order of 1.5 ns, because the rotation collapses to straight-line native code reading a flat buffer.
  • WASM has no LLVM, so browser builds always use the interpreter. This is not a fallback we are ashamed of. It is the only option in the sandbox, and it keeps the in-browser preview honest about the same rotation logic.
  • Attaching a decision-trace sink forces the interpreter even on native, because the JIT cannot record per-decision traces. set_decision_trace on a JIT engine recompiles it through the interpreter on the spot.

The thing that makes two backends maintainable is that they are not two implementations. Both call the same lower::lower_rotation. The lowerer is generic over a RotationBackend trait whose associated types are the backend's notion of a boolean, integer, and float:

rust
pub trait RotationBackend {
    type Bool: Copy;
    type Int: Copy;
    type Float: Copy;

The trait's methods are the primitive operations the lowerer composes: load a field, compare, add, branch, return a cast. The JIT implements those primitives by emitting LLVM IR; the interpreter implements them by computing values directly. The priority-list logic, which condition gates which action and what counts as a terminator, is written exactly once. A cross-backend parity test asserts the two produce identical results.

The interpreter is worth being precise about because it is not what the name suggests. It is not a separate AST walker. Its evaluate constructs an InterpBackend over the buffer and runs the very same lower_rotation:

rust
let mut backend = InterpBackend::new(buffer, now_secs);
lower::lower_rotation(
    &mut backend,
    &self.rotation,
    &self.schema,
    &self.resolver,
    &self.table,
);
backend.finish()

The only difference from the JIT is that the primitive ops compute instead of emit. "Interpreter" here means "the lowerer driven eagerly," not "a second engine."

Figure 14
Rotation Compilation (JIT vs Interpreter)
Expands the build-handler box of the simulation pipeline: JSON to parse_and_validate to lower::prepare to the shared lower_rotation, lowered through either the LLVM JIT backend or the interpreter to one EvalResult; the WASM build and decision-trace mode use the interpreter.

This figure expands the Engine box of the system-context diagram (see Architecture). The pipeline:

  • parse_and_validate is serde plus a validation pass over the action tree.
  • lower::prepare builds the DescriptorTable, registers every field the rotation reads and every user variable into a SchemaBuilder, validates that referenced resources exist, and returns the ContextSchema that fixes the buffer layout.
  • The decision node picks a backend, but both arrows converge on the same lower_rotation. The JIT arm then verifies the module and creates an execution engine at OptimizationLevel::Aggressive, grabbing the raw function pointer; the interpreter arm just stores the rotation and table.
  • At runtime, evaluate(buffer, now) either calls the native function (which returns a packed u64 decoded into an EvalResult) or runs the interpreter to finish().

A note on the road not taken: the obvious alternative JIT backend in the Rust ecosystem is Cranelift, which is simpler to embed and compiles faster. The engine uses LLVM through inkwell instead. It is slower to compile but better at optimising the kind of branch-heavy, read-only numeric code a rotation lowers to, and the rotation is compiled once per sim and then evaluated millions of times, so compile time is amortised to nothing.

The EvalResult ABI

EvalResult is the value the rotation returns and the handler acts on. The in-memory form is a #[repr(C)] 12-byte triple, with a const_assert_eq! pinning the size at 12:

rust
#[derive(Debug, Clone, Copy, PartialEq)]
#[repr(C)]
pub struct EvalResult {
    pub kind: u8,
    /// Spell ID for `KIND_CAST`, or gear-slot repr for `KIND_USE_ITEM`.
    pub spell_id: u32,
    /// Wait seconds for `KIND_WAIT`, pool target for `KIND_POOL`.
    pub wait_time: f32,
}

The kind byte is one of five constants: KIND_NONE, KIND_CAST, KIND_WAIT, KIND_POOL, KIND_USE_ITEM. The spell_id and wait_time fields mean different things per kind, which is why their doc comments hedge: for a use-item result spell_id carries the GearSlot repr instead of a spell id, and for a pool result wait_time is the target level.

The JIT does not return that struct directly. A native function returns a scalar, so the rotation function returns a u64 with the same three fields bit-packed: [kind:8][spell_id:24][wait_time:32]. pack_eval_result builds it; decode_eval_result is the inverse, called right after the JIT call:

rust
pub fn decode_eval_result(packed: u64) -> (u8, u32, f32) {
    // #t(block: lossy_cast, magic_numbers) packed u64 bit extraction.
    let kind = (packed >> KIND_SHIFT) as u8;
    let spell_id = ((packed >> SPELL_ID_SHIFT) & SPELL_ID_MASK) as u32;
    let wait_time = f32::from_bits(packed as u32);
    (kind, spell_id, wait_time)
}

Note the packed spell id is 24 bits, narrower than the struct's u32. That is fine for live spell ids, and it is the only place the two representations differ. This packing lives in buffer-contract, the crate that holds the ABI both the lowerer and the backends agree on.

The dense buffer

The rotation reads game state, and how that state is laid out is the difference between a 1.5 ns evaluation and a slow one. State lives in a DenseBuffer: one contiguous Vec<SlotChunk>, where SlotChunk is a repr(align(8)) 8-byte newtype, so the buffer is 8-byte-aligned where a plain Vec<u8> would not be. The buffer is viewed as raw bytes and divided into slots, read and written through raw pointer casts. No hash lookups during evaluation, no boxing, no indirection. The rotation function is handed a *mut u8 and a known set of byte offsets, and it loads f64s and i32s straight out.

Figure 15
DenseBuffer Slot Model
Expands the DenseBuffer box of the rotation compiler: singleton slots (player, combat, pet) and keyed maps (cooldown, aura, resource, spell, ...) as repr(C) 8-aligned structs in one flat byte buffer, indexed by the FieldDescriptor inventory.

This figure expands the DenseBuffer box of the rotation-compile figure above. The model has three families of slot:

  • Singletons, one each: player, combat, pet. The player slot holds GCD end, cast/channel end, haste, crit, mastery, attack power, level, and the boolean state flags (moving, alive, in combat, stealthed).
  • Keyed maps, many of each, indexed by an integer or string key: cooldowns and spells and history by SpellIdx, auras by AuraKey, resources by ResourceType, units by role string, swings by hand. Each key maps to a byte offset into the same buffer.
  • Standalone, the user's own rotation variables, defaulted on reset.

Every slot is a repr(C) struct generated by a define_slot! macro that also emits its field offsets, its size, and a set of FieldDescriptors collected through the inventory crate. A FieldDescriptor records (domain, name) -> (field_offset, eval_kind, field_type). That descriptor table is exactly what lower::prepare consults to turn a rotation's Read { field: "cooldown.fireball.remaining" } into a concrete byte load with the right EvalKind.

EvalKind is where the buffer's expressiveness lives. One stored field can expose several named rotation expressions with different evaluation semantics:

rust
pub enum EvalKind {
    Direct,
    TimestampReady,
    TimestampRemaining,
    TimestampActive,
    TimestampElapsed,
    TimestampInactive,
    /// Field `ready_at` (f64 +0); reads `current_charges` (i32 +16), `max_charges` (i32 +20).
    CooldownReady,
    /// Field `expires_at` (f64 +0); reads `base_duration` (f64 +8); active iff `remaining < 0.3 * base_duration`.
    AuraRefreshable,
    PositiveFloat,
    /// Field `current` (f64 +0), `max` (+8).
    ResourceDeficit,
    /// Field `current` (f64 +0), `max` (+8).
    ResourcePct,
    /// Field `current` (f64 +0), `max` (+8).
    ResourceDeficitPct,
    /// Field `current` (+0), `max` (+8), `regen` (+16).
    ResourceTimeToMax,
    /// Field `health` (+0), `max_health` (+8).
    UnitHealthPct,
    /// Field `health` (+0), `max_health` (+8).
    UnitHealthDeficit,
    /// Cross-slot spell usability; extra offsets resolved at JIT compile time via a side-table.
    SpellUsable,
}

An aura slot stores a single expires_at deadline, but the rotation can ask for is_active (TimestampActive), remaining (TimestampRemaining), elapsed (TimestampElapsed), or is_refreshable (AuraRefreshable, true when remaining < 0.3 * base_duration). A resource slot stores current, max, and regen, and the rotation reads deficit, pct, deficit_pct, or time_to_max off them. The deadline-and-now arithmetic happens at evaluation; the buffer stores only the raw state.

Two correctness guards keep this honest. At compile time, every slot's declared layout is checked against its actual repr(C) layout by assert_repr_c_layout, which recomputes offsets, size, and alignment from the field list, asserts they match, and rejects any field whose alignment exceeds the slot's 8-byte alignment. At runtime, the slot accessor macros assert pointer alignment before the cast. Misalignment is undefined behaviour in release, so the macro uses a plain assert! (not debug_assert!) that fires in release builds too. The buffer is fast because it is flat and unsafe; it is correct because the contract is verified at the boundary.

The buffer is what the combat system reads and writes during a fight. The next page, the cast pipeline, is what actually mutates those slots when a spell lands.

Next steps