Taraṅga

The technology

Numbers as waves.
Gates as laws.

Stochastic computing replaces place-value arithmetic with probability. The hardware gets radically smaller because the mathematics is done by physics — at the price of time, paid in stream length. Here is the whole machine, plainly.

01

Encode

value becomes density

A number x in [0,1] becomes a bitstream where each bit is 1 with probability x. The encoder (B2S) is just a comparator against a pseudo-random sequence — we use a 16-bit LFSR bank, or a 7.5× smaller deterministic low-discrepancy generator.

02

Compute

probability does the math

For independent streams, P(A AND B) = P(A)·P(B): an AND gate multiplies. A multiplexer computes the scaled average. A five-state machine bends a stream through tanh. The entire arithmetic vocabulary of shading and inference, in single-digit gate counts.

03

Decode

count the ones

A counter (S2B) reads the result back: the fraction of ones over L cycles is the answer, accurate as 1/√L for random streams — or exact at L = 2¹⁶ with the deterministic generator. This counter's off-by-one once halved every result; our gate-level simulator caught it.

04

Render

the pipeline around it

Command processor → SC vertex transform → binary rasterizer (exact coverage, made synthesizable with a per-triangle reciprocal) → SC fragment shading → Z-buffered framebuffer → VGA. Open Verilog, end to end.

The trade, quantified

Gates for cycles. Exactly this many.

What a compute unit costs
1,709 cells

4 multiply lanes + MAC + tanh + lerp — vs 1,912 cells for one binary MAC-4. Marginal extra multiply: 1 gate.

What a result costs
L cycles

29× longer per op at L=256 (~29 dB); 116× longer per op at L=1024 (~36.7 dB). This is the honest price of the one-gate multiply.

What the generator costs
7.5× smaller

Deterministic low-discrepancy generation vs the LFSR bank — and 15.4× less switching energy, with exact products at L=2¹⁶.

Every figure synthesized or simulated in the released flow. The quality–rate knob is a register: 68 fps rough, 4.5 fps fine, identical silicon. The full evidence →

Why graphics, of all things

Rendering forgives — and rewards — noise.

Human vision integrates noise over space and time; 8-bit color exposes only 256 levels; rasterization is already approximate. Rendering is the rare workload where SC's precision limit stops being a deficiency.

Better: the paper proves SC's quantization noise is spectrally white — exactly the dither that low-precision binary pipelines must add with extra hardware to avoid color banding. The wave's flaw is the display's feature.

See it dissolve banding live in the Signal Lab →