तरंग — wave
The GPU that
computes on
a wave.
Taranga builds StochastiCore: a complete, open, peer-revised graphics pipeline whose arithmetic rides probability instead of place-value. A number becomes the density of ones in a pulse train — and multiplication collapses to a single AND gate.
Honest by construction: stochastic computing is not a throughput or energy win per operation — and this site prints those numbers at full size, beside the ones where it shines.
The idea
Arithmetic, priced in single gates.
Binary arithmetic pays for precision with silicon: an 8-bit multiplier is ~320 gates, and accelerators are mostly oceans of them. Stochastic computing pays with time instead — encode values as random bitstreams and the laws of probability do the arithmetic: P(A∧B) = P(A)·P(B), so an AND gate is a multiplier.
That trade — gates for cycles — is wrong for a gaming GPU and exactly right for a different world: minimum-silicon displays, radiation-tolerant compute, voltage-overscaled edge devices, and a GPU cheap enough that a student can build one with their own hands.
| Operation | StochastiCore | Conventional binary |
|---|---|---|
| Multiply | 1 gate a single AND | ~320 gates 8-bit array multiplier |
| Scaled add | 1 gate a single MUX | ~40 gates 8-bit adder |
| 4-lane MAC | 7 cells 4 AND + 3 MUX | 1,912 cells synthesized binary MAC-4 |
| tanh | 52 cells 5-state FSM | LUT/CORDIC, hundreds typical implementations |
Synthesized counts from the released Yosys flow — reproducible from the public RTL. The fair fine print: an SC result needs L cycles to read out (L = 256–65,536), which is the entire trade. How it works →
The honest line
Where the wave wins — and where it loses.
- Area density: a full compute unit — 4 multipliers, MAC, tanh, lerp — in fewer cells than ONE binary MAC-4 (1,709 vs 1,912).
- Marginal multiply = 1 gate: parallelism priced like wire, not silicon.
- Graceful failure: above ~0.5% bit-error rate, SC compute out-degrades binary — a flipped bit is 1/L of an answer, never the MSB.
- Free anti-banding: SC's noise is provably white — low-precision gradients render smooth without dithering hardware.
- Not a throughput win: useful quality needs 29×–116× longer per operation; the binary baseline renders ~8× more frames.
- Not an energy win: gate-level measurement says 71 fJ/op binary vs 17.6k–69k fJ/op stochastic. Our own earlier model was wrong; the revision says so.
- Not a gaming GPU: the niche is minimum-silicon, reliability-critical, display-class compute — and we won't pretend otherwise.
The first product
A real GPU, for the price of a textbook.
The Taranga Dev Kit is the build guide made physical: a complete stochastic GPU on a $25–55 FPGA, driving a real VGA monitor, commanded over SPI by any microcontroller — assembled by you, understood by you, owned by you.
- Display
- 320×240 @ 60 Hz (VGA, pixel-doubled to 640×480)
- Color
- RGB332 — 256 colors via an R-2R resistor DAC
- Compute
- 4 parallel SC compute units (SIMD)
- Precision
- runtime knob: L = 256 (fast, noisy) → 65,536 (slow, near-exact)
- Host
- any microcontroller over SPI (Arduino, STM32)
- Footprint
- ~2,400 LUTs — ~30% of a $25–55 iCE40 HX8K board
Where it serves
Honest use cases.
Education & research kits
A GPU whose entire pipeline a student can read, synthesize, probe, and rebuild — multiplication is literally one AND gate you can point at. From Verilog to VGA glow in a weekend, on a fully open toolchain.
Radiation & harsh-environment compute
Above ~0.5% bit-error rate, stochastic arithmetic degrades gracefully where binary corrupts catastrophically — a single flipped bit is 1/L of an answer, not the MSB. A candidate fabric for CubeSats, high-altitude, and high-EMI industrial settings.
Minimum-silicon instrument displays
Gauges, medical monitors, and panel displays need smooth gradients, not teraflops. The SC fabric draws banding-free shading at 3-bit color — the dithering is free, by physics — in a corner of the cheapest FPGAs.
The Signal Lab
Multiply two numbers with one gate. Right now.
The Lab runs the same LFSR semantics as the RTL, live in your tab: watch streams converge, watch banding dissolve, and watch binary shatter under bit-flips while the wave bends.
Open the Signal Lab