KVInfer · 152M
Chat
Idle
KV
KVInfer Studio
152M · GPT-2 Decoder-Only · Custom C++ inference engine with AVX2 SIMD, OpenMP parallelism & persistent session KV-cache.
152M params
AVX2 SIMD
OpenMP
KV Cache
Streaming