GPT
You can chat with the decentrally hosted model at chat.hypertensor.org hosting an uncensored version of Llama 3.1 8B. This model is validated in a subnet using the Decentralized Subnet Standard.
This is a beta version of a chat GPT, expect bugs.
Features
Text generation
Voice to text
Text to voice
Performance
With good GPUs, internet connection, and geography, expect between 8-20 tokens/s.
Model
GPU
Tokens Per Second
Llama 3.1 8B
NVIDIA 3070
6-15
Llama 3.1 8B
NVIDIA T4
4-8.5
Llama 3.1 8B
NVIDIA 4090
7.1-20
The tokens per second can slow down when the subnet has many clients using it and when there aren't nodes in your region due to latency.
As this is a testnet and unincentivized, most of the nodes in the subnet won't be using high-performance hardware.
Last updated