Troubleshooting
This page lists common errors and ways to address them.
Before starting your subnet validator node, ensure the following are true:
Run
nvidia-smi
This should display your GPU information.
Otherwise, you will need to use the following to install your GPU:
Run
python3 -c "import torch; print(torch.cuda.is_available())"
orpython -c "import torch; print(torch.cuda.is_available())"
This should print
True
Otherwise, reboot your server or ensure you have torch installed.
I get this error:
hivemind.dht.protocol.ValidationError: local time must be within 3 seconds of others
on WSL. What should I do?All clocks on all nodes need to be synchronized. Please set the date using an NTP server:
The server starts loading blocks and then prints:
Killed
. What should I do?This happens since Windows doesn't allocate much RAM to WSL by default, so the server gets OOM-killed.
To increase the memory limit, go to
C:/Users/username
and create the.wslconfig
with this contents:Then reboot WSL (run
sudo reboot
in the WSL console) and it should work fine.I get this error:
torch.cuda.OutOfMemoryError: CUDA out of memory
. What should I do?If you use an Anaconda env, run this before starting the server:
If you use Docker, add this argument after
--rm
in the Docker command:WSL clock tends to get out of synch, which prevents the server from launching with the error
hivemind.dht.protocol.ValidationError: local time must be within 3 seconds of others
.To sync the WSL clock run
sudo ntpdate pool.ntp.org
. See more fixes discussed at stackverflow.
Last updated