Heartbeat Thread

Each node in a Hypertensor subnet periodically broadcasts a heartbeat to let the network know it is still active. This is a fundamental part of maintaining a healthy and synchronized subnet.

The Heartbeat Thread is one of the core background tasks that should be started by your Server class. It periodically publishes a DHT record containing the node’s ServerInfo, which includes important metadata such as its current status (Joining, Online, or Offline), identity, and role in the subnet.

What the Heartbeat Does

  • Signals liveness: Helps other nodes determine whether a peer is still online or has gone offline.

  • Communicates state: Each heartbeat includes a node’s status, typically one of:

    • Joining – Node has joined the subnet and is syncing or bootstrapping.

    • Online – Node is active and ready to serve requests.

    • Offline – Node is gracefully shutting down or stepping away.

  • Publishes metadata: Along with its status, each node includes a ServerInfo payload in the heartbeat, which can contain:

    • Node ID

    • Roles (e.g., validator, worker, standby)

    • Public key or peer ID

    • Version info or capabilities

    • Timestamp or epoch

How It Works

  • The Heartbeat Thread is a persistent background task started by the Server class.

  • It runs at a regular interval (e.g., every 10–30 seconds).

  • It creates or updates a DHT record under a key such as a role.

  • The record includes the signed ServerInfo and an expiration time, ensuring stale heartbeats are cleaned up automatically.

💡 You can use the DHT’s TTL (time-to-live) feature to control how long a heartbeat is considered valid. Other nodes can periodically traverse the DHT to prune expired or missing heartbeats.

Customizing ServerInfo

Every subnet may have different requirements for how nodes behave and coordinate. That’s why the ServerInfo structure is meant to be customized per subnet.

For example:

  • A training subnet might define roles like Trainer, Aggregator, Verifier.

  • A chat subnet might use roles like Responder, Router, Relay.

  • A subnet with a voting mechanism could embed stake, score, or proposal participation info.

This flexibility allows subnets to tailor node behavior while still using a standardized heartbeat mechanism.

See Record Validator to see how to create conditions around what keys can be stored in the database.

In this example, each node

class ServerState(Enum):
    OFFLINE = 0
    JOINING = 1
    ONLINE = 2

class ServerClass(Enum):
    """
    Make your own roles here
    """
    VALIDATOR = "validator"

RPS = pydantic.confloat(ge=0, allow_inf_nan=False, strict=True)

"""
Create server node data to store
"""
@pydantic.dataclasses.dataclass
class ServerInfo:
    state: ServerState
    role: ServerClass
    throughput: RPS

    public_name: Optional[str] = None
    version: Optional[str] = None

    inference_rps: Optional[RPS] = None # type: ignore

    using_relay: Optional[bool] = None
    next_pings: Optional[Dict[str, pydantic.confloat(ge=0, strict=True)]] = None # type: ignore

    def to_tuple(self) -> Tuple[int, str, float, dict]:
        extra_info = dataclasses.asdict(self)
        del extra_info["state"], extra_info["throughput"], extra_info["role"]
        return (self.state.value, self.role.value, self.throughput, extra_info)

    @classmethod
    def from_tuple(cls, source: tuple):
        if not isinstance(source, tuple):
            raise TypeError(f"Expected a tuple, got {type(source)}")
        state, role, throughput = source[:3]
        extra_info = source[3] if len(source) > 2 else {}
        # pydantic will validate existing fields and ignore extra ones
        return cls(state=ServerState(state), role=role, throughput=throughput, **extra_info)

In the Inference Subnet, the roles are HOSTER and VALIDATOR.

Last updated