8948 lines
383 KiB
HTML
Raw Normal View History

2024-02-05 13:11:36 +00:00
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>The TokTok Project - Protocol</title>
<meta name="description" content="- The line below will tell kramdown to generate the ToC on this line.{:toc}# IntroductionThis document is a textual specification of the Tox protocol and a...">
<link rel="canonical" href="https://toktok.ltd/spec.html">
<link href="https://use.fontawesome.com/releases/v5.0.6/css/all.css" rel="stylesheet">
<link rel="stylesheet" href="static/css/bootstrap.css" media="screen">
<link rel="stylesheet" href="static/css/style.css" media="screen">
</head>
<body>
<nav class="navbar">
<div class="container-fluid limit-width">
<div class="navbar-header">
<a class="navbar-brand" href="index.html">
<span class="fab fa-earlybirds"></span>
<span class="text">TokTok</span>
</a>
<input type="checkbox" id="hambox" class="navbar-toggle collapsed">
<label id="hamlabel" class="navbar-toggle collapsed fas fa-bars" for="hambox"></label>
<div class="tinynav navbar-toggle collapsed">
<ul class="nav navbar-nav">
<li>
<a href="mission.html">
<span class="fas fa-align-left">&nbsp;</span>
Mission
</a>
</li>
<li>
<a href="pulls.html">
<span class="fas fa-code-branch">&nbsp;</span>
Pull Requests
</a>
</li>
<li class="current">
<a href="spec.html">
<span class="fas fa-book">&nbsp;</span>
Protocol
</a>
</li>
<li>
<a href="documents.html">
<span class="fas fa-file-alt">&nbsp;</span>
Documents
</a>
</li>
<li>
<a href="integrations.html">
<span class="fas fa-list-ul">&nbsp;</span>
Integrations
</a>
</li>
<li>
<a href="https://tox.chat">
<span class="fas fa-lock">&nbsp;</span>
Tox
</a>
</li>
</ul>
</div>
</div>
<div id="navbar" class="navbar-collapse collapse">
<ul class="nav navbar-nav navbar-right">
<li>
<a href="mission.html">
<span class="fas fa-align-left">&nbsp;</span>
Mission
</a>
</li>
<li>
<a href="pulls.html">
<span class="fas fa-code-branch">&nbsp;</span>
Pull Requests
</a>
</li>
<li class="current">
<a href="spec.html">
<span class="fas fa-book">&nbsp;</span>
Protocol
</a>
</li>
<li>
<a href="documents.html">
<span class="fas fa-file-alt">&nbsp;</span>
Documents
</a>
</li>
<li>
<a href="integrations.html">
<span class="fas fa-list-ul">&nbsp;</span>
Integrations
</a>
</li>
<li>
<a href="https://tox.chat">
<span class="fas fa-lock">&nbsp;</span>
Tox
</a>
</li>
</ul>
</div>
</div>
</nav>
<section id="content">
<div class="container-fluid limit-width lead">
<ul id="markdown-toc">
<li><a href="#introduction" id="markdown-toc-introduction">Introduction</a> <ul>
<li><a href="#objectives" id="markdown-toc-objectives">Objectives</a> <ul>
<li><a href="#goals" id="markdown-toc-goals">Goals</a></li>
<li><a href="#non-goals" id="markdown-toc-non-goals">Non-goals</a></li>
</ul>
</li>
<li><a href="#threat-model" id="markdown-toc-threat-model">Threat model</a></li>
<li><a href="#data-types" id="markdown-toc-data-types">Data types</a></li>
<li><a href="#integers" id="markdown-toc-integers">Integers</a></li>
<li><a href="#strings" id="markdown-toc-strings">Strings</a></li>
</ul>
</li>
<li><a href="#crypto" id="markdown-toc-crypto">Crypto</a> <ul>
<li><a href="#key" id="markdown-toc-key">Key</a> <ul>
<li><a href="#key-pair" id="markdown-toc-key-pair">Key Pair</a></li>
<li><a href="#combined-key" id="markdown-toc-combined-key">Combined Key</a></li>
<li><a href="#nonce" id="markdown-toc-nonce">Nonce</a></li>
</ul>
</li>
<li><a href="#box" id="markdown-toc-box">Box</a></li>
</ul>
</li>
<li><a href="#node-info" id="markdown-toc-node-info">Node Info</a> <ul>
<li><a href="#transport-protocol" id="markdown-toc-transport-protocol">Transport Protocol</a></li>
<li><a href="#host-address" id="markdown-toc-host-address">Host Address</a></li>
<li><a href="#port-number" id="markdown-toc-port-number">Port Number</a></li>
<li><a href="#socket-address" id="markdown-toc-socket-address">Socket Address</a></li>
<li><a href="#node-info-packed-node-format" id="markdown-toc-node-info-packed-node-format">Node Info (packed node format)</a></li>
</ul>
</li>
<li><a href="#protocol-packet" id="markdown-toc-protocol-packet">Protocol Packet</a> <ul>
<li><a href="#packet-kind" id="markdown-toc-packet-kind">Packet Kind</a></li>
</ul>
</li>
<li><a href="#dht" id="markdown-toc-dht">DHT</a> <ul>
<li><a href="#distance" id="markdown-toc-distance">Distance</a></li>
<li><a href="#client-lists" id="markdown-toc-client-lists">Client Lists</a></li>
<li><a href="#k-buckets" id="markdown-toc-k-buckets">K-buckets</a> <ul>
<li><a href="#bucket-index" id="markdown-toc-bucket-index">Bucket Index</a></li>
<li><a href="#manipulating-k-buckets" id="markdown-toc-manipulating-k-buckets">Manipulating k-buckets</a></li>
</ul>
</li>
<li><a href="#dht-node-state" id="markdown-toc-dht-node-state">DHT node state</a> <ul>
<li><a href="#dht-search-entry" id="markdown-toc-dht-search-entry">DHT Search Entry</a></li>
<li><a href="#manipulating-the-dht-node-state" id="markdown-toc-manipulating-the-dht-node-state">Manipulating the DHT node state</a></li>
</ul>
</li>
<li><a href="#self-organisation" id="markdown-toc-self-organisation">Self-organisation</a></li>
<li><a href="#dht-packet" id="markdown-toc-dht-packet">DHT Packet</a></li>
<li><a href="#rpc-services" id="markdown-toc-rpc-services">RPC Services</a> <ul>
<li><a href="#replies-to-rpc-requests" id="markdown-toc-replies-to-rpc-requests">Replies to RPC requests</a></li>
<li><a href="#ping-service" id="markdown-toc-ping-service">Ping Service</a> <ul>
<li><a href="#ping-request-0x00" id="markdown-toc-ping-request-0x00">Ping Request (0x00)</a></li>
<li><a href="#ping-response-0x01" id="markdown-toc-ping-response-0x01">Ping Response (0x01)</a></li>
</ul>
</li>
<li><a href="#nodes-service" id="markdown-toc-nodes-service">Nodes Service</a> <ul>
<li><a href="#nodes-request-0x02" id="markdown-toc-nodes-request-0x02">Nodes Request (0x02)</a></li>
<li><a href="#nodes-response-0x04" id="markdown-toc-nodes-response-0x04">Nodes Response (0x04)</a></li>
</ul>
</li>
</ul>
</li>
<li><a href="#dht-operation" id="markdown-toc-dht-operation">DHT Operation</a> <ul>
<li><a href="#dht-initialisation" id="markdown-toc-dht-initialisation">DHT Initialisation</a></li>
<li><a href="#periodic-sending-of-nodes-requests" id="markdown-toc-periodic-sending-of-nodes-requests">Periodic sending of Nodes Requests</a></li>
<li><a href="#handling-nodes-response-packets" id="markdown-toc-handling-nodes-response-packets">Handling Nodes Response packets</a></li>
<li><a href="#handling-nodes-request-packets" id="markdown-toc-handling-nodes-request-packets">Handling Nodes Request packets</a></li>
<li><a href="#handling-ping-request-packets" id="markdown-toc-handling-ping-request-packets">Handling Ping Request packets</a></li>
<li><a href="#handling-ping-response-packets" id="markdown-toc-handling-ping-response-packets">Handling Ping Response packets</a></li>
<li><a href="#sending-ping-requests" id="markdown-toc-sending-ping-requests">Sending Ping Requests</a></li>
</ul>
</li>
<li><a href="#dht-request-packets" id="markdown-toc-dht-request-packets">DHT Request Packets</a> <ul>
<li><a href="#handling-dht-request-packets" id="markdown-toc-handling-dht-request-packets">Handling DHT Request packets</a></li>
<li><a href="#nat-ping-packets" id="markdown-toc-nat-ping-packets">NAT ping packets</a> <ul>
<li><a href="#nat-ping-request" id="markdown-toc-nat-ping-request">NAT ping request</a></li>
<li><a href="#nat-ping-response" id="markdown-toc-nat-ping-response">NAT ping response</a></li>
</ul>
</li>
<li><a href="#effects-of-chosen-constants-on-performance" id="markdown-toc-effects-of-chosen-constants-on-performance">Effects of chosen constants on performance</a></li>
</ul>
</li>
<li><a href="#nats" id="markdown-toc-nats">NATs</a></li>
<li><a href="#hole-punching" id="markdown-toc-hole-punching">Hole punching</a></li>
<li><a href="#dht-bootstrap-info-0xf0" id="markdown-toc-dht-bootstrap-info-0xf0">DHT Bootstrap Info (0xf0)</a></li>
</ul>
</li>
<li><a href="#lan-discovery" id="markdown-toc-lan-discovery">LAN discovery</a></li>
<li><a href="#messenger" id="markdown-toc-messenger">Messenger</a> <ul>
<li><a href="#online" id="markdown-toc-online"><code class="language-plaintext highlighter-rouge">ONLINE</code></a></li>
<li><a href="#offline" id="markdown-toc-offline"><code class="language-plaintext highlighter-rouge">OFFLINE</code></a></li>
<li><a href="#nickname" id="markdown-toc-nickname"><code class="language-plaintext highlighter-rouge">NICKNAME</code></a></li>
<li><a href="#statusmessage" id="markdown-toc-statusmessage"><code class="language-plaintext highlighter-rouge">STATUSMESSAGE</code></a></li>
<li><a href="#userstatus" id="markdown-toc-userstatus"><code class="language-plaintext highlighter-rouge">USERSTATUS</code></a></li>
<li><a href="#typing" id="markdown-toc-typing"><code class="language-plaintext highlighter-rouge">TYPING</code></a></li>
<li><a href="#message" id="markdown-toc-message"><code class="language-plaintext highlighter-rouge">MESSAGE</code></a></li>
<li><a href="#action" id="markdown-toc-action"><code class="language-plaintext highlighter-rouge">ACTION</code></a></li>
<li><a href="#msi" id="markdown-toc-msi"><code class="language-plaintext highlighter-rouge">MSI</code></a></li>
<li><a href="#file-transfer-related-packets" id="markdown-toc-file-transfer-related-packets">File Transfer Related Packets</a> <ul>
<li><a href="#file_sendrequest" id="markdown-toc-file_sendrequest"><code class="language-plaintext highlighter-rouge">FILE_SENDREQUEST</code></a></li>
<li><a href="#file_control" id="markdown-toc-file_control"><code class="language-plaintext highlighter-rouge">FILE_CONTROL</code></a></li>
<li><a href="#file_data" id="markdown-toc-file_data"><code class="language-plaintext highlighter-rouge">FILE_DATA</code></a></li>
</ul>
</li>
<li><a href="#group-chat-related-packets" id="markdown-toc-group-chat-related-packets">Group Chat Related Packets</a></li>
</ul>
</li>
<li><a href="#tcp-client" id="markdown-toc-tcp-client">TCP client</a></li>
<li><a href="#tcp-connections" id="markdown-toc-tcp-connections">TCP connections</a></li>
<li><a href="#tcp-server" id="markdown-toc-tcp-server">TCP server</a> <ul>
<li><a href="#encrypted-payload-types" id="markdown-toc-encrypted-payload-types">Encrypted payload types</a> <ul>
<li><a href="#routing-request-0x00" id="markdown-toc-routing-request-0x00">Routing request (0x00)</a></li>
<li><a href="#routing-request-response-0x01" id="markdown-toc-routing-request-response-0x01">Routing request response (0x01)</a></li>
<li><a href="#connect-notification-0x02" id="markdown-toc-connect-notification-0x02">Connect notification (0x02)</a></li>
<li><a href="#disconnect-notification-0x03" id="markdown-toc-disconnect-notification-0x03">Disconnect notification (0x03)</a></li>
<li><a href="#ping-packet-0x04" id="markdown-toc-ping-packet-0x04">Ping packet (0x04)</a></li>
<li><a href="#ping-response-pong-0x05" id="markdown-toc-ping-response-pong-0x05">Ping response (pong) (0x05)</a></li>
<li><a href="#oob-send-0x06" id="markdown-toc-oob-send-0x06">OOB send (0x06)</a></li>
<li><a href="#oob-recv-0x07" id="markdown-toc-oob-recv-0x07">OOB recv (0x07)</a></li>
<li><a href="#onion-packet-0x08" id="markdown-toc-onion-packet-0x08">Onion packet (0x08)</a></li>
<li><a href="#onion-packet-response-0x09" id="markdown-toc-onion-packet-response-0x09">Onion packet response (0x09)</a></li>
<li><a href="#data-0x10-and-up" id="markdown-toc-data-0x10-and-up">Data (0x10 and up)</a></li>
</ul>
</li>
</ul>
</li>
<li><a href="#friend-connection" id="markdown-toc-friend-connection">Friend connection</a></li>
<li><a href="#friend-requests" id="markdown-toc-friend-requests">Friend requests</a></li>
<li><a href="#group" id="markdown-toc-group">Group</a> <ul>
<li><a href="#message-ids" id="markdown-toc-message-ids">Message ids</a> <ul>
<li><a href="#ping-0x00" id="markdown-toc-ping-0x00">ping (0x00)</a></li>
<li><a href="#new_peer-0x10" id="markdown-toc-new_peer-0x10"><code class="language-plaintext highlighter-rouge">new_peer</code> (0x10)</a></li>
<li><a href="#kill_peer-0x11" id="markdown-toc-kill_peer-0x11"><code class="language-plaintext highlighter-rouge">kill_peer</code> (0x11)</a></li>
<li><a href="#freeze_peer-0x12" id="markdown-toc-freeze_peer-0x12"><code class="language-plaintext highlighter-rouge">freeze_peer</code> (0x12)</a></li>
<li><a href="#name-change-0x30" id="markdown-toc-name-change-0x30">Name change (0x30)</a></li>
<li><a href="#groupchat-title-change-0x31" id="markdown-toc-groupchat-title-change-0x31">Groupchat title change (0x31)</a></li>
<li><a href="#chat-message-0x40" id="markdown-toc-chat-message-0x40">Chat message (0x40)</a></li>
<li><a href="#action-me-0x41" id="markdown-toc-action-me-0x41">Action (/me) (0x41)</a></li>
</ul>
</li>
<li><a href="#timeouts-and-reconnection" id="markdown-toc-timeouts-and-reconnection">Timeouts and reconnection</a></li>
</ul>
</li>
<li><a href="#dht-group-chats" id="markdown-toc-dht-group-chats">DHT Group Chats</a> <ul>
<li><a href="#features" id="markdown-toc-features">Features</a></li>
<li><a href="#group-roles" id="markdown-toc-group-roles">Group roles</a></li>
<li><a href="#group-types" id="markdown-toc-group-types">Group types</a> <ul>
<li><a href="#public" id="markdown-toc-public">Public</a></li>
<li><a href="#private" id="markdown-toc-private">Private</a></li>
</ul>
</li>
<li><a href="#voice-state" id="markdown-toc-voice-state">Voice state</a></li>
<li><a href="#cryptography" id="markdown-toc-cryptography">Cryptography</a> <ul>
<li><a href="#permanent-keypairs" id="markdown-toc-permanent-keypairs">Permanent keypairs</a></li>
<li><a href="#session-keypairshared-symmetric-key" id="markdown-toc-session-keypairshared-symmetric-key">Session keypair/shared symmetric key</a></li>
<li><a href="#group-keypairs" id="markdown-toc-group-keypairs">Group keypairs</a></li>
</ul>
</li>
<li><a href="#founders" id="markdown-toc-founders">Founders</a> <ul>
<li><a href="#shared-state" id="markdown-toc-shared-state">Shared state</a></li>
<li><a href="#moderation" id="markdown-toc-moderation">Moderation</a></li>
<li><a href="#kicks" id="markdown-toc-kicks">Kicks</a></li>
<li><a href="#moderator-list" id="markdown-toc-moderator-list">Moderator list</a></li>
<li><a href="#sanctions-list" id="markdown-toc-sanctions-list">Sanctions list</a></li>
</ul>
</li>
<li><a href="#topics" id="markdown-toc-topics">Topics</a></li>
<li><a href="#state-syncing" id="markdown-toc-state-syncing">State syncing</a></li>
<li><a href="#dht-announcements" id="markdown-toc-dht-announcements">DHT Announcements</a></li>
</ul>
</li>
<li><a href="#dht-group-chats-packet-protocols" id="markdown-toc-dht-group-chats-packet-protocols">DHT Group Chats Packet Protocols</a> <ul>
<li><a href="#full-packet-structure" id="markdown-toc-full-packet-structure">Full Packet Structure</a> <ul>
<li><a href="#plaintext-header" id="markdown-toc-plaintext-header">Plaintext header</a></li>
<li><a href="#encrypted-header" id="markdown-toc-encrypted-header">Encrypted header</a></li>
<li><a href="#encrypted-payload" id="markdown-toc-encrypted-payload">Encrypted payload</a></li>
</ul>
</li>
<li><a href="#handshake-packet-payloads" id="markdown-toc-handshake-packet-payloads">Handshake packet payloads</a> <ul>
<li><a href="#handshake_request-0x00-and-handshake_response-0x01" id="markdown-toc-handshake_request-0x00-and-handshake_response-0x01">HANDSHAKE_REQUEST (0x00) and HANDSHAKE_RESPONSE (0x01)</a></li>
</ul>
</li>
<li><a href="#lossy-packet-payloads" id="markdown-toc-lossy-packet-payloads">Lossy Packet Payloads</a> <ul>
<li><a href="#ping-0x01" id="markdown-toc-ping-0x01">PING (0x01)</a></li>
<li><a href="#message_ack-0x02" id="markdown-toc-message_ack-0x02">MESSAGE_ACK (0x02)</a></li>
<li><a href="#invite_response_reject-0x03" id="markdown-toc-invite_response_reject-0x03">INVITE_RESPONSE_REJECT (0x03)</a></li>
</ul>
</li>
<li><a href="#lossless-packet-payloads" id="markdown-toc-lossless-packet-payloads">Lossless Packet Payloads</a> <ul>
<li><a href="#fragment-0xef" id="markdown-toc-fragment-0xef">FRAGMENT (0xef)</a></li>
<li><a href="#key_rotations-0xf0" id="markdown-toc-key_rotations-0xf0">KEY_ROTATIONS (0xf0)</a></li>
<li><a href="#tcp_relays-0xf1" id="markdown-toc-tcp_relays-0xf1">TCP_RELAYS (0xf1)</a></li>
<li><a href="#custom_packets-0xf2" id="markdown-toc-custom_packets-0xf2">CUSTOM_PACKETS (0xf2)</a></li>
<li><a href="#broadcast-0xf3" id="markdown-toc-broadcast-0xf3">BROADCAST (0xf3)</a> <ul>
<li><a href="#status-0x00" id="markdown-toc-status-0x00">STATUS (0x00)</a></li>
<li><a href="#nick-0x01" id="markdown-toc-nick-0x01">NICK (0x01)</a></li>
<li><a href="#plain_message-0x02" id="markdown-toc-plain_message-0x02">PLAIN_MESSAGE (0x02)</a></li>
<li><a href="#action_message-0x03" id="markdown-toc-action_message-0x03">ACTION_MESSAGE (0x03)</a></li>
<li><a href="#private_message-0x04" id="markdown-toc-private_message-0x04">PRIVATE_MESSAGE (0x04)</a></li>
<li><a href="#peer_exit-0x05" id="markdown-toc-peer_exit-0x05">PEER_EXIT (0x05)</a></li>
<li><a href="#peer_kick-0x06" id="markdown-toc-peer_kick-0x06">PEER_KICK (0x06)</a></li>
<li><a href="#set_mod-0x07" id="markdown-toc-set_mod-0x07">SET_MOD (0x07)</a></li>
<li><a href="#set_observer-0x08" id="markdown-toc-set_observer-0x08">SET_OBSERVER (0x08)</a></li>
</ul>
</li>
<li><a href="#peer_info_request-0xf4" id="markdown-toc-peer_info_request-0xf4">PEER_INFO_REQUEST (0xf4)</a></li>
<li><a href="#peer_info_response-0xf5" id="markdown-toc-peer_info_response-0xf5">PEER_INFO_RESPONSE (0xf5)</a></li>
<li><a href="#invite_request-0xf6" id="markdown-toc-invite_request-0xf6">INVITE_REQUEST (0xf6)</a></li>
<li><a href="#invite_response-0xf7" id="markdown-toc-invite_response-0xf7">INVITE_RESPONSE (0xf7)</a></li>
<li><a href="#sync_request-0xf8" id="markdown-toc-sync_request-0xf8">SYNC_REQUEST (0xf8)</a></li>
<li><a href="#sync_response-0xf9" id="markdown-toc-sync_response-0xf9">SYNC_RESPONSE (0xf9)</a></li>
<li><a href="#topic-0xfa" id="markdown-toc-topic-0xfa">TOPIC (0xfa)</a></li>
<li><a href="#shared_state-0xfb" id="markdown-toc-shared_state-0xfb">SHARED_STATE (0xfb)</a></li>
<li><a href="#mod_list-0xfc" id="markdown-toc-mod_list-0xfc">MOD_LIST (0xfc)</a></li>
<li><a href="#sanctions_list-0xfd" id="markdown-toc-sanctions_list-0xfd">SANCTIONS_LIST (0xfd)</a> <ul>
<li><a href="#sanctions-list-entry" id="markdown-toc-sanctions-list-entry">Sanctions List Entry</a></li>
<li><a href="#sanctions-credentials" id="markdown-toc-sanctions-credentials">Sanctions Credentials</a></li>
</ul>
</li>
<li><a href="#friend_invite-0xfe" id="markdown-toc-friend_invite-0xfe">FRIEND_INVITE (0xfe)</a></li>
<li><a href="#hs_response_ack-0xff" id="markdown-toc-hs_response_ack-0xff">HS_RESPONSE_ACK (0xff)</a></li>
</ul>
</li>
</ul>
</li>
<li><a href="#net-crypto" id="markdown-toc-net-crypto">Net crypto</a></li>
<li><a href="#networktxt" id="markdown-toc-networktxt">network.txt</a></li>
<li><a href="#onion" id="markdown-toc-onion">Onion</a></li>
<li><a href="#ping-array" id="markdown-toc-ping-array">Ping array</a></li>
<li><a href="#state-format" id="markdown-toc-state-format">State Format</a> <ul>
<li><a href="#sections" id="markdown-toc-sections">Sections</a> <ul>
<li><a href="#nospam-and-keys-0x01" id="markdown-toc-nospam-and-keys-0x01">Nospam and Keys (0x01)</a></li>
<li><a href="#dht-0x02" id="markdown-toc-dht-0x02">DHT (0x02)</a> <ul>
<li><a href="#dht-sections" id="markdown-toc-dht-sections">DHT Sections</a> <ul>
<li><a href="#nodes-0x04" id="markdown-toc-nodes-0x04">Nodes (0x04)</a></li>
</ul>
</li>
</ul>
</li>
<li><a href="#friends-0x03" id="markdown-toc-friends-0x03">Friends (0x03)</a></li>
<li><a href="#name-0x04" id="markdown-toc-name-0x04">Name (0x04)</a></li>
<li><a href="#status-message-0x05" id="markdown-toc-status-message-0x05">Status Message (0x05)</a></li>
<li><a href="#status-0x06" id="markdown-toc-status-0x06">Status (0x06)</a></li>
<li><a href="#tcp-relays-0x0a" id="markdown-toc-tcp-relays-0x0a">Tcp Relays (0x0A)</a></li>
<li><a href="#path-nodes-0x0b" id="markdown-toc-path-nodes-0x0b">Path Nodes (0x0B)</a></li>
<li><a href="#conferences-0x14" id="markdown-toc-conferences-0x14">Conferences (0x14)</a></li>
<li><a href="#eof-0xff" id="markdown-toc-eof-0xff">EOF (0xFF)</a></li>
</ul>
</li>
</ul>
</li>
</ul>
<!-- Leave a blank line under this or the following header will break. -->
<h2 id="introduction">Introduction</h2>
<p>This document is a textual specification of the Tox protocol and all the
supporting modules required to implement it. The goal of this document
is to give enough guidance to permit a complete and correct
implementation of the protocol.</p>
<h3 id="objectives">Objectives</h3>
<p>This section provides an overview of goals and non-goals of Tox. It
provides the reader with:</p>
<ul>
<li>
<p>a basic understanding of what problems Tox intends to solve;</p>
</li>
<li>
<p>a means to validate whether those problems are indeed solved by the
protocol as specified;</p>
</li>
<li>
<p>the ability to make better tradeoffs and decisions in their own
reimplementation of the protocol.</p>
</li>
</ul>
<h4 id="goals">Goals</h4>
<ul>
<li>
<p><strong>Authentication:</strong> Tox aims to provide authenticated communication.
This means that during a communication session, both parties can be
sure of the other partys identity. Users are identified by their
public key. The initial key exchange is currently not in scope for
the Tox protocol. In the future, Tox may provide a means for initial
authentication using a challenge/response or shared secret based
exchange.</p>
<p>If the secret key is compromised, the users identity is
compromised, and an attacker can impersonate that user. When this
happens, the user must create a new identity with a new public key.</p>
</li>
<li>
<p><strong>End-to-end encryption:</strong> The Tox protocol establishes end-to-end
encrypted communication links. Shared keys are deterministically
derived using a Diffie-Hellman-like method, so keys are never
transferred over the network.</p>
</li>
<li>
<p><strong>Forward secrecy</strong>: Session keys are re-negotiated when the peer
connection is established.</p>
</li>
<li>
<p><strong>Privacy</strong>: When Tox establishes a communication link, it aims to
avoid leaking to any third party the identities of the parties
involved (i.e. their public keys).</p>
<p>Furthermore, it aims to avoid allowing third parties to determine
the IP address of a given user.</p>
</li>
<li>
<p><strong>Resilience:</strong></p>
<ul>
<li>
<p>Independence of infrastructure: Tox avoids relying on servers as
much as possible. Communications are not transmitted via or
stored on central servers. Joining a Tox network requires
connecting to a well-known node called a bootstrap node. Anyone
can run a bootstrap node, and users need not put any trust in
them.</p>
</li>
<li>
<p>Tox tries to establish communication paths in difficult network
situations. This includes connecting to peers behind a NAT or
firewall. Various techniques help achieve this, such as UDP
hole-punching, UPnP, NAT-PMP, other untrusted nodes acting as
relays, and DNS tunnels.</p>
</li>
<li>
<p>Resistance to basic denial of service attacks: short timeouts
make the network dynamic and resilient against poisoning
attempts.</p>
</li>
</ul>
</li>
<li>
<p><strong>Minimum configuration:</strong> Tox aims to be nearly zero-conf.
User-friendliness is an important aspect to security. Tox aims to
make security easy to achieve for average users.</p>
</li>
</ul>
<h4 id="non-goals">Non-goals</h4>
<ul>
<li>
<p><strong>Anonymity</strong> is not in scope for the Tox protocol itself, but it
provides an easy way to integrate with software providing anonymity,
such as Tor.</p>
<p>By default, Tox tries to establish direct connections between peers;
as a consequence, each is aware of the others IP address, and third
parties may be able to determine that a connection has been
established between those IP addresses. One of the reasons for
making direct connections is that relaying real-time multimedia
conversations over anonymity networks is not feasible with the
current network infrastructure.</p>
</li>
</ul>
<h3 id="threat-model">Threat model</h3>
<p>TODO(iphydf): Define one.</p>
<h3 id="data-types">Data types</h3>
<p>All data types are defined before their first use, and their binary
protocol representation is given. The protocol representations are
normative and must be implemented exactly as specified. For some types,
human-readable representations are suggested. An implementation may
choose to provide no such representation or a different one. The
implementation is free to choose any in-memory representation of the
specified types.</p>
<p>Binary formats are specified in tables with length, type, and content
descriptions. If applicable, specific enumeration types are used, so
types may be self-explanatory in some cases. The length can be either a
fixed number in bytes (e.g. <code class="language-plaintext highlighter-rouge">32</code>), a number in bits (e.g. <code class="language-plaintext highlighter-rouge">7</code> bit), a
choice of lengths (e.g. <code class="language-plaintext highlighter-rouge">4 | 16</code>), or an inclusive range (e.g.
<code class="language-plaintext highlighter-rouge">[0, 100]</code>). Open ranges are denoted <code class="language-plaintext highlighter-rouge">[n,]</code> to mean a minimum length of
<code class="language-plaintext highlighter-rouge">n</code> with no specified maximum length.</p>
<h3 id="integers">Integers</h3>
<p>The protocol uses four bounded unsigned integer types. Bounded means
they have an upper bound beyond which incrementing is not defined. The
integer types support modular arithmetic, so overflow wraps around to
zero. Unsigned means their lower bound is 0. Signed integer types are
not used. The binary encoding of all integer types is a fixed-width byte
sequence with the integer encoded in <a href="https://en.wikipedia.org/wiki/Endianness">Big
Endian</a> unless stated
otherwise.</p>
<table>
<thead>
<tr>
<th style="text-align: left">Type name</th>
<th style="text-align: left">C type</th>
<th style="text-align: left">Length</th>
<th style="text-align: left">Upper bound</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left">Word8</td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code></td>
<td style="text-align: left">1</td>
<td style="text-align: left">255 (0xff)</td>
</tr>
<tr>
<td style="text-align: left">Word16</td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint16_t</code></td>
<td style="text-align: left">2</td>
<td style="text-align: left">65535 (0xffff)</td>
</tr>
<tr>
<td style="text-align: left">Word32</td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint32_t</code></td>
<td style="text-align: left">4</td>
<td style="text-align: left">4294967295 (0xffffffff)</td>
</tr>
<tr>
<td style="text-align: left">Word64</td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint64_t</code></td>
<td style="text-align: left">8</td>
<td style="text-align: left">18446744073709551615 (0xffffffffffffffff)</td>
</tr>
</tbody>
</table>
<h3 id="strings">Strings</h3>
<p>A String is a data structure used for human readable text. Strings are
sequences of glyphs. A glyph consists of one non-zero-width unicode code
point and zero or more zero-width unicode code points. The
human-readable representation of a String starts and ends with a
quotation mark (<code class="language-plaintext highlighter-rouge">"</code>) and contains all human-readable glyphs verbatim.
Control characters are represented in an isomorphic human-readable way.
I.e. every control character has exactly one human-readable
representation, and a mapping exists from the human-readable
representation to the control character. Therefore, the use of Unicode
Control Characters (U+240x) is not permitted without additional marker.</p>
<h2 id="crypto">Crypto</h2>
<p>The Crypto module contains all the functions and data types related to
cryptography. This includes random number generation, encryption and
decryption, key generation, operations on nonces and generating random
nonces.</p>
<h3 id="key">Key</h3>
<p>A Crypto Number is a large fixed size unsigned (non-negative) integer.
Its binary encoding is as a Big Endian integer in exactly the encoded
byte size. Its human-readable encoding is as a base-16 number encoded as
String. The NaCl implementation
<a href="https://github.com/jedisct1/libsodium">libsodium</a> supplies the
functions <code class="language-plaintext highlighter-rouge">sodium_bin2hex</code> and <code class="language-plaintext highlighter-rouge">sodium_hex2bin</code> to aid in implementing
the human-readable encoding. The in-memory encoding of these crypto
numbers in NaCl already satisfies the binary encoding, so for
applications directly using those APIs, binary encoding and decoding is
the <a href="https://en.wikipedia.org/wiki/Identity_function">identity
function</a>.</p>
<p>Tox uses four kinds of Crypto Numbers:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Type</th>
<th style="text-align: left">Bits</th>
<th style="text-align: left">Encoded byte size</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left">Public Key</td>
<td style="text-align: left">256</td>
<td style="text-align: left">32</td>
</tr>
<tr>
<td style="text-align: left">Secret Key</td>
<td style="text-align: left">256</td>
<td style="text-align: left">32</td>
</tr>
<tr>
<td style="text-align: left">Combined Key</td>
<td style="text-align: left">256</td>
<td style="text-align: left">32</td>
</tr>
<tr>
<td style="text-align: left">Nonce</td>
<td style="text-align: left">192</td>
<td style="text-align: left">24</td>
</tr>
</tbody>
</table>
<h4 id="key-pair">Key Pair</h4>
<p>A Key Pair is a pair of Secret Key and Public Key. A new key pair is
generated using the <code class="language-plaintext highlighter-rouge">crypto_box_keypair</code> function of the NaCl crypto
library. Two separate calls to the key pair generation function must
return distinct key pairs. See the <a href="https://nacl.cr.yp.to/box.html">NaCl
documentation</a> for details.</p>
<p>A Public Key can be computed from a Secret Key using the NaCl function
<code class="language-plaintext highlighter-rouge">crypto_scalarmult_base</code>, which computes the scalar product of a
standard group element and the Secret Key. See the <a href="https://nacl.cr.yp.to/scalarmult.html">NaCl
documentation</a> for details.</p>
<h4 id="combined-key">Combined Key</h4>
<p>A Combined Key is computed from a Secret Key and a Public Key using the
NaCl function <code class="language-plaintext highlighter-rouge">crypto_box_beforenm</code>. Given two Key Pairs KP1 (SK1, PK1)
and KP2 (SK2, PK2), the Combined Key computed from (SK1, PK2) equals the
one computed from (SK2, PK1). This allows for symmetric encryption, as
peers can derive the same shared key from their own secret key and their
peers public key.</p>
<p>In the Tox protocol, packets are encrypted using the public key of the
receiver and the secret key of the sender. The receiver decrypts the
packets using the receivers secret key and the senders public key.</p>
<p>The fact that the same key is used to encrypt and decrypt packets on
both sides means that packets being sent could be replayed back to the
sender if there is nothing to prevent it.</p>
<p>The shared key generation is the most resource intensive part of the
encryption/decryption which means that resource usage can be reduced
considerably by saving the shared keys and reusing them later as much as
possible.</p>
<h4 id="nonce">Nonce</h4>
<p>A random nonce is generated using the cryptographically secure random
number generator from the NaCl library <code class="language-plaintext highlighter-rouge">randombytes</code>.</p>
<p>A nonce is incremented by interpreting it as a Big Endian number and
adding 1. If the nonce has the maximum value, the value after the
increment is 0.</p>
<p>Most parts of the protocol use random nonces. This prevents new nonces
from being associated with previous nonces. If many different packets
could be tied together due to how the nonces were generated, it might
for example lead to tying DHT and onion announce packets together. This
would introduce a flaw in the system as non friends could tie some
peoples DHT keys and long term keys together.</p>
<h3 id="box">Box</h3>
<p>The Tox protocol differentiates between two types of text: Plain Text
and Cipher Text. Cipher Text may be transmitted over untrusted data
channels. Plain Text can be Sensitive or Non Sensitive. Sensitive Plain
Text must be transformed into Cipher Text using the encryption function
before it can be transmitted over untrusted data channels.</p>
<p>The encryption function takes a Combined Key, a Nonce, and a Plain Text,
and returns a Cipher Text. It uses <code class="language-plaintext highlighter-rouge">crypto_box_afternm</code> to perform the
encryption. The meaning of the sentence “encrypting with a secret key, a
public key, and a nonce” is: compute a combined key from the secret key
and the public key and then use the encryption function for the
transformation.</p>
<p>The decryption function takes a Combined Key, a Nonce, and a Cipher
Text, and returns either a Plain Text or an error. It uses
<code class="language-plaintext highlighter-rouge">crypto_box_open_afternm</code> from the NaCl library. Since the cipher is
symmetric, the encryption function can also perform decryption, but will
not perform message authentication, so the implementation must be
careful to use the correct functions.</p>
<p><code class="language-plaintext highlighter-rouge">crypto_box</code> uses xsalsa20 symmetric encryption and poly1305
authentication.</p>
<p>The create and handle request functions are the encrypt and decrypt
functions for a type of DHT packets used to send data directly to other
DHT nodes. To be honest they should probably be in the DHT module but
they seem to fit better here. TODO: What exactly are these functions?</p>
<h2 id="node-info">Node Info</h2>
<h3 id="transport-protocol">Transport Protocol</h3>
<p>A Transport Protocol is a transport layer protocol directly below the
Tox protocol itself. Tox supports two transport protocols: UDP and TCP.
The binary representation of the Transport Protocol is a single bit: 0
for UDP, 1 for TCP. If encoded as standalone value, the bit is stored in
the least significant bit of a byte. If followed by other bit-packed
data, it consumes exactly one bit.</p>
<p>The human-readable representation for UDP is <code class="language-plaintext highlighter-rouge">UDP</code> and for TCP is <code class="language-plaintext highlighter-rouge">TCP</code>.</p>
<h3 id="host-address">Host Address</h3>
<p>A Host Address is either an IPv4 or an IPv6 address. The binary
representation of an IPv4 address is a Big Endian 32 bit unsigned
integer (4 bytes). For an IPv6 address, it is a Big Endian 128 bit
unsigned integer (16 bytes). The binary representation of a Host Address
is a 7 bit unsigned integer specifying the address family (2 for IPv4,
10 for IPv6), followed by the address itself.</p>
<p>Thus, when packed together with the Transport Protocol, the first bit of
the packed byte is the protocol and the next 7 bits are the address
family.</p>
<h3 id="port-number">Port Number</h3>
<p>A Port Number is a 16 bit number. Its binary representation is a Big
Endian 16 bit unsigned integer (2 bytes).</p>
<h3 id="socket-address">Socket Address</h3>
<p>A Socket Address is a pair of Host Address and Port Number. Together
with a Transport Protocol, it is sufficient information to address a
network port on any internet host.</p>
<h3 id="node-info-packed-node-format">Node Info (packed node format)</h3>
<p>The Node Info data structure contains a Transport Protocol, a Socket
Address, and a Public Key. This is sufficient information to start
communicating with that node. The binary representation of a Node Info
is called the “packed node format”.</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Type</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code> bit</td>
<td style="text-align: left">Transport Protocol</td>
<td style="text-align: left">UDP = 0, TCP = 1</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">7</code> bit</td>
<td style="text-align: left">Address Family</td>
<td style="text-align: left">2 = IPv4, 10 = IPv6</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">4 \| 16</code></td>
<td style="text-align: left">IP address</td>
<td style="text-align: left">4 bytes for IPv4, 16 bytes for IPv6</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">2</code></td>
<td style="text-align: left">Port Number</td>
<td style="text-align: left">Port number</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">32</code></td>
<td style="text-align: left">Public Key</td>
<td style="text-align: left">Node ID</td>
</tr>
</tbody>
</table>
<p>The packed node format is a way to store the node info in a small yet
easy to parse format. To store more than one node, simply append another
one to the previous one: <code class="language-plaintext highlighter-rouge">[packed node 1][packed node 2][...]</code>.</p>
<p>In the packed node format, the first byte (high bit protocol, lower 7
bits address family) are called the IP Type. The following table is
informative and can be used to simplify the implementation.</p>
<table>
<thead>
<tr>
<th style="text-align: left">IP Type</th>
<th style="text-align: left">Transport Protocol</th>
<th style="text-align: left">Address Family</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">2 (0x02)</code></td>
<td style="text-align: left">UDP</td>
<td style="text-align: left">IPv4</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">10 (0x0a)</code></td>
<td style="text-align: left">UDP</td>
<td style="text-align: left">IPv6</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">130 (0x82)</code></td>
<td style="text-align: left">TCP</td>
<td style="text-align: left">IPv4</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">138 (0x8a)</code></td>
<td style="text-align: left">TCP</td>
<td style="text-align: left">IPv6</td>
</tr>
</tbody>
</table>
<p>The number <code class="language-plaintext highlighter-rouge">130</code> is used for an IPv4 TCP relay and <code class="language-plaintext highlighter-rouge">138</code> is used to
indicate an IPv6 TCP relay.</p>
<p>The reason for these numbers is that the numbers on Linux for IPv4 and
IPv6 (the <code class="language-plaintext highlighter-rouge">AF_INET</code> and <code class="language-plaintext highlighter-rouge">AF_INET6</code> defines) are <code class="language-plaintext highlighter-rouge">2</code> and <code class="language-plaintext highlighter-rouge">10</code>. The TCP
numbers are just the UDP numbers <code class="language-plaintext highlighter-rouge">+ 128</code>.</p>
<h2 id="protocol-packet">Protocol Packet</h2>
<p>A Protocol Packet is the top level Tox protocol element. All other
packet types are wrapped in Protocol Packets. It consists of a Packet
Kind and a payload. The binary representation of a Packet Kind is a
single byte (8 bits). The payload is an arbitrary sequence of bytes.</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Type</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left">Packet Kind</td>
<td style="text-align: left">The packet kind identifier</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">[0,]</code></td>
<td style="text-align: left">Bytes</td>
<td style="text-align: left">Payload</td>
</tr>
</tbody>
</table>
<p>These top level packets can be transported in a number of ways, the most
common way being over the network using UDP or TCP. The protocol itself
does not prescribe transport methods, and an implementation is free to
implement additional transports such as WebRTC, IRC, or pipes.</p>
<p>In the remainder of the document, different kinds of Protocol Packet are
specified with their packet kind and payload. The packet kind is not
repeated in the payload description (TODO: actually it mostly is, but
later it wont).</p>
<p>Inside Protocol Packets payload, other packet types can specify
additional packet kinds. E.g. inside a Crypto Data packet (<code class="language-plaintext highlighter-rouge">0x1b</code>), the
<a href="#messenger">Messenger</a> module defines its protocols for messaging, file
transfers, etc. Top level Protocol Packets are themselves not encrypted,
though their payload may be.</p>
<h3 id="packet-kind">Packet Kind</h3>
<p>The following is an exhaustive list of top level packet kind names and
their number. Their payload is specified in dedicated sections. Each
section is named after the Packet Kind it describes followed by the byte
value in parentheses, e.g. <a href="#ping-request-0x00">Ping Request (0x00)</a>.</p>
<table>
<thead>
<tr>
<th style="text-align: left">Byte value</th>
<th style="text-align: left">Packet Kind</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">0x00</code></td>
<td style="text-align: left">Ping Request</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">0x01</code></td>
<td style="text-align: left">Ping Response</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">0x02</code></td>
<td style="text-align: left">Nodes Request</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">0x04</code></td>
<td style="text-align: left">Nodes Response</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">0x18</code></td>
<td style="text-align: left">Cookie Request</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">0x19</code></td>
<td style="text-align: left">Cookie Response</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">0x1a</code></td>
<td style="text-align: left">Crypto Handshake</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">0x1b</code></td>
<td style="text-align: left">Crypto Data</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">0x20</code></td>
<td style="text-align: left">DHT Request</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">0x21</code></td>
<td style="text-align: left">LAN Discovery</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">0x80</code></td>
<td style="text-align: left">Onion Request 0</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">0x81</code></td>
<td style="text-align: left">Onion Request 1</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">0x82</code></td>
<td style="text-align: left">Onion Request 2</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">0x83</code></td>
<td style="text-align: left">Announce Request</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">0x84</code></td>
<td style="text-align: left">Announce Response</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">0x85</code></td>
<td style="text-align: left">Onion Data Request</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">0x86</code></td>
<td style="text-align: left">Onion Data Response</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">0x8c</code></td>
<td style="text-align: left">Onion Response 3</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">0x8d</code></td>
<td style="text-align: left">Onion Response 2</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">0x8e</code></td>
<td style="text-align: left">Onion Response 1</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">0xf0</code></td>
<td style="text-align: left">Bootstrap Info</td>
</tr>
</tbody>
</table>
<h2 id="dht">DHT</h2>
<p>The DHT is a self-organizing swarm of all nodes in the Tox network. A
node in the Tox network is also called a “Tox node”. When we talk about
“peers”, we mean any node that is not the local node (the subject). This
module takes care of finding the IP and port of nodes and establishing a
route to them directly via UDP using <a href="#hole-punching">hole punching</a> if
necessary. The DHT only runs on UDP and so is only used if UDP works.</p>
<p>Every node in the Tox DHT has an ephemeral Key Pair called the DHT Key
Pair, consisting of the DHT Secret Key and the DHT Public Key. The DHT
Public Key acts as the node address. The DHT Key Pair is renewed every
time the Tox instance is closed or restarted. An implementation may
choose to renew the key more often, but doing so will disconnect all
peers.</p>
<p>The DHT public key of a friend is found using the <a href="#onion">onion</a>
module. Once the DHT public key of a friend is known, the DHT is used to
find them and connect directly to them via UDP.</p>
<h3 id="distance">Distance</h3>
<p>A Distance is a positive integer. Its human-readable representation is a
base-16 number. Distance (type) is an <a href="https://en.wikipedia.org/wiki/Ordered_semigroup">ordered
monoid</a> with the
associative binary operator <code class="language-plaintext highlighter-rouge">+</code> and the identity element <code class="language-plaintext highlighter-rouge">0</code>.</p>
<p>The DHT uses a
<a href="https://en.wikipedia.org/wiki/Metric_\(mathematics\)">metric</a> to
determine the distance between two nodes. The Distance type is the
co-domain of this metric. The metric currently used by the Tox DHT is
the <code class="language-plaintext highlighter-rouge">XOR</code> of the nodes public keys: <code class="language-plaintext highlighter-rouge">distance(x, y) = x XOR y</code>. For
this computation, public keys are interpreted as Big Endian integers
(see <a href="#key-1">Crypto Numbers</a>).</p>
<p>When we speak of a “close node”, we mean that its Distance to the node
under consideration is small compared to the Distance to other nodes.</p>
<p>An implementation is not required to provide a Distance type, so it has
no specified binary representation. For example, instead of computing a
distance and comparing it against another distance, the implementation
can choose to implement Distance as a pair of public keys and define an
ordering on Distance without computing the complete integral value. This
works, because as soon as an ordering decision can be made in the most
significant bits, further bits wont influence that decision.</p>
<p>XOR is a valid metric, i.e. it satisfies the required conditions:</p>
<ol>
<li>
<p>Non-negativity <code class="language-plaintext highlighter-rouge">distance(x, y) &gt;= 0</code>: Since public keys are Crypto
Numbers, which are by definition non-negative, their XOR is
necessarily non-negative.</p>
</li>
<li>
<p>Identity of indiscernibles <code class="language-plaintext highlighter-rouge">distance(x, y) == 0</code> iff <code class="language-plaintext highlighter-rouge">x == y</code>: The
XOR of two integers is zero iff they are equal.</p>
</li>
<li>
<p>Symmetry <code class="language-plaintext highlighter-rouge">distance(x, y) == distance(y, x)</code>: XOR is a symmetric
operation.</p>
</li>
<li>
<p>Subadditivity <code class="language-plaintext highlighter-rouge">distance(x, z) &lt;= distance(x, y) + distance(y, z)</code>:
follows from associativity, since <code class="language-plaintext highlighter-rouge">x XOR z = x XOR (y XOR y) XOR z =
distance(x, y) XOR distance(y, z)</code> which is not greater than
<code class="language-plaintext highlighter-rouge">distance(x, y) + distance(y, z)</code>.</p>
</li>
</ol>
<p>In addition, XOR has other useful properties:</p>
<ul>
<li>
<p>Unidirectionality: given the key <code class="language-plaintext highlighter-rouge">x</code> and the distance <code class="language-plaintext highlighter-rouge">d</code> there
exist one and only one key <code class="language-plaintext highlighter-rouge">y</code> such that <code class="language-plaintext highlighter-rouge">distance(x, y) = d</code>.</p>
<p>The implication is that repeated lookups are likely to pass along
the same way and thus caching makes sense.</p>
<p>Source:
<a href="http://pdos.csail.mit.edu/~petar/papers/maymounkov-kademlia-lncs.pdf">maymounkov-kademlia</a></p>
</li>
</ul>
<!-- end list -->
<p>Example: Given three nodes with keys 2, 5, and 6:</p>
<ul>
<li>
<p><code class="language-plaintext highlighter-rouge">2 XOR 5 = 7</code></p>
</li>
<li>
<p><code class="language-plaintext highlighter-rouge">2 XOR 6 = 4</code></p>
</li>
<li>
<p><code class="language-plaintext highlighter-rouge">5 XOR 2 = 7</code></p>
</li>
<li>
<p><code class="language-plaintext highlighter-rouge">5 XOR 6 = 3</code></p>
</li>
<li>
<p><code class="language-plaintext highlighter-rouge">6 XOR 2 = 4</code></p>
</li>
<li>
<p><code class="language-plaintext highlighter-rouge">6 XOR 5 = 3</code></p>
</li>
</ul>
<p>The closest node from both 2 and 5 is 6. The closest node from 6 is 5
with distance 3. This example shows that a key that is close in terms of
integer addition may not necessarily be close in terms of XOR.</p>
<h3 id="client-lists">Client Lists</h3>
<p>A Client List of <em>maximum size</em> <code class="language-plaintext highlighter-rouge">k</code> with a given public key as <em>base
key</em> is an ordered set of at most <code class="language-plaintext highlighter-rouge">k</code> nodes close to the base key. The
elements are sorted by <a href="#distance">distance</a> from the base key. Thus,
the first (smallest) element of the set is the closest one to the base
key in that set, the last (greatest) element is the furthest away. The
maximum size and base key are constant throughout the lifetime of a
Client List.</p>
<p>A Client List is <em>full</em> when the number of nodes it contains is the
maximum size of the list.</p>
<p>A node is <em>viable</em> for entry if the Client List is not <em>full</em> or the
nodes public key has a lower distance from the base key than the
current entry with the greatest distance.</p>
<p>If a node is <em>viable</em> and the Client List is <em>full</em>, the entry with the
greatest distance from the base key is removed to keep the size below
the maximum configured size.</p>
<p>Adding a node whose key already exists will result in an update of the
Node Info in the Client List. Removing a node for which no Node Info
exists in the Client List has no effect. Thus, removing a node twice is
permitted and has the same effect as removing it once.</p>
<p>The iteration order of a Client List is in order of distance from the
base key. I.e. the first node seen in iteration is the closest, and the
last node is the furthest away in terms of the distance metric.</p>
<h3 id="k-buckets">K-buckets</h3>
<p>K-buckets is a data structure for efficiently storing a set of nodes
close to a certain key called the base key. The base key is constant
throughout the lifetime of a k-buckets instance.</p>
<p>A k-buckets is a map from small integers <code class="language-plaintext highlighter-rouge">0 &lt;= n &lt; 256</code> to Client Lists
of maximum size (k). Each Client List is called a (k-)bucket. A
k-buckets is equipped with a base key, and each bucket has this key as
its base key. <code class="language-plaintext highlighter-rouge">k</code> is called the bucket size. The default bucket size is</p>
<ol>
<li>A large bucket size was chosen to increase the speed at which peers
are found.</li>
</ol>
<p>The above number <code class="language-plaintext highlighter-rouge">n</code> is the bucket index. It is a non-negative integer
with the range <code class="language-plaintext highlighter-rouge">[0, 255]</code>, i.e. the range of an 8 bit unsigned integer.</p>
<h4 id="bucket-index">Bucket Index</h4>
<p>The index of the bucket can be computed using the following function:
<code class="language-plaintext highlighter-rouge">bucketIndex(baseKey, nodeKey) = 255 - log_2(distance(baseKey,
nodeKey))</code>. This function is not defined when <code class="language-plaintext highlighter-rouge">baseKey == nodeKey</code>,
meaning k-buckets will never contain a Node Info about the base node.</p>
<p>Thus, each k-bucket contains only Node Infos for whose keys the
following holds: if node with key <code class="language-plaintext highlighter-rouge">nodeKey</code> is in k-bucket with index
<code class="language-plaintext highlighter-rouge">n</code>, then <code class="language-plaintext highlighter-rouge">bucketIndex(baseKey, nodeKey) == n</code>. Thus, nth k-bucket
consists of nodes for which distance to the base node lies in range
<code class="language-plaintext highlighter-rouge">[2^n, 2^(n+1) - 1]</code>.</p>
<p>The bucket index can be efficiently computed by determining the first
bit at which the two keys differ, starting from the most significant
bit. So, if the local DHT key starts with e.g. <code class="language-plaintext highlighter-rouge">0x80</code> and the bucketed
node key starts with <code class="language-plaintext highlighter-rouge">0x40</code>, then the bucket index for that node is 0.
If the second bit differs, the bucket index is 1. If the keys are almost
exactly equal and only the last bit differs, the bucket index is 255.</p>
<h4 id="manipulating-k-buckets">Manipulating k-buckets</h4>
<p>TODO: this is different from kademlias least-recently-seen eviction
policy; why the existing solution was chosen, how does it affect
security, performance and resistance to poisoning? original paper claims
that preference of old live nodes results in better persistence and
resistance to basic DDoS attacks;</p>
<p>Any update or lookup operation on a k-buckets instance that involves a
single node requires us to first compute the bucket index for that node.
An update involving a Node Info with <code class="language-plaintext highlighter-rouge">nodeKey == baseKey</code> has no effect.
If the update results in an empty bucket, that bucket is removed from
the map.</p>
<p>Adding a node to, or removing a node from, a k-buckets consists of
performing the corresponding operation on the Client List bucket whose
index is that of the nodes public key, except that adding a new node to
a full bucket has no effect. A node is considered <em>viable</em> for entry if
the corresponding bucket is not full.</p>
<p>Iteration order of a k-buckets instance is in order of distance from the
base key. I.e. the first node seen in iteration is the closest, and the
last node is the furthest away in terms of the distance metric.</p>
<h3 id="dht-node-state">DHT node state</h3>
<p>Every DHT node contains the following state:</p>
<ul>
<li>
<p>DHT Key Pair: The Key Pair used to communicate with other DHT nodes.
It is immutable throughout the lifetime of the DHT node.</p>
</li>
<li>
<p>DHT Close List: A set of Node Infos of nodes that are close to the
DHT Public Key (public part of the DHT Key Pair). The Close List is
represented as a <a href="#k-buckets">k-buckets</a> data structure, with the
DHT Public Key as the Base Key.</p>
</li>
<li>
<p>DHT Search List: A list of Public Keys of nodes that the DHT node is
searching for, associated with a DHT Search Entry.</p>
</li>
</ul>
<!-- end list -->
<p>A DHT node state is initialised using a Key Pair, which is stored in the
state as DHT Key Pair and as base key for the Close List. Both the Close
and Search Lists are initialised to be empty.</p>
<h4 id="dht-search-entry">DHT Search Entry</h4>
<p>A DHT Search Entry contains a Client List with base key the searched
nodes Public Key. Once the searched node is found, it is also stored in
the Search Entry.</p>
<p>The maximum size of the Client List is set to 8. (Must be the same or
smaller than the bucket size of the close list to make sure all the
closest peers found will know the node being searched (TODO(zugz): this
argument is unclear.)).</p>
<p>A DHT node state therefore contains one Client List for each bucket
index in the Close List, and one Client List for each DHT Search Entry.
These lists are not required to be disjoint - a node may be in multiple
Client Lists simultaneously.</p>
<p>A Search Entry is initialised with the searched-for Public Key. The
contained Client List is initialised to be empty.</p>
<h4 id="manipulating-the-dht-node-state">Manipulating the DHT node state</h4>
<p>Adding a search key to the DHT node state creates an empty entry in the
Search Nodes list. If a search entry for the public key already existed,
the “add” operation has no effect.</p>
<p>Removing a search key removes its search entry and all associated data
structures from memory.</p>
<p>The Close List and the Search Entries are termed the <code class="language-plaintext highlighter-rouge">Node Lists</code> of the
DHT State.</p>
<p>The iteration order over the DHT state is to first process the Close
List k-buckets, then the Search List entry Client Lists. Each of these
follows the iteration order in the corresponding specification.</p>
<p>A node info is considered to be contained in the DHT State if it is
contained in the Close List or in at least one of the Search Entries.</p>
<p>The size of the DHT state is defined to be the number of node infos it
contains, counted with multiplicity: node infos contained multiple
times, e.g. in the close list and in various search entries, are counted
as many times as they appear. Search keys do not directly count towards
the state size. So the size of the state is the sum of the sizes of the
Close List and the Search Entries.</p>
<p>The state size is relevant to later pruning algorithms that decide when
to remove a node info and when to request a ping from stale nodes.
Search keys, once added, are never automatically pruned.</p>
<p>Adding a Node Info to the state is done by adding the node to each Node
List in the state.</p>
<p>When adding a node info to the state, the search entry for the nodes
public key, if it exists, is updated to contain the new node info. All
k-buckets and Client Lists that already contain the node info will also
be updated. See the corresponding specifications for the update
algorithms. However, a node info will not be added to a search entry
when it is the node to which the search entry is associated (i.e. the
node being search for).</p>
<p>Removing a node info from the state removes it from all k-buckets. If a
search entry for the removed nodes public key existed, the node info in
that search entry is unset. The search entry itself is not removed.</p>
<h3 id="self-organisation">Self-organisation</h3>
<p>Self-organising in the DHT occurs through each DHT peer connecting to an
arbitrary number of peers closest to their own DHT public key and some
that are further away.</p>
<p>If each peer in the network knows the peers with the DHT public key
closest to its DHT public key, then to find a specific peer with public
key X a peer just needs to recursively ask peers in the DHT for known
peers that have the DHT public keys closest to X. Eventually the peer
will find the peers in the DHT that are the closest to that peer and, if
that peer is online, they will find them.</p>
<h3 id="dht-packet">DHT Packet</h3>
<p>The DHT Packet contains the senders DHT Public Key, an encryption
Nonce, and an encrypted payload. The payload is encrypted with the DHT
secret key of the sender, the DHT public key of the receiver, and the
nonce that is sent along with the packet. DHT Packets are sent inside
Protocol Packets with a varying Packet Kind.</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Type</th>
<th style="text-align: left"><a href="#protocol-packet">Contents</a></th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">32</code></td>
<td style="text-align: left">Public Key</td>
<td style="text-align: left">Sender DHT Public Key</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">24</code></td>
<td style="text-align: left">Nonce</td>
<td style="text-align: left">Random nonce</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">[16,]</code></td>
<td style="text-align: left">Bytes</td>
<td style="text-align: left">Encrypted payload</td>
</tr>
</tbody>
</table>
<p>The encrypted payload is at least 16 bytes long, because the encryption
includes a
<a href="https://en.wikipedia.org/wiki/Message_authentication_code">MAC</a> of 16
bytes. A 16 byte payload would thus be the empty message. The DHT
protocol never actually sends empty messages, so in reality the minimum
size is 27 bytes for the <a href="#ping-service">Ping Packet</a>.</p>
<h3 id="rpc-services">RPC Services</h3>
<p>A DHT RPC Service consists of a Request packet and a Response packet. A
DHT RPC Packet contains a payload and a Request ID. This ID is a 64 bit
unsigned integer that helps identify the response for a given request.</p>
<h4 id="replies-to-rpc-requests">Replies to RPC requests</h4>
<p>A <em>reply</em> to a Request packet is a Response packet with the Request ID
in the Response packet set equal to the Request ID in the Request
packet. A response is accepted if and only if it is the first received
reply to a request which was sent sufficiently recently, according to a
time limit which depends on the service.</p>
<p>DHT RPC Packets are encrypted and transported within DHT Packets.</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Type</th>
<th style="text-align: left"><a href="#dht-packet">Contents</a></th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">[0,]</code></td>
<td style="text-align: left">Bytes</td>
<td style="text-align: left">Payload</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">8</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint64_t</code></td>
<td style="text-align: left">Request ID</td>
</tr>
</tbody>
</table>
<p>The minimum payload size is 0, but in reality the smallest sensible
payload size is 1. Since the same symmetric key is used in both
communication directions, an encrypted Request would be a valid
encrypted Response if they contained the same plaintext.</p>
<p>Parts of the protocol using RPC packets must take care to make Request
payloads not be valid Response payloads. For instance, <a href="#ping-service">Ping
Packets</a> carry a boolean flag that indicate whether the
payload corresponds to a Request or a Response.</p>
<p>The Request ID provides some resistance against replay attacks. If there
were no Request ID, it would be easy for an attacker to replay old
responses and thus provide nodes with out-of-date information. A Request
ID should be randomly generated for each Request which is sent.</p>
<h4 id="ping-service">Ping Service</h4>
<p>The Ping Service is used to check if a node is responsive.</p>
<p>A Ping Packet payload consists of just a boolean value saying whether it
is a request or a response.</p>
<p>The one byte boolean inside the encrypted payload is added to prevent
peers from creating a valid Ping Response from a Ping Request without
decrypting the packet and encrypting a new one. Since symmetric
encryption is used, the encrypted Ping Response would be byte-wise equal
to the Ping Request without the discriminator byte.</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Type</th>
<th style="text-align: left"><a href="#rpc-services">Contents</a></th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left">Bool</td>
<td style="text-align: left">Response flag: 0x00 for Request, 0x01 for Response</td>
</tr>
</tbody>
</table>
<h5 id="ping-request-0x00">Ping Request (0x00)</h5>
<p>A Ping Request is a Ping Packet with the response flag set to False.
When a Ping Request is received and successfully decrypted, a Ping
Response packet is created and sent back to the requestor.</p>
<h5 id="ping-response-0x01">Ping Response (0x01)</h5>
<p>A Ping Response is a Ping Packet with the response flag set to True.</p>
<h4 id="nodes-service">Nodes Service</h4>
<p>The Nodes Service is used to query another DHT node for up to 4 nodes
they know that are the closest to a requested node.</p>
<p>The DHT Nodes RPC service uses the Packed Node Format.</p>
<p>Only the UDP Protocol (IP Type <code class="language-plaintext highlighter-rouge">2</code> and <code class="language-plaintext highlighter-rouge">10</code>) is used in the DHT module
when sending nodes with the packed node format. This is because the TCP
Protocol is used to send TCP relay information and the DHT is UDP only.</p>
<h5 id="nodes-request-0x02">Nodes Request (0x02)</h5>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Type</th>
<th style="text-align: left"><a href="#rpc-services">Contents</a></th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">32</code></td>
<td style="text-align: left">Public Key</td>
<td style="text-align: left">Requested DHT Public Key</td>
</tr>
</tbody>
</table>
<p>The DHT Public Key sent in the request is the one the sender is
searching for.</p>
<h5 id="nodes-response-0x04">Nodes Response (0x04)</h5>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Type</th>
<th style="text-align: left"><a href="#rpc-services">Contents</a></th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left">Int</td>
<td style="text-align: left">Number of nodes in the response (maximum 4)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">[39, 204]</code></td>
<td style="text-align: left">Node Infos</td>
<td style="text-align: left">Nodes in Packed Node Format</td>
</tr>
</tbody>
</table>
<p>An IPv4 node is 39 bytes, an IPv6 node is 51 bytes, so the maximum size
of the packed Node Infos is <code class="language-plaintext highlighter-rouge">51 * 4 = 204</code> bytes.</p>
<p>Nodes responses should contain the 4 closest nodes that the sender of
the response has in their lists of known nodes.</p>
<h3 id="dht-operation">DHT Operation</h3>
<h4 id="dht-initialisation">DHT Initialisation</h4>
<p>A new DHT node is initialised with a DHT State with a fresh random key
pair, an empty close list, and a search list containing 2 empty search
entries searching for the public keys of fresh random key pairs.</p>
<h4 id="periodic-sending-of-nodes-requests">Periodic sending of Nodes Requests</h4>
<p>For each Nodes List in the DHT State, every 20 seconds we send a Nodes
Request to a random node on the list, searching for the base key of the
list.</p>
<p>When a Nodes List first becomes populated with nodes, we send 5 such
random Nodes Requests in quick succession.</p>
<p>Random nodes are chosen since being able to predict which node a node
will send a request to next could make some attacks that disrupt the
network easier, as it adds a possible attack vector.</p>
<p>Furthermore, we periodically check every node for responsiveness by
sending it a Nodes Request: for each Nodes List in the DHT State, we
send each node on the list a Nodes Request every 60 seconds, searching
for the base key of the list. We remove from the DHT State any node from
which we persistently fail to receive Nodes Responses.</p>
<p>c-toxcores implementation of checking and timeouts: A Last Checked time
is maintained for each node in each list. When a node is added to a
list, if doing so evicts a node from the list then the Last Checked time
is set to that of the evicted node, and otherwise it is set to 0. This
includes updating an already present node. Nodes from which we have not
received a Nodes Response for 122 seconds are considered Bad; they
remain in the DHT State, but are preferentially overwritten when adding
to the DHT State, and are ignored for all operations except the
once-per-60s checking described above. If we have not received a Nodes
Response for 182 seconds, the node is not even checked. So one check is
sent after the node becomes Bad. In the special case that every node in
the Close List is Bad, they are all checked once more.)</p>
<p>hs-toxcore implementation of checking and timeouts: We maintain a Last
Checked timestamp and a Checks Counter on each node on each Nodes List
in the Dht State. When a node is added to a list, these are set
respectively to the current time and to 0. This includes updating an
already present node. We periodically pass through the nodes on the
lists, and for each which is due a check, we: check it, update the
timestamp, increment the counter, and, if the counter is then 2, remove
the node from the list. This is pretty close to the behaviour of
c-toxcore, but much simpler. TODO: currently hs-toxcore doesnt do
anything to try to recover if the Close List becomes empty. We could
maintain a separate list of the most recently heard from nodes, and
repopulate the Close List with that if the Close List becomes empty.</p>
<h4 id="handling-nodes-response-packets">Handling Nodes Response packets</h4>
<p>When we receive a valid Nodes Response packet, we first check that it is
a reply to a Nodes Request which we sent within the last 60 seconds to
the node from which we received the response, and that no previous reply
has been received. If this check fails, the packet is ignored. If the
check succeeds, first we add to the DHT State the node from which the
response was sent. Then, for each node listed in the response and for
each Nodes List in the DHT State which does not currently contain the
node and to which the node is viable for entry, we send a Nodes Request
to the node with the requested public key being the base key of the
Nodes List.</p>
<p>An implementation may choose not to send every such Nodes Request.
(c-toxcore only sends so many per list (8 for the Close List, 4 for a
Search Entry) per 50ms, prioritising the closest to the base key).</p>
<h4 id="handling-nodes-request-packets">Handling Nodes Request packets</h4>
<p>When we receive a Nodes Request packet from another node, we reply with
a Nodes Response packet containing the 4 nodes in the DHT State which
are the closest to the public key in the packet. If there are fewer than
4 nodes in the state, we reply with all the nodes in the state. If there
are no nodes in the state, no reply is sent.</p>
<p>We also send a Ping Request when this is appropriate; see below.</p>
<h4 id="handling-ping-request-packets">Handling Ping Request packets</h4>
<p>When a valid Ping Request packet is received, we reply with a Ping
Response.</p>
<p>We also send a Ping Request when this is appropriate; see below.</p>
<h4 id="handling-ping-response-packets">Handling Ping Response packets</h4>
<p>When we receive a valid Ping Response packet, we first check that it is
a reply to a Ping Request which we sent within the last 5 seconds to the
node from which we received the response, and that no previous reply has
been received. If this check fails, the packet is ignored. If the check
succeeds, we add to the DHT State the node from which the response was
sent.</p>
<h4 id="sending-ping-requests">Sending Ping Requests</h4>
<p>When we receive a Nodes Request or a Ping Request, in addition to the
handling described above, we sometimes send a Ping Request. Namely, we
send a Ping Request to the node which sent the packet if the node is
viable for entry to the Close List and is not already in the Close List.
An implementation may (TODO: should?) choose not to send every such Ping
Request. (c-toxcore sends at most 32 every 2 seconds, preferring closer
nodes.)</p>
<h3 id="dht-request-packets">DHT Request Packets</h3>
<p>DHT Request packets are used to route encrypted data from a sender to
another node, referred to as the addressee of the packet, via a third
node.</p>
<p>A DHT Request Packet is sent as the payload of a Protocol Packet with
the corresponding Packet Kind. It contains the DHT Public Key of an
addressee, and a DHT Packet which is to be received by the addressee.</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Type</th>
<th style="text-align: left"><a href="#protocol-packet">Contents</a></th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">32</code></td>
<td style="text-align: left">Public Key</td>
<td style="text-align: left">Addressee DHT Public Key</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">[72,]</code></td>
<td style="text-align: left">DHT Packet</td>
<td style="text-align: left">DHT Packet</td>
</tr>
</tbody>
</table>
<h4 id="handling-dht-request-packets">Handling DHT Request packets</h4>
<p>A DHT node that receives a DHT request packet checks whether the
addressee public key is their DHT public key. If it is, they will
decrypt and handle the packet. Otherwise, they will check whether the
addressee DHT public key is the DHT public key of one of the nodes in
their Close List. If it isnt, they will drop the packet. If it is they
will resend the packet, unaltered, to that DHT node.</p>
<p>DHT request packets are used for DHT public key packets (see
<a href="#onion">onion</a>) and NAT ping packets.</p>
<h4 id="nat-ping-packets">NAT ping packets</h4>
<p>A NAT ping packet is sent as the payload of a DHT request packet.</p>
<p>We use NAT ping packets to see if a friend we are not connected to
directly is online and ready to do the hole punching.</p>
<h5 id="nat-ping-request">NAT ping request</h5>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> (0xfe)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> (0x00)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">8</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint64_t</code> random number</td>
</tr>
</tbody>
</table>
<h5 id="nat-ping-response">NAT ping response</h5>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> (0xfe)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> (0x01)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">8</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint64_t</code> random number (the same that was received in request)</td>
</tr>
</tbody>
</table>
<p>TODO: handling these packets.</p>
<h4 id="effects-of-chosen-constants-on-performance">Effects of chosen constants on performance</h4>
<p>If the bucket size of the k-buckets were increased, it would increase
the amount of packets needed to check if each node is still alive, which
would increase the bandwidth usage, but reliability would go up. If the
number of nodes were decreased, reliability would go down along with
bandwidth usage. The reason for this relationship between reliability
and number of nodes is that if we assume that not every node has its UDP
ports open or is behind a cone NAT it means that each of these nodes
must be able to store a certain number of nodes behind restrictive NATs
in order for others to be able to find those nodes behind restrictive
NATs. For example if 7/8 nodes were behind restrictive NATs, using 8
nodes would not be enough because the chances of some of these nodes
being impossible to find in the network would be too high.</p>
<p>TODO(zugz): this seems a rather wasteful solution to this problem.</p>
<p>If the ping timeouts and delays between pings were higher it would
decrease the bandwidth usage but increase the amount of disconnected
nodes that are still being stored in the lists. Decreasing these delays
would do the opposite.</p>
<p>If the maximum size 8 of the DHT Search Entry Client Lists were
increased would increase the bandwidth usage, might increase hole
punching efficiency on symmetric NATs (more ports to guess from, see
Hole punching) and might increase the reliability. Lowering this number
would have the opposite effect.</p>
<p>The timeouts and number of nodes in lists for toxcore were picked by
feeling alone and are probably not the best values. This also applies to
the behavior which is simple and should be improved in order to make the
network resist better to sybil attacks.</p>
<p>TODO: consider giving min and max values for the constants.</p>
<h3 id="nats">NATs</h3>
<p>We assume that peers are either directly accessible or are behind one of
3 types of NAT:</p>
<p>Cone NATs: Assign one whole port to each UDP socket behind the NAT; any
packet from any IP/port sent to that assigned port from the internet
will be forwarded to the socket behind it.</p>
<p>Restricted Cone NATs: Assign one whole port to each UDP socket behind
the NAT. However, it will only forward packets from IPs that the UDP
socket has sent a packet to.</p>
<p>Symmetric NATs: The worst kind of NAT, they assign a new port for each
IP/port a packet is sent to. They treat each new peer you send a UDP
packet to as a <code class="language-plaintext highlighter-rouge">connection</code> and will only forward packets from the
IP/port of that <code class="language-plaintext highlighter-rouge">connection</code>.</p>
<h3 id="hole-punching">Hole punching</h3>
<p>Holepunching on normal cone NATs is achieved simply through the way in
which the DHT functions.</p>
<p>If more than half of the 8 peers closest to the friend in the DHT return
an IP/port for the friend and we send a ping request to each of the
returned IP/ports but get no response. If we have sent 4 ping requests
to 4 IP/ports that supposedly belong to the friend and get no response,
then this is enough for toxcore to start the hole punching. The numbers
8 and 4 are used in toxcore and were chosen based on feel alone and so
may not be the best numbers.</p>
<p>Before starting the hole punching, the peer will send a NAT ping packet
to the friend via the peers that say they know the friend. If a NAT ping
response with the same random number is received the hole punching will
start.</p>
<p>If a NAT ping request is received, we will first check if it is from a
friend. If it is not from a friend it will be dropped. If it is from a
friend, a response with the same 8 byte number as in the request will be
sent back via the nodes that know the friend sending the request. If no
nodes from the friend are known, the packet will be dropped.</p>
<p>Receiving a NAT ping response therefore means that the friend is both
online and actively searching for us, as that is the only way they would
know nodes that know us. This is important because hole punching will
work only if the friend is actively trying to connect to us.</p>
<p>NAT ping requests are sent every 3 seconds in toxcore, if no response is
received for 6 seconds, the hole punching will stop. Sending them in
longer intervals might increase the possibility of the other node going
offline and ping packets sent in the hole punching being sent to a dead
peer but decrease bandwidth usage. Decreasing the intervals will have
the opposite effect.</p>
<p>There are 2 cases that toxcore handles for the hole punching. The first
case is if each 4+ peers returned the same IP and port. The second is if
the 4+ peers returned same IPs but different ports.</p>
<p>A third case that may occur is the peers returning different IPs and
ports. This can only happen if the friend is behind a very restrictive
NAT that cannot be hole punched or if the peer recently connected to
another internet connection and some peers still have the old one
stored. Since there is nothing we can do for the first option it is
recommended to just use the most common IP returned by the peers and to
ignore the other IP/ports.</p>
<p>In the case where the peers return the same IP and port it means that
the other friend is on a restricted cone NAT. These kinds of NATs can be
hole punched by getting the friend to send a packet to our public
IP/port. This means that hole punching can be achieved easily and that
we should just continue sending DHT ping packets regularly to that
IP/port until we get a ping response. This will work because the friend
is searching for us in the DHT and will find us and will send us a
packet to our public IP/port (or try to with the hole punching), thereby
establishing a connection.</p>
<p>For the case where peers do not return the same ports, this means that
the other peer is on a symmetric NAT. Some symmetric NATs open ports in
sequences so the ports returned by the other peers might be something
like: 1345, 1347, 1389, 1395. The method to hole punch these NATs is to
try to guess which ports are more likely to be used by the other peer
when they try sending us ping requests and send some ping requests to
these ports. Toxcore just tries all the ports beside each returned port
(ex: for the 4 ports previously it would try: 1345, 1347, 1389, 1395,
1346, 1348, 1390, 1396, 1344, 1346…) getting gradually further and
further away and, although this works, the method could be improved.
When using this method toxcore will try up to 48 ports every 3 seconds
until both connect. After 5 tries toxcore doubles this and starts trying
ports from 1024 (48 each time) along with the previous port guessing.
This is because I have noticed that this seemed to fix it for some
symmetric NATs, most likely because a lot of them restart their count at
1024.</p>
<p>Increasing the amount of ports tried per second would make the hole
punching go faster but might DoS NATs due to the large number of packets
being sent to different IPs in a short amount of time. Decreasing it
would make the hole punching slower.</p>
<p>This works in cases where both peers have different NATs. For example,
if A and B are trying to connect to each other: A has a symmetric NAT
and B a restricted cone NAT. A will detect that B has a restricted cone
NAT and keep sending ping packets to his one IP/port. B will detect that
A has a symmetric NAT and will send packets to it to try guessing his
ports. If B manages to guess the port A is sending packets from they
will connect together.</p>
<h3 id="dht-bootstrap-info-0xf0">DHT Bootstrap Info (0xf0)</h3>
<p>Bootstrap nodes are regular Tox nodes with a stable DHT public key. This
means the DHT public key does not change across restarts. DHT bootstrap
nodes have one additional request kind: Bootstrap Info. The request is
simply a packet of length 78 bytes where the first byte is 0xf0. The
other bytes are ignored.</p>
<p>The response format is as follows:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Type</th>
<th style="text-align: left"><a href="#protocol-packet">Contents</a></th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">4</code></td>
<td style="text-align: left">Word32</td>
<td style="text-align: left">Bootstrap node version</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">256</code></td>
<td style="text-align: left">Bytes</td>
<td style="text-align: left">Message of the day</td>
</tr>
</tbody>
</table>
<h2 id="lan-discovery">LAN discovery</h2>
<p>LAN discovery is a way to discover Tox peers that are on a local
network. If two Tox friends are on a local network, the most efficient
way for them to communicate together is to use the local network. If a
Tox client is opened on a local network in which another Tox client
exists then good behavior would be to bootstrap to the network using the
Tox client on the local network. This is what LAN discovery aims to
accomplish.</p>
<p>LAN discovery works by sending a UDP packet through the toxcore UDP
socket to the interface broadcast address on IPv4, the global broadcast
address (255.255.255.255) and the multicast address on IPv6 (FF02::1) on
the default Tox UDP port (33445).</p>
<p>The LAN Discovery packet:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> (33)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">32</code></td>
<td style="text-align: left">DHT public key</td>
</tr>
</tbody>
</table>
<p>LAN Discovery packets contain the DHT public key of the sender. When a
LAN Discovery packet is received, a DHT get nodes packet will be sent to
the sender of the packet. This means that the DHT instance will
bootstrap itself to every peer from which it receives one of these
packets. Through this mechanism, Tox clients will bootstrap themselves
automatically from other Tox clients running on the local network.</p>
<p>When enabled, toxcore sends these packets every 10 seconds to keep
delays low. The packets could be sent up to every 60 seconds but this
would make peer finding over the network 6 times slower.</p>
<p>LAN discovery enables two friends on a local network to find each other
as the DHT prioritizes LAN addresses over non LAN addresses for DHT
peers. Sending a get node request/bootstrapping from a peer successfully
should also add them to the list of DHT peers if we are searching for
them. The peer must not be immediately added if a LAN discovery packet
with a DHT public key that we are searching for is received as there is
no cryptographic proof that this packet is legitimate and not
maliciously crafted. This means that a DHT get node or ping packet must
be sent, and a valid response must be received, before we can say that
this peer has been found.</p>
<p>LAN discovery is how Tox handles and makes everything work well on LAN.</p>
<h2 id="messenger">Messenger</h2>
<p>Messenger is the module at the top of all the other modules. It sits on
top of <code class="language-plaintext highlighter-rouge">friend_connection</code> in the hierarchy of toxcore.</p>
<p>Messenger takes care of sending and receiving messages using the
connection provided by <code class="language-plaintext highlighter-rouge">friend_connection</code>. The module provides a way
for friends to connect and makes it usable as an instant messenger. For
example, Messenger lets users set a nickname and status message which it
then transmits to friends when they are online. It also allows users to
send messages to friends and builds an instant messenging system on top
of the lower level <code class="language-plaintext highlighter-rouge">friend_connection</code> module.</p>
<p>Messenger offers two methods to add a friend. The first way is to add a
friend with only their long term public key, this is used when a friend
needs to be added but for some reason a friend request should not be
sent. The friend should only be added. This method is most commonly used
to accept friend requests but could also be used in other ways. If two
friends add each other using this function they will connect to each
other. Adding a friend using this method just adds the friend to
<code class="language-plaintext highlighter-rouge">friend_connection</code> and creates a new friend entry in Messenger for the
friend.</p>
<p>The Tox ID is used to identify peers so that they can be added as
friends in Tox. In order to add a friend, a Tox user must have the
friends Tox ID. The Tox ID contains the long term public key of the
peer (32 bytes) followed by the 4 byte nospam (see: <code class="language-plaintext highlighter-rouge">friend_requests</code>)
value and a 2 byte XOR checksum. The method of sending the Tox ID to
others is up to the user and the client but the recommended way is to
encode it in hexadecimal format and have the user manually send it to
the friend using another program.</p>
<p>Tox ID:</p>
<p><img src="res/images/tox-id.png" alt="Tox ID" /></p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">32</code></td>
<td style="text-align: left">long term public key</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">4</code></td>
<td style="text-align: left">nospam</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">2</code></td>
<td style="text-align: left">checksum</td>
</tr>
</tbody>
</table>
<p>The checksum is calculated by XORing the first two bytes of the ID with
the next two bytes, then the next two bytes until all the 36 bytes have
been XORed together. The result is then appended to the end to form the
Tox ID.</p>
<p>The user must make sure the Tox ID is not intercepted and replaced in
transit by a different Tox ID, which would mean the friend would connect
to a malicious person instead of the user, though taking reasonable
precautions as this is outside the scope of Tox. Tox assumes that the
user has ensured that they are using the correct Tox ID, belonging to
the intended person, to add a friend.</p>
<p>The second method to add a friend is by using their Tox ID and a message
to be sent in a friend request. This way of adding friends will try to
send a friend request, with the set message, to the peer whose Tox ID
was added. The method is similar to the first one, except that a friend
request is crafted and sent to the other peer.</p>
<p>When a friend connection associated to a Messenger friend goes online, a
ONLINE packet will be sent to them. Friends are only set as online if an
ONLINE packet is received.</p>
<p>As soon as a friend goes online, Messenger will stop sending friend
requests to that friend, if it was sending them, as they are redundant
for this friend.</p>
<p>Friends will be set as offline if either the friend connection
associated to them goes offline or if an OFFLINE packet is received from
the friend.</p>
<p>Messenger packets are sent to the friend using the online friend
connection to the friend.</p>
<p>Should Messenger need to check whether any of the non lossy packets in
the following list were received by the friend, for example to implement
receipts for text messages, <code class="language-plaintext highlighter-rouge">net_crypto</code> can be used. The <code class="language-plaintext highlighter-rouge">net_crypto</code>
packet number, used to send the packets, should be noted and then
<code class="language-plaintext highlighter-rouge">net_crypto</code> checked later to see if the bottom of the send array is
after this packet number. If it is, then the friend has received them.
Note that <code class="language-plaintext highlighter-rouge">net_crypto</code> packet numbers could overflow after a long time,
so checks should happen within 2**32 <code class="language-plaintext highlighter-rouge">net_crypto</code> packets sent with
the same friend connection.</p>
<p>Message receipts for action messages and normal text messages are
implemented by adding the <code class="language-plaintext highlighter-rouge">net_crypto</code> packet number of each message,
along with the receipt number, to the top of a linked list that each
friend has as they are sent. Every Messenger loop, the entries are read
from the bottom and entries are removed and passed to the client until
an entry that refers to a packet not yet received by the other is
reached, when this happens it stops.</p>
<p>List of Messenger packets:</p>
<h3 id="online"><code class="language-plaintext highlighter-rouge">ONLINE</code></h3>
<p>length: 1 byte</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> (0x18)</td>
</tr>
</tbody>
</table>
<p>Sent to a friend when a connection is established to tell them to mark
us as online in their friends list. This packet and the OFFLINE packet
are necessary as <code class="language-plaintext highlighter-rouge">friend_connections</code> can be established with
non-friends who are part of a groupchat. The two packets are used to
differentiate between these peers, connected to the user through
groupchats, and actual friends who ought to be marked as online in the
friendlist.</p>
<p>On receiving this packet, Messenger will show the peer as being online.</p>
<h3 id="offline"><code class="language-plaintext highlighter-rouge">OFFLINE</code></h3>
<p>length: 1 byte</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> (0x19)</td>
</tr>
</tbody>
</table>
<p>Sent to a friend when deleting the friend. Prevents a deleted friend
from seeing us as online if we are connected to them because of a group
chat.</p>
<p>On receiving this packet, Messenger will show this peer as offline.</p>
<h3 id="nickname"><code class="language-plaintext highlighter-rouge">NICKNAME</code></h3>
<p>length: 1 byte to 129 bytes.</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> (0x30)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">[0, 128]</code></td>
<td style="text-align: left">Nickname as a UTF8 byte string</td>
</tr>
</tbody>
</table>
<p>Used to send the nickname of the peer to others. This packet should be
sent every time to each friend every time they come online and each time
the nickname is changed.</p>
<h3 id="statusmessage"><code class="language-plaintext highlighter-rouge">STATUSMESSAGE</code></h3>
<p>length: 1 byte to 1008 bytes.</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> (0x31)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">[0, 1007]</code></td>
<td style="text-align: left">Status message as a UTF8 byte string</td>
</tr>
</tbody>
</table>
<p>Used to send the status message of the peer to others. This packet
should be sent every time to each friend every time they come online and
each time the status message is changed.</p>
<h3 id="userstatus"><code class="language-plaintext highlighter-rouge">USERSTATUS</code></h3>
<p>length: 2 bytes</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> (0x32)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> status (0 = online, 1 = away, 2 = busy)</td>
</tr>
</tbody>
</table>
<p>Used to send the user status of the peer to others. This packet should
be sent every time to each friend every time they come online and each
time the user status is changed.</p>
<h3 id="typing"><code class="language-plaintext highlighter-rouge">TYPING</code></h3>
<p>length: 2 bytes</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> (0x33)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> typing status (0 = not typing, 1 = typing)</td>
</tr>
</tbody>
</table>
<p>Used to tell a friend whether the user is currently typing or not.</p>
<h3 id="message"><code class="language-plaintext highlighter-rouge">MESSAGE</code></h3>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> (0x40)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">[0, 1372]</code></td>
<td style="text-align: left">Message as a UTF8 byte string</td>
</tr>
</tbody>
</table>
<p>Used to send a normal text message to the friend.</p>
<h3 id="action"><code class="language-plaintext highlighter-rouge">ACTION</code></h3>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> (0x41)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">[0, 1372]</code></td>
<td style="text-align: left">Action message as a UTF8 byte string</td>
</tr>
</tbody>
</table>
<p>Used to send an action message (like an IRC action) to the friend.</p>
<h3 id="msi"><code class="language-plaintext highlighter-rouge">MSI</code></h3>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> (0x45)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">?</code></td>
<td style="text-align: left">data</td>
</tr>
</tbody>
</table>
<p>Reserved for Tox AV usage.</p>
<h3 id="file-transfer-related-packets">File Transfer Related Packets</h3>
<h4 id="file_sendrequest"><code class="language-plaintext highlighter-rouge">FILE_SENDREQUEST</code></h4>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> (0x50)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> file number</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">4</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint32_t</code> file type</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">8</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint64_t</code> file size</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">32</code></td>
<td style="text-align: left">file id (32 bytes)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">[0, 255]</code></td>
<td style="text-align: left">filename as a UTF8 byte string</td>
</tr>
</tbody>
</table>
<p>Note that file type and file size are sent in big endian/network byte
format.</p>
<h4 id="file_control"><code class="language-plaintext highlighter-rouge">FILE_CONTROL</code></h4>
<p>length: 4 bytes if <code class="language-plaintext highlighter-rouge">control_type</code> isnt seek. 8 bytes if <code class="language-plaintext highlighter-rouge">control_type</code>
is seek.</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> (0x51)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> <code class="language-plaintext highlighter-rouge">send_receive</code></td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> file number</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> <code class="language-plaintext highlighter-rouge">control_type</code></td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">8</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint64_t</code> seek parameter</td>
</tr>
</tbody>
</table>
<p><code class="language-plaintext highlighter-rouge">send_receive</code> is 0 if the control targets a file being sent (by the
peer sending the file control), and 1 if it targets a file being
received.</p>
<p><code class="language-plaintext highlighter-rouge">control_type</code> can be one of: 0 = accept, 1 = pause, 2 = kill, 3 = seek.</p>
<p>The seek parameter is only included when <code class="language-plaintext highlighter-rouge">control_type</code> is seek (3).</p>
<p>Note that if it is included the seek parameter will be sent in big
endian/network byte format.</p>
<h4 id="file_data"><code class="language-plaintext highlighter-rouge">FILE_DATA</code></h4>
<p>length: 2 to 1373 bytes.</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> (0x52)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> file number</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">[0, 1371]</code></td>
<td style="text-align: left">file data piece</td>
</tr>
</tbody>
</table>
<p>Files are transferred in Tox using File transfers.</p>
<p>To initiate a file transfer, the friend creates and sends a
<code class="language-plaintext highlighter-rouge">FILE_SENDREQUEST</code> packet to the friend it wants to initiate a file
transfer to.</p>
<p>The first part of the <code class="language-plaintext highlighter-rouge">FILE_SENDREQUEST</code> packet is the file number. The
file number is the number used to identify this file transfer. As the
file number is represented by a 1 byte number, the maximum amount of
concurrent files Tox can send to a friend is 256. 256 file transfers per
friend is enough that clients can use tricks like queueing files if
there are more files needing to be sent.</p>
<p>256 outgoing files per friend means that there is a maximum of 512
concurrent file transfers, between two users, if both incoming and
outgoing file transfers are counted together.</p>
<p>As file numbers are used to identify the file transfer, the Tox instance
must make sure to use a file number that isnt used for another outgoing
file transfer to that same friend when creating a new outgoing file
transfer. File numbers are chosen by the file sender and stay unchanged
for the entire duration of the file transfer. The file number is used by
both <code class="language-plaintext highlighter-rouge">FILE_CONTROL</code> and <code class="language-plaintext highlighter-rouge">FILE_DATA</code> packets to identify which file
transfer these packets are for.</p>
<p>The second part of the file transfer request is the file type. This is
simply a number that identifies the type of file. for example, tox.h
defines the file type 0 as being a normal file and type 1 as being an
avatar meaning the Tox client should use that file as an avatar. The
file type does not effect in any way how the file is transfered or the
behavior of the file transfer. It is set by the Tox client that creates
the file transfers and send to the friend untouched.</p>
<p>The file size indicates the total size of the file that will be
transfered. A file size of <code class="language-plaintext highlighter-rouge">UINT64_MAX</code> (maximum value in a <code class="language-plaintext highlighter-rouge">uint64_t</code>)
means that the size of the file is undetermined or unknown. For example
if someone wanted to use Tox file transfers to stream data they would
set the file size to <code class="language-plaintext highlighter-rouge">UINT64_MAX</code>. A file size of 0 is valid and behaves
exactly like a normal file transfer.</p>
<p>The file id is 32 bytes that can be used to uniquely identify the file
transfer. For example, avatar transfers use it as the hash of the avatar
so that the receiver can check if they already have the avatar for a
friend which saves bandwidth. It is also used to identify broken file
transfers across toxcore restarts (for more info see the file transfer
section of tox.h). The file transfer implementation does not care about
what the file id is, as it is only used by things above it.</p>
<p>The last part of the file transfer is the optional file name which is
used to tell the receiver the name of the file.</p>
<p>When a <code class="language-plaintext highlighter-rouge">FILE_SENDREQUEST</code> packet is received, the implementation
validates and sends the info to the Tox client which decides whether
they should accept the file transfer or not.</p>
<p>To refuse or cancel a file transfer, they will send a <code class="language-plaintext highlighter-rouge">FILE_CONTROL</code>
packet with <code class="language-plaintext highlighter-rouge">control_type</code> 2 (kill).</p>
<p><code class="language-plaintext highlighter-rouge">FILE_CONTROL</code> packets are used to control the file transfer.
<code class="language-plaintext highlighter-rouge">FILE_CONTROL</code> packets are used to accept/unpause, pause, kill/cancel
and seek file transfers. The <code class="language-plaintext highlighter-rouge">control_type</code> parameter denotes what the
file control packet does.</p>
<p>The <code class="language-plaintext highlighter-rouge">send_receive</code> and file number are used to identify a specific file
transfer. Since file numbers for outgoing and incoming files are not
related to each other, the <code class="language-plaintext highlighter-rouge">send_receive</code> parameter is used to identify
if the file number belongs to files being sent or files being received.
If <code class="language-plaintext highlighter-rouge">send_receive</code> is 0, the file number corresponds to a file being sent
by the user sending the file control packet. If <code class="language-plaintext highlighter-rouge">send_receive</code> is 1, it
corresponds to a file being received by the user sending the file
control packet.</p>
<p><code class="language-plaintext highlighter-rouge">control_type</code> indicates the purpose of the <code class="language-plaintext highlighter-rouge">FILE_CONTROL</code> packet.
<code class="language-plaintext highlighter-rouge">control_type</code> of 0 means that the <code class="language-plaintext highlighter-rouge">FILE_CONTROL</code> packet is used to tell
the friend that the file transfer is accepted or that we are unpausing a
previously paused (by us) file transfer. <code class="language-plaintext highlighter-rouge">control_type</code> of 1 is used to
tell the other to pause the file transfer.</p>
<p>If one party pauses a file transfer, that party must be the one to
unpause it. Should both sides pause a file transfer, both sides must
unpause it before the file can be resumed. For example, if the sender
pauses the file transfer, the receiver must not be able to unpause it.
To unpause a file transfer, <code class="language-plaintext highlighter-rouge">control_type</code> 0 is used. Files can only be
paused when they are in progress and have been accepted.</p>
<p><code class="language-plaintext highlighter-rouge">control_type</code> 2 is used to kill, cancel or refuse a file transfer. When
a <code class="language-plaintext highlighter-rouge">FILE_CONTROL</code> is received, the targeted file transfer is considered
dead, will immediately be wiped and its file number can be reused. The
peer sending the <code class="language-plaintext highlighter-rouge">FILE_CONTROL</code> must also wipe the targeted file
transfer from their side. This control type can be used by both sides of
the transfer at any time.</p>
<p><code class="language-plaintext highlighter-rouge">control_type</code> 3, the seek control type is used to tell the sender of
the file to start sending from a different index in the file than 0. It
can only be used right after receiving a <code class="language-plaintext highlighter-rouge">FILE_SENDREQUEST</code> packet and
before accepting the file by sending a <code class="language-plaintext highlighter-rouge">FILE_CONTROL</code> with
<code class="language-plaintext highlighter-rouge">control_type</code> 0. When this <code class="language-plaintext highlighter-rouge">control_type</code> is used, an extra 8 byte
number in big endian format is appended to the <code class="language-plaintext highlighter-rouge">FILE_CONTROL</code> that is
not present with other control types. This number indicates the index in
bytes from the beginning of the file at which the file sender should
start sending the file. The goal of this control type is to ensure that
files can be resumed across core restarts. Tox clients can know if they
have received a part of a file by using the file id and then using this
packet to tell the other side to start sending from the last received
byte. If the seek position is bigger or equal to the size of the file,
the seek packet is invalid and the one receiving it will discard it.</p>
<p>To accept a file Tox will therefore send a seek packet, if it is needed,
and then send a <code class="language-plaintext highlighter-rouge">FILE_CONTROL</code> packet with <code class="language-plaintext highlighter-rouge">control_type</code> 0 (accept) to
tell the file sender that the file was accepted.</p>
<p>Once the file transfer is accepted, the file sender will start sending
file data in sequential chunks from the beginning of the file (or the
position from the <code class="language-plaintext highlighter-rouge">FILE_CONTROL</code> seek packet if one was received).</p>
<p>File data is sent using <code class="language-plaintext highlighter-rouge">FILE_DATA</code> packets. The file number corresponds
to the file transfer that the file chunks belong to. The receiver
assumes that the file transfer is over as soon as a chunk with the file
data size not equal to the maximum size (1371 bytes) is received. This
is how the sender tells the receiver that the file transfer is complete
in file transfers where the size of the file is unknown (set to
<code class="language-plaintext highlighter-rouge">UINT64_MAX</code>). The receiver also assumes that if the amount of received
data equals to the file size received in the <code class="language-plaintext highlighter-rouge">FILE_SENDREQUEST</code>, the
file sending is finished and has been successfully received. Immediately
after this occurs, the receiver frees up the file number so that a new
incoming file transfer can use that file number. The implementation
should discard any extra data received which is larger than the file
size received at the beginning.</p>
<p>In 0 filesize file transfers, the sender will send one <code class="language-plaintext highlighter-rouge">FILE_DATA</code>
packet with a file data size of 0.</p>
<p>The sender will know if the receiver has received the file successfully
by checking if the friend has received the last <code class="language-plaintext highlighter-rouge">FILE_DATA</code> packet sent
(containing the last chunk of the file). <code class="language-plaintext highlighter-rouge">net_crypto</code> can be used to
check whether packets sent through it have been received by storing the
packet number of the sent packet and verifying later in <code class="language-plaintext highlighter-rouge">net_crypto</code> to
see whether it was received or not. As soon as <code class="language-plaintext highlighter-rouge">net_crypto</code> says the
other received the packet, the file transfer is considered successful,
wiped and the file number can be reused to send new files.</p>
<p><code class="language-plaintext highlighter-rouge">FILE_DATA</code> packets should be sent as fast as the <code class="language-plaintext highlighter-rouge">net_crypto</code>
connection can handle it respecting its congestion control.</p>
<p>If the friend goes offline, all file transfers are cleared in toxcore.
This makes it simpler for toxcore as it does not have to deal with
resuming file transfers. It also makes it simpler for clients as the
method for resuming file transfers remains the same, even if the client
is restarted or toxcore loses the connection to the friend because of a
bad internet connection.</p>
<h3 id="group-chat-related-packets">Group Chat Related Packets</h3>
<table>
<thead>
<tr>
<th style="text-align: left">Packet ID</th>
<th style="text-align: left">Packet Name</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left">0x60</td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">INVITE_GROUPCHAT</code></td>
</tr>
<tr>
<td style="text-align: left">0x61</td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">ONLINE_PACKET</code></td>
</tr>
<tr>
<td style="text-align: left">0x62</td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">DIRECT_GROUPCHAT</code></td>
</tr>
<tr>
<td style="text-align: left">0x63</td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">MESSAGE_GROUPCHAT</code></td>
</tr>
<tr>
<td style="text-align: left">0xC7</td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">LOSSY_GROUPCHAT</code></td>
</tr>
</tbody>
</table>
<p>Messenger also takes care of saving the friends list and other friend
information so that its possible to close and start toxcore while
keeping all your friends, your long term key and the information
necessary to reconnect to the network.</p>
<p>Important information messenger stores includes: the long term private
key, our current nospam value, our friends public keys and any friend
requests the user is currently sending. The network DHT nodes, TCP
relays and some onion nodes are stored to aid reconnection.</p>
<p>In addition to this, a lot of optional data can be stored such as the
usernames of friends, our current username, status messages of friends,
our status message, etc… can be stored. The exact format of the
toxcore save is explained later.</p>
<p>The TCP server is run from the toxcore messenger module if the client
has enabled it. TCP server is usually run independently as part of the
bootstrap node package but it can be enabled in clients. If it is
enabled in toxcore, Messenger will add the running TCP server to the TCP
relay.</p>
<p>Messenger is the module that transforms code that can connect to friends
based on public key into a real instant messenger.</p>
<h2 id="tcp-client">TCP client</h2>
<p><code class="language-plaintext highlighter-rouge">TCP client</code> is the client for the TCP server. It establishes and keeps
a connection to the TCP server open.</p>
<p>All the packet formats are explained in detail in <code class="language-plaintext highlighter-rouge">TCP server</code> so this
section will only cover <code class="language-plaintext highlighter-rouge">TCP client</code> specific details which are not
covered in the <code class="language-plaintext highlighter-rouge">TCP server</code> documentation.</p>
<p>TCP clients can choose to connect to TCP servers through a proxy. Most
common types of proxies (SOCKS, HTTP) work by establishing a connection
through a proxy using the protocol of that specific type of proxy. After
the connection through that proxy to a TCP server is established, the
socket behaves from the point of view of the application exactly like a
TCP socket that connects directly to a TCP server instance. This means
supporting proxies is easy.</p>
<p><code class="language-plaintext highlighter-rouge">TCP client</code> first establishes a TCP connection, either through a proxy
or directly to a TCP server. It uses the DHT public key as its long term
key when connecting to the TCP server.</p>
<p>It establishes a secure connection to the TCP server. After establishing
a connection to the TCP server, and when the handshake response has been
received from the TCP server, the toxcore implementation immediately
sends a ping packet. Ideally the first packets sent would be routing
request packets but this solution aids code simplicity and allows the
server to confirm the connection.</p>
<p>Ping packets, like all other data packets, are sent as encrypted
packets.</p>
<p>Ping packets are sent by the toxcore TCP client every 30 seconds with a
timeout of 10 seconds, the same interval and timeout as toxcore TCP
server ping packets. They are the same because they accomplish the same
thing.</p>
<p><code class="language-plaintext highlighter-rouge">TCP client</code> must have a mechanism to make sure important packets
(routing requests, disconnection notifications, ping packets, ping
response packets) dont get dropped because the TCP socket is full.
Should this happen, the TCP client must save these packets and
prioritize sending them, in order, when the TCP socket on the server
becomes available for writing again. <code class="language-plaintext highlighter-rouge">TCP client</code> must also take into
account that packets might be bigger than the number of bytes it can
currently write to the socket. In this case, it must save the bytes of
the packet that it didnt write to the socket and write them to the
socket as soon as the socket allows so that the connection does not get
broken. It must also assume that it may receive only part of an
encrypted packet. If this occurs it must save the part of the packet it
has received and wait for the rest of the packet to arrive before
handling it.</p>
<p><code class="language-plaintext highlighter-rouge">TCP client</code> can be used to open up a route to friends who are connected
to the TCP server. This is done by sending a routing request to the TCP
server with the DHT public key of the friend. This tells the server to
register a <code class="language-plaintext highlighter-rouge">connection_id</code> to the DHT public key sent in the packet. The
server will then respond with a routing response packet. If the
connection was accepted, the <code class="language-plaintext highlighter-rouge">TCP client</code> will store the <code class="language-plaintext highlighter-rouge">connection id</code>
for this connection. The <code class="language-plaintext highlighter-rouge">TCP client</code> will make sure that routing
response packets are responses to a routing packet that it sent by
storing that it sent a routing packet to that public key and checking
the response against it. This prevents the possibility of a bad TCP
server exploiting the client.</p>
<p>The <code class="language-plaintext highlighter-rouge">TCP client</code> will handle connection notifications and disconnection
notifications by alerting the module using it that the connection to the
peer is up or down.</p>
<p><code class="language-plaintext highlighter-rouge">TCP client</code> will send a disconnection notification to kill a connection
to a friend. It must send a disconnection notification packet regardless
of whether the peer was online or offline so that the TCP server will
unregister the connection.</p>
<p>Data to friends can be sent through the TCP relay using OOB (out of
band) packets and connected connections. To send an OOB packet, the DHT
public key of the friend must be known. OOB packets are sent in blind
and there is no way to query the TCP relay to see if the friend is
connected before sending one. OOB packets should be sent when the
connection to the friend via the TCP relay isnt in an connected state
but it is known that the friend is connected to that relay. If the
friend is connected via the TCP relay, then normal data packets must be
sent as they are smaller than OOB packets.</p>
<p>OOB recv and data packets must be handled and passed to the module using
it.</p>
<h2 id="tcp-connections">TCP connections</h2>
<p><code class="language-plaintext highlighter-rouge">TCP_connections</code> takes care of handling multiple TCP client instances
to establish a reliable connection via TCP relays to a friend.
Connecting to a friend with only one relay would not be very reliable,
so <code class="language-plaintext highlighter-rouge">TCP_connections</code> provides the level of abstraction needed to manage
multiple relays. For example, it ensures that if a relay goes down, the
connection to the peer will not be impacted. This is done by connecting
to the other peer with more than one relay.</p>
<p><code class="language-plaintext highlighter-rouge">TCP_connections</code> is above <a href="#tcp-client"><code class="language-plaintext highlighter-rouge">TCP client</code></a> and below
<code class="language-plaintext highlighter-rouge">net_crypto</code>.</p>
<p>A TCP connection in <code class="language-plaintext highlighter-rouge">TCP_connections</code> is defined as a connection to a
peer though one or more TCP relays. To connect to another peer with
<code class="language-plaintext highlighter-rouge">TCP_connections</code>, a connection in <code class="language-plaintext highlighter-rouge">TCP_connections</code> to the peer with
DHT public key X will be created. Some TCP relays which we know the peer
is connected to will then be associated with that peer. If the peer
isnt connected directly yet, these relays will be the ones that the
peer has sent to us via the onion module. The peer will also send some
relays it is directly connected to once a connection is established,
however, this is done by another module.</p>
<p><code class="language-plaintext highlighter-rouge">TCP_connections</code> has a list of all relays it is connected to. It tries
to keep the number of relays it is connected to as small as possible in
order to minimize load on relays and lower bandwidth usage for the
client. The desired number of TCP relay connections per peer is set to 3
in toxcore with the maximum number set to 6. The reason for these
numbers is that 1 would mean no backup relays and 2 would mean only 1
backup. To be sure that the connection is reliable 3 seems to be a
reasonable lower bound. The maximum number of 6 is the maximum number of
relays that can be tied to each peer. If 2 peers are connected each to
the same 6+ relays and they both need to be connected to that amount of
relays because of other friends this is where this maximum comes into
play. There is no reason why this number is 6 but in toxcore it has to
be at least double than the desired number (3) because the code assumes
this.</p>
<p>If necessary, <code class="language-plaintext highlighter-rouge">TCP_connections</code> will connect to TCP relays to use them
to send onion packets. This is only done if there is no UDP connection
to the network. When there is a UDP connection, packets are sent with
UDP only because sending them with TCP relays can be less reliable. It
is also important that we are connected at all times to some relays as
these relays will be used by TCP only peers to initiate a connection to
us.</p>
<p>In toxcore, each client is connected to 3 relays even if there are no
TCP peers and the onion is not needed. It might be optimal to only
connect to these relays when toxcore is initializing as this is the only
time when peers will connect to us via TCP relays we are connected to.
Due to how the onion works, after the initialization phase, where each
peer is searched in the onion and then if they are found the info
required to connect back (DHT pk, TCP relays) is sent to them, there
should be no more peers connecting to us via TCP relays. This may be a
way to further reduce load on TCP relays, however, more research is
needed before it is implemented.</p>
<p><code class="language-plaintext highlighter-rouge">TCP_connections</code> picks one relay and uses only it for sending data to
the other peer. The reason for not picking a random connected relay for
each packet is that it severely deteriorates the quality of the link
between two peers and makes performance of lossy video and audio
transmissions really poor. For this reason, one relay is picked and used
to send all data. If for any reason no more data can be sent through
that relay, the next relay is used. This may happen if the TCP socket is
full and so the relay should not necessarily be dropped if this occurs.
Relays are only dropped if they time out or if they become useless (if
the relay is one too many or is no longer being used to relay data to
any peers).</p>
<p><code class="language-plaintext highlighter-rouge">TCP_connections</code> in toxcore also contains a mechanism to make
connections go to sleep. TCP connections to other peers may be put to
sleep if the connection to the peer establishes itself with UDP after
the connection is established with TCP. UDP is the method preferred by
<code class="language-plaintext highlighter-rouge">net_crypto</code> to communicate with other peers. In order to keep track of
the relays which were used to connect with the other peer in case the
UDP connection fails, they are saved by <code class="language-plaintext highlighter-rouge">TCP_connections</code> when the
connection is put to sleep. Any relays which were only used by this
redundant connection are saved then disconnected from. If the connection
is awakened, the relays are reconnected to and the connection is
reestablished. Putting a connection to sleep is the same as saving all
the relays used by the connection and removing the connection. Awakening
the connection is the same as creating a new connection with the same
parameters and restoring all the relays.</p>
<p>A method to detect potentially dysfunctional relays that try to disrupt
the network by lying that they are connecting to a peer when they are
not or that maliciously drop all packets should be considered. Toxcore
doesnt currently implement such a system and adding one requires more
research and likely also requires extending the protocol.</p>
<p>When TCP connections connects to a relay it will create a new
<a href="#tcp-client"><code class="language-plaintext highlighter-rouge">TCP_client</code></a> instance for that relay. At any time if the
<code class="language-plaintext highlighter-rouge">TCP_client</code> instance reports that it has disconnected, the TCP relay
will be dropped. Once the TCP relay reports that it is connected,
<code class="language-plaintext highlighter-rouge">TCP_connections</code> will find all the connections that are associated to
the relay and announce to the relay that it wants to connect to each of
them with routing requests. If the relay reports that the peer for a
connection is online, the connection number and relay will be used to
send data in that connection with data packets. If the peer isnt
reported as online but the relay is associated to a connection, TCP OOB
(out of band) packets will be used to send data instead of data packets.
TCP OOB packets are used in this case since the relay most likely has
the peer connected but it has not sent a routing request to connect to
us.</p>
<p><code class="language-plaintext highlighter-rouge">TCP_connections</code> is used as the bridge between individual <code class="language-plaintext highlighter-rouge">TCP_client</code>
instances and <code class="language-plaintext highlighter-rouge">net_crypto</code>, or the bridge between individual connections
and something that requires an interface that looks like one connection.</p>
<h2 id="tcp-server">TCP server</h2>
<p>The TCP server in tox has the goal of acting like a TCP relay between
clients who cannot connect directly to each other or who for some reason
are limited to using the TCP protocol to connect to each other.
<code class="language-plaintext highlighter-rouge">TCP_server</code> is typically run only on actual server machines but any Tox
client could host one as the api to run one is exposed through the tox.h
api.</p>
<p>To connect to a hosted TCP server toxcore uses the TCP client module.</p>
<p>The TCP server implementation in toxcore can currently either work on
epoll on linux or using unoptimized but portable socket polling.</p>
<p>TCP connections between the TCP client and the server are encrypted to
prevent an outsider from knowing information like who is connecting to
whom just be looking at someones connection to a TCP server. This is
useful when someone connects though something like Tor for example. It
also prevents someone from injecting data in the stream and makes it so
we can assume that any data received was not tampered with and is
exactly what was sent by the client.</p>
<p>When a client first connects to a TCP server he opens up a TCP
connection to the ip and port the TCP server is listening on. Once the
connection is established he then sends a handshake packet, the server
then responds with his own and a secure connection is established. The
connection is then said to be unconfirmed and the client must then send
some encrypted data to the server before the server can mark the
connection as confirmed. The reason it works like this is to prevent a
type of attack where a peer would send a handshake packet and then time
out right away. To prevent this the server must wait a few seconds for a
sign that the client received his handshake packet before confirming the
connection. The both can then communicate with each other using the
encrypted connection.</p>
<p>The TCP server essentially acts as just a relay between 2 peers. When a
TCP client connects to the server he tells the server which clients he
wants the server to connect him to. The server will only let two clients
connect to each other if both have indicated to the server that they
want to connect to each other. This is to prevent non friends from
checking if someone is connected to a TCP server. The TCP server
supports sending packets blindly through it to clients with a client
with public key X (OOB packets) however the TCP server does not give any
feedback or anything to say if the packet arrived or not and as such it
is only useful to send data to friends who may not know that we are
connected to the current TCP server while we know they are. This occurs
when one peer discovers the TCP relay and DHT public key of the other
peer before the other peer discovers its DHT public key. In that case
OOB packets would be used until the other peer knows that the peer is
connected to the relay and establishes a connection through it.</p>
<p>In order to make toxcore work on TCP only the TCP server supports
relaying onion packets from TCP clients and sending any responses from
them to TCP clients.</p>
<p>To establish a secure connection with a TCP server send the following
128 bytes of data or handshake packet to the server:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">32</code></td>
<td style="text-align: left">DHT public key of client</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">24</code></td>
<td style="text-align: left">Nonce for the encrypted data</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">72</code></td>
<td style="text-align: left">Payload (plus MAC)</td>
</tr>
</tbody>
</table>
<p>Payload is encrypted with the DHT private key of the client and public
key of the server and the nonce:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">32</code></td>
<td style="text-align: left">Public key</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">24</code></td>
<td style="text-align: left">Base nonce</td>
</tr>
</tbody>
</table>
<p>The base nonce is the one TCP client wants the TCP server to use to
decrypt the packets received from the TCP client.</p>
<p>The first 32 bytes are the public key (DHT public key) that the TCP
client is announcing itself to the server with. The next 24 bytes are a
nonce which the TCP client uses along with the secret key associated
with the public key in the first 32 bytes of the packet to encrypt the
rest of this packet. The encrypted part of this packet contains a
temporary public key that will be used for encryption during the
connection and will be discarded after. It also contains a base nonce
which will be used later for decrypting packets received from the TCP
client.</p>
<p>If the server decrypts successfully the encrypted data in the handshake
packet and responds with the following handshake response of length 96
bytes:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">24</code></td>
<td style="text-align: left">Nonce for the encrypted data</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">72</code></td>
<td style="text-align: left">Payload (plus MAC)</td>
</tr>
</tbody>
</table>
<p>Payload is encrypted with the private key of the server and the DHT
public key of the client and the nonce:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">32</code></td>
<td style="text-align: left">Public key</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">24</code></td>
<td style="text-align: left">Base nonce</td>
</tr>
</tbody>
</table>
<p>The base nonce is the one the TCP server wants the TCP client to use to
decrypt the packets received from the TCP server.</p>
<p>The client already knows the long term public key of the server so it is
omitted in the response, instead only a nonce is present in the
unencrypted part. The encrypted part of the response has the same
elements as the encrypted part of the request: a temporary public key
tied to this connection and a base nonce which will be used later when
decrypting packets received from the TCP client both unique for the
connection.</p>
<p>In toxcore the base nonce is generated randomly like all the other
nonces, it must be randomly generated to prevent nonce reuse. For
example if the nonce used was 0 for both sides since both sides use the
same keys to encrypt packets they send to each other, two packets would
be encrypted with the same nonce. These packets could then be possibly
replayed back to the sender which would cause issues. A similar
mechanism is used in <code class="language-plaintext highlighter-rouge">net_crypto</code>.</p>
<p>After this the client will know the connection temporary public key and
base nonce of the server and the server will know the connection base
nonce and temporary public key of the client.</p>
<p>The client will then send an encrypted packet to the server, the
contents of the packet do not matter and it must be handled normally by
the server (ex: if it was a ping send a pong response. The first packet
must be any valid encrypted data packet), the only thing that does
matter is that the packet was encrypted correctly by the client because
it means that the client has correctly received the handshake response
the server sent to it and that the handshake the client sent to the
server really came from the client and not from an attacker replaying
packets. The server must prevent resource consuming attacks by timing
out clients if they do not send any encrypted packets so the server to
prove to the server that the connection was established correctly.</p>
<p>Toxcore does not have a timeout for clients, instead it stores
connecting clients in large circular lists and times them out if their
entry in the list gets replaced by a newer connection. The reasoning
behind this is that it prevents TCP flood attacks from having a negative
impact on the currently connected nodes. There are however much better
ways to do this and the only reason toxcore does it this way is because
writing it was very simple. When connections are confirmed they are
moved somewhere else.</p>
<p>When the server confirms the connection he must look in the list of
connected peers to see if he is already connected to a client with the
same announced public key. If this is the case the server must kill the
previous connection because this means that the client previously timed
out and is reconnecting. Because of Toxcore design it is very unlikely
to happen that two legitimate different peers will have the same public
key so this is the correct behavior.</p>
<p>Encrypted data packets look like this to outsiders:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">2</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint16_t</code> length of data</td>
</tr>
<tr>
<td style="text-align: left">variable</td>
<td style="text-align: left">encrypted data</td>
</tr>
</tbody>
</table>
<p>In a TCP stream they would look like:
<code class="language-plaintext highlighter-rouge">[[length][data]][[length][data]][[length][data]]...</code>.</p>
<p>Both the client and server use the following (temp public and private
(client and server) connection keys) which are each generated for the
connection and then sent to the other in the handshake and sent to the
other. They are then used like the next diagram shows to generate a
shared key which is equal on both sides.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Client: Server:
generate_shared_key( generate_shared_key(
[temp connection public key of server], [temp connection public key of client],
[temp connection private key of client]) [temp connection private key of server])
= =
[shared key] [shared key]
</code></pre></div></div>
<p>The generated shared key is equal on both sides and is used to encrypt
and decrypt the encrypted data packets.</p>
<p>each encrypted data packet sent to the client will be encrypted with the
shared key and with a nonce equal to: (client base nonce + number of
packets sent so for the first packet it is (starting at 0) nonce + 0,
the second is nonce + 1 and so on. Note that nonces like all other
numbers sent over the network in toxcore are numbers in big endian
format so when increasing them by 1 the least significant byte is the
last one)</p>
<p>each packet received from the client will be decrypted with the shared
key and with a nonce equal to: (server base nonce + number of packets
sent so for the first packet it is (starting at 0) nonce + 0, the second
is nonce + 1 and so on. Note that nonces like all other numbers sent
over the network in toxcore are numbers in big endian format so when
increasing them by 1 the least significant byte is the last one)</p>
<p>Encrypted data packets have a hard maximum size of 2 + 2048 bytes in the
toxcore TCP server implementation, 2048 bytes is big enough to make sure
that all toxcore packets can go through and leaves some extra space just
in case the protocol needs to be changed in the future. The 2 bytes
represents the size of the data length and the 2048 bytes the max size
of the encrypted part. This means the maximum size is 2050 bytes. In
current toxcore, the largest encrypted data packets sent will be of size
2 + 1417 which is 1419 total.</p>
<p>The logic behind the format of the handshake is that we:</p>
<ol>
<li>
<p>need to prove to the server that we own the private key related to
the public key we are announcing ourselves with.</p>
</li>
<li>
<p>need to establish a secure connection that has perfect forward
secrecy</p>
</li>
<li>
<p>prevent any replay, impersonation or other attacks</p>
</li>
</ol>
<p>How it accomplishes each of those points:</p>
<ol>
<li>
<p>If the client does not own the private key related to the public key
they will not be able to create the handshake packet.</p>
</li>
<li>
<p>Temporary session keys generated by the client and server in the
encrypted part of the handshake packets are used to encrypt/decrypt
packets during the session.</p>
</li>
<li>
<p>The following attacks are prevented:</p>
<ul>
<li>
<p>Attacker modifies any byte of the handshake packets: Decryption
fail, no attacks possible.</p>
</li>
<li>
<p>Attacker captures the handshake packet from the client and
replays it later to the server: Attacker will never get the
server to confirm the connection (no effect).</p>
</li>
<li>
<p>Attacker captures a server response and sends it to the client
next time they try to connect to the server: Client will never
confirm the connection. (See: <code class="language-plaintext highlighter-rouge">TCP_client</code>)</p>
</li>
<li>
<p>Attacker tries to impersonate a server: They wont be able to
decrypt the handshake and wont be able to respond.</p>
</li>
<li>
<p>Attacker tries to impersonate a client: Server wont be able to
decrypt the handshake.</p>
</li>
</ul>
</li>
</ol>
<p>The logic behind the format of the encrypted packets is that:</p>
<ol>
<li>
<p>TCP is a stream protocol, we need packets.</p>
</li>
<li>
<p>Any attacks must be prevented</p>
</li>
</ol>
<p>How it accomplishes each of those points:</p>
<ol>
<li>
<p>2 bytes before each packet of encrypted data denote the length. We
assume a functioning TCP will deliver bytes in order which makes it
work. If the TCP doesnt it most likely means it is under attack and
for that see the next point.</p>
</li>
<li>
<p>The following attacks are prevented:</p>
<ul>
<li>
<p>Modifying the length bytes will either make the connection time
out and/or decryption fail.</p>
</li>
<li>
<p>Modifying any encrypted bytes will make decryption fail.</p>
</li>
<li>
<p>Injecting any bytes will make decryption fail.</p>
</li>
<li>
<p>Trying to re order the packets will make decryption fail because
of the ordered nonce.</p>
</li>
<li>
<p>Removing any packets from the stream will make decryption fail
because of the ordered nonce.</p>
</li>
</ul>
</li>
</ol>
<h3 id="encrypted-payload-types">Encrypted payload types</h3>
<p>The folowing represents the various types of data that can be sent
inside encrypted data packets.</p>
<h4 id="routing-request-0x00">Routing request (0x00)</h4>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> (0x00)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">32</code></td>
<td style="text-align: left">Public key</td>
</tr>
</tbody>
</table>
<h4 id="routing-request-response-0x01">Routing request response (0x01)</h4>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> (0x01)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> rpid</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">32</code></td>
<td style="text-align: left">Public key</td>
</tr>
</tbody>
</table>
<p>rpid is invalid <code class="language-plaintext highlighter-rouge">connection_id</code> (0) if refused, <code class="language-plaintext highlighter-rouge">connection_id</code> if
accepted.</p>
<h4 id="connect-notification-0x02">Connect notification (0x02)</h4>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> (0x02)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> <code class="language-plaintext highlighter-rouge">connection_id</code> of connection that got connected</td>
</tr>
</tbody>
</table>
<h4 id="disconnect-notification-0x03">Disconnect notification (0x03)</h4>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> (0x03)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> <code class="language-plaintext highlighter-rouge">connection_id</code> of connection that got disconnected</td>
</tr>
</tbody>
</table>
<h4 id="ping-packet-0x04">Ping packet (0x04)</h4>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> (0x04)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">8</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint64_t</code> <code class="language-plaintext highlighter-rouge">ping_id</code> (0 is invalid)</td>
</tr>
</tbody>
</table>
<h4 id="ping-response-pong-0x05">Ping response (pong) (0x05)</h4>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> (0x05)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">8</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint64_t</code> <code class="language-plaintext highlighter-rouge">ping_id</code> (0 is invalid)</td>
</tr>
</tbody>
</table>
<h4 id="oob-send-0x06">OOB send (0x06)</h4>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> (0x06)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">32</code></td>
<td style="text-align: left">Destination public key</td>
</tr>
<tr>
<td style="text-align: left">variable</td>
<td style="text-align: left">Data</td>
</tr>
</tbody>
</table>
<h4 id="oob-recv-0x07">OOB recv (0x07)</h4>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> (0x07)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">32</code></td>
<td style="text-align: left">Sender public key</td>
</tr>
<tr>
<td style="text-align: left">variable</td>
<td style="text-align: left">Data</td>
</tr>
</tbody>
</table>
<h4 id="onion-packet-0x08">Onion packet (0x08)</h4>
<p>Same format as initial onion packet but packet id is 0x08 instead of
0x80.</p>
<h4 id="onion-packet-response-0x09">Onion packet response (0x09)</h4>
<p>Same format as onion packet but packet id is 0x09 instead of 0x8e.</p>
<h4 id="data-0x10-and-up">Data (0x10 and up)</h4>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> packet id</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> connection id</td>
</tr>
<tr>
<td style="text-align: left">variable</td>
<td style="text-align: left">data</td>
</tr>
</tbody>
</table>
<p>The TCP server is set up in a way to minimize waste while relaying the
many packets that might go between two tox peers hence clients must
create connections to other clients on the relay. The connection number
is a <code class="language-plaintext highlighter-rouge">uint8_t</code> and must be equal or greater to 16 in order to be valid.
Because a <code class="language-plaintext highlighter-rouge">uint8_t</code> has a maximum value of 256 it means that the maximum
number of different connections to other clients that each connection
can have is 240. The reason valid <code class="language-plaintext highlighter-rouge">connection_ids</code> are bigger than 16 is
because they are the first byte of data packets. Currently only number 0
to 9 are taken however we keep a few extras in case we need to extend
the protocol without breaking it completely.</p>
<p>Routing request (Sent by client to server): Send a routing request to
the server that we want to connect to peer with public key where the
public key is the public the peer announced themselves as. The server
must respond to this with a Routing response.</p>
<p>Routing response (Sent by server to client): The response to the routing
request, tell the client if the routing request succeeded (valid
<code class="language-plaintext highlighter-rouge">connection_id</code>) and if it did, tell them the id of the connection
(<code class="language-plaintext highlighter-rouge">connection_id</code>). The public key sent in the routing request is also
sent in the response so that the client can send many requests at the
same time to the server without having code to track which response
belongs to which public key.</p>
<p>The only reason a routing request should fail is if the connection has
reached the maximum number of simultaneous connections. In case the
routing request fails the public key in the response will be the public
key in the failed request.</p>
<p>Connect notification (Sent by server to client): Tell the client that
<code class="language-plaintext highlighter-rouge">connection_id</code> is now connected meaning the other is online and data
can be sent using this <code class="language-plaintext highlighter-rouge">connection_id</code>.</p>
<p>Disconnect notification (Sent by client to server): Sent when client
wants the server to forget about the connection related to the
<code class="language-plaintext highlighter-rouge">connection_id</code> in the notification. Server must remove this connection
and must be able to reuse the <code class="language-plaintext highlighter-rouge">connection_id</code> for another connection. If
the connection was connected the server must send a disconnect
notification to the other client. The other client must think that this
client has simply disconnected from the TCP server.</p>
<p>Disconnect notification (Sent by server to client): Sent by the server
to the client to tell them that the connection with <code class="language-plaintext highlighter-rouge">connection_id</code> that
was connected is now disconnected. It is sent either when the other
client of the connection disconnect or when they tell the server to kill
the connection (see above).</p>
<p>Ping and Pong packets (can be sent by both client and server, both will
respond): ping packets are used to know if the other side of the
connection is still live. TCP when established doesnt have any sane
timeouts (1 week isnt sane) so we are obliged to have our own way to
check if the other side is still live. Ping ids can be anything except
0, this is because of how toxcore sets the variable storing the
<code class="language-plaintext highlighter-rouge">ping_id</code> that was sent to 0 when it receives a pong response which
means 0 is invalid.</p>
<p>The server should send ping packets every X seconds (toxcore
<code class="language-plaintext highlighter-rouge">TCP_server</code> sends them every 30 seconds and times out the peer if it
doesnt get a response in 10). The server should respond immediately to
ping packets with pong packets.</p>
<p>The server should respond to ping packets with pong packets with the
same <code class="language-plaintext highlighter-rouge">ping_id</code> as was in the ping packet. The server should check that
each pong packet contains the same <code class="language-plaintext highlighter-rouge">ping_id</code> as was in the ping, if not
the pong packet must be ignored.</p>
<p>OOB send (Sent by client to server): If a peer with private key equal to
the key they announced themselves with is connected, the data in the OOB
send packet will be sent to that peer as an OOB recv packet. If no such
peer is connected, the packet is discarded. The toxcore <code class="language-plaintext highlighter-rouge">TCP_server</code>
implementation has a hard maximum OOB data length of 1024. 1024 was
picked because it is big enough for the <code class="language-plaintext highlighter-rouge">net_crypto</code> packets related to
the handshake and is large enough that any changes to the protocol would
not require breaking TCP server. It is however not large enough for the
biggest <code class="language-plaintext highlighter-rouge">net_crypto</code> packets sent with an established <code class="language-plaintext highlighter-rouge">net_crypto</code>
connection to prevent sending those via OOB packets.</p>
<p>OOB recv (Sent by server to client): OOB recv are sent with the
announced public key of the peer that sent the OOB send packet and the
exact data.</p>
<p>OOB packets can be used just like normal data packets however the extra
size makes sending data only through them less efficient than data
packets.</p>
<p>Data: Data packets can only be sent and received if the corresponding
<code class="language-plaintext highlighter-rouge">connection_id</code> is connection (a Connect notification has been received
from it) if the server receives a Data packet for a non connected or
existent connection it will discard it.</p>
<p>Why did I use different packet ids for all packets when some are only
sent by the client and some only by the server? Its less confusing.</p>
<h2 id="friend-connection">Friend connection</h2>
<p><code class="language-plaintext highlighter-rouge">friend_connection</code> is the module that sits on top of the DHT, onion and
<code class="language-plaintext highlighter-rouge">net_crypto</code> modules and takes care of linking the 3 together.</p>
<p>Friends in <code class="language-plaintext highlighter-rouge">friend_connection</code> are represented by their real public key.
When a friend is added in <code class="language-plaintext highlighter-rouge">friend_connection</code>, an onion search entry is
created for that friend. This means that the onion module will start
looking for this friend and send that friend their DHT public key, and
the TCP relays it is connected to, in case a connection is only possible
with TCP.</p>
<p>Once the onion returns the DHT public key of the peer, the DHT public
key is saved, added to the DHT friends list and a new <code class="language-plaintext highlighter-rouge">net_crypto</code>
connection is created. Any TCP relays returned by the onion for this
friend are passed to the <code class="language-plaintext highlighter-rouge">net_crypto</code> connection.</p>
<p>If the DHT establishes a direct UDP connection with the friend,
<code class="language-plaintext highlighter-rouge">friend_connection</code> will pass the IP/port of the friend to <code class="language-plaintext highlighter-rouge">net_crypto</code>
and also save it to be used to reconnect to the friend if they
disconnect.</p>
<p>If <code class="language-plaintext highlighter-rouge">net_crypto</code> finds that the friend has a different DHT public key,
which can happen if the friend restarted their client, <code class="language-plaintext highlighter-rouge">net_crypto</code> will
pass the new DHT public key to the onion module and will remove the DHT
entry for the old DHT public key and replace it with the new one. The
current <code class="language-plaintext highlighter-rouge">net_crypto</code> connection will also be killed and a new one with
the correct DHT public key will be created.</p>
<p>When the <code class="language-plaintext highlighter-rouge">net_crypto</code> connection for a friend goes online,
<code class="language-plaintext highlighter-rouge">friend_connection</code> will tell the onion module that the friend is online
so that it can stop spending resources looking for the friend. When the
friend connection goes offline, <code class="language-plaintext highlighter-rouge">friend_connection</code> will tell the onion
module so that it can start looking for the friend again.</p>
<p>There are 2 types of data packets sent to friends with the <code class="language-plaintext highlighter-rouge">net_crypto</code>
connection handled at the level of <code class="language-plaintext highlighter-rouge">friend_connection</code>, Alive packets
and TCP relay packets. Alive packets are packets with the packet id or
first byte of data (only byte in this packet) being 16. They are used in
order to check if the other friend is still online. <code class="language-plaintext highlighter-rouge">net_crypto</code> does
not have any timeout when the connection is established so timeouts are
caught using this packet. In toxcore, this packet is sent every 8
seconds. If none of these packets are received for 32 seconds, the
connection is timed out and killed. These numbers seem to cause the
least issues and 32 seconds is not too long so that, if a friend times
out, toxcore wont falsely see them online for too long. Usually when a
friend goes offline they have time to send a disconnect packet in the
<code class="language-plaintext highlighter-rouge">net_crypto</code> connection which makes them appear offline almost
instantly.</p>
<p>The timeout for when to stop retrying to connect to a friend by creating
new <code class="language-plaintext highlighter-rouge">net_crypto</code> connections when the old one times out in toxcore is
the same as the timeout for DHT peers (122 seconds). However, it is
calculated from the last time a DHT public key was received for the
friend or time the friends <code class="language-plaintext highlighter-rouge">net_crypto</code> connection went offline after
being online. The highest time is used to calculate when the timeout is.
<code class="language-plaintext highlighter-rouge">net_crypto</code> connections will be recreated (if the connection fails)
until this timeout.</p>
<p><code class="language-plaintext highlighter-rouge">friend_connection</code> sends a list of 3 relays (the same number as the
target number of TCP relay connections in <code class="language-plaintext highlighter-rouge">TCP_connections</code>) to each
connected friend every 5 minutes in toxcore. Immediately before sending
the relays, they are associated to the current
<code class="language-plaintext highlighter-rouge">net_crypto-&gt;TCP_connections</code> connection. This facilitates connecting
the two friends together using the relays as the friend who receives the
packet will associate the sent relays to the <code class="language-plaintext highlighter-rouge">net_crypto</code> connection
they received it from. When both sides do this they will be able to
connect to each other using the relays. The packet id or first byte of
the packet of share relay packets is 0x11. This is then followed by some
TCP relays stored in packed node format.</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> (0x11)</td>
</tr>
<tr>
<td style="text-align: left">variable</td>
<td style="text-align: left">TCP relays in packed node format (see DHT)</td>
</tr>
</tbody>
</table>
<p>If local IPs are received as part of the packet, the local IP will be
replaced with the IP of the peer that sent the relay. This is because we
assume this is the best way to attempt to connect to the TCP relay. If
the peer that sent the relay is using a local IP, then the sent local IP
should be used to connect to the relay.</p>
<p>For all other data packets, are passed by <code class="language-plaintext highlighter-rouge">friend_connection</code> up to the
upper Messenger module. It also separates lossy and lossless packets
from <code class="language-plaintext highlighter-rouge">net_crypto</code>.</p>
<p>Friend connection takes care of establishing the connection to the
friend and gives the upper messenger layer a simple interface to receive
and send messages, add and remove friends and know if a friend is
connected (online) or not connected (offline).</p>
<h2 id="friend-requests">Friend requests</h2>
<p>When a Tox user adds someone with Tox, toxcore will try sending a friend
request to that person. A friend request contains the long term public
key of the sender, a nospam number and a message.</p>
<p>Transmitting the long term public key is the primary goal of the friend
request as it is what the peer needs to find and establish a connection
to the sender. The long term public key is what the receiver adds to his
friends list if he accepts the friend request.</p>
<p>The nospam is a number used to prevent someone from spamming the network
with valid friend requests. It makes sure that the only people who have
seen the Tox ID of a peer are capable of sending them a friend request.
The nospam is one of the components of the Tox ID.</p>
<p>The nospam is a number or a list of numbers set by the peer, only
received friend requests that contain a nospam that was set by the peer
are sent to the client to be accepted or refused by the user. The nospam
prevents random peers in the network from sending friend requests to non
friends. The nospam is not long enough to be secure meaning an extremely
resilient attacker could manage to send a spam friend request to
someone. 4 bytes is large enough to prevent spam from random peers in
the network. The nospam could also allow Tox users to issue different
Tox IDs and even change Tox IDs if someone finds a Tox ID and decides to
send it hundreds of spam friend requests. Changing the nospam would stop
the incoming wave of spam friend requests without any negative effects
to the users friends list. For example if users would have to change
their public key to prevent them from receiving friend requests it would
mean they would have to essentially abandon all their current friends as
friends are tied to the public key. The nospam is not used at all once
the friends have each other added which means changing it wont have any
negative effects.</p>
<p>Friend
request:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[uint32_t nospam][Message (UTF8) 1 to ONION_CLIENT_MAX_DATA_SIZE bytes]
</code></pre></div></div>
<p>Friend request packet when sent as an onion data packet:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[uint8_t (32)][Friend request]
</code></pre></div></div>
<p>Friend request packet when sent as a <code class="language-plaintext highlighter-rouge">net_crypto</code> data packet (If we are
directly connected to the peer because of a group chat but are not
friends with them):</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[uint8_t (18)][Friend request]
</code></pre></div></div>
<p>When a friend is added to toxcore with their Tox ID and a message, the
friend is added in <code class="language-plaintext highlighter-rouge">friend_connection</code> and then toxcore tries to send
friend requests.</p>
<p>When sending a friend request, toxcore will check if the peer which a
friend request is being sent to is already connected to using a
<code class="language-plaintext highlighter-rouge">net_crypto</code> connection which can happen if both are in the same group
chat. If this is the case the friend request will be sent as a
<code class="language-plaintext highlighter-rouge">net_crypto</code> packet using that connection. If not, it will be sent as an
onion data packet.</p>
<p>Onion data packets contain the real public key of the sender and if a
<code class="language-plaintext highlighter-rouge">net_crypto</code> connection is established it means the peer knows our real
public key. This is why the friend request does not need to contain the
real public key of the peer.</p>
<p>Friend requests are sent with exponentially increasing interval of 2
seconds, 4 seconds, 8 seconds, etc… in toxcore. This is so friend
requests get resent but eventually get resent in intervals that are so
big that they essentially expire. The sender has no way of knowing if a
peer refuses a friend requests which is why friend requests need to
expire in some way. Note that the interval is the minimum timeout, if
toxcore cannot send that friend request it will try again until it
manages to send it. One reason for not being able to send the friend
request would be that the onion has not found the friend in the onion
and so cannot send an onion data packet to them.</p>
<p>Received friend requests are passed to the client, the client is
expected to show the message from the friend request to the user and ask
the user if they want to accept the friend request or not. Friend
requests are accepted by adding the peer sending the friend request as a
friend and refused by simply ignoring it.</p>
<p>Friend requests are sent multiple times meaning that in order to prevent
the same friend request from being sent to the client multiple times
toxcore keeps a list of the last real public keys it received friend
requests from and discards any received friend requests that are from a
real public key that is in that list. In toxcore this list is a simple
circular list. There are many ways this could be improved and made more
efficient as a circular list isnt very efficient however it has worked
well in toxcore so far.</p>
<p>Friend requests from public keys that are already added to the friends
list should also be discarded.</p>
<h2 id="group">Group</h2>
<p>Group chats in Tox work by temporarily adding some peers present in the
group chat as temporary <code class="language-plaintext highlighter-rouge">friend_connection</code> friends, that are deleted
when the group chat is exited.</p>
<p>Each peer in the group chat is identified by their real long term public
key. Peers also transmit their DHT public keys to each other via the
group chat in order to speed up the connection by making it unnecessary
for the peers to find each others DHT public keys with the onion, as
would happen had they added each other as normal friends.</p>
<p>The upside of using <code class="language-plaintext highlighter-rouge">friend_connection</code> is that group chats do not have
to deal with things like hole punching, peers only on TCP or other low
level networking things. The downside however is that every single peer
knows each others real long term public key and DHT public key, meaning
that these group chats should only be used between friends.</p>
<p>Each peer adds a <code class="language-plaintext highlighter-rouge">friend_connection</code> for each of up to 4 other peers in
the group. If the group chat has 5 participants or fewer, each of the
peers will therefore have each of the others added to their list of
friend connections, and a peer wishing to send a message to the group
may communicate it directly to the other peers. When there are more than
5 peers, messages are relayed along friend connections.</p>
<p>Since the maximum number of peers per groupchat that will be connected
to with friend connections is 4, if the peers in the groupchat are
arranged in a circle and each peer connects to the 2 peers that are
closest to the right of them and the 2 peers that are closest to the
left of them, then the peers should form a well-connected circle of
peers.</p>
<p>Group chats in toxcore do this by subtracting the real long term public
key of the peer with all the others in the group (our PK - other peer
PK), using modular arithmetic, and finding the two peers for which the
result of this operation is the smallest. The operation is then inversed
(other peer PK - our PK) and this operation is done again with all the
public keys of the peers in the group. The 2 peers for which the result
is again the smallest are picked.</p>
<p>This gives 4 peers that are then added as a friend connection and
associated to the group. If every peer in the group does this, they will
form a circle of perfectly connected peers.</p>
<p>Once the peers are connected to each other in a circle they relay each
others messages. Every time a peer leaves the group or a new peer
joins, each member of the chat will recalculate the peers they should
connect to.</p>
<p>To join a group chat, a peer must first be invited to it by their
friend. To make a groupchat the peer will first create a groupchat and
then invite people to this group chat. Once their friends are in the
group chat, those friends can invite their other friends to the chat,
and so on.</p>
<p>To create a group chat, a peer generates a random 32 byte id that is
used to uniquely identify the group chat. 32 bytes is enough so that
when randomly generated with a secure random number generator every
groupchat ever created will have a different id. The goal of this 32
byte id is so that peers have a way of identifying each group chat, so
that they can prevent themselves from joining a groupchat twice for
example.</p>
<p>The groupchat will also have an unsigned 1 byte type. This type
indicates what kind of groupchat the groupchat is. The current types
are:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Type number</th>
<th style="text-align: left">Type</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">0</code></td>
<td style="text-align: left">text</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left">audio</td>
</tr>
</tbody>
</table>
<p>Text groupchats are text only, while audio indicates that the groupchat
supports sending audio to it as well as text.</p>
<p>The groupchat will also be identified by a unique unsigned 2 byte
integer, which in toxcore corresponds to the index of the groupchat in
the array it is being stored in. Every groupchat in the current instance
must have a different number. This number is used by groupchat peers
that are directly connected to us to tell us which packets are for which
groupchat. Every groupchat packet contains a 2 byte groupchat number.
Putting a 32 byte groupchat id in each packet would increase bandwidth
waste by a lot, and this is the reason why groupchat numbers are used
instead.</p>
<p>Using the group number as the index of the array used to store the
groupchat instances is recommended, because this kind of access is
usually most efficient and it ensures that each groupchat has a unique
group number.</p>
<p>When creating a new groupchat, the peer will add themselves as a
groupchat peer with a peer number of 0 and their own long term public
key and DHT public key.</p>
<p>Invite packets:</p>
<p>Invite packet:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> (0x60)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> (0x00)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">2</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint16_t</code> group number</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">33</code></td>
<td style="text-align: left">Group chat identifier</td>
</tr>
</tbody>
</table>
<p>Accept Invite packet:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> (0x60)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> (0x01)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">2</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint16_t</code> group number (local)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">2</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint16_t</code> group number to join</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">33</code></td>
<td style="text-align: left">Group chat identifier</td>
</tr>
</tbody>
</table>
<p>Member Information packet:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> (0x60)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> (0x02)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">2</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint16_t</code> group number (local)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">2</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint16_t</code> group number to join</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">33</code></td>
<td style="text-align: left">Group chat identifier</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">2</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint16_t</code> peer number</td>
</tr>
</tbody>
</table>
<p>A group chat identifier consists of a 1-byte type and a 32-byte id
concatenated:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> type</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">32</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> groupchat id</td>
</tr>
</tbody>
</table>
<p>To invite a friend to a group chat, an invite packet is sent to the
friend. These packets are sent using Messenger (if you look at the
Messenger packet id section, all the groupchat packet ids are in there).
Note that all numbers here, like all numbers sent using Tox packets, are
sent in big endian format.</p>
<p>The group chat number is as explained above, the number used to uniquely
identify the groupchat instance from all the other groupchat instances
the peer has. It is sent in the invite packet because it is needed by
the friend in order to send back groupchat related packets.</p>
<p>What follows is the 33 byte group chat identifier.</p>
<p>To refuse the invite, the friend receiving it will simply ignore and
discard it.</p>
<p>To accept the invite, the friend will create their own groupchat
instance with the 1 byte type and 32 byte groupchat id sent in the
request, and send an invite accept packet back. The friend will also add
the peer who sent the invite as a groupchat connection, and mark the
connection as introducing the friend.</p>
<p>If the friend being invited is already in the group, they will respond
with a member information packet, add the peer who sent the invite as a
groupchat connection, and mark the connection as introducing both the
friend and the peer who sent the invite.</p>
<p>The first group number in the invite accept packet is the group number
of the groupchat the invited friend just created. The second group
number is the group number that was sent in the invite request. What
follows is the 33 byte group chat identifier which was sent in the
invite request. The member information packet is the same, but includes
also the current peer number of the invited friend.</p>
<p>When a peer receives an invite accept packet they will check if the
group identifier sent back corresponds to the group identifier of the
groupchat with the group number also sent back. If so, a new peer number
will be generated for the peer that sent the invite accept packet. Then
the peer with their generated peer number, their long term public key
and their DHT public key will be added to the peer list of the
groupchat. A new peer message packet will also be sent to tell everyone
in the group chat about the new peer. The peer will also be added as a
groupchat connection, and the connection will be marked as introducing
the peer.</p>
<p>When a peer receives a member information packet they proceed as with an
invite accept packet, but use the peer number in the packet rather than
generating a new one, and mark the new connection as also introducing
the peer receiving the member information packet.</p>
<p>Peer numbers are used to uniquely identify each peer in the group chat.
They are used in groupchat message packets so that peers receiving them
can know who or which groupchat peer sent them. As groupchat message
packets are relayed, they must contain something that is used by others
to identify the sender. Since putting a 32 byte public key in each
packet would be wasteful, a 2 byte peer number is used instead. Each
peer in the groupchat has a unique peer number. Toxcore generates each
peer number randomly but makes sure newly generated peer numbers are not
equal to current ones already used by other peers in the group chat. If
two peers join the groupchat from two different endpoints there is a
small possibility that both will be given the same peer number, but the
probability of this occurring is low enough in practice that it is not
an issue.</p>
<p>Peer online packet:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> (0x61)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">2</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint16_t</code> group number (local)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">33</code></td>
<td style="text-align: left">Group chat identifier</td>
</tr>
</tbody>
</table>
<p>Peer introduced packet:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> (0x62)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">2</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint16_t</code> group number (local)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> (0x01)</td>
</tr>
</tbody>
</table>
<p>For a groupchat connection to work, both peers in the groupchat must be
attempting to connect directly to each other.</p>
<p>Groupchat connections are established when both peers who want to
connect to each other either create a new friend connection to connect
to each other or reuse an existing friend connection that connects them
together (if they are friends or already are connected together because
of another group chat).</p>
<p>As soon as the connection to the other peer is opened, a peer online
packet is sent to the peer. The goal of the online packet is to tell the
peer that we want to establish the groupchat connection with them and
tell them the groupchat number of our groupchat instance. The peer
online packet contains the group number and the 33 byte group chat
identifier. The group number is the group number the peer has for the
group with the group id sent in the packet.</p>
<p>When both sides send an online packet to the other peer, a connection is
established.</p>
<p>When an online packet is received from a peer, if the connection to the
peer is already established (an online packet has been already
received), or if there is no group connection to that peer being
established, the packet is dropped. Otherwise, the group number to
communicate with the group via the peer is saved, the connection is
considered established, and an online packet is sent back to the peer. A
ping message is sent to the group. If this is the first group connection
to that group we establish, or the connection is marked as introducing
us, we send a peer query packet back to the peer. This is so we can get
the list of peers from the group. If the connection is marked as
introducing the peer, we send a new peer message to the group announcing
the peer, and a name message reannouncing our name.</p>
<p>A groupchat connection can be marked as introducing one or both of the
peers it connects, to indicate that the connection should be maintained
until that peer is well connected to the group. A peer maintains a
groupchat connection to a second peer as long as the second peer is one
of the four closest peers in the groupchat to the first, or the
connection is marked as introducing a peer who still requires the
connection. A peer requires a groupchat connection to a second peer
which introduces the first peer until the first peer has more than 4
groupchat connections and receives a message from the second peer via a
different groupchat connection. The first peer then sends a peer
introduced packet to the second peer to indicate that they no longer
require the connection.</p>
<p>Peer query packet:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> (0x62)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">2</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint16_t</code> group number</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> (0x08)</td>
</tr>
</tbody>
</table>
<p>Peer response packet:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> (0x62)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">2</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint16_t</code> group number</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> (0x09)</td>
</tr>
<tr>
<td style="text-align: left">variable</td>
<td style="text-align: left">Repeated times number of peers: Peer info</td>
</tr>
</tbody>
</table>
<p>The Peer info structure is as follows:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">2</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint16_t</code> peer number</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">32</code></td>
<td style="text-align: left">Long term public key</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">32</code></td>
<td style="text-align: left">DHT public key</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> Name length</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">[0, 255]</code></td>
<td style="text-align: left">Name</td>
</tr>
</tbody>
</table>
<p>Title response packet:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> (0x62)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">2</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint16_t</code> group number</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> (0x0a)</td>
</tr>
<tr>
<td style="text-align: left">variable</td>
<td style="text-align: left">Title</td>
</tr>
</tbody>
</table>
<p>Message packets:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> (0x63)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">2</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint16_t</code> group number</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">2</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint16_t</code> peer number</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">4</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint32_t</code> message number</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> with a value representing id of message</td>
</tr>
<tr>
<td style="text-align: left">variable</td>
<td style="text-align: left">Data</td>
</tr>
</tbody>
</table>
<p>Lossy Message packets:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> (0xc7)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">2</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint16_t</code> group number</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">2</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint16_t</code> peer number</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">2</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint16_t</code> message number</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> with a value representing id of message</td>
</tr>
<tr>
<td style="text-align: left">variable</td>
<td style="text-align: left">Data</td>
</tr>
</tbody>
</table>
<p>If a peer query packet is received, the receiver takes their list of
peers and creates a peer response packet which is then sent to the other
peer. If there are too many peers in the group chat and the peer
response packet would be larger than the maximum size of friend
connection packets (1373 bytes), more than one peer response packet is
sent back. A Title response packet is also sent back. This is how the
peer that joins a group chat finds out the list of peers in the group
chat and the title of the group chat right after joining.</p>
<p>Peer response packets are straightforward and contain the information
for each peer (peer number, real public key, DHT public key, name)
appended to each other. The title response is also straightforward.</p>
<p>Both the maximum length of groupchat peer names and the groupchat title
is 128 bytes. This is the same maximum length as names in all of
toxcore.</p>
<p>When a peer receives a peer response packet, they will add each of the
received peers to their groupchat peer list, find the 4 closest peers to
them and create groupchat connections to them as was explained
previously. The DHT public key of an already known peer is updated to
one given in the response packet if the peer is frozen, or if it has
been frozen since its DHT public key was last updated.</p>
<p>When a peer receives a title response packet, they update the title for
the groupchat accordingly if the title has not already been set, or if
since it was last set there has been a time at which all peers were
frozen.</p>
<p>If the peer does not yet know their own peer number, as is the case if
they have just accepted an invitation, the peer will find themselves in
the list of received peers and use the peer number assigned to them as
their own. They are then able to send messages and invite other peers to
the groupchat. They immediately send a name message to announce their
name to the group.</p>
<p>Message packets are used to send messages to all peers in the groupchat.
To send a message packet, a peer will first take their peer number and
the message they want to send. Each message packet sent will have a
message number that is equal to the last message number sent + 1. Like
all other numbers (group chat number, peer number) in the packet, the
message number in the packet will be in big endian format.</p>
<p>When a Message packet is received, the peer receiving it will first
check that the peer number of the sender is in their peer list. If not,
the peer ignores the message but sends a peer query packet to the peer
the packet was directly received from. That peer should have the message
sender in their peer list, and so will send the senders peer info back
in a peer response.</p>
<p>If the sender is in the receivers peer list, the receiver now checks
whether they have already seen a message with the same sender and
message number. This is achieved by storing the 8 greatest message
numbers received from a given sender peer number. If the message has
lesser message number than any of those 8, it is assumed to have been
received. If the message has already been received according to this
check, or if it is a name or title message and another message of the
same type from the same sender with a greater message number has been
received, then the packet is discarded. Otherwise, the message is
processed as described below, and a Message packet with the message is
sent (relayed) to all current group connections except the one that it
was received from, and also to that one if that peer is the original
sender of the message. The only thing that should change in the Message
packet as it is relayed is the group number.</p>
<p>Lossy message packets are used to send audio packets to others in audio
group chats. Lossy packets work the same way as normal relayed groupchat
messages in that they are relayed to everyone in the group chat until
everyone has them, but there are a few differences. Firstly, the message
number is only a 2 byte integer. When receiving a lossy packet from a
peer the receiving peer will first check if a message with that message
number was already received from that peer. If it wasnt, the packet
will be added to the list of received packets and then the packet will
be passed to its handler and then sent to the 2 closest connected
groupchat peers that are not the sender. The reason for it to be 2
instead of 4 (or 3 if we are not the original sender) as for lossless
message packets is that it reduces bandwidth usage without lowering the
quality of the received audio stream via lossy packets, at the cost of
reduced robustness against connections failing. To check if a packet was
already received, the last 256 message numbers received from each peer
are stored. If video was added meaning a much higher number of packets
would be sent, this number would be increased. If the packet number is
in this list then it was received.</p>
<h3 id="message-ids">Message ids</h3>
<h4 id="ping-0x00">ping (0x00)</h4>
<p>Sent approximately every 20 seconds by every peer. Contains no data.</p>
<h4 id="new_peer-0x10"><code class="language-plaintext highlighter-rouge">new_peer</code> (0x10)</h4>
<p>Tell everyone about a new peer in the chat.</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">2</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint16_t</code> Peer number</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">32</code></td>
<td style="text-align: left">Long term public key</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">32</code></td>
<td style="text-align: left">DHT public key</td>
</tr>
</tbody>
</table>
<h4 id="kill_peer-0x11"><code class="language-plaintext highlighter-rouge">kill_peer</code> (0x11)</h4>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">2</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint16_t</code> Peer number</td>
</tr>
</tbody>
</table>
<h4 id="freeze_peer-0x12"><code class="language-plaintext highlighter-rouge">freeze_peer</code> (0x12)</h4>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">2</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint16_t</code> Peer number</td>
</tr>
</tbody>
</table>
<h4 id="name-change-0x30">Name change (0x30)</h4>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left">variable</td>
<td style="text-align: left">Name (namelen)</td>
</tr>
</tbody>
</table>
<h4 id="groupchat-title-change-0x31">Groupchat title change (0x31)</h4>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left">variable</td>
<td style="text-align: left">Title (titlelen)</td>
</tr>
</tbody>
</table>
<h4 id="chat-message-0x40">Chat message (0x40)</h4>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left">variable</td>
<td style="text-align: left">Message (messagelen)</td>
</tr>
</tbody>
</table>
<h4 id="action-me-0x41">Action (/me) (0x41)</h4>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left">variable</td>
<td style="text-align: left">Message (messagelen)</td>
</tr>
</tbody>
</table>
<p>Ping messages are sent every 20 seconds by every peer. This is how other
peers know that the peers are still alive.</p>
<p>When a new peer joins, the peer which invited the joining peer will send
a new peer message to warn everyone that there is a new peer in the
chat. When a new peer message is received, the peer in the message must
be added to the peer list if it is not there already, and its DHT public
key must be set to that in the message.</p>
<p>Kill peer messages are used to indicate that a peer has quit the group
chat permanently. Freeze peer messages are similar, but indicate that
the quitting peer may later return to the group. Each is sent by the one
quitting the group chat right before they quit it.</p>
<p>Name change messages are used to change or set the name of the peer
sending it. They are also sent by a joining peer right after receiving
the list of peers in order to tell others what their name is.</p>
<p>Title change packets are used to change the title of the group chat and
can be sent by anyone in the group chat.</p>
<p>Chat and action messages are used by the group chat peers to send
messages to others in the group chat.</p>
<h3 id="timeouts-and-reconnection">Timeouts and reconnection</h3>
<p>Groupchat connections may go down, and this may lead to a peer becoming
disconnected from the group or the group otherwise splitting into
multiple connected components. To ensure the group becomes fully
connected again once suitable connections are re-established, peers keep
track of peers who are no longer visible in the group (“frozen” peers),
and try to re-integrate them into the group via any suitable friend
connections which may come to be available. The rejoin packet is used
for this.</p>
<p>Rejoin packet:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> (0x64)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">33</code></td>
<td style="text-align: left">Group chat identifier</td>
</tr>
</tbody>
</table>
<p>A peer in a groupchat is considered to be active when a group message or
rejoin packet is received from it, or a new peer message is received for
it. A peer which remains inactive for 60 seconds is set as frozen; this
means it is removed from the peer list and added to a separate list of
frozen peers. Frozen peers are disregarded for all purposes except those
discussed below.</p>
<p>If a frozen peer becomes active, we unfreeze it, meaning that we move it
from the frozen peers list to the peer list, and we send a name message
to the group.</p>
<p>Whenever we make a new friend connection to a peer, we check whether the
public key of the peer is that of any frozen peer. If so, we send a
rejoin packet to the peer along the friend connection, and create a
groupchat connection to the peer, marked as introducing us, and send a
peer online packet to the peer.</p>
<p>If we receive a rejoin packet from a peer along a friend connection,
then, after unfreezing the peer if it was frozen, we update the peers
DHT public key in the groupchat peer list to the key in the friend
connection, and create a groupchat connection for the peer, marked as
introducing the peer, and send a peer online packet to the peer.</p>
<p>When a peer is added to the peer list, any existing peer in the peer
list or frozen peers list with the same public key is first removed.</p>
<h2 id="dht-group-chats">DHT Group Chats</h2>
<p>This document details the groupchat implementation, giving a high level
overview of all the important features and aspects, as well as some
important low level implementation details. This documentation reflects
what is currently implemented at the time of writing; it is not
speculative. For detailed API docs see the groupchats section of the
tox.h header file.</p>
<h3 id="features">Features</h3>
<ul>
<li>
<p>Plain and action messages (/me)</p>
</li>
<li>
<p>Private messages</p>
</li>
<li>
<p>Public groups (peers may join via a public key of group)</p>
</li>
<li>
<p>Private groups (require a friend invite for join)</p>
</li>
<li>
<p>Permanence (a group cannot die as long as at least one peer
retains their group credentials)</p>
</li>
<li>
<p>Persistence across client restarts</p>
</li>
<li>
<p>Ability to set peer limits</p>
</li>
<li>
<p>Group roles (founder, moderators, users, observers)</p>
</li>
<li>
<p>Moderation (kicking, silencing, controlling which roles may speak)</p>
</li>
<li>
<p>Permanent group names (set on creation)</p>
</li>
<li>
<p>Topics (permission to modify is set by founder)</p>
</li>
<li>
<p>Password protection</p>
</li>
<li>
<p>Self-repairing (auto-rejoin on disconnect, group split protection,
state syncing)</p>
</li>
<li>
<p>Identity separation from the Tox ID</p>
</li>
<li>
<p>Ability to ignore peers</p>
</li>
<li>
<p>Nicknames can be set on a per-group basis</p>
</li>
<li>
<p>Peer statuses (online, away, busy) which can be set on a per-group
basis</p>
</li>
<li>
<p>Sending group name in invites</p>
</li>
<li>
<p>Ability to disconnect from group and join later with the same
credentials</p>
</li>
</ul>
<h3 id="group-roles">Group roles</h3>
<p>There are four distinct roles which are hierarchical in nature (higher
roles have all the privileges of lower roles).</p>
<ul>
<li>
<p><strong><code class="language-plaintext highlighter-rouge">Founder</code></strong> - The groups creator. May set the role of all other
peers to anything except founder. May modify the shared state
(password, privacy state, topic lock, peer limit).</p>
</li>
<li>
<p><strong><code class="language-plaintext highlighter-rouge">Moderator</code></strong> - Promoted by the founder. May kick peers below this
role, as well as set peers with the user role to observer, and vice
versa. May also set the topic when the topic lock is enabled.</p>
</li>
<li>
<p><strong><code class="language-plaintext highlighter-rouge">User</code></strong> - Default non-founder role. May communicate with other
peers normally. May set the topic when the topic lock is disabled.</p>
</li>
<li>
<p><strong><code class="language-plaintext highlighter-rouge">Observer</code></strong> - Demoted by moderators and the founder. May observe
the group and ignore peers; may not communicate with other peers or
with the group.</p>
</li>
</ul>
<h3 id="group-types">Group types</h3>
<p>Groups can have two types: private and public. The type can be set on
creation, and may also be toggled by the group founder at any point
after creation. (<em>Note: password protection is independent of the group
type</em>)</p>
<h4 id="public">Public</h4>
<p>Anyone may join the group using the Chat ID. If the group is public,
information about peers inside the group, including their IP addresses
and group public keys (but not their Tox IDs) is visible to anyone with
access to a node storing their DHT announcement.</p>
<h4 id="private">Private</h4>
<p>The only way to join a private group is by having someone in your friend
list send you an invite. If the group is private, no peer/group
information (mentioned in the Public section) is present in the DHT; the
DHT is not used for any purpose at all. If a public group is set to
private, all DHT information related to the group will expire within a
few minutes.</p>
<h3 id="voice-state">Voice state</h3>
<p>The voice state, which may only be set by the founder, determines which
group roles have permission to speak. There are three voice states:</p>
<ul>
<li>
<p><strong><code class="language-plaintext highlighter-rouge">Founder</code></strong> - Only the founder may speak.</p>
</li>
<li>
<p><strong><code class="language-plaintext highlighter-rouge">Moderator</code></strong> - The founder and moderators may speak.</p>
</li>
<li>
<p><strong><code class="language-plaintext highlighter-rouge">All</code></strong> - Everyone except observers may speak.</p>
</li>
</ul>
<p>The voice state does not affect topic setting or private messages, and
is set to <strong><code class="language-plaintext highlighter-rouge">All</code></strong> by default.</p>
<h3 id="cryptography">Cryptography</h3>
<p>Groupchats use the <a href="https://en.wikipedia.org/wiki/NaCl_\(software\)">NaCl/libsodium cryptography
library</a> for all
cryptography related operations. All group communication is end-to-end
encrypted. Message confidentiality, integrity, and repudability are
guaranteed via <a href="https://en.wikipedia.org/wiki/Authenticated_encryption">authenticated
encryption</a>, and
<a href="https://en.wikipedia.org/wiki/Forward_secrecy">perfect forward secrecy</a>
is also provided.</p>
<p>One of the most important security improvements from the old groupchat
implementation is the removal of a message-relay mechanism that uses a
group-wide shared key. Instead, connections are 1-to-1 (a complete
graph), meaning an outbound message is sent once per peer, and
encrypted/decrypted using a session key unique to each pair of peers.
This prevents MITM attacks that were previously possible. This
additionally ensures that private messages are truly private.</p>
<p>Groups make use of 11 unique keys in total: Two permanent keypairs
(encryption and signature), two group keypairs (encryption and
signature), one session keypair (encryption), and one shared symmetric
key (encryption).</p>
<p>The Tox ID/Tox public key is not used for any purpose. As such, neither
peers in a given group nor in the group DHT can be matched with their
Tox ID. In other words, there is no way of identifying a peer aside from
their IP address, nickname, and group public key. (<em>Note: group
nicknames can be different from the clients main nickname that their
friends see</em>).</p>
<h4 id="permanent-keypairs">Permanent keypairs</h4>
<p>When a peer creates or joins a group they generate two permanent
keypairs: an encryption keypair and a signature keypair, both of which
are unique to the group. The two public keys are the only guaranteed way
to identify a peer, and both keypairs will persist for as long as a peer
remains in the group (even across client restarts). If a peer exits the
group these keypairs will be lost forever.</p>
<p>This encryption keypair is not used for any encryption operations except
for the initial handshake when connecting to another peer. For usage
details on the signature key, see the <a href="#moderation"><code class="language-plaintext highlighter-rouge">Moderation</code></a>
section.</p>
<h4 id="session-keypairshared-symmetric-key">Session keypair/shared symmetric key</h4>
<p>When two peers establish a connection they each generate an ephemeral
session encryption keypair and share one anothers resulting public key.
With their own session secret key and the others session public key,
they will both generate the same symmetric encryption key. This
symmetric key, which must not be exposed to anyone else, will be used
for all further encryption and decryption operations between the two
peers for the duration of the session.</p>
<p>The purpose of this extra key exchange is to prevent an adversary from
decrypting messages from previous sessions in event that a secret
encryption key becomes compromised. This is known as forward secrecy.</p>
<p>Session keys are periodically rotated to further reduce the potential
damage in the event of a security breach, as well as to mitigate certain
types of data-based cryptography attacks.</p>
<h4 id="group-keypairs">Group keypairs</h4>
<p>The group founder generates two additional permanent keypairs when the
group is created: an encryption keypair, and a signature keypair. The
public signature key is considered the <strong><code class="language-plaintext highlighter-rouge">Chat ID</code></strong> and is used as the
groups permanent identifier, allowing other peers to join public groups
via the DHT. Every peer in the group holds a copy of the groups public
encryption key along with the public signature key/Chat ID.</p>
<p>The group secret keys are similar to the permanent keypairs in that they
will persist across client restarts, but will be lost forever if the
founder exits the group. This is particularly important as
administration related functionality will not work without these keys.</p>
<p>See the <a href="#founders"><code class="language-plaintext highlighter-rouge">Founders</code></a> section for usage details.</p>
<h3 id="founders">Founders</h3>
<p>The peer who creates the group is the groups founder. Founders have a
set of admin privileges, including:</p>
<ul>
<li>
<p>Promoting and demoting moderators</p>
</li>
<li>
<p>The ability to kick moderators along-side non-moderators</p>
</li>
<li>
<p>Setting the peer limit</p>
</li>
<li>
<p>Setting the groups privacy state</p>
</li>
<li>
<p>Setting group passwords</p>
</li>
<li>
<p>Toggling the topic lock</p>
</li>
<li>
<p>Setting the voice state</p>
</li>
</ul>
<h4 id="shared-state">Shared state</h4>
<p>Groups contain a data structure called the <strong><code class="language-plaintext highlighter-rouge">shared state</code></strong> which is
given to every peer who joins the group. Within this structure resides
all data pertaining to the group that may only be modified by the group
founder. This includes the group name, the group type, the peer limit,
the topic lock, the password, and the voice state. The shared state
holds a copy of the group founders public encryption and signature
keys, which is how other peers in the group are able to verify the
identity of the group founder. It also contains a hash of the moderator
list.</p>
<p>The shared state is signed by the founder using the group secret
signature key. As the founder is the only peer who holds this secret
key, the shared state can be shared with new peers and cryptographically
verified even in the absence of the founder.</p>
<p>When the founder modifies the shared state, he increments the shared
state version, signs the new shared state data with the group secret
signature key, and broadcasts the new shared state data along with its
signature to the entire group. When a peer receives this broadcast, he
uses the group public signature key to verify that the data was signed
with the group secret signature key, and also verifies that the new
version is not older than the current version.</p>
<h4 id="moderation">Moderation</h4>
<p>The founder has the ability to promote other peers to the moderator
role. Moderators have all the privileges of normal users. In addition,
they have the power to kick peers whose role is below moderator, as well
as set their roles to anything below moderator. Moderators may also
modify the group topic when it is locked. Moderators have no power over
one another; only the founder can kick or change the role of a
moderator.</p>
<h4 id="kicks">Kicks</h4>
<p>When a peer is kicked from the group, he will be disconnected from all
group peers, his role will be set to user, and his chat instance will be
left in a disconnected state. His public key will not be lost; he will
be able to reconnect to the group with the same identity.</p>
<h4 id="moderator-list">Moderator list</h4>
<p>Each peer holds a copy of the <strong><code class="language-plaintext highlighter-rouge">moderator list</code></strong>, which is an array of
public signature keys of peers who currently have the moderator role
(including those who are offline). A hash (sha256) of this list called
the <strong><code class="language-plaintext highlighter-rouge">mod_list_hash</code></strong> is stored in the shared state, which is itself
signed by the founder using the group secret signature key. This allows
the moderator list to be shared between untrusted peers, even in the
absence of the founder, while maintaining moderator verifiability.</p>
<p>When the founder modifies the moderator list, he updates the
<code class="language-plaintext highlighter-rouge">mod_list_hash</code>, increments the shared state version, signs the new
shared state, broadcasts the new shared state data along with its
signature to the entire group, then broadcasts the new moderator list to
the entire group. When a peer receives this moderator list (having
already verified the new shared state), he creates a hash of the new
list and verifies that it is identical to the <code class="language-plaintext highlighter-rouge">mod_list_hash</code>.</p>
<h4 id="sanctions-list">Sanctions list</h4>
<p>Each peer holds a copy of the <strong><code class="language-plaintext highlighter-rouge">sanctions list</code></strong>. This list is
comprised of peers who have been demoted to the observer role.</p>
<p>Entries contain the public key of the sanctioned peer, a timestamp of
the time the entry was made, the public signature key of the peer who
set the sanction, and a signature of the entrys data, which is signed
by the peer who created the entry using their secret signature key.
Individual entries are verified by ensuring that the entrys public
signature key belongs to the founder or is present in the moderator
list, and then verifying that the entrys data was signed by the owner
of that key.</p>
<p>Although each individual entry can be verified, we still need a way to
verify that the list as a whole is complete, and identical for every
peer, otherwise any peer would be able to remove entries arbitrarily, or
replace the list with an older version. Therefore each peer holds a copy
of the <strong><code class="language-plaintext highlighter-rouge">sanctions list credentials</code></strong>. This is a data structure that
holds a version number, a hash (sha256) of all combined sanctions list
entries, a 16-bit checksum of the hash, the public signature key of the
last peer to have modified the list, and a signature of the hash, which
is signed by the private signature key associated with the
aforementioned public signature key.</p>
<p>When a moderator or founder modifies the sanctions list, he will
increment the version, create a new hash of the list, make a checksum of
the hash, sign the hash+version with his secret signature key, and
replace the old public signature key with his own. He will then
broadcast the new changes (not the entire list) to the entire group
along with the new credentials. When a peer receives this broadcast, he
will verify that the new credentials version is not older than the
current version, validate the hash and checksum, and verify that the
changes were made by a moderator or the founder. If adding an entry, he
will verify that the entry was signed by the signature key of the
entrys creator.</p>
<p>If a peer receives sanctions credentials with a version equal to their
own but with a different checksum, they will ignore the changes if the
new checksum is a smaller value than the checksum for their current
sanctions credentials.</p>
<p>When the founder kicks or demotes a moderator, he will first go through
the sanctions list and re-sign each entry made by that moderator using
the founder key, then re-broadcast the sanctions list to the entire
group. This is necessary to guarantee that all sanctions list entries
and its credentials are signed by a current moderator or the founder at
all times.</p>
<p><em>Note: The sanctions list is not saved to the Tox save file, meaning
that if the group ever becomes empty, the sanctions list will be reset.
This is in contrast to the shared state and moderator list, which are
both saved and will persist even if the group becomes empty.</em></p>
<h3 id="topics">Topics</h3>
<p>The topic is an arbitrary string of characters with a maximum length of
512 bytes. The topic has two states: locked and unlocked. When locked,
only moderators and the founder may modify it. When unlocked, all peers
except observers may modify it. The integrity of the topic is maintained
in a similar manner as sanctions entries, using a data structure called
<strong><code class="language-plaintext highlighter-rouge">topic_info</code></strong>. This is a struct which contains the topic, a version,
a 16-bit checksum of the topic, and the public key of the peer who last
set the topic.</p>
<p>The topic lock state is kept track of in the shared state, and may only
be modified by the founder. If the topic lock is set to zero, this
indicates that the lock is enabled. If non-zero, the value is set to the
topic version of the last topic set when the lock was enabled. This
allows peers to ensure that the topic version is not modified while the
lock is disabled.</p>
<p>When the topic lock is <strong><code class="language-plaintext highlighter-rouge">enabled</code></strong>, the topic setter will create a new
checksum of the topic and increment the topic version. They will then
sign the topic and version with their secret signature key, replace the
public key with their own, and broadcast the new topic_info data along
with the signature to the entire group. When a peer receives this
broadcast they will check if the public signature key of the topic
setter either belongs to the founder or is in the moderator list, and
ensure that the version is not older than the current topic version.
They will then verify the signature using the setters public signature
key and validate the checksum. If the received topic has the same
version as their own but a different checksum, they will ignore the new
topic if its checksum value is smaller than the checksum value for their
current topic.</p>
<p>When the topic lock is <strong><code class="language-plaintext highlighter-rouge">disabled</code></strong>, the topic setter will create a
new checksum of the topic and leave the version unchanged. They will
then sign the topic and version with their secret signature key, replace
the public key with their own, and broadcast the new topic_info data
along with the signature to the entire group. When a peer receives this
broadcast they will ensure that the topic setter is not in the sanctions
list, and ensure that the version is equal to the value that the topic
lock is set to, unless the setter has the Founder role, in which case
they will ignore the version. They will then verify the signature using
the setters public signature key and validate the checksum.</p>
<p>If the peer who set the current topic is kicked or demoted, or if the
topic lock is enabled, the peer who initiated the action will re-sign
the topic using his own signature key and rebroadcast it to the entire
group.</p>
<p>If the peer who set the current topic is kicked or demoted, or if the
topic lock is enabled, the peer who initiated the action will re-sign
the topic using his own signature key and rebroadcast it to the entire
group.</p>
<h3 id="state-syncing">State syncing</h3>
<p>Peers send four unsigned 16-bit integers and three unsigned 32-bit
integers along with their ping packets: Their peer count[1], a
checksum of their peer list, their shared state version, their sanctions
credentials version, their peer roles checksum[2], their topic
version, and their topic checksum. If a peer receives a ping in which
any of the versions are greater than their own, or if their peer list
checksum does not match and their peer count is not greater than the
peer count received, this indicates that they may be out of sync with
the rest of the group. In this case they will send a sync request to the
respective peer, with the appropriate sync flags set to indicate what
group information they need.</p>
<p>In certain scenarios a peer may receive a topic version or sanctions
credentials version that is equal to their own, but with a different
checksum. This may occur if two or more peers in the group initiate an
action at the exact same time. If such a conflict occurs, the peer will
make the appropriate sync request if their checksum is a smaller value
than the one they received.</p>
<p>Peers that are connected to the DHT also occasionally append their IP
and port number to their ping packets for peers with which they do not
have a direct UDP connection established. This gives priority to direct
connections and ensures that TCP relays are used only as a fall-back, or
when a peer explicitly forces a TCP connection.</p>
<h3 id="dht-announcements">DHT Announcements</h3>
<p>Public groupchats leverage the Tox DHT network in order to allow for
groups that can be joined by anyone who possesses the <strong><code class="language-plaintext highlighter-rouge">Chat ID</code></strong>.
Group announcements have the same underlying functionality as normal Tox
friend announcements (including onion routing).</p>
<h2 id="dht-group-chats-packet-protocols">DHT Group Chats Packet Protocols</h2>
<p>All packet fields are considered mandatory unless flagged as
<strong><code class="language-plaintext highlighter-rouge">[optional]</code></strong>. The minimum size of an encrypted packet is 83 bytes
for lossless and 75 bytes for lossy. The maximum size of an encrypted
packet is 1400 bytes.</p>
<h3 id="full-packet-structure">Full Packet Structure</h3>
<h4 id="plaintext-header">Plaintext header</h4>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left">Packet Kind</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">32</code></td>
<td style="text-align: left">Senders Public Encryption Key</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">32</code></td>
<td style="text-align: left">Receivers Public Encryption Key <strong><code class="language-plaintext highlighter-rouge">[optional]</code></strong></td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">24</code></td>
<td style="text-align: left">Nonce</td>
</tr>
</tbody>
</table>
<h4 id="encrypted-header">Encrypted header</h4>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">0-8</code></td>
<td style="text-align: left">Padding</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left">Group Packet Identifier</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">8</code></td>
<td style="text-align: left">Message Id <strong><code class="language-plaintext highlighter-rouge">[optional]</code></strong></td>
</tr>
</tbody>
</table>
<h4 id="encrypted-payload">Encrypted payload</h4>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left">Variable</td>
<td style="text-align: left">Payload</td>
</tr>
</tbody>
</table>
<p>The plaintext header contains a <strong><code class="language-plaintext highlighter-rouge">Toxcore Network Packet Kind</code></strong> which
identifies the toxcore networking level packet type. These types are:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Type</th>
<th style="text-align: left">Net Packet ID</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><strong><code class="language-plaintext highlighter-rouge">NET_PACKET_GC_HANDSHAKE</code></strong></td>
<td style="text-align: left">0x5a</td>
</tr>
<tr>
<td style="text-align: left"><strong><code class="language-plaintext highlighter-rouge">NET_PACKET_GC_LOSSLESS</code></strong></td>
<td style="text-align: left">0x5b</td>
</tr>
<tr>
<td style="text-align: left"><strong><code class="language-plaintext highlighter-rouge">NET_PACKET_GC_LOSSY</code></strong></td>
<td style="text-align: left">0x5c</td>
</tr>
</tbody>
</table>
<p>The senders public encryption key is used to identify the peer who sent
the packet, as well as to identify the group instance for which the
packet is intended for all <strong><code class="language-plaintext highlighter-rouge">NET_PACKET_GC_LOSSLESS</code></strong> and
<strong><code class="language-plaintext highlighter-rouge">NET_PACKET_GC_LOSSY</code></strong> packets. It is also used to establish a secure
connection with the sender during the handshake protocol.</p>
<p>The receivers public encryption key is only sent in
<strong><code class="language-plaintext highlighter-rouge">NET_PACKET_GC_HANDSHAKE</code></strong> packets, and is used to identify the group
instance for which the packet is intended.</p>
<p>The encrypted header for lossless and lossy packets contains between 0
and 8 bytes of empty padding. The <strong><code class="language-plaintext highlighter-rouge">Group Packet Identifier</code></strong> is used
to identify the type of group packet, and the <strong><code class="language-plaintext highlighter-rouge">Message ID</code></strong> is a
unique packet identifier which is used for the lossless UDP
implementation.</p>
<p>The encrypted payload contains arbitrary data specific to the respective
group packet identifier. The length may range from zero to the maximum
packet size (minus the headers). These payloads will be the focus of the
remainder of this document.</p>
<h3 id="handshake-packet-payloads">Handshake packet payloads</h3>
<p>Handshake packet payloads are structured as follows:</p>
<h4 id="handshake_request-0x00-and-handshake_response-0x01">HANDSHAKE_REQUEST (0x00) and HANDSHAKE_RESPONSE (0x01)</h4>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">32</code></td>
<td style="text-align: left">Public Session Key</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">32</code></td>
<td style="text-align: left">Public Signature Key</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left">Request Type</td>
</tr>
<tr>
<td style="text-align: left">variable</td>
<td style="text-align: left">1 Packed TCP Relay</td>
</tr>
</tbody>
</table>
<p>This packet type is used to initiate a secure connection with a peer.</p>
<p>The <strong><code class="language-plaintext highlighter-rouge">Public Session Key</code></strong> is a temporary key unique to this peer
which, along with its secret counterpart, will be used to create a
shared session encryption key. This keypair is used for all further
communication for the current session. It must only be used for a single
peer, and must be discarded of once the connection with the peer is
severed.</p>
<p>The <strong><code class="language-plaintext highlighter-rouge">Public Signature Key</code></strong> is our own permanent signature key for
this group chat.</p>
<p>The <strong><code class="language-plaintext highlighter-rouge">Request Type</code></strong> is an identifier for the type of handshake being
initiated, defined as an enumerator starting at zero as follows:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Type</th>
<th style="text-align: left">ID</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><strong><code class="language-plaintext highlighter-rouge">HANDSHAKE_INVITE_REQUEST</code></strong></td>
<td style="text-align: left">0x00</td>
</tr>
<tr>
<td style="text-align: left"><strong><code class="language-plaintext highlighter-rouge">HANDSHAKE_PEER_INFO_EXCHANGE</code></strong></td>
<td style="text-align: left">0x01</td>
</tr>
</tbody>
</table>
<p>If the request type is an invite request, the receiving peer must
respond with a <strong><code class="language-plaintext highlighter-rouge">INVITE_REQUEST</code></strong> packet. If the request type is a
peer info exchange, the receiving peer must respond with a
<strong><code class="language-plaintext highlighter-rouge">PEER_INFO_RESPONSE</code></strong> packet followed immediately by a
<strong><code class="language-plaintext highlighter-rouge">PEER_INFO_REQUEST</code></strong> packet.</p>
<p>The packed TCP relay contains a TCP relay that the sender may be
connected through by the receiver.</p>
<h3 id="lossy-packet-payloads">Lossy Packet Payloads</h3>
<h4 id="ping-0x01">PING (0x01)</h4>
<p>A ping packet payload is structured as follows:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">2</code></td>
<td style="text-align: left">Peerlist Checksum</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">2</code></td>
<td style="text-align: left">Confirmed Peer Count</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">4</code></td>
<td style="text-align: left">Shared State Version</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">4</code></td>
<td style="text-align: left">Sanctions Credentials Version</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">2</code></td>
<td style="text-align: left">Peer Roles Checksum</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">4</code></td>
<td style="text-align: left">Topic Version</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">2</code></td>
<td style="text-align: left">Topic Checksum</td>
</tr>
<tr>
<td style="text-align: left">Variable</td>
<td style="text-align: left">Packed IP Address and Port</td>
</tr>
</tbody>
</table>
<p>Ping packets are periodically sent to every confirmed peer in order to
maintain peer connections, and to ensure the group state between peers
are in sync. A peer is considered to be disconnected from the group
after a ping packet has not been receieved over a period of time.</p>
<h4 id="message_ack-0x02">MESSAGE_ACK (0x02)</h4>
<p>Message ack packets are structured as follows:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">8</code></td>
<td style="text-align: left">Message ID</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left">Type</td>
</tr>
</tbody>
</table>
<p>This packet ensures that all lossless packets are successfully received
and processed in sequential order as they were sent.</p>
<p>Message ack types are defined by an enumerator beginning at zero as
follows:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Type</th>
<th style="text-align: left">ID</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><strong><code class="language-plaintext highlighter-rouge">GR_ACK_RECV</code></strong></td>
<td style="text-align: left">0x00</td>
</tr>
<tr>
<td style="text-align: left"><strong><code class="language-plaintext highlighter-rouge">GR_ACK_REQ</code></strong></td>
<td style="text-align: left">0x01</td>
</tr>
</tbody>
</table>
<p>If the type is <strong><code class="language-plaintext highlighter-rouge">GR_ACK_RECV</code></strong>, this indicates that the packet with
the given id has been received and successfully processed. If the type
is <strong><code class="language-plaintext highlighter-rouge">GR_ACK_REQ</code></strong>, this indicates that the message with the given id
should be sent again.</p>
<h4 id="invite_response_reject-0x03">INVITE_RESPONSE_REJECT (0x03)</h4>
<p>An invite response reject payload is structured as follows:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left">Type</td>
</tr>
</tbody>
</table>
<p>This packet alerts a peer that their invite request has been rejected.
The reason for the rejection is specified by the <strong><code class="language-plaintext highlighter-rouge">type</code></strong> field.</p>
<p>Rejection types are defined by an enumerator beginning at zero as
follows:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Type</th>
<th style="text-align: left">ID</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><strong><code class="language-plaintext highlighter-rouge">GROUP_FULL</code></strong></td>
<td style="text-align: left">0x00</td>
</tr>
<tr>
<td style="text-align: left"><strong><code class="language-plaintext highlighter-rouge">INVALID_PASSWORD</code></strong></td>
<td style="text-align: left">0x01</td>
</tr>
<tr>
<td style="text-align: left"><strong><code class="language-plaintext highlighter-rouge">INVITE_FAILED</code></strong></td>
<td style="text-align: left">0x02</td>
</tr>
</tbody>
</table>
<h3 id="lossless-packet-payloads">Lossless Packet Payloads</h3>
<h4 id="fragment-0xef">FRAGMENT (0xef)</h4>
<p>Fragment packets are structured as follows:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left">Lossless Packet Type <strong><code class="language-plaintext highlighter-rouge">[First chunk only]</code></strong></td>
</tr>
<tr>
<td style="text-align: left">Variable</td>
<td style="text-align: left">Arbitrary data</td>
</tr>
</tbody>
</table>
<p>Represents a segment in a sequence of packet fragments that comprise one
full lossless packet payload which exceeds the maximum allowed packet
chunk size (500 bytes). The first byte in the first chunk must be a
lossless packet type. Each chunk in the sequence must be sent in
succession.</p>
<p>The end of the sequence is signaled by a fragment packet with a length
of zero.</p>
<p>A fully assembled packet must be no greater than 50,000 bytes.</p>
<h4 id="key_rotations-0xf0">KEY_ROTATIONS (0xf0)</h4>
<p>Key rotation packets are structured as follows:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><strong><code class="language-plaintext highlighter-rouge">is_response</code></strong></td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">32</code></td>
<td style="text-align: left">Public Encryption Key</td>
</tr>
</tbody>
</table>
<p>Key rotation packets are used to rotate session encryption keys with a
peer. If <strong><code class="language-plaintext highlighter-rouge">is_response</code></strong> is false, the packet initiates a public key
exchange. Otherwise the packet is a response to a previously initiated
exchange.</p>
<p>The public encryption key must be a newly generated key which takes the
place of the previously used session key. The resulting shared session
key is generated using the same protocol as the initial handshake, and
must be kept secret.</p>
<p>Request packets should only be sent by the peer whose permanent public
encryption key for the given group is closer to the group <strong><code class="language-plaintext highlighter-rouge">Chat ID</code></strong>
according to the <a href="#distance"><code class="language-plaintext highlighter-rouge">Distance</code></a> metric.</p>
<h4 id="tcp_relays-0xf1">TCP_RELAYS (0xf1)</h4>
<p>A TCP relay packet payload is structured as follows:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left">Variable</td>
<td style="text-align: left">Packed TCP Relays</td>
</tr>
</tbody>
</table>
<p>The purpose of this packet is to share a list of TCP relays with a
confirmed peer. Used to maintain a list of mutual TCP relays with other
peers, which are used to maintain TCP connections when direct
connections cannot be established.</p>
<p>This packet is sent to every confirmed peer whenever a new TCP relay is
added to our list, or periodically when we presently have no shared TCP
relays with a given peer.</p>
<h4 id="custom_packets-0xf2">CUSTOM_PACKETS (0xf2)</h4>
<p>A custom packet payload is structured as follows:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left">Variable</td>
<td style="text-align: left">Arbitrary Data</td>
</tr>
</tbody>
</table>
<p>This packet is used to to send arbitrary data to another peer. It may be
used for client-side features.</p>
<h4 id="broadcast-0xf3">BROADCAST (0xf3)</h4>
<p>A broadcast packet payload is structured as follows:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left">Type</td>
</tr>
<tr>
<td style="text-align: left">Variable</td>
<td style="text-align: left">Payload</td>
</tr>
</tbody>
</table>
<p>This packet broadcasts a message to all confirmed peers in a group (with
the exception of <strong><code class="language-plaintext highlighter-rouge">PRIVATE_MESSAGE</code></strong>). The type of broadcast is
specificed by the <strong><code class="language-plaintext highlighter-rouge">type</code></strong> field.</p>
<p>Broadcast types are defined and structured as follows:</p>
<h6 id="status-0x00">STATUS (0x00)</h6>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left">User status</td>
</tr>
</tbody>
</table>
<p>Indicates that a peer has changed their status. Statuses must be of type
<strong><code class="language-plaintext highlighter-rouge">USERSTATUS</code></strong>.</p>
<h6 id="nick-0x01">NICK (0x01)</h6>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left">Variable</td>
<td style="text-align: left">Name</td>
</tr>
</tbody>
</table>
<p>Indicates that a peer has changed their nickname. A nick must be greater
than 0 bytes, and may not exceed <strong><code class="language-plaintext highlighter-rouge">TOX_MAX_NAME_LENGTH</code></strong> bytes in
length.</p>
<h6 id="plain_message-0x02">PLAIN_MESSAGE (0x02)</h6>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left">Variable</td>
<td style="text-align: left">Arbitrary data</td>
</tr>
</tbody>
</table>
<p>Contains an arbitrary message. A plain message must be greater than 0
bytes, and may not exceed <strong><code class="language-plaintext highlighter-rouge">TOX_MAX_MESSAGE_LENGTH</code></strong> bytes.</p>
<h6 id="action_message-0x03">ACTION_MESSAGE (0x03)</h6>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left">Variable</td>
<td style="text-align: left">Arbitrary data</td>
</tr>
</tbody>
</table>
<p>Contains an arbitrary message. An action message must be greater than 0
bytes, and may not exceed <strong><code class="language-plaintext highlighter-rouge">TOX_MAX_MESSAGE_LENGTH</code></strong> bytes.</p>
<h6 id="private_message-0x04">PRIVATE_MESSAGE (0x04)</h6>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left">Variable</td>
<td style="text-align: left">Arbitrary data</td>
</tr>
</tbody>
</table>
<p>Contains an arbitrary message which is only sent to the intended peer. A
private message must be greater than 0 bytes, and may not exceed
<strong><code class="language-plaintext highlighter-rouge">TOX_MAX_MESSAGE_LENGTH</code></strong> bytes.</p>
<h6 id="peer_exit-0x05">PEER_EXIT (0x05)</h6>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left">Variable</td>
<td style="text-align: left">Arbitrary data <strong><code class="language-plaintext highlighter-rouge">[optional]</code></strong></td>
</tr>
</tbody>
</table>
<p>Indicates that a peer is leaving the group. Contains an optional parting
message which may not exceed <strong><code class="language-plaintext highlighter-rouge">TOX_GROUP_MAX_PART_LENGTH</code></strong>.</p>
<h6 id="peer_kick-0x06">PEER_KICK (0x06)</h6>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">32</code></td>
<td style="text-align: left">Public Encryption Key</td>
</tr>
</tbody>
</table>
<p>Indicates that the peer associated with the public encryption key has
been kicked from the group by a moderator or the founder. This peer must
be removed from the peer list.</p>
<h6 id="set_mod-0x07">SET_MOD (0x07)</h6>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left">Flag</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">32</code></td>
<td style="text-align: left">Public Signature Key</td>
</tr>
</tbody>
</table>
<p>Indicates that the peer associated with the public signature key has
either been promoted to or demoted from the <strong><code class="language-plaintext highlighter-rouge">Moderator</code></strong> role by the
group founder. If <strong><code class="language-plaintext highlighter-rouge">flag</code></strong> is non-zero, the peer should be promoted
and added to the moderator list. Otherwise they should be demoted to the
<strong><code class="language-plaintext highlighter-rouge">User</code></strong> role and removed from the moderator list.</p>
<h6 id="set_observer-0x08">SET_OBSERVER (0x08)</h6>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left">Flag</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">32</code></td>
<td style="text-align: left">Public Encryption Key</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">32</code></td>
<td style="text-align: left">Public Signature Key</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">137</code></td>
<td style="text-align: left">Sanctions List Entry <strong><code class="language-plaintext highlighter-rouge">[optional]</code></strong></td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">132</code></td>
<td style="text-align: left">Packed Sanctions List Credentials</td>
</tr>
</tbody>
</table>
<p>Indicates that the peer associated with the given public keys has either
been demoted to or promoted from the <strong><code class="language-plaintext highlighter-rouge">Observer</code></strong> role by the group
founder or a moderator. If <strong><code class="language-plaintext highlighter-rouge">flag</code></strong> is non-zero, the peer should be
demoted and added to the sanctions list. Otherwise they should be
promoted to the <strong><code class="language-plaintext highlighter-rouge">User</code></strong> role and removed from the sanctions list.</p>
<h4 id="peer_info_request-0xf4">PEER_INFO_REQUEST (0xf4)</h4>
<p>A peer info request packet contains an empty payload. Its purpose is to
request a peer to send us information about themselves.</p>
<h4 id="peer_info_response-0xf5">PEER_INFO_RESPONSE (0xf5)</h4>
<p>A peer info response packet payload is structured as follows:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">2</code></td>
<td style="text-align: left">Group Password Length <strong><code class="language-plaintext highlighter-rouge">[optional]</code></strong></td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">32</code></td>
<td style="text-align: left">Group Password <strong><code class="language-plaintext highlighter-rouge">[optional]</code></strong></td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">2</code></td>
<td style="text-align: left">Name Length</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">128</code></td>
<td style="text-align: left">Name</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left">Status</td>
</tr>
</tbody>
</table>
<p>This packet supplies information about ourselves to a peer. It is sent
as a response to a <strong><code class="language-plaintext highlighter-rouge">PEER_INFO_REQUEST</code></strong> or
<strong><code class="language-plaintext highlighter-rouge">HS_PEER_INFO_EXCHANGE</code></strong> packet as part of the handshake protocol. A
password and length of password must be included in the packet if the
group is password protected.</p>
<h4 id="invite_request-0xf6">INVITE_REQUEST (0xf6)</h4>
<p>An invite request packet payload is structured as follows:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">2</code></td>
<td style="text-align: left">Group Password Length <strong><code class="language-plaintext highlighter-rouge">[optional]</code></strong></td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">32</code></td>
<td style="text-align: left">Group Password <strong><code class="language-plaintext highlighter-rouge">[optional]</code></strong></td>
</tr>
</tbody>
</table>
<p>This packet requests an invite to the group. A password and length of
password must be included in the packet if the group is password
protected.</p>
<h4 id="invite_response-0xf7">INVITE_RESPONSE (0xf7)</h4>
<p>An invite response packet has an empty payload.</p>
<p>This packet alerts a peer who sent us an <strong><code class="language-plaintext highlighter-rouge">INVITE_REQUEST</code></strong> packet
that their request has been validated, which informs them that they may
continue to the next step in the handshake protocol.</p>
<p>Before sending this packet we first attempt to validate the invite
request. If validation fails, we instead send a packet of type
<strong><code class="language-plaintext highlighter-rouge">INVITE_RESPONSE_REJECT</code></strong> in response, and remove the peer from our
peer list.</p>
<h4 id="sync_request-0xf8">SYNC_REQUEST (0xf8)</h4>
<p>A sync request packet payload is structured as follows:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">2</code></td>
<td style="text-align: left"><strong><code class="language-plaintext highlighter-rouge">Sync_Flags</code></strong></td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">2</code></td>
<td style="text-align: left">Group Password Length <strong><code class="language-plaintext highlighter-rouge">[optional]</code></strong></td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">32</code></td>
<td style="text-align: left">Group Password <strong><code class="language-plaintext highlighter-rouge">[optional]</code></strong></td>
</tr>
</tbody>
</table>
<p>This packet asks a peer to send us state information about the group
chat. The specific information being requested is specified via the
<strong><code class="language-plaintext highlighter-rouge">Sync_Flags</code></strong> field. A password and length of password must be
included in the packet if the group is password protected.</p>
<p><strong><code class="language-plaintext highlighter-rouge">Sync_Flags</code></strong> is a bitfield defined as a 16-bit unsigned integer
which may have the bits set for the respective values depending on what
information is being requested:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Type</th>
<th style="text-align: left">Set Bits</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><strong><code class="language-plaintext highlighter-rouge">PEER_LIST</code></strong></td>
<td style="text-align: left">0x01</td>
</tr>
<tr>
<td style="text-align: left"><strong><code class="language-plaintext highlighter-rouge">TOPIC</code></strong></td>
<td style="text-align: left">0x02</td>
</tr>
<tr>
<td style="text-align: left"><strong><code class="language-plaintext highlighter-rouge">STATE</code></strong></td>
<td style="text-align: left">0x04</td>
</tr>
</tbody>
</table>
<h4 id="sync_response-0xf9">SYNC_RESPONSE (0xf9)</h4>
<p>A sync response packet payload is structured as follows:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">32</code></td>
<td style="text-align: left">Public Encryption Key</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><strong><code class="language-plaintext highlighter-rouge">IP_Port_Is_Set</code></strong></td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left">TCP Relays Count</td>
</tr>
<tr>
<td style="text-align: left">Variable</td>
<td style="text-align: left">Packed IP_Port <strong><code class="language-plaintext highlighter-rouge">[optional]</code></strong></td>
</tr>
<tr>
<td style="text-align: left">Variable</td>
<td style="text-align: left">Packed TCP Relays <strong><code class="language-plaintext highlighter-rouge">[optional]</code></strong></td>
</tr>
</tbody>
</table>
<p>This packet is sent as a response to a peer who made a sync request via
the <strong><code class="language-plaintext highlighter-rouge">SYNC_REQUEST</code></strong> packet. It contains a single packed peer
announce, which is a data structure that contains all of the information
about a peer needed to initiate the handshake protocol via TCP relays, a
direct connection, or both.</p>
<p>If the <strong><code class="language-plaintext highlighter-rouge">IP_Port_Is_Set</code></strong> flag is non-zero, the packet will contain a
packed <strong><code class="language-plaintext highlighter-rouge">IP_Port</code></strong> of the peer associated with the given public key.
If <strong><code class="language-plaintext highlighter-rouge">TCP Relays Count</code></strong> is greater than 0, the packet will contain a
list of tcp relays that the peer associated with the given public key is
connected to.</p>
<p>When responding to a sync request, one separate sync response will be
sent for each peer in the peer list. All other requested group
information is sent via its respective packet.</p>
<h4 id="topic-0xfa">TOPIC (0xfa)</h4>
<p>A topic packet payload is structured as follows:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">64</code></td>
<td style="text-align: left">Topic Signature</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">4</code></td>
<td style="text-align: left">Topic Version</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">2</code></td>
<td style="text-align: left">Topic Checksum</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">2</code></td>
<td style="text-align: left">Topic Length</td>
</tr>
<tr>
<td style="text-align: left">Topic Length</td>
<td style="text-align: left">Topic</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">32</code></td>
<td style="text-align: left">Public Signature Key</td>
</tr>
</tbody>
</table>
<p>This packet contains a topic as well as information used to validate the
topic. Sent when the topic changes, or in response to a
<strong><code class="language-plaintext highlighter-rouge">SYNC_REQUEST</code></strong> in which the <strong><code class="language-plaintext highlighter-rouge">TOPIC</code></strong> flag is set. A topic may not
exceed <strong><code class="language-plaintext highlighter-rouge">TOX_GROUP_MAX_TOPIC_LENGTH</code></strong> bytes in length.</p>
<h4 id="shared_state-0xfb">SHARED_STATE (0xfb)</h4>
<p>A shared state packet payload is structured as follows:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">64</code></td>
<td style="text-align: left">Shared State Signature</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">4</code></td>
<td style="text-align: left">Shared State Version</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">64</code></td>
<td style="text-align: left">Founder Extended Public Key</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">2</code></td>
<td style="text-align: left">Peer Limit</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">2</code></td>
<td style="text-align: left">Group Name Length</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">48</code></td>
<td style="text-align: left">Group Name</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left">Privacy State</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">2</code></td>
<td style="text-align: left">Group Password Length</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">32</code></td>
<td style="text-align: left">Group Password</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">32</code></td>
<td style="text-align: left">Moderator List Hash (Sha256)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">4</code></td>
<td style="text-align: left">Topic Lock State</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left">Voice State</td>
</tr>
</tbody>
</table>
<p>This packet contains information about the group shared state. Sent to
all peers by the group founder whenever the shared state has changed.
Also sent in response to a <strong><code class="language-plaintext highlighter-rouge">SYNC_REQUEST</code></strong> in which the <strong><code class="language-plaintext highlighter-rouge">STATE</code></strong>
flag is set.</p>
<h4 id="mod_list-0xfc">MOD_LIST (0xfc)</h4>
<p>A moderation list packet payload is structured as follows:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">2</code></td>
<td style="text-align: left">Moderator Count</td>
</tr>
<tr>
<td style="text-align: left">Variable</td>
<td style="text-align: left">Moderator List</td>
</tr>
</tbody>
</table>
<p>This packet contains information about the moderator list, including the
number of moderators, and a list of public signature keys of all current
moderators. Sent to all peers by the group founder after the moderator
list has been modified. Also sent in response to a <strong><code class="language-plaintext highlighter-rouge">SYNC_REQUEST</code></strong> in
which the <strong><code class="language-plaintext highlighter-rouge">STATE</code></strong> flag is set.</p>
<p>The moderator list is comprised of one or more 32 byte public signature
keys.</p>
<p>This packet must always be sent after a <strong><code class="language-plaintext highlighter-rouge">SHARED_STATE</code></strong> packet, as
the moderator list is validated using data contained within the shared
state.</p>
<h4 id="sanctions_list-0xfd">SANCTIONS_LIST (0xfd)</h4>
<p>A sanctions list packet payload is structured as follows:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">2</code></td>
<td style="text-align: left">Sanctions List Count</td>
</tr>
<tr>
<td style="text-align: left">Variable</td>
<td style="text-align: left">Sanctions List</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">132</code></td>
<td style="text-align: left">Packed Sanctions List Credentials</td>
</tr>
</tbody>
</table>
<h6 id="sanctions-list-entry">Sanctions List Entry</h6>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left">Type</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">32</code></td>
<td style="text-align: left">Public Signature Key</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">8</code></td>
<td style="text-align: left">Unix Timestamp</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">32</code></td>
<td style="text-align: left">Public Encryption Key</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">64</code></td>
<td style="text-align: left">Signature</td>
</tr>
</tbody>
</table>
<h6 id="sanctions-credentials">Sanctions Credentials</h6>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">4</code></td>
<td style="text-align: left">Version</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">32</code></td>
<td style="text-align: left">Hash (Sha256)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">2</code></td>
<td style="text-align: left">Checksum</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">32</code></td>
<td style="text-align: left">Public Signature Key</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">64</code></td>
<td style="text-align: left">Signature</td>
</tr>
</tbody>
</table>
<p>This packet contains information about the sanctions list, including the
number of entries, the sanctions list, and the credentials needed to
validate the sanctions list.</p>
<p>Sanctions types are defined as an enumerator beginning at zero as
follows:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Type</th>
<th style="text-align: left">ID</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><strong><code class="language-plaintext highlighter-rouge">OBSERVER</code></strong></td>
<td style="text-align: left">0x00</td>
</tr>
</tbody>
</table>
<p>During a sync response, this packet must be sent after a <strong><code class="language-plaintext highlighter-rouge">MOD_LIST</code></strong>
packet, as the sanctions list is validated using the moderator list.</p>
<h4 id="friend_invite-0xfe">FRIEND_INVITE (0xfe)</h4>
<p>A friend invite packet payload is structured as follows:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left">Type</td>
</tr>
</tbody>
</table>
<p>Used to initiate or respond to a group invite to or from an existing
friend. The invite action is specified by the <strong><code class="language-plaintext highlighter-rouge">type</code></strong> field.</p>
<p>Invite types are defined as an enumerator beginning at zero as follows:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Type</th>
<th style="text-align: left">ID</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><strong><code class="language-plaintext highlighter-rouge">GROUP_INVITE</code></strong></td>
<td style="text-align: left">0x00</td>
</tr>
<tr>
<td style="text-align: left"><strong><code class="language-plaintext highlighter-rouge">GROUP_INVITE_ACCEPTED</code></strong></td>
<td style="text-align: left">0x01</td>
</tr>
<tr>
<td style="text-align: left"><strong><code class="language-plaintext highlighter-rouge">GROUP_INVITE_CONFIRM</code></strong></td>
<td style="text-align: left">0x02</td>
</tr>
</tbody>
</table>
<h4 id="hs_response_ack-0xff">HS_RESPONSE_ACK (0xff)</h4>
<p>A handshake response ack packet has an empty payload. This packet is
used to send acknowledgement that a lower level toxcore
<strong><code class="language-plaintext highlighter-rouge">NET_PACKET_GC_HANDSHAKE</code></strong> packet has been received, which is the
first step in the group handshake protocol. This packet will initiate an
invite request via the <strong><code class="language-plaintext highlighter-rouge">INVITE_REQUEST</code></strong> packet.</p>
<h2 id="net-crypto">Net crypto</h2>
<p>The Tox transport protocol is what Tox uses to establish and send data
securely to friends and provides encryption, ordered delivery, and
perfect forward secrecy. It is a UDP protocol but it is also used when 2
friends connect over TCP relays.</p>
<p>The reason the protocol for connections to friends over TCP relays and
direct UDP is the same is for simplicity and so the connection can
switch between both without the peers needing to disconnect and
reconnect. For example two Tox friends might first connect over TCP and
a few seconds later switch to UDP when a direct UDP connection becomes
possible. The opening up of the UDP route or hole punching is done by
the DHT module and the opening up of a relayed TCP connection is done by
the <code class="language-plaintext highlighter-rouge">TCP_connection</code> module. The Tox transport protocol has the job of
connecting two peers (tox friends) safely once a route or communications
link between both is found. Direct UDP is preferred over TCP because it
is direct and isnt limited by possibly congested TCP relays. Also, a
peer can only connect to another using the Tox transport protocol if
they know the real public key and DHT public key of the peer they want
to connect to. However, both the DHT and TCP connection modules require
this information in order to find and open the route to the peer which
means we assume this information is known by toxcore and has been passed
to <code class="language-plaintext highlighter-rouge">net_crypto</code> when the <code class="language-plaintext highlighter-rouge">net_crypto</code> connection was created.</p>
<p>Because this protocol has to work over UDP it must account for possible
packet loss, packets arriving in the wrong order and has to implement
some kind of congestion control. This is implemented above the level at
which the packets are encrypted. This prevents a malicious TCP relay
from disrupting the connection by modifying the packets that go through
it. The packet loss prevention makes it work very well on TCP relays
that we assume may go down at any time as the connection will stay
strong even if there is need to switch to another TCP relay which will
cause some packet loss.</p>
<p>Before sending the actual handshake packet the peer must obtain a
cookie. This cookie step serves as a way for the receiving peer to
confirm that the peer initiating the connection can receive the
responses in order to prevent certain types of DoS attacks.</p>
<p>The peer receiving a cookie request packet must not allocate any
resources to the connection. They will simply respond to the packet with
a cookie response packet containing the cookie that the requesting peer
must then use in the handshake to initiate the actual connection.</p>
<p>The cookie response must be sent back using the exact same link the
cookie request packet was sent from. The reason for this is that if it
is sent back using another link, the other link might not work and the
peer will not be expecting responses from another link. For example, if
a request is sent from UDP with ip port X, it must be sent back by UDP
to ip port X. If it was received from a TCP OOB packet it must be sent
back by a TCP OOB packet via the same relay with the destination being
the peer who sent the request. If it was received from an established
TCP relay connection it must be sent back via that same exact
connection.</p>
<p>When a cookie request is received, the peer must not use the information
in the request packet for anything, he must not store it, he must only
create a cookie and cookie response from it, then send the created
cookie response packet and forget them. The reason for this is to
prevent possible attacks. For example if a peer would allocate long term
memory for each cookie request packet received then a simple packet
flood would be enough to achieve an effective denial of service attack
by making the program run out of memory.</p>
<p>cookie request packet (145 bytes):</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[uint8_t 24]
[Sender's DHT Public key (32 bytes)]
[Random nonce (24 bytes)]
[Encrypted message containing:
[Sender's real public key (32 bytes)]
[padding (32 bytes)]
[uint64_t echo id (must be sent back untouched in cookie response)]
]
</code></pre></div></div>
<p>Encrypted message is encrypted with senders DHT private key, receivers
DHT public key and the nonce.</p>
<p>The packet id for cookie request packets is 24. The request contains the
DHT public key of the sender which is the key used (The DHT private key)
(along with the DHT public key of the receiver) to encrypt the encrypted
part of the cookie packet and a nonce also used to encrypt the encrypted
part of the packet. Padding is used to maintain backwards-compatibility
with previous versions of the protocol. The echo id in the cookie
request must be sent back untouched in the cookie response. This echo id
is how the peer sending the request can be sure that the response
received was a response to the packet that he sent.</p>
<p>The reason for sending the DHT public key and real public key in the
cookie request is that both are contained in the cookie sent back in the
response.</p>
<p>Toxcore currently sends 1 cookie request packet every second 8 times
before it kills the connection if there are no responses.</p>
<p>cookie response packet (161 bytes):</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[uint8_t 25]
[Random nonce (24 bytes)]
[Encrypted message containing:
[Cookie]
[uint64_t echo id (that was sent in the request)]
]
</code></pre></div></div>
<p>Encrypted message is encrypted with the exact same symmetric key as the
cookie request packet it responds to but with a different nonce.</p>
<p>The packet id for cookie request packets is 25. The response contains a
nonce and an encrypted part encrypted with the nonce. The encrypted part
is encrypted with the same key used to decrypt the encrypted part of the
request meaning the expensive shared key generation needs to be called
only once in order to handle and respond to a cookie request packet with
a cookie response.</p>
<p>The Cookie (see below) and the echo id that was sent in the request are
the contents of the encrypted part.</p>
<p>The Cookie should be (112 bytes):</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[nonce]
[encrypted data:
[uint64_t time]
[Sender's real public key (32 bytes)]
[Sender's DHT public key (32 bytes)]
]
</code></pre></div></div>
<p>The cookie is a 112 byte piece of data that is created and sent to the
requester as part of the cookie response packet. A peer who wants to
connect to another must obtain a cookie packet from the peer they are
trying to connect to. The only way to send a valid handshake packet to
another peer is to first obtain a cookie from them.</p>
<p>The cookie contains information that will both prove to the receiver of
the handshake that the peer has received a cookie response and contains
encrypted info that tell the receiver of the handshake packet enough
info to both decrypt and validate the handshake packet and accept the
connection.</p>
<p>When toxcore is started it generates a symmetric encryption key that it
uses to encrypt and decrypt all cookie packets (using NaCl authenticated
encryption exactly like encryption everywhere else in toxcore). Only the
instance of toxcore that create the packets knows the encryption key
meaning any cookie it successfully decrypts and validates were created
by it.</p>
<p>The time variable in the cookie is used to prevent cookie packets that
are too old from being used. Toxcore has a time out of 15 seconds for
cookie packets. If a cookie packet is used more than 15 seconds after it
is created toxcore will see it as invalid.</p>
<p>When responding to a cookie request packet the senders real public key
is the known key sent by the peer in the encrypted part of the cookie
request packet and the senders DHT public key is the key used to encrypt
the encrypted part of the cookie request packet.</p>
<p>When generating a cookie to put inside the encrypted part of the
handshake: One of the requirements to connect successfully to someone
else is that we know their DHT public key and their real long term
public key meaning there is enough information to construct the cookie.</p>
<p>Handshake packet:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[uint8_t 26]
[Cookie]
[nonce (24 bytes)]
[Encrypted message containing:
[24 bytes base nonce]
[session public key of the peer (32 bytes)]
[sha512 hash of the entire Cookie sitting outside the encrypted part]
[Other Cookie (used by the other to respond to the handshake packet)]
]
</code></pre></div></div>
<p>The packet id for handshake packets is 26. The cookie is a cookie
obtained by sending a cookie request packet to the peer and getting a
cookie response packet with a cookie in it. It may also be obtained in
the handshake packet by a peer receiving a handshake packet (Other
Cookie).</p>
<p>The nonce is a nonce used to encrypt the encrypted part of the handshake
packet. The encrypted part of the handshake packet is encrypted with the
long term keys of both peers. This is to prevent impersonation.</p>
<p>Inside the encrypted part of the handshake packet there is a base
nonce and a session public key. The base nonce is a nonce that the
other should use to encrypt each data packet, adding + 1 to it for each
data packet sent. (first packet is base nonce + 0, next is base
nonce + 1, etc. Note that for mathematical operations the nonce is
considered to be a 24 byte number in big endian format). The session key
is the temporary connection public key that the peer has generated for
this connection and it sending to the other. This session key is used so
that the connection has perfect forward secrecy. It is important to save
the private key counterpart of the session public key sent in the
handshake, the public key received by the other and both the received
and sent base nonces as they are used to encrypt/decrypt the data
packets.</p>
<p>The hash of the cookie in the encrypted part is used to make sure that
an attacker has not taken an older valid handshake packet and then
replaced the cookie packet inside with a newer one which would be bad as
they could replay it and might be able to make a mess.</p>
<p>The Other Cookie is a valid cookie that we put in the handshake so
that the other can respond with a valid handshake without having to make
a cookie request to obtain one.</p>
<p>The handshake packet is sent by both sides of the connection. If a peer
receives a handshake it will check if the cookie is valid, if the
encrypted section decrypts and validates, if the cookie hash is valid,
if long term public key belongs to a known friend. If all these are true
then the connection is considered Accepted but not Confirmed.</p>
<p>If there is no existing connection to the peer identified by the long
term public key to set to Accepted, one will be created with that
status. If a connection to such peer with a not yet Accepted status to
exists, this connection is set to accepted. If a connection with a
Confirmed status exists for this peer, the handshake packet will be
ignored and discarded (The reason for discarding it is that we do not
want slightly late handshake packets to kill the connection) except if
the DHT public key in the cookie contained in the handshake packet is
different from the known DHT public key of the peer. If this happens the
connection will be immediately killed because it means it is no longer
valid and a new connection will be created immediately with the
Accepted status.</p>
<p>Sometimes toxcore might receive the DHT public key of the peer first
with a handshake packet so it is important that this case is handled and
that the implementation passes the DHT public key to the other modules
(DHT, <code class="language-plaintext highlighter-rouge">TCP_connection</code>) because this does happen.</p>
<p>Handshake packets must be created only once during the connection but
must be sent in intervals until we are sure the other received them.
This happens when a valid encrypted data packet is received and
decrypted.</p>
<p>The states of a connection:</p>
<ol>
<li>
<p>Not accepted: Send handshake packets.</p>
</li>
<li>
<p>Accepted: A handshake packet has been received from the other peer
but no encrypted packets: continue (or start) sending handshake
packets because the peer cant know if the other has received them.</p>
</li>
<li>
<p>Confirmed: A valid encrypted packet has been received from the other
peer: Connection is fully established: stop sending handshake
packets.</p>
</li>
</ol>
<p>Toxcore sends handshake packets every second 8 times and times out the
connection if the connection does not get confirmed (no encrypted packet
is received) within this time.</p>
<p>Perfect handshake scenario:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Peer 1 Peer 2
Cookie request -&gt;
&lt;- Cookie response
Handshake packet -&gt;
* accepts connection
&lt;- Handshake packet
*accepts connection
Encrypted packet -&gt; &lt;- Encrypted packet
*confirms connection *confirms connection
Connection successful.
Encrypted packets -&gt; &lt;- Encrypted packets
More realistic handshake scenario:
Peer 1 Peer 2
Cookie request -&gt; *packet lost*
Cookie request -&gt;
&lt;- Cookie response
*Peer 2 randomly starts new connection to peer 1
&lt;- Cookie request
Cookie response -&gt;
Handshake packet -&gt; &lt;- Handshake packet
*accepts connection * accepts connection
Encrypted packet -&gt; &lt;- Encrypted packet
*confirms connection *confirms connection
Connection successful.
Encrypted packets -&gt; &lt;- Encrypted packets
</code></pre></div></div>
<p>The reason why the handshake is like this is because of certain design
requirements:</p>
<ol>
<li>
<p>The handshake must not leak the long term public keys of the peers
to a possible attacker who would be looking at the packets but each
peer must know for sure that they are connecting to the right peer
and not an impostor.</p>
</li>
<li>
<p>A connection must be able of being established if only one of the
peers has the information necessary to initiate a connection (DHT
public key of the peer and a link to the peer).</p>
</li>
<li>
<p>If both peers initiate a connection to each other at the same time
the connection must succeed without issues.</p>
</li>
<li>
<p>There must be perfect forward secrecy.</p>
</li>
<li>
<p>Must be resistant to any possible attacks.</p>
</li>
</ol>
<p>Due to how it is designed only one connection is possible at a time
between 2 peers.</p>
<p>Encrypted
packets:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> (0x1b)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">2</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint16_t</code> The last 2 bytes of the nonce used to encrypt this</td>
</tr>
<tr>
<td style="text-align: left">variable</td>
<td style="text-align: left">Payload</td>
</tr>
</tbody>
</table>
<p>The payload is encrypted with the session key and base nonce set by
the receiver in their handshake + packet number (starting at 0, big
endian math).</p>
<p>The packet id for encrypted packets is 27. Encrypted packets are the
packets used to send data to the other peer in the connection. Since
these packets can be sent over UDP the implementation must assume that
they can arrive out of order or even not arrive at all.</p>
<p>To get the key used to encrypt/decrypt each packet in the connection a
peer takes the session public key received in the handshake and the
private key counterpart of the key it sent it the handshake and
generates a shared key from it. This shared key will be identical for
both peers. It is important to note that connection keys must be wiped
when the connection is killed.</p>
<p>To create an encrypted packet to be sent to the other peer, the data is
encrypted with the shared key for this connection and the base nonce
that the other peer sent in the handshake packet with the total number
of encrypted packets sent in the connection added to it (base nonce +
0 for the first encrypted data packet sent, base nonce + 1 for the
second, etc. Note that the nonce is treated as a big endian number for
mathematical operations like additions). The 2 byte (<code class="language-plaintext highlighter-rouge">uint16_t</code>) number
at the beginning of the encrypted packet is the last 2 bytes of this 24
byte nonce.</p>
<p>To decrypt a received encrypted packet, the nonce the packet was
encrypted with is calculated using the base nonce that the peer sent to
the other and the 2 byte number at the beginning of the packet. First we
assume that packets will most likely arrive out of order and that some
will be lost but that packet loss and out of orderness will never be
enough to make the 2 byte number need an extra byte. The packet is
decrypted using the shared key for the connection and the calculated
nonce.</p>
<p>Toxcore uses the following method to calculate the nonce for each
packet:</p>
<ol>
<li>
<p><code class="language-plaintext highlighter-rouge">diff</code> = (2 byte number on the packet) - (last 2 bytes of the
current saved base nonce) NOTE: treat the 3 variables as 16 bit
unsigned ints, the result is expected to sometimes roll over.</p>
</li>
<li>
<p>copy <code class="language-plaintext highlighter-rouge">saved_base_nonce</code> to <code class="language-plaintext highlighter-rouge">temp_nonce</code>.</p>
</li>
<li>
<p><code class="language-plaintext highlighter-rouge">temp_nonce = temp_nonce + diff</code>. <code class="language-plaintext highlighter-rouge">temp_nonce</code> is the correct nonce
that can be used to decrypt the packet.</p>
</li>
<li>
<p><code class="language-plaintext highlighter-rouge">DATA_NUM_THRESHOLD</code> = (1/3 of the maximum number that can be stored
in an unsigned 2 bit integer)</p>
</li>
<li>
<p>if decryption succeeds and <code class="language-plaintext highlighter-rouge">diff &gt; (DATA_NUM_THRESHOLD * 2)</code> then:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">saved_base_nonce = saved_base_nonce + DATA_NUM_THRESHOLD</code></li>
</ul>
</li>
</ol>
<p>First it takes the difference between the 2 byte number on the packet
and the last. Because the 3 values are unsigned 16 bit ints and rollover
is part of the math something like diff = (10 - 65536) means diff is
equal to 11.</p>
<p>Then it copies the saved base nonce to a temp nonce buffer.</p>
<p>Then it adds diff to the nonce (the nonce is in big endian format).</p>
<p>After if decryption was successful it checks if diff was bigger than 2/3
of the value that can be contained in a 16 bit unsigned int and
increases the saved base nonce by 1/3 of the maximum value if it
succeeded.</p>
<p>This is only one of many ways that the nonce for each encrypted packet
can be calculated.</p>
<p>Encrypted packets that cannot be decrypted are simply dropped.</p>
<p>The reason for exchanging base nonces is because since the key for
encrypting packets is the same for received and sent packets there must
be a cryptographic way to make it impossible for someone to do an attack
where they would replay packets back to the sender and the sender would
think that those packets came from the other peer.</p>
<p>Data in the encrypted
packets:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[our recvbuffers buffer_start, (highest packet number handled + 1), (big endian)]
[uint32_t packet number if lossless, sendbuffer buffer_end if lossy, (big endian)]
[data]
</code></pre></div></div>
<p>Encrypted packets may be lossy or lossless. Lossy packets are simply
encrypted packets that are sent to the other. If they are lost, arrive
in the wrong order or even if an attacker duplicates them (be sure to
take this into account for anything that uses lossy packets) they will
simply be decrypted as they arrive and passed upwards to what should
handle them depending on the data id.</p>
<p>Lossless packets are packets containing data that will be delivered in
order by the implementation of the protocol. In this protocol, the
receiver tells the sender which packet numbers he has received and which
he has not and the sender must resend any packets that are dropped. Any
attempt at doubling packets will cause all (except the first received)
to be ignored.</p>
<p>Each lossless packet contains both a 4 byte number indicating the
highest packet number received and processed and a 4 byte packet number
which is the packet number of the data in the packet.</p>
<p>In lossy packets, the layout is the same except that instead of a packet
number, the second 4 byte number represents the packet number of a
lossless packet if one were sent right after. This number is used by the
receiver to know if any packets have been lost. (for example if it
receives 4 packets with numbers (0, 1, 2, 5) and then later a lossy
packet with this second number as: 8 it knows that packets: 3, 4, 6, 7
have been lost and will request them)</p>
<p>How the reliability is achieved:</p>
<p>First it is important to say that packet numbers do roll over, the next
number after 0xFFFFFFFF (maximum value in 4 bytes) is 0. Hence, all the
mathematical operations dealing with packet numbers are assumed to be
done only on unsigned 32 bit integer unless said otherwise. For example
0 - 0xFFFFFFFF would equal to 1 because of the rollover.</p>
<p>When sending a lossless packet, the packet is created with its packet
number being the number of the last lossless packet created + 1
(starting at 0). The packet numbers are used for both reliability and in
ordered delivery and so must be sequential.</p>
<p>The packet is then stored along with its packet number in order for the
peer to be able to send it again if the receiver does not receive it.
Packets are only removed from storage when the receiver confirms they
have received them.</p>
<p>The receiver receives packets and stores them along with their packet
number. When a receiver receives a packet he stores the packet along
with its packet number in an array. If there is already a packet with
that number in the buffer, the packet is dropped. If the packet number
is smaller than the last packet number that was processed, the packet is
dropped. A processed packet means it was removed from the buffer and
passed upwards to the relevant module.</p>
<p>Assuming a new connection, the sender sends 5 lossless packets to the
receiver: 0, 1, 2, 3, 4 are the packet numbers sent and the receiver
receives: 3, 2, 0, 2 in that order.</p>
<p>The receiver will save the packets and discards the second packet with
the number 2, he has: 0, 2, 3 in his buffer. He will pass the first
packet to the relevant module and remove it from the array but since
packet number 1 is missing he will stop there. Contents of the buffer
are now: 2, 3. The receiver knows packet number 1 is missing and will
request it from the sender by using a packet request packet:</p>
<p>data ids:</p>
<table>
<thead>
<tr>
<th style="text-align: left">ID</th>
<th style="text-align: left">Data</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left">0</td>
<td style="text-align: left">padding (skipped until we hit a non zero (data id) byte)</td>
</tr>
<tr>
<td style="text-align: left">1</td>
<td style="text-align: left">packet request packet (lossy packet)</td>
</tr>
<tr>
<td style="text-align: left">2</td>
<td style="text-align: left">connection kill packet (lossy packet)</td>
</tr>
<tr>
<td style="text-align: left"></td>
<td style="text-align: left"></td>
</tr>
<tr>
<td style="text-align: left">16+</td>
<td style="text-align: left">reserved for Messenger usage (lossless packets)</td>
</tr>
<tr>
<td style="text-align: left">192+</td>
<td style="text-align: left">reserved for Messenger usage (lossy packets)</td>
</tr>
<tr>
<td style="text-align: left">255</td>
<td style="text-align: left">reserved for Messenger usage (lossless packet)</td>
</tr>
</tbody>
</table>
<p>Connection kill packets tell the other that the connection is over.</p>
<p>Packet numbers are the first byte of data in the packet.</p>
<p>packet request packet:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[uint8_t (1)][uint8_t num][uint8_t num][uint8_t num]...[uint8_t num]
</code></pre></div></div>
<p>Packet request packets are used by one side of the connection to request
packets from the other. To create a full packet request packet, the one
requesting the packet takes the last packet number that was processed
(sent to the relevant module and removed from the array (0 in the
example above)). Subtract the number of the first missing packet from
that number (1 - 0) = 1. Which means the full packet to request packet
number 1 will look like:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[uint32_t 1]
[uint32_t 0]
[uint8_t 1][uint8_t 1]
</code></pre></div></div>
<p>If packet number 4 was being requested as well, take the difference
between the packet number and the last packet number being requested (4</p>
<ul>
<li>
<p>1) = 3. So the packet will look like:</p>
<p>[uint32_t 1]
[uint32_t 0]
[uint8_t 1][uint8_t 1][uint8_t 3]</p>
</li>
</ul>
<p>But what if the number is greater than 255? Lets say the peer needs to
request packets 3, 6, 1024, the packet will look like:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[uint32_t 1]
[uint32_t 2]
[uint8_t 1][uint8_t 3][uint8_t 3][uint8_t 0][uint8_t 0][uint8_t 0][uint8_t 253]
</code></pre></div></div>
<p>Each 0 in the packet represents adding 255 until a non 0 byte is reached
which is then added and the resulting requested number is what is left.</p>
<p>This request is designed to be small when requesting packets in real
network conditions where the requested packet numbers will be close to
each other. Putting each requested 4 byte packet number would be very
simple but would make the request packets unnecessarily large which is
why the packets look like this.</p>
<p>When a request packet is received, it will be decoded and all packets in
between the requested packets will be assumed to be successfully
received by the other.</p>
<p>Packet request packets are sent at least every 1 second in toxcore and
more when packets are being received.</p>
<p>The current formula used is (note that this formula is likely
sub-optimal):</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>REQUEST_PACKETS_COMPARE_CONSTANT = 50.0 double request_packet_interval =
(REQUEST_PACKETS_COMPARE_CONSTANT /
(((double)num_packets_array(&amp;conn-&gt;recv_array) + 1.0) / (conn-&gt;packet_recv_rate
+ 1.0)));
</code></pre></div></div>
<p><code class="language-plaintext highlighter-rouge">num_packets_array(&amp;conn-&gt;recv_array)</code> returns the difference between
the highest packet number received and the last one handled. In the
toxcore code it refers to the total size of the current array (with the
holes which are the placeholders for not yet received packets that are
known to be missing).</p>
<p><code class="language-plaintext highlighter-rouge">conn-&gt;packet_recv_rate</code> is the number of data packets successfully
received per second.</p>
<p>This formula was created with the logic that the higher the delay in
packets (<code class="language-plaintext highlighter-rouge">num_packets_array(&amp;conn-&gt;recv_array)</code>) vs the speed of packets
received, the more request packets should be sent.</p>
<p>Requested packets are resent every time they can be resent as in they
will obey the congestion control and not bypass it. They are resent
once, subsequent request packets will be used to know if the packet was
received or if it should be resent.</p>
<p>The ping or rtt (round trip time) between two peers can be calculated by
saving the time each packet was sent and taking the difference between
the time the latest packet confirmed received by a request packet was
sent and the time the request packet was received. The rtt can be
calculated for every request packet. The lowest one (for all packets)
will be the closest to the real ping.</p>
<p>This ping or rtt can be used to know if a request packet that requests a
packet we just sent should be resent right away or we should wait or not
for the next one (to know if the other side actually had time to receive
the packet).</p>
<p>The congestion control algorithm has the goal of guessing how many
packets can be sent through the link every second before none can be
sent through anymore. How it works is basically to send packets faster
and faster until none can go through the link and then stop sending them
faster than that.</p>
<p>Currently the congestion control uses the following formula in toxcore
however that is probably not the best way to do it.</p>
<p>The current formula is to take the difference between the current size
of the send queue and the size of the send queue 1.2 seconds ago, take
the total number of packets sent in the last 1.2 seconds and subtract
the previous number from it.</p>
<p>Then divide this number by 1.2 to get a packet speed per second. If this
speed is lower than the minimum send rate of 8 packets per second, set
it to 8.</p>
<p>A congestion event can be defined as an event when the number of
requested packets exceeds the number of packets the congestion control
says can be sent during this frame. If a congestion event occurred
during the last 2 seconds, the packet send rate of the connection is set
to the send rate previously calculated, if not it is set to that send
rate times 1.25 in order to increase the speed.</p>
<p>Like I said this isnt perfect and a better solution can likely be found
or the numbers tweaked.</p>
<p>To fix the possible issue where it would be impossible to send very low
bandwidth data like text messages when sending high bandwidth data like
files it is possible to make priority packets ignore the congestion
control completely by placing them into the send packet queue and
sending them even if the congestion control says not to. This is used in
toxcore for all non file transfer packets to prevent file transfers from
preventing normal message packets from being sent.</p>
<h2 id="networktxt">network.txt</h2>
<p>The network module is the lowest file in toxcore that everything else
depends on. This module is basically a UDP socket wrapper, serves as the
sorting ground for packets received by the socket, initializes and
uninitializes the socket. It also contains many socket, networking
related and some other functions like a monotonic time function used by
other toxcore modules.</p>
<p>Things of note in this module are the maximum UDP packet size define
(<code class="language-plaintext highlighter-rouge">MAX_UDP_PACKET_SIZE</code>) which sets the maximum UDP packet size toxcore
can send and receive. The list of all UDP packet ids: <code class="language-plaintext highlighter-rouge">NET_PACKET_</code>. UDP
packet ids are the value of the first byte of each UDP packet and is how
each packet gets sorted to the right module that can handle it.
<code class="language-plaintext highlighter-rouge">networking_registerhandler()</code> is used by higher level modules in order
to tell the network object which packets to send to which module via a
callback.</p>
<p>It also contains datastructures used for ip addresses in toxcore. IP4
and IP6 are the datastructures for ipv4 and ipv6 addresses, IP is the
datastructure for storing either (the family can be set to <code class="language-plaintext highlighter-rouge">AF_INET</code>
(ipv4) or <code class="language-plaintext highlighter-rouge">AF_INET6</code> (ipv6). It can be set to another value like
<code class="language-plaintext highlighter-rouge">TCP_ONION_FAMILY</code>, <code class="language-plaintext highlighter-rouge">TCP_INET</code>, <code class="language-plaintext highlighter-rouge">TCP_INET6</code> or <code class="language-plaintext highlighter-rouge">TCP_FAMILY</code> which are
invalid values in the network modules but valid values in some other
module and denote a special type of ip) and <code class="language-plaintext highlighter-rouge">IP_Port</code> stores an IP
datastructure with a port.</p>
<p>Since the network module interacts directly with the underlying
operating system with its socket functions it has code to make it work
on windows, linux, etc… unlike most modules that sit at a higher
level.</p>
<p>The network module currently uses the polling method to read from the
UDP socket. The <code class="language-plaintext highlighter-rouge">networking_poll()</code> function is called to read all the
packets from the socket and pass them to the callbacks set using the
<code class="language-plaintext highlighter-rouge">networking_registerhandler()</code> function. The reason it uses polling is
simply because it was easier to write it that way, another method would
be better here.</p>
<p>The goal of this module is to provide an easy interface to a UDP socket
and other networking related functions.</p>
<h2 id="onion">Onion</h2>
<p>The goal of the onion module in Tox is to prevent peers that are not
friends from finding out the temporary DHT public key from a known long
term public key of the peer and to prevent peers from discovering the
long term public key of peers when only the temporary DHT key is known.</p>
<p>It makes sure only friends of a peer can find it and connect to it and
indirectly makes sure non friends cannot find the ip address of the peer
when knowing the Tox address of the friend.</p>
<p>The only way to prevent peers in the network from associating the
temporary DHT public key with the long term public key is to not
broadcast the long term key and only give others in the network that are
not friends the DHT public key.</p>
<p>The onion lets peers send their friends, whose real public key they know
as it is part of the Tox ID, their DHT public key so that the friends
can then find and connect to them without other peers being able to
identify the real public keys of peers.</p>
<p>So how does the onion work?</p>
<p>The onion works by enabling peers to announce their real public key to
peers by going through the onion path. It is like a DHT but through
onion paths. In fact it uses the DHT in order for peers to be able to
find the peers with ids closest to their public key by going through
onion paths.</p>
<p>In order to announce its real public key anonymously to the Tox network
while using the onion, a peer first picks 3 random nodes that it knows
(they can be from anywhere: the DHT, connected TCP relays or nodes found
while finding peers with the onion). The nodes should be picked in a way
that makes them unlikely to be operated by the same person perhaps by
looking at the ip addresses and looking if they are in the same subnet
or other ways. More research is needed to make sure nodes are picked in
the safest way possible.</p>
<p>The reason for 3 nodes is that 3 hops is what they use in Tor and other
anonymous onion based networks.</p>
<p>These nodes are referred to as nodes A, B and C. Note that if a peer
cannot communicate via UDP, its first peer will be one of the TCP relays
it is connected to, which will be used to send its onion packet to the
network.</p>
<p>TCP relays can only be node A or the first peer in the chain as the TCP
relay is essentially acting as a gateway to the network. The data sent
to the TCP Client module to be sent as a TCP onion packet by the module
is different from the one sent directly via UDP. This is because it
doesnt need to be encrypted (the connection to the TCP relay server is
already encrypted).</p>
<p>First I will explain how communicating via onion packets work.</p>
<p>Note: nonce is a 24 byte nonce. The nested nonces are all the same as
the outer nonce.</p>
<p>Onion packet (request):</p>
<p>Initial (TCP) data sent as the data of an onion packet through the TCP
client module:</p>
<ul>
<li>
<p><code class="language-plaintext highlighter-rouge">IP_Port</code> of node B</p>
</li>
<li>
<p>A random public key PK1</p>
</li>
<li>
<p>Encrypted with the secret key SK1 and the public key of Node B and
the nonce:</p>
<ul>
<li>
<p><code class="language-plaintext highlighter-rouge">IP_Port</code> of node C</p>
</li>
<li>
<p>A random public key PK2</p>
</li>
<li>
<p>Encrypted with the secret key SK2 and the public key of Node C
and the nonce:</p>
<ul>
<li>
<p><code class="language-plaintext highlighter-rouge">IP_Port</code> of node D</p>
</li>
<li>
<p>Data to send to Node D</p>
</li>
</ul>
</li>
</ul>
</li>
</ul>
<p>Initial (UDP) (sent from us to node A):</p>
<ul>
<li>
<p><code class="language-plaintext highlighter-rouge">uint8_t</code> (0x80) packet id</p>
</li>
<li>
<p>Nonce</p>
</li>
<li>
<p>Our temporary DHT public key</p>
</li>
<li>
<p>Encrypted with our temporary DHT secret key and the public key of
Node A and the nonce:</p>
<ul>
<li>
<p><code class="language-plaintext highlighter-rouge">IP_Port</code> of node B</p>
</li>
<li>
<p>A random public key PK1</p>
</li>
<li>
<p>Encrypted with the secret key SK1 and the public key of Node B
and the nonce:</p>
<ul>
<li>
<p><code class="language-plaintext highlighter-rouge">IP_Port</code> of node C</p>
</li>
<li>
<p>A random public key PK2</p>
</li>
<li>
<p>Encrypted with the secret key SK2 and the public key of Node
C and the nonce:</p>
<ul>
<li>
<p><code class="language-plaintext highlighter-rouge">IP_Port</code> of node D</p>
</li>
<li>
<p>Data to send to Node D</p>
</li>
</ul>
</li>
</ul>
</li>
</ul>
</li>
</ul>
<p>(sent from node A to node B):</p>
<ul>
<li>
<p><code class="language-plaintext highlighter-rouge">uint8_t</code> (0x81) packet id</p>
</li>
<li>
<p>Nonce</p>
</li>
<li>
<p>A random public key PK1</p>
</li>
<li>
<p>Encrypted with the secret key SK1 and the public key of Node B and
the nonce:</p>
<ul>
<li>
<p><code class="language-plaintext highlighter-rouge">IP_Port</code> of node C</p>
</li>
<li>
<p>A random public key PK2</p>
</li>
<li>
<p>Encrypted with the secret key SK2 and the public key of Node C
and the nonce:</p>
<ul>
<li>
<p><code class="language-plaintext highlighter-rouge">IP_Port</code> of node D</p>
</li>
<li>
<p>Data to send to Node D</p>
</li>
</ul>
</li>
</ul>
</li>
<li>
<p>Nonce</p>
</li>
<li>
<p>Encrypted with temporary symmetric key of Node A and the nonce:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">IP_Port</code> (of us)</li>
</ul>
</li>
</ul>
<p>(sent from node B to node C):</p>
<ul>
<li>
<p><code class="language-plaintext highlighter-rouge">uint8_t</code> (0x82) packet id</p>
</li>
<li>
<p>Nonce</p>
</li>
<li>
<p>A random public key PK1</p>
</li>
<li>
<p>Encrypted with the secret key SK1 and the public key of Node C and
the nonce:</p>
<ul>
<li>
<p><code class="language-plaintext highlighter-rouge">IP_Port</code> of node D</p>
</li>
<li>
<p>Data to send to Node D</p>
</li>
</ul>
</li>
<li>
<p>Nonce</p>
</li>
<li>
<p>Encrypted with temporary symmetric key of Node B and the nonce:</p>
<ul>
<li>
<p><code class="language-plaintext highlighter-rouge">IP_Port</code> (of Node A)</p>
</li>
<li>
<p>Nonce</p>
</li>
<li>
<p>Encrypted with temporary symmetric key of Node A and the nonce:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">IP_Port</code> (of us)</li>
</ul>
</li>
</ul>
</li>
</ul>
<p>(sent from node C to node D):</p>
<ul>
<li>
<p>Data to send to Node D</p>
</li>
<li>
<p>Nonce</p>
</li>
<li>
<p>Encrypted with temporary symmetric key of Node C and the nonce:</p>
<ul>
<li>
<p><code class="language-plaintext highlighter-rouge">IP_Port</code> (of Node B)</p>
</li>
<li>
<p>Nonce</p>
</li>
<li>
<p>Encrypted with temporary symmetric key of Node B and the nonce:</p>
<ul>
<li>
<p><code class="language-plaintext highlighter-rouge">IP_Port</code> (of Node A)</p>
</li>
<li>
<p>Nonce</p>
</li>
<li>
<p>Encrypted with temporary symmetric key of Node A and the
nonce:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">IP_Port</code> (of us)</li>
</ul>
</li>
</ul>
</li>
</ul>
</li>
</ul>
<p>Onion packet (response):</p>
<p>initial (sent from node D to node C):</p>
<ul>
<li>
<p><code class="language-plaintext highlighter-rouge">uint8_t</code> (0x8c) packet id</p>
</li>
<li>
<p>Nonce</p>
</li>
<li>
<p>Encrypted with the temporary symmetric key of Node C and the nonce:</p>
<ul>
<li>
<p><code class="language-plaintext highlighter-rouge">IP_Port</code> (of Node B)</p>
</li>
<li>
<p>Nonce</p>
</li>
<li>
<p>Encrypted with the temporary symmetric key of Node B and the
nonce:</p>
<ul>
<li>
<p><code class="language-plaintext highlighter-rouge">IP_Port</code> (of Node A)</p>
</li>
<li>
<p>Nonce</p>
</li>
<li>
<p>Encrypted with the temporary symmetric key of Node A and the
nonce:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">IP_Port</code> (of us)</li>
</ul>
</li>
</ul>
</li>
</ul>
</li>
<li>
<p>Data to send back</p>
</li>
</ul>
<p>(sent from node C to node B):</p>
<ul>
<li>
<p><code class="language-plaintext highlighter-rouge">uint8_t</code> (0x8d) packet id</p>
</li>
<li>
<p>Nonce</p>
</li>
<li>
<p>Encrypted with the temporary symmetric key of Node B and the nonce:</p>
<ul>
<li>
<p><code class="language-plaintext highlighter-rouge">IP_Port</code> (of Node A)</p>
</li>
<li>
<p>Nonce</p>
</li>
<li>
<p>Encrypted with the temporary symmetric key of Node A and the
nonce:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">IP_Port</code> (of us)</li>
</ul>
</li>
</ul>
</li>
<li>
<p>Data to send back</p>
</li>
</ul>
<p>(sent from node B to node A):</p>
<ul>
<li>
<p><code class="language-plaintext highlighter-rouge">uint8_t</code> (0x8e) packet id</p>
</li>
<li>
<p>Nonce</p>
</li>
<li>
<p>Encrypted with the temporary symmetric key of Node A and the nonce:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">IP_Port</code> (of us)</li>
</ul>
</li>
<li>
<p>Data to send back</p>
</li>
</ul>
<p>(sent from node A to us):</p>
<ul>
<li>Data to send back</li>
</ul>
<p>Each packet is encrypted multiple times so that only node A will be able
to receive and decrypt the first packet and know where to send it to,
node B will only be able to receive that decrypted packet, decrypt it
again and know where to send it and so on. You will also notice a piece
of encrypted data (the sendback) at the end of the packet that grows
larger and larger at every layer with the IP of the previous node in it.
This is how the node receiving the end data (Node D) will be able to
send data back.</p>
<p>When a peer receives an onion packet, they will decrypt it, encrypt the
coordinates (IP/port) of the source along with the already existing
encrypted data (if it exists) with a symmetric key known only by the
peer and only refreshed every hour (in toxcore) as a security measure to
force expire paths.</p>
<p>Heres a diagram how it works:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>peer
-&gt; [onion1[onion2[onion3[data]]]] -&gt; Node A
-&gt; [onion2[onion3[data]]][sendbackA] -&gt; Node B
-&gt; [onion3[data]][sendbackB[sendbackA]] -&gt; Node C
-&gt; [data][SendbackC[sendbackB[sendbackA]]]-&gt; Node D (end)
Node D
-&gt; [SendbackC[sendbackB[sendbackA]]][response] -&gt; Node C
-&gt; [sendbackB[sendbackA]][response] -&gt; Node B
-&gt; [sendbackA][response] -&gt; Node A
-&gt; [response] -&gt; peer
</code></pre></div></div>
<p>The random public keys in the onion packets are temporary public keys
generated for and used for that onion path only. This is done in order
to make it difficult for others to link different paths together. Each
encrypted layer must have a different public key. This is the reason why
there are multiple keys in the packet definintions above.</p>
<p>The nonce is used to encrypt all the layers of encryption. This 24 byte
nonce should be randomly generated. If it isnt randomly generated and
has a relation to nonces used for other paths it could be possible to
tie different onion paths together.</p>
<p>The <code class="language-plaintext highlighter-rouge">IP_Port</code> is an ip and port in packed
format:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">TOX_AF_INET</code> (2) for IPv4 or <code class="language-plaintext highlighter-rouge">TOX_AF_INET6</code> (10) for IPv6</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">4 \| 16</code></td>
<td style="text-align: left">IP address (4 bytes if IPv4, 16 if IPv6)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">12 \| 0</code></td>
<td style="text-align: left">Zeroes</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">2</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint16_t</code> Port</td>
</tr>
</tbody>
</table>
<p>If IPv4 the format is padded with 12 bytes of zeroes so that both IPv4
and IPv6 have the same stored size.</p>
<p>The <code class="language-plaintext highlighter-rouge">IP_Port</code> will always end up being of size 19 bytes. This is to make
it hard to know if an ipv4 or ipv6 ip is in the packet just by looking
at the size. The 12 bytes of zeros when ipv4 must be set to 0 and not
left uninitialized as some info may be leaked this way if it stays
uninitialized. All numbers here are in big endian format.</p>
<p>The <code class="language-plaintext highlighter-rouge">IP_Port</code> in the sendback data can be in any format as long as the
length is 19 bytes because only the one who writes it can decrypt it and
read it, however, using the previous format is recommended because of
code reuse. The nonce in the sendback data must be a 24 byte nonce.</p>
<p>Each onion layers has a different packed id that identifies it so that
an implementation knows exactly how to handle them. Note that any data
being sent back must be encrypted, appear random and not leak
information in any way as all the nodes in the path will see it.</p>
<p>If anything is wrong with the received onion packets (decryption fails)
the implementation should drop them.</p>
<p>The implementation should have code for each different type of packet
that handles it, adds (or decrypts) a sendback and sends it to the next
peer in the path. There are a lot of packets but an implementation
should be very straightforward.</p>
<p>Note that if the first node in the path is a TCP relay, the TCP relay
must put an identifier (instead of an IP/Port) in the sendback so that
it knows that any response should be sent to the appropriate peer
connected to the TCP relay.</p>
<p>This explained how to create onion packets and how they are sent back.
Next is what is actually sent and received on top of these onion packets
or paths.</p>
<p>Note: nonce is a 24 byte nonce.</p>
<p>announce request packet:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> (0x83)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">24</code></td>
<td style="text-align: left">Nonce</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">32</code></td>
<td style="text-align: left">A public key (real or temporary)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">?</code></td>
<td style="text-align: left">Payload</td>
</tr>
</tbody>
</table>
<p>The public key is our real long term public key if we want to announce
ourselves, a temporary one if we are searching for friends.</p>
<p>The payload is encrypted with the secret key part of the sent public
key, the public key of Node D and the nonce, and
contains:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">32</code></td>
<td style="text-align: left">Ping ID</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">32</code></td>
<td style="text-align: left">Public key we are searching for</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">32</code></td>
<td style="text-align: left">Public key that we want those sending back data packets to use</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">8</code></td>
<td style="text-align: left">Data to send back in response</td>
</tr>
</tbody>
</table>
<p>If the ping id is zero, respond with an announce response packet.</p>
<p>If the ping id matches the one the node sent in the announce response
and the public key matches the one being searched for, add the part used
to send data to our list. If the list is full make it replace the
furthest entry.</p>
<p>data to route request packet:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> (0x85)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">32</code></td>
<td style="text-align: left">Public key of destination node</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">24</code></td>
<td style="text-align: left">Nonce</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">32</code></td>
<td style="text-align: left">Temporary just generated public key</td>
</tr>
<tr>
<td style="text-align: left">variable</td>
<td style="text-align: left">Payload</td>
</tr>
</tbody>
</table>
<p>The payload is encrypted with that temporary secret key and the nonce
and the public key from the announce response packet of the destination
node. If Node D contains the ret data for the node, it sends the stuff
in this packet as a data to route response packet to the right node.</p>
<p>The data in the previous packet is in format:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">32</code></td>
<td style="text-align: left">Real public key of sender</td>
</tr>
<tr>
<td style="text-align: left">variable</td>
<td style="text-align: left">Payload</td>
</tr>
</tbody>
</table>
<p>The payload is encrypted with real secret key of the sender, the nonce
in the data packet and the real public key of the receiver:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> id</td>
</tr>
<tr>
<td style="text-align: left">variable</td>
<td style="text-align: left">Data (optional)</td>
</tr>
</tbody>
</table>
<p>Data sent to us:</p>
<p>announce response packet:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> (0x84)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">8</code></td>
<td style="text-align: left">Data to send back in response</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">24</code></td>
<td style="text-align: left">Nonce</td>
</tr>
<tr>
<td style="text-align: left">variable</td>
<td style="text-align: left">Payload</td>
</tr>
</tbody>
</table>
<p>The payload is encrypted with the DHT secret key of Node D, the public
key in the request and the nonce:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> <code class="language-plaintext highlighter-rouge">is_stored</code></td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">32</code></td>
<td style="text-align: left">Ping ID or Public Key</td>
</tr>
<tr>
<td style="text-align: left">variable</td>
<td style="text-align: left">Maximum of 4 nodes in packed node format (see DHT)</td>
</tr>
</tbody>
</table>
<p>The packet contains a ping ID if <code class="language-plaintext highlighter-rouge">is_stored</code> is 0 or 2, or the public
key that must be used to send data packets if <code class="language-plaintext highlighter-rouge">is_stored</code> is 1.</p>
<p>If the <code class="language-plaintext highlighter-rouge">is_stored</code> is not 0, it means the information to reach the
public key we are searching for is stored on this node. <code class="language-plaintext highlighter-rouge">is_stored</code> is 2
as a response to a peer trying to announce himself to tell the peer that
he is currently announced successfully.</p>
<p>data to route response packet:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> (0x86)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">24</code></td>
<td style="text-align: left">Nonce</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">32</code></td>
<td style="text-align: left">Temporary just generated public key</td>
</tr>
<tr>
<td style="text-align: left">variable</td>
<td style="text-align: left">Payload</td>
</tr>
</tbody>
</table>
<p>The payload is encrypted with that temporary secret key, the nonce and
the public key from the announce response packet of the destination
node.</p>
<p>There are 2 types of request packets and 2 response packets to go with
them. The announce request is used to announce ourselves to a node and
announce response packet is used by the node to respond to this packet.
The data to route request packet is a packet used to send packets
through the node to another peer that has announced itself and that we
have found. The data to route response packet is what the node
transforms this packet into.</p>
<p>To announce ourselves to the network we must first find, using announce
packets, the peers with the DHT public key closest to our real public
key. We must then announce ourselves to these peers. Friends will then
be able to send messages to us using data to route packets by sending
them to these peers. To find the peers we have announced ourselves to,
our friends will find the peers closest to our real public key and ask
them if they know us. They will then be able to use the peers that know
us to send us some messages that will contain their DHT public key
(which we need to know to connect directly to them), TCP relays that
they are connected to (so we can connect to them with these relays if we
need to) and some DHT peers they are connected to (so we can find them
faster in the DHT).</p>
<p>Announce request packets are the same packets used slightly differently
if we are announcing ourselves or searching for peers that know one of
our friends.</p>
<p>If we are announcing ourselves we must put our real long term public key
in the packet and encrypt it with our long term private key. This is so
the peer we are announcing ourselves to can be sure that we actually own
that public key. If we are looking for peers we use a temporary public
key used only for packets looking for that peer in order to leak as
little information as possible. The <code class="language-plaintext highlighter-rouge">ping_id</code> is a 32 byte number which
is sent to us in the announce response and we must send back to the peer
in another announce request. This is done in order to prevent people
from easily announcing themselves many times as they have to prove they
can respond to packets from the peer before the peer will let them
announce themselves. This <code class="language-plaintext highlighter-rouge">ping_id</code> is set to 0 when none is known.</p>
<p>The public key we are searching for is set to our long term public key
when announcing ourselves and set to the long term public key of the
friend we are searching for if we are looking for peers.</p>
<p>When announcing ourselves, the public key we want others to use to send
us data back is set to a temporary public key and we use the private key
part of this key to decrypt packet routing data sent to us. This public
key is to prevent peers from saving old data to route packets from
previous sessions and be able to replay them in future Tox sessions.
This key is set to zero when searching for peers.</p>
<p>The sendback data is an 8 byte number that will be sent back in the
announce packet response. Its goal is to be used to learn which announce
request packet the response is responding to, and hence its location in
the unencrypted part of the response. This is needed in toxcore to find
and check info about the packet in order to decrypt it and handle it
correctly. Toxcore uses it as an index to its special <code class="language-plaintext highlighter-rouge">ping_array</code>.</p>
<p>Why dont we use different packets instead of having one announce packet
request and one response that does everything? It makes it a lot more
difficult for possible attackers to know if we are merely announcing
ourselves or if we are looking for friends as the packets for both look
the same and are the same size.</p>
<p>The unencrypted part of an announce response packet contains the
sendback data, which was sent in the request this packet is responding
to and a 24 byte random nonce used to encrypt the encrypted part.</p>
<p>The <code class="language-plaintext highlighter-rouge">is_stored</code> number is set to either 0, 1 or 2. 0 means that the
public key that was being searched in the request isnt stored or known
by this peer. 1 means that it is and 2 means that we are announced
successfully at that node. Both 1 and 2 are needed so that when clients
are restarted it is possible to reannounce without waiting for the
timeout of the previous announce. This would not otherwise be possible
as a client would receive response 1 without a <code class="language-plaintext highlighter-rouge">ping_id</code> which is needed
in order to reannounce successfully.</p>
<p>When the <code class="language-plaintext highlighter-rouge">is_stored</code> number is 0 or 2, the next 32 bytes is a <code class="language-plaintext highlighter-rouge">ping_id</code>.
When <code class="language-plaintext highlighter-rouge">is_stored</code> is 1 it corresponds to a public key (the send back data
public key set by the friend in their announce request) that must be
used to encrypt and send data to the friend.</p>
<p>Then there is an optional maximum 4 nodes, in DHT packed nodes format
(see DHT), attached to the response which denote the 4 DHT peers with
the DHT public keys closest to the searched public key in the announce
request known by the peer (see DHT). To find these peers, toxcore uses
the same function as is used to find peers for get node DHT responses.
Peers wanting to announce themselves or searching for peers that know
their friends will recursively query closer and closer peers until they
find the closest possible and then either announce themselves to them or
just ping them every once in a while to know if their friend can be
contacted. Note that the distance function used for this is the same as
the Tox DHT.</p>
<p>Data to route request packets are packets used to send data directly to
another peer via a node that knows that peer. The public key is the
public key of the final destination where we want the packet to be sent
(the real public key of our friend). The nonce is a 24 byte random nonce
and the public key is a random temporary public key used to encrypt the
data in the packet and, if possible, only to send packets to this friend
(we want to leak as little info to the network as possible so we use
temp public keys as we dont want a peer to see the same public keys and
be able to link things together). The data is encrypted data that we
want to send to the peer with the public key.</p>
<p>The route response packets are just the last elements (nonce, public
key, encrypted data) of the data to route request packet copied into a
new packet and sent to the appropriate destination.</p>
<p>To handle onion announce packets, toxcore first receives an announce
packet and decrypts it.</p>
<p>Toxcore generates <code class="language-plaintext highlighter-rouge">ping_id</code>s by taking a 32 byte sha hash of the current
time, some secret bytes generated when the instance is created, the
current time divided by a 300 second timeout, the public key of the
requester and the source ip/port that the packet was received from.
Since the ip/port that the packet was received from is in the <code class="language-plaintext highlighter-rouge">ping_id</code>,
the announce packets being sent with a ping id must be sent using the
same path as the packet that we received the <code class="language-plaintext highlighter-rouge">ping_id</code> from or
announcing will fail.</p>
<p>The reason for this 300 second timeout in toxcore is that it gives a
reasonable time (300 to 600 seconds) for peers to announce themselves.</p>
<p>Toxcore generates 2 different ping ids, the first is generated with the
current time (divided by 300) and the second with the current time + 300
(divided by 300). The two ping ids are then compared to the ping ids in
the received packets. The reason for doing this is that storing every
ping id received might be expensive and leave us vulnerable to a DoS
attack, this method makes sure that the other cannot generate <code class="language-plaintext highlighter-rouge">ping_id</code>s
and must ask for them. The reason for the 2 <code class="language-plaintext highlighter-rouge">ping_id</code>s is that we want
to make sure that the timeout is at least 300 seconds and cannot be 0.</p>
<p>If one of the two ping ids is equal to the ping id in the announce
request, the sendback data public key and the sendback data are stored
in the datastructure used to store announced peers. If the
implementation has a limit to how many announced entries it can store,
it should only store the entries closest (determined by the DHT distance
function) to its DHT public key. If the entry is already there, the
information will simply be updated with the new one and the timeout will
be reset for that entry.</p>
<p>Toxcore has a timeout of 300 seconds for announce entries after which
they are removed which is long enough to make sure the entries dont
expire prematurely but not long enough for peers to stay announced for
extended amounts of time after they go offline.</p>
<p>Toxcore will then copy the 4 DHT nodes closest to the public key being
searched to a new packet (the response).</p>
<p>Toxcore will look if the public key being searched is in the
datastructure. If it isnt it will copy the second generated <code class="language-plaintext highlighter-rouge">ping_id</code>
(the one generated with the current time plus 300 seconds) to the
response, set the <code class="language-plaintext highlighter-rouge">is_stored</code> number to 0 and send the packet back.</p>
<p>If the public key is in the datastructure, it will check whether the
public key that was used to encrypt the announce packet is equal to the
announced public key, if it isnt then it means that the peer is
searching for a peer and that we know it. This means the <code class="language-plaintext highlighter-rouge">is_stored</code> is
set to 1 and the sending back data public key in the announce entry is
copied to the packet.</p>
<p>If it (key used to encrypt the announce packet) is equal (to the
announced public key which is also the public key we are searching for
in the announce packet) meaning the peer is announcing itself and an
entry for it exists, the sending back data public key is checked to see
if it equals the one in the packet. If it is not equal it means that it
is outdated, probably because the announcing peers toxcore instance was
restarted and so their <code class="language-plaintext highlighter-rouge">is_stored</code> is set to 0, if it is equal it means
the peer is announced correctly so the <code class="language-plaintext highlighter-rouge">is_stored</code> is set to 2. The
second generated <code class="language-plaintext highlighter-rouge">ping_id</code> is then copied to the packet.</p>
<p>Once the packet is contructed a random 24 byte nonce is generated, the
packet is encrypted (the shared key used to decrypt the request can be
saved and used to encrypt the response to save an expensive key
derivation operation), the data to send back is copied to the
unencrypted part and the packet is sent back as an onion response
packet.</p>
<p>In order to announce itself using onion announce packets toxcore first
takes DHT peers, picks random ones and builds onion paths with them by
saving 3 nodes, calling it a path, generating some keypairs for
encrypting the onion packets and using them to send onion packets. If
the peer is only connected with TCP, the initial nodes will be bootstrap
nodes and connected TCP relays (for the first peer in the path). Once
the peer is connected to the onion he can fill up his list of known
peers with peers sent in announce responses if needed.</p>
<p>Onion paths have different timeouts depending on whether the path is
confirmed or unconfirmed. Unconfirmed paths (paths that core has never
received any responses from) have a timeout of 4 seconds with 2 tries
before they are deemed non working. This is because, due to network
conditions, there may be a large number of newly created paths that do
not work and so trying them a lot would make finding a working path take
much longer. The timeout for a confirmed path (from which a response was
received) is 10 seconds with 4 tries without a response. A confirmed
path has a maximum lifetime of 1200 seconds to make possible
deanonimization attacks more difficult.</p>
<p>Toxcore saves a maximum of 12 paths: 6 paths are reserved for announcing
ourselves and 6 others are used to search for friends. This may not be
the safest way (some nodes may be able to associate friends together)
however it is much more performant than having different paths for each
friend. The main benefit is that the announcing and searching are done
with different paths, which makes it difficult to know that peer with
real public key X is friends with Y and Z. More research is needed to
find the best way to do this. At first toxcore did have different paths
for each friend, however, that meant that each friend path was almost
never used (and checked). When using a low amount of paths for searching
there is less resources needed to find good paths. 6 paths are used
because 4 was too low and caused some performance issues because it took
longer to find some good paths at the beginning because only 4 could be
tried at a time. A too high number meanwhile would mean each path is
used (and tested) less. The reason why the numbers are the same for both
types of paths is for code simplification purposes.</p>
<p>To search/announce itself to peers, toxcore keeps the 8 closest peers
(12 for announcing) to each key it is searching (or announcing itself
to). To populate these it starts by sending announce requests to random
peers for all the public keys it is searching for. It then recursively
searches closer and closer peers (DHT distance function) until it no
longer finds any. It is important to make sure it is not too aggressive
at searching the peers as some might no longer be online but peers might
still send announce responses with their information. Toxcore keeps
lists of last pinged nodes for each key searched so as not to ping dead
nodes too aggressively.</p>
<p>Toxcore decides if it will send an announce packet to one of the 4 peers
in the announce response by checking if the peer would be stored as one
of the stored closest peers if it responded; if it would not be it
doesnt send an announce request, if it would be it sends one.</p>
<p>Peers are only put in the closest peers array if they respond to an
announce request. If the peers fail to respond to 3 announce requests
they are deemed timed out and removed. When sending an announce request
to a peer to which we have been announcing ourselves for at least 90
seconds and which has failed to respond to the previous 2 requests,
toxcore uses a random path for the request. This reduces the chances
that a good node will be removed due to bad paths.</p>
<p>The reason for the numbers of peers being 8 and 12 is that lower numbers
might make searching for and announcing too unreliable and a higher
number too bandwidth/resource intensive.</p>
<p>Toxcore uses <code class="language-plaintext highlighter-rouge">ping_array</code> (see <code class="language-plaintext highlighter-rouge">ping_array</code>) for the 8 byte sendback
data in announce packets to store information that it will need to
handle the response (key to decrypt it, why was it sent? (to announce
ourselves or to search? For what key? and some other info)). For
security purposes it checks to make sure the packet was received from
the right ip/port and checks if the key in the unencrypted part of the
packet is the right public key.</p>
<p>For peers we are announcing ourselves to, if we are not announced to
them toxcore tries every 3 seconds to announce ourselves to them until
they return that we have announced ourselves to them, then initially
toxcore sends an announce request packet every 15 seconds to see if we
are still announced and reannounce ourselves at the same time. Toxcore
sends every announce packet with the <code class="language-plaintext highlighter-rouge">ping_id</code> previously received from
that peer with the same path (if possible). Toxcore use a timeout of 120
seconds rather than 15 seconds if we have been announcing to the peer
for at least 90 seconds, and the onion path we are are using for the
peer has also been alive for at least 90 seconds, and we have not been
waiting for at least 15 seconds for a response to a request sent to the
peer, nor for at least 10 seconds for a response to a request sent via
the path. The timeout of at most 120 seconds means a <code class="language-plaintext highlighter-rouge">ping_id</code> received
in the last packet will not have had time to expire (300 second minimum
timeout) before it is resent 120 seconds later.</p>
<p>For friends this is slightly different. It is important to start
searching for friends after we are fully announced. Assuming a perfect
network, we would only need to do a search for friend public keys only
when first starting the instance (or going offline and back online) as
peers starting up after us would be able to find us immediately just by
searching for us. If we start searching for friends after we are
announced we prevent a scenario where 2 friends start their clients at
the same time but are unable to find each other right away because they
start searching for each other while they have not announced themselves.</p>
<p>For this reason, after the peer is announced successfully, for 17
seconds announce packets are sent aggressively every 3 seconds to each
known close peer (in the list of 8 peers) to search aggressively for
peers that know the peer we are searching for.</p>
<p>After this, toxcore sends requests once per 15 seconds initially, then
uses linear backoff to increase the interval. In detail, the interval
used when searching for a given friend is at least 15 and at most 2400
seconds, and within these bounds is calculated as one quarter of the
time since we began searching for the friend, or since the friend was
last seen. For this purpose, a friend is considered to be seen when some
peer reports that the friend is announced, or we receive a DHT Public
Key packet from the friend, or we obtain a new DHT key for them from a
group, or a friend connection for the friend goes offline.</p>
<p>There are other ways this could be done and which would still work but,
if making your own implementation, keep in mind that these are likely
not the most optimized way to do things.</p>
<p>If we find peers (more than 1) that know a friend we will send them an
onion data packet with our DHT public key, up to 2 TCP relays we are
connected to and 2 DHT peers close to us to help the friend connect back
to us.</p>
<p>Onion data packets are packets sent as the data of data to route
packets.</p>
<p>Onion data packets:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">32</code></td>
<td style="text-align: left">Long term public key of sender</td>
</tr>
<tr>
<td style="text-align: left">variable</td>
<td style="text-align: left">Payload</td>
</tr>
</tbody>
</table>
<p>The payload is encrypted with long term private key of the sender, the
long term public key of the receiver and the nonce used in the data to
route request packet used to send this onion data packet (shaves off 24
bytes).</p>
<p>DHT public key packet:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> (0x9c)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">8</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint64_t</code> <code class="language-plaintext highlighter-rouge">no_replay</code></td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">32</code></td>
<td style="text-align: left">Our DHT public key</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">[39, 204]</code></td>
<td style="text-align: left">Maximum of 4 nodes in packed format</td>
</tr>
</tbody>
</table>
<p>The packet will only be accepted if the <code class="language-plaintext highlighter-rouge">no_replay</code> number is greater
than the <code class="language-plaintext highlighter-rouge">no_replay</code> number in the last packet received.</p>
<p>The nodes sent in the packet comprise 2 TCP relays to which we are
connected (or fewer if there are not 2 available) and a number of DHT
nodes from our Close List, with the total number of nodes sent being at
most 4. The nodes chosen from the Close List are those closest in DHT
distance to us. This allows the friend to find us more easily in the
DHT, or to connect to us via a TCP relay.</p>
<p>Why another round of encryption? We have to prove to the receiver that
we own the long term public key we say we own when sending them our DHT
public key. Friend requests are also sent using onion data packets but
their exact format is explained in Messenger.</p>
<p>The <code class="language-plaintext highlighter-rouge">no_replay</code> number is protection if someone tries to replay an older
packet and should be set to an always increasing number. It is 8 bytes
so you should set a high resolution monotonic time as the value.</p>
<p>We send this packet every 30 seconds if there is more than one peer (in
the 8) that says they our friend is announced on them. This packet can
also be sent through the DHT module as a DHT request packet (see DHT) if
we know the DHT public key of the friend and are looking for them in the
DHT but have not connected to them yet. 30 second is a reasonable
timeout to not flood the network with too many packets while making sure
the other will eventually receive the packet. Since packets are sent
through every peer that knows the friend, resending it right away
without waiting has a high likelihood of failure as the chances of
packet loss happening to all (up to to 8) packets sent is low.</p>
<p>When sent as a DHT request packet (this is the data sent in the DHT
request packet):</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> (0x9c)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">32</code></td>
<td style="text-align: left">Long term public key of sender</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">24</code></td>
<td style="text-align: left">Nonce</td>
</tr>
<tr>
<td style="text-align: left">variable</td>
<td style="text-align: left">Encrypted payload</td>
</tr>
</tbody>
</table>
<p>The payload is encrypted with long term private key of sender, the long
term public key of receiver and the nonce, and contains the DHT public
key packet.</p>
<p>When sent as a DHT request packet the DHT public key packet is (before
being sent as the data of a DHT request packet) encrypted with the long
term keys of both the sender and receiver and put in that format. This
is done for the same reason as the double encryption of the onion data
packet.</p>
<p>Toxcore tries to resend this packet through the DHT every 20 seconds. 20
seconds is a reasonable resend rate which isnt too aggressive.</p>
<p>Toxcore has a DHT request packet handler that passes received DHT public
key packets from the DHT module to this module.</p>
<p>If we receive a DHT public key packet, we will first check if the DHT
packet is from a friend, if it is not from a friend, it will be
discarded. The <code class="language-plaintext highlighter-rouge">no_replay</code> will then be checked to see if it is good and
no packet with a lower one was received during the session. The DHT key,
the TCP nodes in the packed nodes and the DHT nodes in the packed nodes
will be passed to their relevant modules. The fact that we have the DHT
public key of a friend means this module has achieved its goal.</p>
<p>If a friend is online and connected to us, the onion will stop all of
its actions for that friend. If the peer goes offline it will restart
searching for the friend as if toxcore was just started.</p>
<p>If toxcore goes offline (no onion traffic for 75 seconds) toxcore will
aggressively reannounce itself and search for friends as if it was just
started.</p>
<h2 id="ping-array">Ping array</h2>
<p>Ping array is an array used in toxcore to store data for pings. It
enables the storage of arbitrary data that can then be retrieved later
by passing the 8 byte ping id that was returned when the data was
stored. It also frees data from pings that are older than a ping
expiring delay set when initializing the array.</p>
<p>Ping arrays are initialized with a size and a timeout parameter. The
size parameter denotes the maximum number of entries in the array and
the timeout denotes the number of seconds to keep an entry in the array.
Timeout and size must be bigger than 0.</p>
<p>Adding an entry to the ping array will make it return an 8 byte number
that can be used as the ping number of a ping packet. This number is
generated by first generating a random 8 byte number (toxcore uses the
cryptographic secure random number generator), dividing then multiplying
it by the total size of the array and then adding the index of the
element that was added. This generates a random looking number that will
return the index of the element that was added to the array. This number
is also stored along with the added data and the current time (to check
for timeouts). Data is added to the array in a cyclical manner (0, 1, 2,
3… (array size - 1), 0, 1, …). If the array is full, the oldest
element is overwritten.</p>
<p>To get data from the ping array, the ping number is passed to the
function to get the data from the array. The modulo of the ping number
with the total size of the array will return the index at which the data
is. If there is no data stored at this index, the function returns an
error. The ping number is then checked against the ping number stored
for this element, if it is not equal the function returns an error. If
the array element has timed out, the function returns an error. If all
the checks succeed the function returns the exact data that was stored
and it is removed from the array.</p>
<p>Ping array is used in many places in toxcore to efficiently keep track
of sent packets.</p>
<h2 id="state-format">State Format</h2>
<p>The reference Tox implementation uses a custom binary format to save the
state of a Tox client between restarts. This format is far from perfect
and will be replaced eventually. For the sake of maintaining
compatibility down the road, it is documented here.</p>
<p>The binary encoding of all integer types in the state format is a
fixed-width byte sequence with the integer encoded in Little Endian
unless stated otherwise.</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">4</code></td>
<td style="text-align: left">Zeroes</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">4</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint32_t</code> (0x15ED1B1F)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">?</code></td>
<td style="text-align: left">List of sections</td>
</tr>
</tbody>
</table>
<h3 id="sections">Sections</h3>
<p>The core of the state format consists of a list of sections. Every
section has its type and length specified at the beginning. In some
cases, a section only contains one item and thus takes up the entire
length of the section. This is denoted with ?.</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">4</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint32_t</code> Length of this section</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">2</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint16_t</code> Section type</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">2</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint16_t</code> (0x01CE)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">?</code></td>
<td style="text-align: left">Section</td>
</tr>
</tbody>
</table>
<p>Section types:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Name</th>
<th style="text-align: left">Value</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left">NospamKeys</td>
<td style="text-align: left">0x01</td>
</tr>
<tr>
<td style="text-align: left">DHT</td>
<td style="text-align: left">0x02</td>
</tr>
<tr>
<td style="text-align: left">Friends</td>
<td style="text-align: left">0x03</td>
</tr>
<tr>
<td style="text-align: left">Name</td>
<td style="text-align: left">0x04</td>
</tr>
<tr>
<td style="text-align: left">StatusMessage</td>
<td style="text-align: left">0x05</td>
</tr>
<tr>
<td style="text-align: left">Status</td>
<td style="text-align: left">0x06</td>
</tr>
<tr>
<td style="text-align: left">TcpRelays</td>
<td style="text-align: left">0x0A</td>
</tr>
<tr>
<td style="text-align: left">PathNodes</td>
<td style="text-align: left">0x0B</td>
</tr>
<tr>
<td style="text-align: left">Conferences</td>
<td style="text-align: left">0x14</td>
</tr>
<tr>
<td style="text-align: left">EOF</td>
<td style="text-align: left">0xFF</td>
</tr>
</tbody>
</table>
<p>Not every section listed above is required to be present in order to
restore from a state file. Only NospamKeys is required.</p>
<h4 id="nospam-and-keys-0x01">Nospam and Keys (0x01)</h4>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">4</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint32_t</code> Nospam</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">32</code></td>
<td style="text-align: left">Long term public key</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">32</code></td>
<td style="text-align: left">Long term secret key</td>
</tr>
</tbody>
</table>
<h4 id="dht-0x02">DHT (0x02)</h4>
<p>This section contains a list of DHT-related sections.</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">4</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint32_t</code> (0x159000D)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">?</code></td>
<td style="text-align: left">List of DHT sections</td>
</tr>
</tbody>
</table>
<h5 id="dht-sections">DHT Sections</h5>
<p>Every DHT section has the following structure:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">4</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint32_t</code> Length of this section</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">2</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint16_t</code> DHT section type</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">2</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint16_t</code> (0x11CE)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">?</code></td>
<td style="text-align: left">DHT section</td>
</tr>
</tbody>
</table>
<p>DHT section types:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Name</th>
<th style="text-align: left">Value</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left">Nodes</td>
<td style="text-align: left">0x04</td>
</tr>
</tbody>
</table>
<h6 id="nodes-0x04">Nodes (0x04)</h6>
<p>This section contains a list of nodes. These nodes are used to quickly
reconnect to the DHT after a Tox client is restarted.</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">?</code></td>
<td style="text-align: left">List of nodes</td>
</tr>
</tbody>
</table>
<p>The structure of a node is the same as <code class="language-plaintext highlighter-rouge">Node Info</code>. Note: this means
that the integers stored in these nodes are stored in Big Endian as
well.</p>
<h4 id="friends-0x03">Friends (0x03)</h4>
<p>This section contains a list of friends. A friend can either be a peer
weve sent a friend request to or a peer weve accepted a friend request
from.</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">?</code></td>
<td style="text-align: left">List of friends</td>
</tr>
</tbody>
</table>
<p>Friend:</p>
<p>The integers in this structure are stored in Big Endian format.</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> Status</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">32</code></td>
<td style="text-align: left">Long term public key</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1024</code></td>
<td style="text-align: left">Friend request message as a byte string</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left">PADDING</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">2</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint16_t</code> Size of the friend request message</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">128</code></td>
<td style="text-align: left">Name as a byte string</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">2</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint16_t</code> Size of the name</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1007</code></td>
<td style="text-align: left">Status message as a byte string</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left">PADDING</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">2</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint16_t</code> Size of the status message</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> User status (see also: <code class="language-plaintext highlighter-rouge">USERSTATUS</code>)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">3</code></td>
<td style="text-align: left">PADDING</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">4</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint32_t</code> Nospam (only used for sending a friend request)</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">8</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint64_t</code> Last seen time</td>
</tr>
</tbody>
</table>
<p>Status can be one of:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Status</th>
<th style="text-align: left">Meaning</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left">0</td>
<td style="text-align: left">Not a friend</td>
</tr>
<tr>
<td style="text-align: left">1</td>
<td style="text-align: left">Friend added</td>
</tr>
<tr>
<td style="text-align: left">2</td>
<td style="text-align: left">Friend request sent</td>
</tr>
<tr>
<td style="text-align: left">3</td>
<td style="text-align: left">Confirmed friend</td>
</tr>
<tr>
<td style="text-align: left">4</td>
<td style="text-align: left">Friend online</td>
</tr>
</tbody>
</table>
<h4 id="name-0x04">Name (0x04)</h4>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">?</code></td>
<td style="text-align: left">Name as a UTF-8 encoded string</td>
</tr>
</tbody>
</table>
<h4 id="status-message-0x05">Status Message (0x05)</h4>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">?</code></td>
<td style="text-align: left">Status message as a UTF-8 encoded string</td>
</tr>
</tbody>
</table>
<h4 id="status-0x06">Status (0x06)</h4>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> User status (see also: <code class="language-plaintext highlighter-rouge">USERSTATUS</code>)</td>
</tr>
</tbody>
</table>
<h4 id="tcp-relays-0x0a">Tcp Relays (0x0A)</h4>
<p>This section contains a list of TCP relays.</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">?</code></td>
<td style="text-align: left">List of TCP relays</td>
</tr>
</tbody>
</table>
<p>The structure of a TCP relay is the same as <code class="language-plaintext highlighter-rouge">Node Info</code>. Note: this
means that the integers stored in these nodes are stored in Big Endian
as well.</p>
<h4 id="path-nodes-0x0b">Path Nodes (0x0B)</h4>
<p>This section contains a list of path nodes used for onion routing.</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">?</code></td>
<td style="text-align: left">List of path nodes</td>
</tr>
</tbody>
</table>
<p>The structure of a path node is the same as <code class="language-plaintext highlighter-rouge">Node Info</code>. Note: this
means that the integers stored in these nodes are stored in Big Endian
as well.</p>
<h4 id="conferences-0x14">Conferences (0x14)</h4>
<p>This section contains a list of saved conferences.</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">?</code></td>
<td style="text-align: left">List of conferences</td>
</tr>
</tbody>
</table>
<p>Conference:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> Groupchat type</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">32</code></td>
<td style="text-align: left">Groupchat id</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">4</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint32_t</code> Message number</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">2</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint16_t</code> Lossy message number</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">2</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint16_t</code> Peer number</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">4</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint32_t</code> Number of peers</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> Title length</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">?</code></td>
<td style="text-align: left">Title</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">?</code></td>
<td style="text-align: left">List of peers</td>
</tr>
</tbody>
</table>
<p>All peers other than the saver are saved, including frozen peers. On
reload, they all start as frozen.</p>
<p>Peer:</p>
<table>
<thead>
<tr>
<th style="text-align: left">Length</th>
<th style="text-align: left">Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">32</code></td>
<td style="text-align: left">Long term public key</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">32</code></td>
<td style="text-align: left">DHT public key</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">2</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint16_t</code> Peer number</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">8</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint64_t</code> Last active timestamp</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">1</code></td>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">uint8_t</code> Name length</td>
</tr>
<tr>
<td style="text-align: left"><code class="language-plaintext highlighter-rouge">?</code></td>
<td style="text-align: left">Name</td>
</tr>
</tbody>
</table>
<h4 id="eof-0xff">EOF (0xFF)</h4>
<p>This section indicates the end of the state file. This section doesnt
have any content and thus its length is 0.</p>
<ol>
<li>
<p>We use a “real” peer count, which is the number of confirmed peers
in the peerlist (that is, peers who you have successfully handshaked
and exchanged peer info with).</p>
</li>
<li>
<p>The peer roles checksum is calculated as follows: Make an unsigned
16-bit sum of each confirmed peers role plus the first byte of
their respective public key, then add to this an unsigned 16-bit sum
of the sanctions credentials hash.</p>
</li>
</ol>
</div>
</section>
</body>
<footer id="footer" class="dark">
<div class="container-fluid limit-width">
<div class="languages"></div>
<div class="ext-med pull-left">
<a class="button" href="https://github.com/TokTok/c-toxcore" title="Star us on Github"><span class="icon fab fa-github"></span></a>
<a class="button" href="https://wiki.tox.chat/users/community#irc" title="Chat with us on IRC"><span class="icon fas fa-comments"></span></a>
<a class="button do-button" href="https://www.digitalocean.com/" title="Powered by DigitalOcean"><img class="button" src="static/img/do.svg" /></a>
</div>
<div class="cc pull-right">
<a href="http://creativecommons.org/licenses/by-sa/4.0/"><img src="static/img/CC.png"></a>
<br>
<p>This page was generated from
<a href="https://github.com/TokTok/website">
a file hosted in public GitHub repository
</a>
— issue tickets and pull requests are very welcome!
</p>
</div>
</div>
</footer>
</html>