UDPMSG3 Protocol
Implementation
More information on deployments and implementations is available on the UDPMSG3 page.
Message fields
Key Value CHN A channel name, this can be anything. For public chat channels, it is recommended to use IRC style names, like #anonet Although channel names in general have no particular meaning, it is recommended to use the "USER/<NICKNAME>" format for private messages CMD The event that caused the message to be sent, for chat channels this is one of "MSG", "JOIN", "PART", "NICK", "ALIVE" NET The originating network, preferably a short name or abbreviation to identify the external network (if any) that originated the event. Purely informational. USR Nickname of the user (if any) whom has initiated the event (recommendation is to use only alphanumeric characters) MSG The message associated with the event (only required for MSG, accepted for PART) NEWNICK New nickname (only valid for NICK command) DUMMY An optional random value, to guarantee uniqueness of the packet (otherwise it's meaningless and should be ignored on reception)
Note here that no implementation is required to send a JOIN, PART or NICK event if these events occur, but it is considered nice behaviour to do so as it provides a more transparent view of the chatbox.
The MSG command indicates that a message has been sent by the user. JOIN and PART are used when a user joins and leaves a channel. NICK is used for nickname changes, but for simplicity an implementation may choose to send or interpret this command as a PART from the old nick followed by a JOIN from the new nick. The ALIVE command indicates that a user is still in a channel, this message may be sent at periodic intervals to keep user lists up to date.
Message encoding
A message consists of a list of key-value pairs. Keys and values are stored as interleaved strings separated by a NULL byte (binary value 0). For this reason, NULL bytes are not allowed in keys or values. All keys and values should be encoded in ASCII or UTF-8. All binary data should be encoded using an arbitrary, extension specific encoding (eg hex, base32, base64, base128) - this is not part of the core protocol.
A list of key-value is stored as follows:
<KEY1>\0<VALUE1>\0<KEY2>\0<VALUE2>\0
Note the tailing \0 character. \0 represents a NULL byte. <KEYx> and <VALUEx> represent the key and value of a key-value pair.
Broadcast subsystem
The broadcast system relies on the uniqueness of every message. Each node in the system should process each message at most one time, so it should keep a certain back log of messages it has recently seen. Storing an MD5 or SHA1 hash of the message is an acceptable compromise for the uniqueness check, to save memory and speed up lookups.
A node should send packets it receives to all it's peers, except for the one that sent the packet. However, if the packet is returned to the sending node, the sending node should ignore the packet. How the packet is transfered or encapsulated is not specified. It's considered good practice to add a checksum of hash to the packet. A few recommendations for the underlying protocol are described below.
The length of a complete message should not exceed 1024 bytes, so that it can reliably be encapsulated in an UDP packet. If a packet is truncated by a lower networking protocol, the entire packet should be dropped.
Packet transfer protocols
Note that these protocols are only recommendations to encourage compatibility among different implementations. The only requirement here, is that a message may not be changed, and thus should somehow be checked for changes and truncation. Corrupted messages should be dropped.
UDP with SHA1 checksum
The packet is prefixed with a 20 byte (binary) SHA1 checksum, and then sent as one UDP packet. The receiver should verify the checksum to make sure that the packet has not taken any damage and has not been truncated. The UDP connection can be configured in various different ways: one bidirectional connection, one unidirectional connection, two unidirectional connections and variations thereof.
Data structure: <20-byte-SHA1-hash><Message>
TCP with framing
The packet is prefixed with a 2 octet big-endian length field. If the receiving side receives a packet with an impossible length (0 or >1024 bytes), it should ignore the packet and may close the connection. A client/consumer should be able to establish an outbound connection; a hub/router should be able to accept inbound connections, and may be able to establish outbound connections for the purpose of linking to other hubs, but it's easy to work around this using socat for example.
Data structure: <2-byte-big-endian-length-field><Message>
Example sourcecode
Key value pair construction and parsing in C
void KVPDecodeValues(char* encoded, int mlen, char** keys, char** values) { int i, j; for (j = 0; j < mlen; j++) { char* key = encoded + j; for (; j < mlen && encoded[j]; j++); if (++j >= mlen) break; char* value = encoded + j; for (; j < mlen && encoded[j]; j++); if (j >= mlen) break; for (i = 0; keys[i]; i++) if (strcmp(key[i], key) == 0) values[i] = value; } } int KVPEncodeValues(char* encoded, int mlen, char** keys, char** values) { int i, l = 0; for (i = 0; keys[i] && values[i]; i++) { int kl = strlen(keys[i]); int vl = strlen(values[i]); l += kl + vl + 2; if (l > mlen) return -1; strncpy(encoded, keys[i], kl); encoded += kl; *encoded = 0; encoded++; strncpy(encoded, values[i], vl); encoded += vl; *encoded = 0; encoded++; } return l; } int KVPEncodeGetLength(char** keys, char** values) { int i, l = 0; for (i = 0; keys[i] && values[i]; i++) l += strlen(keys[i]) + strlen(values[i]) + 2; return l; }
Key value pair construction and parsing in PHP
function KVPDecode($encoded) { $parts = explode("\0", $encoded); $ret = array(); for ($i = 0; $i < count($parts) - 1; $i += 2) $ret[$parts[$i]] = $parts[$i+1]; return $ret; } function KVPEncode($arr) { $tmp = array(); foreach ($arr as $key => $value) { $tmp[] = $key; $tmp[] = $value; } $tmp[] = ''; return implode("\0", $tmp); }