Internet Engineering Task Force (IETF) A. Bakker Request for Comments: 7574 Vrije Universiteit Amsterdam Category: Standards Track R. Petrocco ISSN: 2070-1721 V. Grishchenko Technische Universiteit Delft July 2015 Peer-to-Peer Streaming Peer Protocol (PPSPP) Abstract The Peer-to-Peer Streaming Peer Protocol (PPSPP) is a protocol for disseminating the same content to a group of interested parties in a streaming fashion. PPSPP supports streaming of both prerecorded (on- demand) and live audio/video content. It is based on the peer-to- peer paradigm, where clients consuming the content are put on equal footing with the servers initially providing the content, to create a system where everyone can potentially provide upload bandwidth. It has been designed to provide short time-till-playback for the end user and to prevent disruption of the streams by malicious peers. PPSPP has also been designed to be flexible and extensible. It can use different mechanisms to optimize peer uploading, prevent freeriding, and work with different peer discovery schemes (centralized trackers or Distributed Hash Tables). It supports multiple methods for content integrity protection and chunk addressing. Designed as a generic protocol that can run on top of various transport protocols, it currently runs on top of UDP using Low Extra Delay Background Transport (LEDBAT) for congestion control. Status of This Memo This is an Internet Standards Track document. This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Further information on Internet Standards is available in Section 2 of RFC 5741. Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at http://www.rfc-editor.org/info/rfc7574. Bakker, et al. Standards Track [Page 1] RFC 7574 PPSPP July 2015 Copyright Notice Copyright (c) 2015 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1. Introduction ....................................................5 1.1. Purpose ....................................................5 1.2. Requirements Language ......................................6 1.3. Terminology ................................................6 2. Overall Operation ...............................................9 2.1. Example: Joining a Swarm ...................................9 2.2. Example: Exchanging Chunks ................................10 2.3. Example: Leaving a Swarm ..................................10 3. Messages .......................................................11 3.1. HANDSHAKE .................................................11 3.1.1. Handshake Procedure ................................12 3.2. HAVE ......................................................14 3.3. DATA ......................................................15 3.4. ACK .......................................................15 3.5. INTEGRITY .................................................15 3.6. SIGNED_INTEGRITY ..........................................16 3.7. REQUEST ...................................................16 3.8. CANCEL ....................................................16 3.9. CHOKE and UNCHOKE .........................................17 3.10. Peer Address Exchange ....................................17 3.10.1. PEX_REQ and PEX_RES Messages ......................17 3.11. Channels .................................................19 3.12. Keep Alive Signaling .....................................20 4. Chunk Addressing Schemes .......................................21 4.1. Start-End Ranges ..........................................21 4.1.1. Chunk Ranges .......................................21 4.1.2. Byte Ranges ........................................21 4.2. Bin Numbers ...............................................22 4.3. In Messages ...............................................23 4.3.1. In HAVE Messages ...................................23 4.3.2. In ACK Messages ....................................24 Bakker, et al. Standards Track [Page 2] RFC 7574 PPSPP July 2015 5. Content Integrity Protection ...................................24 5.1. Merkle Hash Tree Scheme ...................................25 5.2. Content Integrity Verification ............................26 5.3. The Atomic Datagram Principle .............................27 5.4. INTEGRITY Messages ........................................28 5.5. Discussion and Overhead ...................................28 5.6. Automatic Detection of Content Size .......................29 5.6.1. Peak Hashes ........................................29 5.6.2. Procedure ..........................................31 6. Live Streaming .................................................32 6.1. Content Authentication ....................................32 6.1.1. Sign All ...........................................33 6.1.2. Unified Merkle Tree ................................33 6.1.2.1. Signed Munro Hashes .......................34 6.1.2.2. Munro Signature Calculation ...............36 6.1.2.3. Procedure .................................37 6.1.2.4. Secure Tune In ............................37 6.2. Forgetting Chunks .........................................38 7. Protocol Options ...............................................38 7.1. End Option ................................................39 7.2. Version ...................................................39 7.3. Minimum Version ...........................................40 7.4. Swarm Identifier ..........................................40 7.5. Content Integrity Protection Method .......................41 7.6. Merkle Tree Hash Function .................................41 7.7. Live Signature Algorithm ..................................42 7.8. Chunk Addressing Method ...................................42 7.9. Live Discard Window .......................................43 7.10. Supported Messages .......................................44 7.11. Chunk Size ...............................................44 8. UDP Encapsulation ..............................................45 8.1. Chunk Size ................................................45 8.2. Datagrams and Messages ....................................46 8.3. Channels ..................................................47 8.4. HANDSHAKE .................................................47 8.5. HAVE ......................................................48 8.6. DATA ......................................................48 8.7. ACK .......................................................49 8.8. INTEGRITY .................................................50 8.9. SIGNED_INTEGRITY ..........................................51 8.10. REQUEST ..................................................52 8.11. CANCEL ...................................................52 8.12. CHOKE and UNCHOKE ........................................53 8.13. PEX_REQ, PEX_RESv4, PEX_RESv6, and PEX_REScert ...........53 8.14. KEEPALIVE ................................................55 8.15. Flow and Congestion Control ..............................56 8.16. Example of Operation .....................................57 9. Extensibility ..................................................61 Bakker, et al. Standards Track [Page 3] RFC 7574 PPSPP July 2015 9.1. Chunk Picking Algorithms ..................................61 9.2. Reciprocity Algorithms ....................................62 10. IANA Considerations ...........................................62 10.1. PPSPP Message Type Registry ..............................62 10.2. PPSPP Option Registry ....................................62 10.3. PPSPP Version Number Registry ............................62 10.4. PPSPP Content Integrity Protection Method Registry .......62 10.5. PPSPP Merkle Hash Tree Function Registry .................63 10.6. PPSPP Chunk Addressing Method Registry ...................63 11. Manageability Considerations ..................................63 11.1. Operations ...............................................63 11.1.1. Installation and Initial Setup ....................63 11.1.2. Migration Path ....................................64 11.1.3. Requirements on Other Protocols and Functional Components .............................64 11.1.4. Impact on Network Operation .......................64 11.1.5. Verifying Correct Operation .......................65 11.1.6. Configuration .....................................65 11.2. Management Considerations ................................66 11.2.1. Management Interoperability and Information .......67 11.2.2. Fault Management ..................................67 11.2.3. Configuration Management ..........................67 11.2.4. Accounting Management .............................68 11.2.5. Performance Management ............................68 11.2.6. Security Management ...............................68 12. Security Considerations .......................................68 12.1. Security of the Handshake Procedure ......................68 12.1.1. Protection against Attack 1 .......................69 12.1.2. Protection against Attack 2 .......................70 12.1.3. Protection against Attack 3 .......................70 12.2. Secure Peer Address Exchange .............................71 12.2.1. Protection against the Amplification Attack .......71 12.2.2. Example: Tracker as Certification Authority .......72 12.2.3. Protection against Eclipse Attacks ................73 12.3. Support for Closed Swarms ................................73 12.4. Confidentiality of Streamed Content ......................74 12.5. Strength of the Hash Function for Merkle Hash Trees ......74 12.6. Limit Potential Damage and Resource Exhaustion by Bad or Broken Peers ......................................74 12.6.1. HANDSHAKE .........................................75 12.6.2. HAVE ..............................................75 12.6.3. DATA ..............................................75 12.6.4. ACK ...............................................75 12.6.5. INTEGRITY and SIGNED_INTEGRITY ....................76 12.6.6. REQUEST ...........................................76 12.6.7. CANCEL ............................................76 12.6.8. CHOKE .............................................77 12.6.9. UNCHOKE ...........................................77 Bakker, et al. Standards Track [Page 4] RFC 7574 PPSPP July 2015 12.6.10. PEX_RES ..........................................77 12.6.11. Unsolicited Messages in General ..................77 12.7. Exclude Bad or Broken Peers ..............................77 13. References ....................................................78 13.1. Normative References .....................................78 13.2. Informative References ...................................79 Acknowledgements ..................................................84 Authors' Addresses ................................................85 1. Introduction 1.1. Purpose This document describes the Peer-to-Peer Streaming Peer Protocol (PPSPP), designed for disseminating the same content to a group of interested parties in a streaming fashion. PPSPP supports streaming of both prerecorded (on-demand) and live audio/video content. It is based on the peer-to-peer paradigm where clients consuming the content are put on equal footing with the servers initially providing the content, to create a system where everyone can potentially provide upload bandwidth. PPSPP has been designed to provide short time-till-playback for the end user and to prevent disruption of the streams by malicious peers. Central in this design is a simple method of identifying content based on self-certification. In particular, content in PPSPP is identified by a single cryptographic hash that is the root hash in a Merkle hash tree calculated recursively from the content [MERKLE] [ABMRKL]. This self-certifying hash tree allows every peer to directly detect when a malicious peer tries to distribute fake content. The tree can be used for both static and live content. Moreover, it ensures only a small amount of information is needed to start a download and to verify incoming chunks of content, thus ensuring short start-up times. PPSPP has also been designed to be extensible for different transports and use cases. Hence, PPSPP is a generic protocol that can run directly on top of UDP, TCP, or other protocols. As such, PPSPP defines a common set of messages that make up the protocol, which can have different representations on the wire depending on the lower-level protocol used. When the lower-level transport allows, PPSPP can also use different congestion control algorithms. At present, PPSPP is set to run on top of UDP using LEDBAT for congestion control [RFC6817]. Using LEDBAT enables PPSPP to serve the content after playback (seeding) without disrupting the user who may have moved to different tasks that use its network connection. Bakker, et al. Standards Track [Page 5] RFC 7574 PPSPP July 2015 PPSPP is also flexible and extensible in the mechanisms it uses to promote client contribution and prevent freeriding, that is, how to deal with peers that only download content but never upload to others. It also allows different schemes for chunk addressing and content integrity protection, if the defaults are not fit for a particular use case. In addition, it can work with different peer discovery schemes, such as centralized trackers or fast Distributed Hash Tables [JIM11]. Finally, in this default setup, PPSPP maintains only a small amount of state per peer. A reference implementation of PPSPP over UDP is available [SWIFTIMPL]. The protocol defined in this document assumes that a peer has already discovered a list of (initial) peers using, for example, a centralized tracker [PPSP-TP]. Once a peer has this list of peers, PPSPP allows the peer to connect to other peers, request chunks of content, and discover other peers disseminating the same content. The design of PPSPP is based on our research into making BitTorrent [BITTORRENT] suitable for streaming content [P2PWIKI]. Most PPSPP messages have corresponding BitTorrent messages and vice versa. However, PPSPP is specifically targeted towards streaming audio/video content and optimizes time-till-playback. It was also designed to be more flexible and extensible. 1.2. Requirements Language The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119]. 1.3. Terminology message The basic unit of PPSPP communication. A message will have different representations on the wire depending on the transport protocol used. Messages are typically multiplexed into a datagram for transmission. datagram A sequence of messages that is offered as a unit to the underlying transport protocol (UDP, etc.). The datagram is PPSPP's Protocol Data Unit (PDU). content Either a live transmission or a prerecorded multimedia file. Bakker, et al. Standards Track [Page 6] RFC 7574 PPSPP July 2015 chunk The basic unit in which the content is divided. For example, a block of N kilobytes. A chunk may be of variable size. chunk ID Unique identifier for a chunk of content (e.g., an integer). Its type depends on the chunk addressing scheme used. chunk specification An expression that denotes one or more chunk IDs. chunk addressing scheme Scheme for identifying chunks and expressing the chunk availability map of a peer in a compact fashion. chunk availability map The set of chunks a peer has successfully downloaded and checked the integrity of. bin A number denoting a specific binary interval of the content (i.e., one or more consecutive chunks) in the bin numbers chunk addressing scheme (see Section 4). content integrity protection scheme Scheme for protecting the integrity of the content while it is being distributed via the peer-to-peer network. That is, methods for receiving peers to detect whether a requested chunk has been modified, either maliciously by the sending peer or accidentally in transit. hash The result of applying a cryptographic hash function, more specifically a Modification Detection Code (MDC) [HAC01], such as SHA-256 [FIPS180-4], to a piece of data. Merkle hash tree A tree of hashes whose base is formed by the hashes of the chunks of content, and its higher nodes are calculated by recursively computing the hash of the concatenation of the two child hashes (see Section 5.1). root hash The root in a Merkle hash tree calculated recursively from the content (see Section 5.1). Bakker, et al. Standards Track [Page 7] RFC 7574 PPSPP July 2015 munro hash The hash of a subtree that is the unit of signing in the Unified Merkle Tree content authentication scheme for live streaming (see Section 6.1.2.1). swarm A group of peers participating in the distribution of the same content. swarm ID Unique identifier for a swarm of peers, in PPSPP a sequence of bytes. For video on demand with content integrity protection enabled, the identifier is the so-called root hash of a Merkle hash tree over the content. For live streaming, the swarm ID is a public key. tracker An entity that records the addresses of peers participating in a swarm, usually for a set of swarms, and makes this membership information available to other peers on request. choking When Peer A is choking Peer B, it means that A is currently not willing to accept requests for content from B. seeding Peer A is said to be seeding when A has downloaded a static content file completely and is now offering it for others to download. leeching Peer A is said to be leeching when A has not completely downloaded a static content file yet or is not offering to upload it to others. channel A logical connection between two peers. The channel concept allows peers to use the same transport address for communicating with different peers. channel ID Unique, randomly chosen identifier for a channel, local to each peer. So the two peers logically connected by a channel each have a different channel ID for that channel. heavy payload A datagram has a heavy payload when it contains DATA messages, SIGNED_INTEGRITY messages, or a large number of smaller messages. Bakker, et al. Standards Track [Page 8] RFC 7574 PPSPP July 2015 In this document the prefixes kilo-, mega-, etc., denote base 1024. 2. Overall Operation The basic unit of communication in PPSPP is the message. Multiple messages are multiplexed into a single datagram for transmission. A datagram (and hence the messages it contains) will have different representations on the wire depending on the transport protocol used (see Section 8). The overall operation of PPSPP is illustrated in the following examples. The examples assume that the content distributed is static, UDP is used for transport, the Merkle Hash Tree scheme is used for content integrity protection, and that a specific policy is used for selecting which chunks to download. 2.1. Example: Joining a Swarm Consider a user who wants to watch a video. To play the video, the user clicks on the play button of a HTML5